The Messiness of Data

Nik Stanbridge, 'Underwater Mud,' 2014.

During a recent conversation with two colleagues I was lamenting the lack of continuity in our writing center's data. Exciting, right? Okay, it's not necessarily exciting, but it's certainly interesting. It turns out that this whole data thing is pretty messy, and, most surprising to me, it turns out that that's okay.

The problem under discussion was/is this: we know on a semi-formal basis (personal observation plus some quantitative work) that in the center I help staff, students that present as male are in the minority and that they tend to come to the center both later in the day, and later in the semester. This pattern has clear policy implications: if male students are coming to the center later in the day and semester, we should be open more later in the day and semester, because males are an underserved population for our writing center. However, when students fill out an information sheet before their consultation, the field in which that data is gathered is a blank.

Having the field be text-entry means that we as researchers are dependent on our own ability to sift the data (does 'Ma' mean male? 'M'? What about 'Boi'?) and this makes using the data more difficult, not to mention slower. For instance, I wanted to perform a microstudy showing the breakdown of student use of the writing center by sex to publish on the blog, in order to demonstrate how even small amounts of data can drive policy in writing centers. However, there is a lot of work involved in generating what would seem on the face of it a simple metric, and a lot of guesswork. In this sense the way the field is set up is, to put it simply, bad. 

If viewed in another light, however, it is the opposite. Having the field be text-entry means that people are not constrained to Male/Female as they would be in a drop-down menu with two choices, and while we briefly considered the pros and cons of a drop-down menu with Male/Female/Other, it's not a particularly good solution. After all, in my own scholarly work drawing on psychoanalytic theory I refer to 'the Other' frequently, and usually I do so in the course of demonstrating how it is a category reserved for the denigrated, excessive, and unwanted among us. Using the Male/Female/Other drop-down menu would simplify the researcher's job and make the data more continuous, but it might enact a microagression on a student about to enter the writing center encounter, and that's the opposite of what we're trying to do as writing center staffers and as scholars interested in the quantitative study of writing centers.

This problem with messy data reminds me of a recent interview Stephanie Graeter conducted with Amade M'Charek in which they discuss the praxiographic method. In the interview she states that praxiographic approaches allow her to step outside the limits of previous anthropological discussions of race as an object of study:

Praxiography compares and contrasts nicely with ethnography where the focus is more on human actors and what they say and do... Here not only humans but also non-humans are analyzed as actors, as things with agency. Also, I think, a post-linguistic turn in cultural anthropology should care for the suffix “graphy,” for it helps to denature culture. The language we use, the concepts we invent, and the style we device (sic) to describe the world helps to enact it as well.

In this response I hear more than a little echo of the New Materialism coming out of the Netherlands, among other places, and also of the voice of that granddaddy of Science and Technology Studies, Bruno Latour. Basically the argument as I understand it is that the ways our observational systems work, whether those be machines, data visualizations, or even our own minds, radically changes the kinds of data we observe and further, the ways we use that data. More than this, our observation brings the data into being: as M'Charek goes on to say, 'race' as an object of study is brought into being and made a vital part of the nature-culture assemblages in which we live through observation/surveillance, and the praxiographic approach allows her to make 'race' a object of study without either committing it to the dustbin of history as insufficiently real to deserve notice, or acceding to the existence of biological 'race' as an unproblematic fact.

In the writing center I help staff I'm not sure we'll be using a praxiographic approach any time soon. As Christian Bueger notes,

The core claim of praxiography is that ‘the social’, ‘the cultural’, and ‘the political’ are based primarily and in the last instance in implicit knowledge and meaning. The focus of praxiographers is on implicit or tacit knowledge, that is, a type of knowledge which is rarely verbalized and is hence not easily readable from signifiers, speech, and discourse. Practices are taken to be the mediator and carrier of such knowledge. Hence to understand social and political order, praxiography suggests studying practices which constitute these orders of knowledge. For praxiography explicit knowledge – such as norms or rules – and articulated meaning – for instance through speech – are of secondary relevance. Explicit knowledge and articulated meaning requires and depends on practice. Practice in this sense is simply ontologically prior. 386

Just as the entry form is chronologically prior to the writing center encounter for a student, a praxiographic view of our intake form tells us that in some ways it is a powerful force shaping how, in retrospect, we understand our work in that encounter. It would require an ethnography of an entire writing center over the course of months (at least) to understand the praxiographic effects of the form on writing center encounters, and while that would be utterly fascinating, we can't perform that kind of study every time we want to maybe change a data field. Perhaps leaving the field as text-entry allows us to understand something truer than the simple, clarified M/F/O categories that might (de)form the praxiographics of our encounters with writers: in our encounters with data we shouldn't be too tempted to simplify, clarify, even unify data that should instead be various. This doesn't mean we should champion confusing methods of gathering data as being inherently better; using 'messier' entry techniques means that in some ways we are more dependent on the preconceptions of the researcher working with any given data set. However, it does mean that we shouldn't expect tidy data to come from a messy world, and we shouldn't give up a fealty to an actual, imperfect reality in favor of a more hygienic, orderly representation of that life on a spreadsheet.