Digital Humanities 2011 and the elephant in the tent

July 15th, 2011 § 3 comments

I couldn’t keep up with all that was going on at the Digital Humanities 2011 conference at Stanford last month, but I thoroughly enjoyed it, learned from it, and found myself thinking about unexpected connections while trying to make sense of it. The three key things I kept thinking about are: scale, materiality, and agency. The semi-processed notes that follow are not so much a record of the conference as they are an experiment to see if a succession of points might hint at an arc.

The paradox of Digital Humanities is that it is a term that attracts more interesting people than it can stably support through careful definition of a coherent field, clear identification of an object of study, or singular commitment to a fixed methodology. The theme of this year’s Stanford conference, the “big tent,” was meant to be welcoming to all, and I think on the ground it succeeded quite well. It wasn’t an occasion of anxious definitional boundary-drawing. People seemed willing to do what they were doing with confidence not only in their own work, but in its acceptance by others within the big tent. This a strength not to be underestimated, I would like to think, and yet it could be difficult to see from a distance, outside the tent altogether.

A conference is a difficult thing to claim as an object of knowledge. There were more than 300 participants and usually four concurrent sessions. There is a substantive 417-page book of abstracts (closer to full papers than the term “abstracts” might imply; here in PDF), and also a #dh11 Twitter hashtag that I did not manage to keep up with. Amid the divergent particulars of papers, posters, and projects, common themes or problematics kept emerging.

  • David Rumsey’s opening keynote showed us implicitly how unsatisfying many of our existing tools are through a masterful demonstration of the kind of digital experience that would enable exploration across multiple levels of scale, seamlessly going from small map scales to large ones, from thumbnails to full screen, from close reading to distant and back, without artificially drawing sharp distinctions between macro-scale discovery, broad analytic purposes, careful examination of detail, and speculative browsing.
  • I understand a perennial question at this conference has often been one or another version of “What is text?” in the light of digital text encoding practices. Statistical text mining approaches are getting a lot of attention now, and perhaps put these questions in a new light. Tastes differ in such things, as do philosophical commitments, but I thought generally the people I heard wore their humanities practice pretty easily and confidently, whatever their methodology.  They were quite willing to explore how texts and meaning are contextually produced, not assuming a positivistic mechanism or scientistic magic somehow inherent in digital bits, even as they were in some cases interested in looking across different levels of scale. (I won’t say much directly about “distant reading” here, but recommend this recent post by conference co-organizer Matt Jockers, whose presentation at the conference was quite impressive.) The conference wasn’t quite what one might imagine of digital humanities solely from reading about “culturomics” in the popular press. To be in the midst of the conference participants, what’s interesting is not so much what’s new, but how much a horizon of the humanities can go without saying among people with relatively diverse backgrounds and occupational and disciplinary commitments.
  • I heard the conclusion of a paper Julia Flanders presented on behalf of herself and Jacqueline Wernimont, trying out the idea that textual markup could conceivably be a practice of philosophical exploration of “possible worlds.” An insightful comment during the question period suggested an analogy to Rumsey’s layering of historic maps, which Julia enthusiastically endorsed. To draw out just one of the implications of that comment: it might be helpful to people outside digital humanities to understand the humanistic character of practices like text encoding and database modeling by analogy to cartography. We can value the work of cartographers without getting particularly confused by the fact that the map is not the territory, and we can appreciate the cultural and rhetorical as well as technical “making” that goes into cartography. And we can accept that maps of different scales have different purposes, without pretending that differently scaled maps necessarily invalidate each other.
  • The panel on materiality was well attended and energetic. Jean-Francois Blanchette, Johanna Drucker, and Matt Kirschenbaum all spoke well about the material basis of technology, against an antihistorical techno-fantasy of disembodied bits. A couple of Blanchette’s slides reminded us of the shipping container industry as analogue and material support for digital technology, and that image seemed to give a particularly vivid ethical and environmental grounding to our sense of all having a stake in materiality. Matt Kirschenbaum talked about the materiality of born-digital archives, the physicality and historical boundedness of hardware, and the importance of engagement with archives, archivists, and materiality in digital humanities scholarship.
  • Johanna Drucker spoke brilliantly at a remarkable pace in what I understood to be advocacy of theoretically mature critical intellectual practice. I won’t pretend to be able to summarize her argument adequately, but toward the end of her talk she included theoretical gestures reminding us of the non-self-identicality of cultural objects, and of the idea of parallax, which I perhaps half-understood as looking at an object simultaneously from different perspectives. And I felt like I was hearing these gestures simultaneously from at least two perspectives: I had glimpsed these ideas before, and was intrigued by them, because something about them sounds right to me. Yet I was also hearing these gestures with sympathy for people who might be impatient of theory, and who might take non-self-identicality in particular as simply a logical offense.
  • Fred Gibbs gave a superb, understated presentation based on his work with Dan Cohen using text mining and visualization techniques to explore questions about Victorian intellectual history. He argued for starting with simple questions, simple tools, skepticism, modest results, and scholarly transparency in making code and data available. He was not cheerleading for his methods, but testing them, playing with them, and evaluating them. He showed how the same underlying question might lead him to produce different visualization graphs depending on whether the source data was just book titles or Google’s full-text n-grams. In the context of my conference experience Fred’s presentation turned into an especially significant highlight, for two reasons. 

First, his practice sounded to me like a pragmatic historian’s version of what I understood Johanna Drucker to be calling for. Gibbs produced and then wondered about three graphs of the “same” phenomenon that turned out not to be conclusively the same, for reasons that bear further exploration. I don’t know if this is what non-self-identicality means, but I appreciated that Gibbs is not sitting still, debating which side to pick in a false dichotomy between historical research object and digital method, he’s productively moving between imperfect method and uncertainly mediated object in an active hermeneutic process that he is transparently implicated in and willing to share, and he is unapologetic about the fact that the process isn’t over and its ultimate results cannot be fully and comfortably spoken for, because there is still something to learn.

Also, Gibbs was quite frank about the nature of his use of digital tools of inquiry in this process. He doesn’t need particularly fancy and complicated cyberinfrastructure to ask some of his questions, he just needs to be free to write short scripts and see results in quick cycles of exploration. I started to wonder whether having that kind of practice in mind, and then asking questions about scaling up, shouldn’t happen more often. The risk of creating sophisticated, methodologically committed digital tools in anticipation of supporting future scholarship is that unless there is the flexibility for quick iteration and change, complex tools might end up silencing considerable fields of potential evidence by hard-coding initial presumptions that will be expensive or difficult to change.

  • I made the most of an opportunity to find out what would bring digital poets to the digital humanities conference, a community I’m interested in but don’t know much about. It seemed to me that the digital poesis folks, much like Gibbs, are using technology critically and experimentally, fiddling with knobs to see what happens, and adjusting based on what they find. John Cayley’s exploration of language through Google searches, looking for short sequences of words that appear together in prior usage (but not too often) to incorporate into poems, catalyzed discussion around Google, which was perhaps a somewhat underacknowledged elephant in the big tent throughout much of the conference. Cayley understands himself to be simultaneously exploring the never-fully-accessible expansive world of language on the web and also the imperfect, essential mediating tool of the search engine. He drives his own exploratory process in a manner not so different from Gibbs. He doesn’t use Google’s n-gram data; as a matter of principle and method he insists on getting his search counts from “The Mouth,” his term for the simple Google web search box. Cayley spoke of chafing under Google’s terms of service and observed that Google didn’t want him to be a robot. For his purposes, it’s only as a robot, or a potential fast-typing equivalent of a robot, that he effectively has any agency at the interface. Debating culturomics on all sides can make it hard to see the immediacy of this sublime paradox. But it’s not a question for some speculative future. It matters now.
  • Plenty of presentations, papers, and posters involved various kinds of digital work in modeling, visualizing, organizing, and presenting human cultural materials. (My own poster [PDF], too, relates to a project that fits this general description.) Many people working in digital humanities understand that visualization tools and interface design are critically important, and there’s much more work to be done. But I found myself wondering how design necessarily looks different from the perspectives of anticipated future use and immediate active inquiry. Often, with “end users” in mind, we assume that applications on the web or elsewhere bind interfaces and data pretty tightly, and that a primary issue is the quality and clarity of the interface. It does not always go without saying that any interface can be a bad interface for scholarship if it’s the only one. The Linked Open Data movement is part of an answer to moving beyond this. The focus there still is mostly on the data publishing side, understandably. But on the demand side, there is not nearly enough exploration yet of paths toward researcher- or reader-driven (as opposed to user-tested) tools for inquiry.
  • Perhaps the most materially significant statement about the future of humanities research at network scale came during a reception honoring the establishment of the HathiTrust Research Center, which will provide computational research access (in some manner to be determined) to the full-text collections of millions of books in the digital collections of HathiTrust. This is an important and exciting initiative, but I got the sense that the vision is still well out ahead of what one would want in a community of practice able to provide informed feedback at appropriate scale.

Google’s Jon Orwant, who has put considerable work into engaging with the digital humanities community in the past few years, explained that the HTRC will support “nonconsumptive” research, meaning computational research on the aggregate of millions of digital volumes, including those in copyright, without providing full-text access that could be in potential violation of copyright. For rights management reasons and also for material engineering reasons, the research architecture will move the computation to the data. That is, the vision of the future here is not one in which major data providers give access to data in big downloadable chunks for reuse and querying in other contexts, but one in which researchers’ queries are somehow formalized in code that the data provider’s servers will run on the researcher’s behalf, presumably also producing economically sized result sets. The economic logic makes sense, given bandwidth limitations and data collections growing at the scale of petabytes. But it seems likely there are intellectual consequences here for research in the humanities that aren’t that easy yet to envision. For ordinary humanities researchers who are already not in a position to fix or route around the interfaces thought up by those of us who in good faith imperfectly make such things, what does it mean to announce that the way of the world will be to “move the computation to the data”? I can imagine many people in the humanities thinking: there were websites before, there will be websites after, what difference will this make to me?

If “moving computation to the data” is going to work as an active humanities research practice, the expressiveness of computation as inquiry will have to move toward researchers and readers. Whatever we mean by “computation,” that is, can’t be locked up in an interface that tightly binds computation and data. Readers already need (and for the most part do not have) our own agents and our own data, our own algorithms for testing, validating, calibrating, and recording our interaction with the black boxes of external infrastructure. The black boxes, too, are time-bound artifacts of culture, and must be read, within and without.

