Why Using a Machine to Study Literature Isn’t as Heretical as it Sounds


by Mae Capozzi

Of the backlash against the Digital Humanities, (of which there is plenty, I assure you), the most interesting to me is the fear that if we begin to use computers to read, we will irrevocably lose the humanistic aspect of reading. For many, reading is a sensory experience. Readers want to touch and smell the book––they want to feel something. Many scholars just don’t want to fully quantify what they view as a wonderfully qualitative experience. It is one thing to theorize based on a close reading of hundreds of texts over the course of a lifetime; it is another thing entirely to analyze thousands of texts in just a few hours using a computer program like MALLET.

Franco Moretti, in “The Slaughterhouse of Literature,” explains that “if we set today’s canon of nineteenth-century British novels at two hundred titles (which is a very high figure), they would still be only about 0.5 per cent of all published novels.” So here we have a task that is only solvable with the help of a computer––it would take thousands of lifetimes to read the other 99.5% of nineteenth-century British novels. How about all of the other novels written in the nineteenth-century in the Western world, or the non-Western world, or in other centuries? I could go on and on. The sheer massiveness of this project makes it seem wrong not to take advantage of this technology, (or to at least give it a whirl).


While I locate myself firmly within the contra-canon camp, there is always the pro-canon argument that the canon consists of the best books ever written and anything outside of the canon is not worth serious literary study. The latter group can certainly argue that because only the canon is worthwhile, why take the time to read the other 99.5% of texts? While scholars may never reconcile on this point, I believe that even supporters of the traditional canon can get behind DH, because by allowing us to read outside of the canon, we can garner a deeper understanding of why canonical texts are not “sent to the slaughterhouse” with the other 99.5%.

In “The Limits of the Digital Humanities,” Adam Hirsch­­––one of the more vehement critics of Digital Humanities––argues: “The best thing that the humanities could do at this moment…is not to embrace the momentum of the digital, the tech tsunami, but to resist it and to critique it.” At the core of Kirsch’s response is a fear that the “digital” aspect of DH will destroy the humanistic nature of the humanities. Rather than acknowledging the possibility for interdisciplinarity, Kirsch seems to believe that DH disregards a “humanities culture” that “prizes thinking and writing skills” in favor of the “making and building” typical of scientific research. I see this analysis as overly reductive and ultimately false. It seems abundantly clear that the Digital Humanities require not only a familiarity with technology, but also a deep understanding of the discipline being studied. To use Moretti again: Moretti makes the depth of his knowledge about both the Western canon as well as other literatures abundantly clear in Distant Reading, which is why he can support the grand claims he makes about the novel as a whole. If he were less well-read, he would fail to see the trends emerging in his research and he would be unable to back that research up.

Kirsch somewhat cynically wonders: “Was it necessary for a humanist in the past five hundred years to know how to set type and publish a book?” I’d like to rephrase his question: Was it necessary for a writer in the 1800s to know how to use a computer? Obviously not, but it doesn’t mean that authors writing in the twenty first-century shouldn’t be at least casually familiar with Microsoft Word. What he misses here, and which seems fairly apparent, is that literature is constantly evolving and changing. This does not mean that when a new theory rolls around, all scholars must immediately abandon everything that came before in favor of it. Rather, humanists should continue to build on pre-existing skill sets. So, the answer to Kirsch’s tongue-in-cheek question is no, humanists did not have to set type or publish a book, but that does not mean they never should. I imagine the study of literature as a bag filled with every text ever written, every theory every conceived, and every technology ever used. This bag holds everything from papyrus, to Woolf, to Derrida. Literary theorists must then reach into the bag and pull from it what they will. What they choose defines them as thinkers. I think it is safe to say most of them will not choose papyrus, because that is a technology that has ceased to be relevant––it was replaced by something better. That does not mean papyrus scrolls have ceased to exist; it just means that technology has advanced since then. Moving towards the digital is just another advance in a long line of technological advances.

The goal of my and Professor Enderle’s project is not to avoid the challenge of close reading, or to undermine the “thinking and writing” skills so valued by Kirsch. Rather, we seek to use the knowledge we have compiled after hours of studying British novels and British colonial history to draw conclusions from a wider set of texts than we could ever read ourselves. We are seeking unexpected connections that will then advise our close readings of eighteenth and nineteenth-century texts, as well as using our knowledge to understand the patterns that emerge in the topic models we create. That happens to involve running MALLET from the command line, and using Python to create visualizations of our data, but it does not mean we are not practicing more traditionally humanistic skills. Finally, we are approaching our research in the same way as we would approach a text. Rather than formulating a clear hypothesis and trying to prove it, we are analyzing the data without preconceived notions of what we might find. This is, again, intensely humanistic. It is not that we are not reading­­; rather, we are applying our reading skills to a different type of text.

We no longer practice Petrarch’s “umanesimo,” and frankly, I do not think that is a bad thing. Even if humanists learn to code, and technology is used more frequently in the classroom, readers will continue to read and books will continue to exist. An e-book has all of the same information as a hard copy; it just looks a little different. If we begin to encourage quantitative analysis while emphasizing the importance of close reading, we may begin to make connections we have yet to imagine, and I feel confident these connections will prove themselves fruitful.


