The following is excerpted from a talk I gave at Middlebury College on February 24th, 2016 as part of the Carol Rifelj Faculty Lecture Series. It’s a literature talk which means major spoiler alerts! The data & code for many of the visualizations below are available here. This is very much a work-in-progress talk, and I welcome feedback, questions, suggestions, and collaborations!
Early last year, the award-winning writer and scholar Robert Macfarlane published an article in The Guardian, intended as a preview to his book Landmarks, which was to be published later that week. In the article (and in the book), Macfarlane laments the loss of the language of landscapes, which he argues is being displaced by the language of technology. He cites the Oxford Junior Dictionary’s contentious decision to “cull” many “nature words” (Macfarlane n.p.) from its list, removing words like “adder, ash, beech, hazel, and willow” and replacing them with “attachment, blog, broadband, and chatroom.”
In response to this loss, Macfarlane works to collect and preserve hyper-local place words. He records words like “pirr”: “a light breath of wind, such as will make a cat’s paw on the water” and “smeuse”: an English dialect noun for “the gap in the base of a hedge made by the regular passage of a small animal”:
“Now I know the word ‘smeuse,’” Macfarlane writes, “I notice these signs of creaturely commute more often.”
These words are sometimes collected through casual conversations, but more frequently, he reports, through archival research, web searches, email conversations between scholars, and recorded oral histories. Technology is often characterized in opposition to the natural world (and sometimes with good reason). But it is clear from Macfarlane’s account of his own research that digital technology has not displaced the language of the natural world—it has recorded, transmitted, publicized, and preserved it.
Because, as Macfarlane’s book illustrates, our research methods have been deeply altered by digital technologies—it behooves us to increase the specificity of our understanding of technology. We are perhaps most familiar with “black box” technologies that separate user from code and mystify the processes through which new knowledge creation is facilitated. Further, it impedes knowledge creation and innovation—we learn to adapt our current methods or practices to new technologies rather than to create new technologies that better serve our practices and philosophies. And if there’s one thing you take away today it’s this—that the more humanists participate in the creation of their own research tools, the better those tools will be for our research.
As I’ve learned (sometimes quite painfully) through the process of this project, the more I understand about digital technologies the better my ability to create, adapt, and critique the limits of that technology. But perhaps more surprisingly, the more I learned about digital technologies, the better equipped I also was to push back on literary theory. What follows is both an account of my attempts to adapt concepts from ecology to literary texts via technology as well as reflections on what the process has revealed about the novels.
This project began with a question: How might we digitally represent the ecosystems of novels? At the time I was writing a dissertation about the visual and textual relationships between nationalism and specific engagements with the natural world (e.g. gardening, natural history, witchcraft, and herbalism) and at the same time tending a small garden I dug in some borrowed space in a friend’s lawn outside of Boston. There was something about the space of the garden that pulled my attention in a really different and corporeal way than the way I was engaging with representations of the natural world in texts. I was reading the poetry of the writer and gardener Vita Sackville-West–who is perhaps most famous in literary circles for her relationship with Virginia Woolf and in gardening circles for the creation of her “White” garden at Sissinghurst Castle in Kent and the gardening column she wrote for The Observer (now The Guardian) for many years. In her long poem The Garden, she writes of a gardener who:Those of us who garden in public enough spaces likely recognize this characterization of the corporeal pull that keeps us in thrall to our gardens, even in the presence of visitors.
There was something about the very different corporeal and affective responses I was experiencing while gardening that had me thinking about the effectiveness of the different ways I was engaging with the natural world: from the close reading and deep analysis I was performing on texts about the natural world to produce my dissertation to my attempts to create the conditions to allow the plants in my garden to grow and thrive.
At the same time, I was beginning to engage with some of the digital methods emerging from the field of digital humanities—not for my dissertation, but for fun. And I found myself pulled into programming in a different but analogous way as I often get pulled into weeding.
Over time, my single question evolved into a series of questions:
I started to think about the possible consonances of close reading, programming, and gardening—all require some initial deep attention and planning. But in a garden, it’s often the letting go of control that produces the best results. So I started to wonder if I could create the conditions through which a novel’s ecological thumbprint or ecosystem could emerge? Or through which other readers might come into deeper contact with the novel and, ultimately, the natural world.
For several decades, close reading has been the primary method for producing the evidence with which literary arguments are made in scholarship and, especially, in our classrooms. More recently, Franco Moretti has controversially advocated for “distant reading” where the researcher “reads” thousands or even millions of texts algorithmically, looking for patterns and structures that span multiple texts. However, both closeness and distance can be dangerous positions to occupy in relation to our environment. To be too close is to lose sight of the larger systems in which we are imbricated. To be too distant is to remain outside, to see patterns without acknowledging one’s complicity within those systems.
By thinking beyond the binary of close and distant reading, I offer new modes of experiencing literatures of the environment that seek to connect readers and viewers not only to individual novels but also to thinking about and attending to the environment. My methods are derived from Jerome McGann and Lisa Samuels’s concept of “interpretive deformation” (115) in which “interpretation of works of imagination call for responsive works of imagination, not reflexive works of analysis” (109). McGann and Samuels explore the productive activities of deformation in a chapter of McGann’s book Radiant Textuality (2001). In the chapter, McGann and Samuels argue that “criticism (scholarship as well as interpretation) tends to imagine itself as an informative rather than deformative activity” (144).
They go on to argue that all interpretation is deformative, but that engaging explicitly in deformative procedures (such as reading a poem backwards, or reading only the verbs in a poem), foregrounds the subjectivity of the critic and her subsequent interpretation. Further, when deformation is preformed through criticism or—in my case, through computer programming and visualizations—“the reader [is put] in a highly idiosyncratic relation to the work” (McGann and Samuels 116). When readers occupy the space of this idiosyncratic relation, the reverberations of difference between their own point of view and the critic’s invite reflection.
Today you’re going to hear about some as yet early steps towards producing and theorizing these other kinds of engagement in the context of two novels: Thomas Hardy’s Tess of d’Urbervilles and Mary Webb’s Gone to Earth.
Born in Dorset in the southwest of England in 1840, Hardy was a prolific novelist and poet whose life spanned the Victorian and Modern eras. At the beginning of his writing career, he primarily wrote novels—publishing more than a dozen between 1870 and 1895. Critics have long noted the importance of the natural world to Hardy’s novels.
Tess of the D’Urbervilles: A Pure Woman Faithfully Presented, published in 1891, is regarded by many as his best novel and by John Humma as “England’s if not the world’s most famous nature novel” (63). The early troubles of the eponymous protagonist double when she meets Alec D’Urberville, who relentlessly pursues her, resulting in a sexual assault that leaves Tess pregnant. Returning to her family, Tess gives birth to a child she names Sorrow who dies within weeks. A few years later, Tess seeks employment Talbothays dairy where she meets and falls in love with Angel Clare, a son of a minister, who she eventually marries. On the night of their wedding each confesses to having had sexual relations previous to the marriage. While Tess is accepting of Angel’s past, his worldview is completely shattered by Tess’s admission, and he leaves her (with some money) to start a farm in Brazil. Tess is driven further and further into financial difficulties. She is eventually discovered by Alec who again relentlessly pursues her until she agrees to live with him. When Angel returns from Brazil to reclaim Tess, Tess murders Alec and flees with Angel. Within a few days she is apprehended and hanged.
Mary Webb was a great admirer of Hardy’s, corresponding with him and dedicating her fourth novel Seven for a Secret to him, with his permission, in 1922. When Webb sent Hardy a presentation copy of the novel she included a note: “I do appreciate with intensity the rich beauty and majesty of your own interpretation of nature & humanity” (qtd. in Crawford & Crawford 97). And Hardy and Webb are often connected in terms of their understanding and treatment of the natural world.
A 1917 review in the New Statesman claimed that “Gone to Earth is the most impressive novel since Thomas Hardy gave us Tess of the D’Urbervilles” (qtd. in Crawford & Crawford 150). And indeed Gone to Earth might be considered a kind of rewriting of Tess. In the novel, a young Hazel Woodus lives with her ignorant and indifferent father and a menagerie of rescued animals (including a one-eyed cat and pet fox named, well “Foxy”). Webb writes that:
Like Tess, Hazel is relentlessly pursued by two male suitors: the first, a man named Reddin, is a country squire who lives in his decaying ancestral home with his misogynist manservant (who is obsessed with clipping a collection of yews into the shapes of swans). Edward Marston, the second suitor, is a minister new to the area who eventually marries Hazel against his mother’s wishes and brings Hazel to live with him in a state of unconsummated marriage. Reddin continues to pursue Hazel after her marriage. Catching her outside of her cottage one day, he assaults her and impregnates her, convincing her to run away with him back to his ancestral home. When Edward finds out where she is, he violently confronts Reddin and brings Hazel back to live with him, promising to raise the child as his own. Before the child is born, however, Hazel dies trying to protect Foxy from a pack of hounds driven by Reddin on a foxhunt.
Webb characterizes Hazel’s suitors as “Reddin the destroyer” and “Edward the savior” (final page of novel)—a description that also maps onto Tess where Alec is the destroyer and Angel the savior. And yet both Angel and Edward are failed saviours: in each novel neither the protagonist nor her child survives. The similarities in the plots and themes of the two novels make them ideal for comparison.
Before we get to the visualizations, I want to talk briefly about my digital methods and tools. I’m not going to go into great detail, but I’ve provided a list of resources and hyperlinks in the credits slide below and linked to many of them within the text.
Both of the texts I selected are in the public domain and freely available on Project Gutenberg. However, because no dataset existed that would adequately address my research questions, I needed to create my own structured data from these texts. I did this through a process of tagging every instance of a named character and reference to the natural world in both novels. Some of this I was able to automate through a tool called CATMA (Computer-Aided Textual Markup and Analysis) but a lot of it was done by hand.
This markup process is probably the most important step in this entire research project—and one that takes the most time and labor. The actual tagging can feel very tedious, but the intellectual work that goes into designing a markup schema is integral to the success of addressing the research questions. How the structure of the information is designed determines how the information can be used. For example, we are able to search library catalogues by author or by Library of Congress subject headings because book records are structured in such a way that specific fields like “author” correspond to specific pieces of information, like “Robert Frost.” If you searched for “Robert Frost” in a date field, the catalogue would (hopefully) return zero results. We take advantage of good and bad structured data all the time, but humanists are rarely involved in creating that data. The process of creating a markup schema surfaces individual and cultural ways of thinking and valuing concepts—in my case—novels and the natural world.
Like in many archival preservation projects, the most important and difficult question is not what to keep or what to tag but what to dispose of, what not to tag. This became painfully evident when—after attempting to mark up and categorize all features of landscapes at the beginning of this project—my schema became needlessly complex and my progress was very, very slow. Knowing that this is a forever iterative process, I eventually compromised by focusing only on “living” beings in the novel—the flora, fauna, and fungi—and only as they were directly named. I realized very quickly that significant references to the natural world emerged—particularly in Webb’s novel—not through direct references but through invocation, through metaphor and simile. For example, Hazel is described as “tawny and foxlike” (Ch. 1) and “sexless as a leaf” (Ch. 1). In a work of fiction, does it matter if the word “leaf” describes an actual feature of the fictional landscape or if it’s an example of figurative language? I ended up marking both.
Finally, I struggled with how to mark humans—clearly we are a part of the kingdom Fauna, but I wanted to be able to isolate the main characters of each novel and contextualize them in relation to the nonhuman others with whom they shared textual space (more on this later). As such, when you see the tag “Fauna” today, recognize that it is fauna absent of humans (something that will be fixed in later iterations).
Because the goal was not to make exact or precise comparisons, but rather to trace major connections among living beings and patterns across texts, I had to increase my tolerance for flattening some of the ambiguities I had been trained to highlight. For the remainder of this talk, I’m going to share four different sets of visualizations I’ve created based on this data that represent key concepts and patterns within the novels.
Absence & Presence
Early in my process, I wanted to get a visual overview of the notable presences and absences of the novel. I began by graphing every instance of a tag in the two novels in Tableau Public. Each vertical line is a tag in the novel and the horizontal axis is the novel from start to finish.
As you can see in the above visualization, Tess appears frequently throughout the majority of the novel. While Angel appears early on (when happening upon young women dancing in Tess’s village), his presence is “thickest” in the middle of the novel and spotty in the final third when he is in Brazil and no longer within the space of Tess’s world. Fauna is particularly dense in the first half of the novel, before the major turn, which is when Angel rejects Tess after their marriage. The density of these descriptions supports various ecocritical readings of the novel which portray Tess’s time at Talbothays Dairy (where she meets Angel) as the most idyllic (Meadowsong) or pastoral (Martell) of the novel. However, the presence of flora does not as neatly correspond to this characterization. Indeed, the most dense moments of flora in the novel occur nearer the end of the novel when Tess is being worked to the bone at a farm in Flintcomb-Ash where she digs swedes (rutabagas) and threshes wheat. These direct comparisons open up questions about how we have understood and characterized the pastoral mode in relation to the presence of nonhuman others.
Similar to Tess, Hazel as the protagonist is the most frequent tag in Gone to Earth. Hazel’s father, Abel, disappears from the novel almost entirely once Hazel has married Edward. Like Angel, Edward isn’t fleshed out as a character until well into the novel. In both novels, the antagonists (Alec and Reddin) occupy more of the narrative frame than the (failed) “saviors,” whose later entrances aren’t enough to counteract the damage done by the antagonists.
These visualizations were useful as exploratory tools, which helped me dig in deeper to the data and the novel. For example, many scholars have argued that Tess’s time at Talbothays dairy is the most pastoral section of the novel, dense with description of the natural world and eroticized connections between Tess and the environment she occupies. This graph suggests that the most organically dense section of the novel precedes her time at Talbothays and her assault. Which begs the question, which literary categories or concepts are objectively measurable? For instance are there measurable qualities of the pastoral mode that could be identified by an algorithm? Or is the pastoral a case of “you know it when you see it?”
Second, and paradoxically, the more a human is named in the novel, the more absent he might be. For instance, when Angel is away in Brazil, Tess thinks about him constantly, but we, as readers, are rarely transported to Brazil to hear from Angel. In other words, “Angel” is present in the text of the novel, but he is not present in the space of the fictional ecosystem that Tess occupies (while he’s in Brazil). Angel’s textual presence is invoked by Tess’s memory and desire. The presence and absence of Angel and Alec in this graph are greatly impacted by Tess’s attention: when Alec intrudes, she thinks less of Angel.
While these two sets of graphs around the two novels are interesting as separate artifacts, they are difficult to compare. By isolating the “flora” tags in each novel, and rescaling the horizontal axis (that is—the length of the novel—since Tess is nearly twice as long as Gone to Earth) a visual comparison becomes possible. The result is a kind of botanical barcode—one that will be unique and visually distinct for each novel.
These kinds of distillations of novels, or—to return to McGann & Samuel’s term—“deformations” of novels often experience brief bursts of popularity online. You may have seen Adam Calhoun’s visualizations of the punctuation of novels popping up in your social media and other feeds last week. Or Jaz Parkinson’s color signatures of novels a few years ago.
My botanical barcodes are particularly useful in understanding how mentions of the natural world (or in this case, the flora) fit within or help frame the narrative structure of the novel. For example, in Tess’s botanical barcode, mentions of flora appear most frequently at the beginning and end of the novel—like an establishing shot in a film.
From this visual, we might be able to conclude that the botanical world acts as a narrative frame which contains and somewhat overlaps with human drama. As human dramas increase, mentions of flora generally decrease. At the end of the novel, mentions of flora again increase, when Tess and Angel are reunited, fleeing together through the countryside.
In contrast, in Gone to Earth, flora appear most densely in the middle of the novel, and least frequently of all in the third section of the novel.
The dramatic contrast between the density and paucity of flora occurs after Reddin takes Hazel to live with him at Undern. Webb writes:Here the character who has had the most reciprocal relationship and empathy with the natural world is rejected by the “freemasonry of the green world.” The relationship becomes unidirectional. This shift results in fewer mentions of the natural world: as if only one half of a conversation is being reported. Without knowing the context of the novel, the increased absence of flora in the third section of the novel might be attributed to a displacement of a passive natural world by “human life.” However, such a reading denies the agency of, say, the primrose who withholds “comprehension” and whose active silence is a protest against Hazel’s abandonment. Active silence registers in the novel as textual absence and in the visuals as white space.
Like the punctuation & color visualizations, these botanical barcodes may be useful for noticing patterns across texts—for distant reading—but not for close reading. The punctuation charts were created by a neuroscientist (and data visualizer) & the color signatures by a graphic designer. Neither set were intended as tools for literary scholarship. I began playing with this possibility: could I produce something that was both publicly accessible and scholastically useful? With a few modifications of the original code, I began to address this:
Access the code behind this visualization here.
Here you can see that running a cursor over the lines in the barcode causes a window to appear above the chart that provides the context of the mention of flora (with the flora highlighted in yellow) and the frequency with which this word appears in the novel.
This is far from perfect, but it is, I hope, a step towards creating tools to facilitate literary research—that let scholars and students move between close and distant readings. In future iterations of this visualization, I could, for instance, also compare the frequency with which an individual word appears in the entire corpus, which would help to identify and measure uniqueness and even, perhaps, add to Macfarlane’s lists of “nature words.”
Drawing on what I had learned from earlier visuals, I wanted to see where the natural world was at its densest. Around what or whom did instances of the natural world coalesce? In other words, I wanted to better understand the ecological context of the major human characters and be able to compare those contexts as a whole rather than to compare a single passage to another single passage (which is one way in which a close reading might happen).
In order to make these comparisons, I wrote a script that identified all of the “fauna” words that appeared within 120 characters—or the average length of a sentence in the novels—of, say, words tagged “Tess.” And I did this for each of the major characters (and for a few more general categories). I then graphed these moments of interspecies cohabitation as individual clusters.
Setting these individual clusters side by side in a kind of triptych emphasizes the variances between characters’ interactions with the natural world, or in this case, their interactions with fellow fauna. In looking at the triptych below, for Tess of the d’Urbervilles, we can see that Tess’s cluster is the most dense, Alec’s is the least dense, and Angel is somewhere in the middle.
Comparing the above triptych to the triptych of Reddin, Hazel, and Edward’s clusters below reveals that there are far fewer differences between the three main characters in Gone to Earth than in Tess. While Tess and Hazel both have the most connections to fauna, there is a striking difference between Alec’s and Reddin’s textual cohabitations with fauna.
You can explore the individual clusters at your leisure here, but in the interest of space, I’ll provide an example of how one particular cluster illuminates or was useful to me in better understanding Edward’s relationship to the natural world. While Angel’s views of both Tess and the natural world are colored by his Wordsworthian idealization (Lowe 57), Edward’s most frequent interactions with the natural world are a direct result of his relationship with Hazel:
As the short video above shows, Edward’s most common fauna word is “bees.” Hazel’s father Abel keeps bees, so Edward is constantly among them as he courts her, but the narrator tells us that “Edward loathed bees in or out of boxes” (loc 171950 near fauna). Despite his “loathing” Edward courteously engages with both Abel and the bees—unlike Alec and his horses. Among the most common fauna words associated with Edward are “rabbit,” “fox,” and “cat”—the animals that Hazel brings with her after their marriage: “She and Foxy and the one-eyed cat, her rabbit, and the blackbird, were going to a country far from troublous things, to the peace of Edward’s love on the slope of God’s Little Mountain.” In this cluster, then, Edward’s varied relationships with different species are represented, but the force that created the conditions through which these relationships could occur—Hazel—is absent.
Similar to the botany barcodes, these clusters by themselves can’t tell us much about the kind of relationships each human has with various fauna. And this was really striking to me in one particular instance where both Reddin & Hazel had connections to the word “urchin” (a colloquial word for a hedgehog). The passage that both of these connections refer to is a section in which Reddin brings Hazel a hedgehog:
“Got something for you,’ he said, pulling at his pocket. ‘Oh! It’s an urchin!’ cried Hazel delightedly. Reddin began bruising and pulling at its spines with his gloved hands. ‘Dunna!’ cried Hazel. Reddin pulled and wrenched until at last the hedgehog screamed a thin, piercing wail.”
I began to address this issue by making contextual information from the novel available when the viewer clicks on a particular node. Even so, one of the major issues with these “clusters,” as I call them, is that they put humans at the center of the visualizations. If my goal is contextualize humans within the environments they inhabit, these visualizations do that but at the expense of visually overemphasizing the importance of human beings. The diagrams reinforce the very anthropocentrism I’m seeking to resist. To address this concern, I’ve begun playing with other kinds of visualizations including the network graphs below.
Whereas the clusters are a kind of moving snapshot of a given human’s habitat centered around that human, the graphs above represent the entire biosphere of the novel—that is, all the biotic beings that have at least one textual connection (measured by collocation) to another living being. The resulting visualizations are basically “spaghetti.” Illegible. Slow to render in a browser. Impossible to analyze in their current forms.
Each imperfect visualization fails in ways that provoke more and more interesting questions. And this is the space that, for the moment, is most exciting—to create, to make, to imagine, to question.