Digital Humanities, Part 1

One of the questions I get asked comparatively often is, as you might imagine “so what is it you do?”

There are two ways to answer this question.

1) I say “I study Victorian Literature in the context of Psychology and Digital Humanities”. This is a completely useless answer, because if you know anything about ANY of those three terms,then you also know that what I’ve said is almost broad enough to be completely meaningless and, if you don’t know anything about one or all of those terms, I could just have easily informed you that I study bogdlefrindling in the context of the elusive umlaut (it’s an interdisciplinary minor).

2) Read them my thesis.

There should be something in between these two possible answers. What I usually do is say the former and, based on a complex analysis of when exactly my interlocutor’s face looks most confused, clarify. As a rule, the moment of confusion occurs when I say the magic words “Digital Humanities”. And I tend to fumble around and explain it as best I can and then, if all goes well, end up in an exciting discussion about the future of literary studies and, if all fails, end up entirely misunderstood.

Now, it isn’t entirely my fault that I lack the words to explain a movement within criticism (and anyone who have ever tried to explain semiotics at a cocktail party (or anywhere else!) will know what I mean). It’s simply too difficult to distill a broad field with a number of different scholars working with different philosophies and taking different approaches into a sentence. Basically, Digital Humanities is a microcosm of academia in which everyone working in it has at least one thing in common: We are all fascinated by what new technology can do to how we study our fields.

This post, as you may have noticed from the title, is going to be in two parts. The first part is in answer to the question “What is it you do?” as illustrated by an example from recent news and some (inevitable) clarification and commentary on my part. The second post will be an attempt to explain some of what it is that those of us who practice Digital Humanities can do and why, as a field, it deserves all the attention it is already getting and more.

But first things first – what is it about DH that interests me? We’ll start with an article by Stanley Fish, a fairly well known detractor of the Digital Humanities (err, well-known within a given context) who, nonetheless, decided to take it upon himself to write an Op-Ed in the NYTimes about what Digital Humanities does and who he does not like it.

I’ll save you the trouble of reading most of it–though you can find it at the following link if you so desire, The Digital Humanities and Interpretation—  because it boils down to “But I don’t want to practice literary criticism your way and you can’t make me.” Which is certainly true and almost entirely irrelevant. (And, I should note, us digital humanists would spend much less time and fewer resources explaining how our ideas can be beneficial/will revolutionize academia if you, Dr. Fish, would stop forcing us to.)

But that’s not why I’m interested in this piece. Fish began his argument with what he describes as the brand of literary criticism he practices.

“Halfway through “Areopagitica” (1644), his celebration of freedom of publication, John Milton observes that the Presbyterian ministers who once complained of being censored by Episcopalian bishops have now become censors themselves. Indeed, he declares, when it comes to exercising a “tyranny over learning,” there is no difference between the two: “Bishops and Presbyters are the same to us both name and thing.” That is, not only are they acting similarly; their names are suspiciously alike.

“In both names the prominent consonants are “b” and “p” and they form a chiasmic pattern: the initial consonant in “bishops” is “b”; “p” is the prominent consonant in the second syllable; the initial consonant in “presbyters” is “p” and “b” is strongly voiced at the beginning of the second syllable. The pattern of the consonants is the formal vehicle of the substantive argument, the argument that what is asserted to be different is really, if you look closely, the same. That argument is reinforced by the phonological fact that “b” and “p” are almost identical. Both are “bilabial plosives” (a class of only two members), sounds produced when the flow of air from the vocal tract is stopped by closing the lips.

“There is more. (I know that’s not what you want to hear.) In the sentences that follow the declaration of equivalence, “b’s” and “p’s” proliferate in a veritable orgy of alliteration and consonance. Here is a partial list of the words that pile up in a brief space: prelaty, pastor, parish, Archbishop, books, pluralists, bachelor, parishioner, private, protestations, chop, Episcopacy, palace, metropolitan, penance, pusillanimous, breast, politic, presses, open, birthright, privilege, Parliament, abrogated, bud, liberty, printing, Prelatical, people.”

So far so good. Later, he points out that this is a style of criticism that is dependent first on an interpretive assumption and then on noticing a correlation between the text and the interpretation. That is, first you have an idea about the text, then you use the text to prove it. He says Digital Humanities works backwards, that we first find something of interest in a text and THEN we flail around for whatever interpretation might fit.

While that last assertion is untrue, there is a bigger problem with Fish’s analysis and it is one that a bit more experience with some of the tools we use in the Digital Humanities might have helped him overcome.

You see, Fish is wrong about the proliferation of ps and bs. In the blog Language Log, Mark Liberman checks Fish’s assertions about ps and bs and finds that, while there seems to be a local peak in their usage around this area, there are other places within the same paper that use roughly the same number of plosives:

His words, not mine

This image comes straight from Liberman’s data and, as you can see, it’s a relatively high usage, but not a globally high usage. Or, to put it another way, it’s probably just a coincidence. If you want to read the rest of Liberman’s article, it can be found here:

So how does this relate to me? See what Fish has done is exactly what I want to AVOID doing in my study of literature. If you want to look for patterns across a book, or a broad range of books, the first thing you want to make sure is that you have all the data. If you’re going to assert, for example, that a certain author uses longer sentences than his contemporaries, you had better have the data to back it up. If you want, as I do, to make claims about the kinds of books the Victorian public was enjoying versus the kinds of books we consider as “the best” from that era, you need a way to compare a large number of books and verify your assumptions about them.

Now, unlike what Fish asserts, I am definitely NOT simply throwing books at the computer and seeing what data comes out. I have a hunch that the popular books will be different than the ones that have survived to be included into the canon and that difference will lie not only in the subject of the stories, but also in the style in which they are told. Furthermore, I believe that by looking at the books using digital methodology, specifically by employing computer programs to look at word frequencies, sentence length, use of punctuation and more, I can come to understand something about the human response to literature.

To return to Fish—after having looked at the data and found that this portion of the text is not that noticeably different from the rest, a question remains. Why would Fish, a respected academic, have thought it was? (The other question, why didn’t he check? is one I don’t feel up to examining.) If there isn’t “a veritable orgy” of ps and bs in this portion of the text, what is it that Fish was picking up on?

And here is where a knowledge of psychology becomes important. You see, one of the things that the human brain is very good at is picking out patterns. You show us clouds, we find faces; you show us a jumble of numbers, we look for order; you show us a collection of words and we look for something that ties them together. In fact, we are so good at finding patterns that we find them even when they aren’t there. Our brains work in such a way that they find false positives (evidence of a pattern when there isn’t one), because it is riskier to miss a pattern that exists than to find one that doesn’t.* What happened to Fish now makes sense. Because he is a person, he has a tendency to look at everything in the hope it will turn into a pattern. He doesn’t necessarily do this consciously, but his brain is always pattern-making. On encountering Milton’s prose, the slightly higher but still within normal range of ps and bs stood out to his mind and made him assume there was a distinct pattern even when the data says there probably wasn’t.

Now here’s the thing. Technically, Fish’s conclusion might still be valid; Milton himself was human and might also have sensed a patterned proliferation of plosives (see what I did there) and left them in the prose because it sounded better. The data suggests that it was likely unintentional (and the fact that Fish was the first one in 400 years to notice this would corroborate that argument) and I will side with the data.

Still, the interesting thing about this little exercise is to show how, with an understanding of digital techniques and human psychology, we can understand what is going on when we’re reading. We have access not only to our own responses, but also to the tools that explain how those responses happen. As I said, Literature, Digital Humanities and Psychology. It’s amazing what you can do with them.

Tune in next time for a whirlwind tour of what else is possible in the wild and wonderful world of the Digital Humanities.


*The argument behind this explanation is that, when we look at the environment in which human beings evolved, CONSTANT VIGILANCE was the order of the day. You were more likely to survive if you noticed that the animals all went to the same watering hole at the same point during the day. You were more likely to survive if you noticed that every time you struck your flint in a specific way, it would spark. Pattern-spotting was adaptive behavior and it was more adaptive to be hyperaware of patterns than to miss one.



Filed under Uncategorized

4 responses to “Digital Humanities, Part 1

  1. Ema

    This might be a obvious question, but where does one get the information on what literature was popular during the Victorian times, versus what remains popular today? I was assuming that the popular literature survived and that is why we continue to read it today. Is that an erroneous assumption? Are there authors the Victorians were reading that we don’t pick up today?

    • Yes, there were Twilight equivalents back then too. Some were called penny dreadfuls.
      But even if we look at enjoyable and well-written books, like “North and South” or “The Moonstone” that went over quite well in their day (the latter more than the former), those books are very rarely taught in undergraduate literature classes, even once you get past Intro to Brit Lit. I wasn’t asked to read Gaskell until grad school and still haven’t read Collins except for my own enjoyment.
      So why “Wuthering Heights” instead of “North and South” (the latter was more popular).
      The best way to gauge popularity is to look at book sales, letters and reviews, all of which will tell you something about how the books were received. Also, there were lending libraries then and some of those records survived. It’s a historical data mining project, but one that is already being done.

  2. You might want to dip into some of the essays from Fish’s 1980 collection, Is There a Text in This Class? In his seminal essay on “Literature in the Reader: Affective Stylistics” (originally published in 1970) for example, he made the general point that the pattern of expectations, some satisfied and some not, which is set up in the process of reading literary texts is essential to the meaning of those texts. Hence any adequate analytic method must describe that essentially temporal pattern. In the course of developing his argument Fish asserted that “What is required, then, is a method, a machine if you will, which in its operation makes observable, or at least accessible, what goes on below the level of self-conscious response.”

    Someone reading that phrase “a method, a machine if you will” in a digital humanities environment, or a cognitive science environment, might think, ah, yes, computer simulation. Let’s simulate the process of reading and then examine what the computer does as it moves through a text. But that’s not where Fish went and there’s no reason to think he even considered such a thing. And yet computers and computation were certainly in the air even then; in other essays in that collection he looks at some work in computational stylistics.

    Why did Fish even think about such a possibility? Given the general tenor of the times, with generative grammar all new and exciting and cognitive science happening all around (the term was coined in ’73), it’s hard not to think that Fish absorbed that language from computing, either directly on indirectly.

    In 1976 I published a “hand simulation” of the semantics of Shakespeare’s Sonnet 129 in the Centennial issue of MLN (see abstract below). That same year David Hays and I published a review of the computational linguistics literature in which we, in effect, proposed an algorithmic criticism, which we called Prospero. What we proposed wasn’t possible at that time, which we acknowledges, and still isn’t. But . . . Maybe something’s to be gained be reopening that conversation.

    William Benzon. Cognitive Networks and Literary Semantics. MLN
    91: 952-982, 1976.

    Abstract: A cognitive network is a type of semantic model developed for simulating natural language on digital computers. A concept is a node in the network while connections between nodes represent relations between concepts. One generates a text by tracing a path through the network and rendering the successive concepts and relations into language according to the appropriate conventions. Elementary concepts are grounded in sensor-motor schemas while abstract concepts are grounded in patterns of network relationship. The semantic structure for Shakespeare’s “Th’ expense of spirit” (Sonnet 129) is given by an abstract pattern for the Fortunate Fall, which is linked to a pattern specifying a fragment of the conceptual basis for faculty psychology.

    William Benzon and David G. Hays. Computational Linguistics and the Humanist. Computers and the Humanities 10: 265 – 274, 1976.

    I co-authored this piece with David Hays. It was published in 1976 in what was then Computers and the Humanities and is a review of the computational linguistics literature. At the end we imagined Project Prospero, a computer simulation of the human mind with which we could simulate reading a literary text. It wasn’t possible to do such a thing then, and it still isn’t, but as a way of thinking about literature it’s a thought-experiment worth resurrecting.

  3. Yael Shayne

    OK. But, I guess I go back to your question there “why ‘Wuthering Heights’ instead of ‘North and South'” if “North and South” was more popular in its time. Are we ultimately defining classics? What defines staying power? Why don’t we (as a reading public and not academics) read Trollope and Thackery (we know why we don’t read Richardson, but besides him….), all social commentators, just like Dickens and Austen. I am starting to wonder if staying power has to do with good BBC adaptations?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s