I’m collecting programming & methodological textbooks for humanists as part of a reflective study on DH, but figured it’d also be useful for those interested in teaching themselves to code, or teachers who need a textbook for their class. Though I haven’t read them all yet, I’ve organized them into very imperfect categories and provided (hopefully) some useful comments. Short coding exercises, books that assume some pre-existing knowledge of coding, and theoretical introductions are not listed here.
An open access introduction to programming in Python. Mostly web scraping and basic text analysis. Probably best to look to newer resources, due to the date. Although it’s aimed at historians, the methods are broadly useful to all text-based DH.
The Programming Historian, 2nd edition (ongoing). Afanador-Llach, Maria José, Antonio Rojas Castro, Adam Crymble, Víctor Gayol, Fred Gibbs, Caleb McDaniel, Ian Milligan, Amanda Visconti, and Jeri Wieringa, eds.
Constantly updating lessons, ostensibly aimed at historians, but useful to all of DH. Includes introductions to web development, text analysis, GIS, network analysis, etc. in multiple programming languages. Not a monograph, and no real order.
A series of lessons in in R, still under development with quite a few chapters missing. Probably the only programming book aimed at historians that actually focuses on historical questions and approaches.
About natural language processing, but not an introduction to coding. Instead, an introduction to the methodological approaches of natural language processing specific to historical texts (OCR, spelling normalization, choosing a corpus, part of speech tagging, etc.). Teaches a variety of tools and techniques.
Step-by-step introduction to learning R, specifically focused on literary text analysis, both for close and distant reading, with primers on the statistical approaches being used. Includes approaches to, e.g., word frequency distribution, lexical variety, classification, and topic modeling.
A growing, interactive textbook similar in scope to Jockers’ book (close & distant reading in literary analysis), but in Python rather than R. Heavily focused on the code itself, and includes such methods as topic modeling and sentiment analysis.
Many of the above books are focused on literary or historical analysis only in name, but are really useful for everyone in DH. The below are similar in scope, but don’t aim themselves at one particular group.
A Mathematica notebook (thus, not accessible unless you have an appropriate reader) teaching text, image, and geo-based analysis. Mathematica itself is an expensive piece of software without an institutional license, so this resource may be inaccessible to many learners. [NOTE: Arno Bosse wrote positive feedback on this textbook in a comment below.]
An introduction to the fundamentals of programming specifically for arts and humanities, languages Python and Processing, that goes through statistics, text, sound, animation, images, and so forth. Much more expansive than many other options listed here, but not as focused on needs of text analysis (which is probably a good thing).
A brief textbook with exercises and explanatory notes specific to text analysis for the study of literature and history. Not an introduction to programming, but covers some of the mathematical and methodological concepts used in these sorts of studies.
Interactive (Jupyter) notebooks teaching Python for statistical text analysis. Quite thorough, teaching methodological reasoning and examples, including quizzes and other lesson helpers, going from basic tokenization up through unsupervised learning, object-oriented programming, etc.
Not an introduction to coding of any sort, but a solid intro to statistics geared at the sort of stats needed by humanists (archaeologists, literary theorists, philosophers, historians, etc.). Reading this should give you a solid foundation of statistical methods (sampling, confidence intervals, bias, etc.)
A practical intro to machine learning in Weka, Java-based software for data mining and modeling. Not aimed at humanists, but legible to the dedicated amateur. It really gets into the weeds of how machine learning works.
Introduction to text mining aimed at data scientists in the statistical programming language R. Some knowledge of R is expected; the authors suggest using R for Data Science (2016) by Grolemund & Wickham to get up to speed. This is for those interested in current data science coding best-practices, though it does not get as in-depth as some other texts focused on literary text analysis. Good as a solid base to learn from.
Full-length introduction to Drupal, a web platform that allows you to build “environments for gathering, annotating, arranging, and presenting their research and supporting materials” on the web. Useful for those interested in getting started with the creation of web-based projects but who don’t want to dive head-first into from-scratch web development.
French introduction to LaTeX for humanists. LaTeX is the primary means scientists use to prepare documents (instead of MS Word or similar software), which allows for more sustainable, robust, and easily typeset scholarly publications. If humanists wish to publish in natural (or some social) science journals, this is an important skill.
The below is the transcript from my October 29 keynote presented to the Creativity and The City 1600-2000 conference in Amsterdam, titled “Punched-Card Humanities”. I survey historical approaches to quantitative history, how they relate to the nomothetic/idiographic divide, and discuss some lessons we can learn from past successes and failures. For ≈200 relevant references, see this Zotero folder.
I’m here to talk about Digital History, and what we can learn from its quantitative antecedents. If yesterday’s keynote was framing our mutual interest in the creative city, I hope mine will help frame our discussions around the bottom half of the poster; the eHumanities perspective.
Specifically, I’ve been delighted to see at this conference, we have a rich interplay between familiar historiographic and cultural approaches, and digital or eHumanities methods, all being brought to bear on the creative city. I want to take a moment to talk about where these two approaches meet.
Yesterday’s wonderful keynote brought up the complicated goal of using new digital methods to explore the creative city, without reducing the city to reductive indices. Are we living up to that goal? I hope a historical take on this question might help us move in this direction, that by learning from those historiographic moments when formal methods failed, we can do better this time.
Digital History is different, we’re told. “New”. Many of us know historians who used computers in the 1960s, for things like demography or cliometrics, but what we do today is a different beast.
Commenting on these early punched-card historians, in 1999, Ed Ayers wrote, quote, “the first computer revolution largely failed.” The failure, Ayers, claimed, was in part due to their statistical machinery not being up to the task of representing the nuances of human experience.
We see this rhetoric of newness or novelty crop up all the time. It cropped up a lot in pioneering digital history essays by Roy Rosenzweig and Dan Cohen in the 90s and 2000s, and we even see a touch of it, though tempered, in this conference’s theme.
In yesterday’s final discussion on uncertainty, Dorit Raines reminded us the difference between quantitative history in the 70s and today’s Digital History is that today’s approaches broaden our sources, whereas early approaches narrowed them.
To say “we’re at a unique historical moment” is something common to pretty much everyone, everywhere, forever. And it’s always a little bit true, right?
It’s true that every historical moment is unique. Unprecedented. Digital History, with its unique combination of public humanities, media-rich interests, sophisticated machinery, and quantitative approaches, is pretty novel.
But as the saying goes, history never repeats itself, but it rhymes. Each thread making up Digital History has a long past, and a lot of the arguments for or against it have been made many times before. Novelty is a convenient illusion that helps us get funding.
Not coincidentally, it’s this tension I’ll highlight today: between revolution and evolution, between breaks and continuities, and between the historians who care more about what makes a moment unique, and those who care more about what connects humanity together.
To be clear, I’m operating on two levels here: the narrative and the metanarrative. The narrative is that the history of digital history is one of continuities and fractures; the metanarrative is that this very tension between uniqueness and self-similarity is what swings the pendulum between quantitative and qualitative historians.
Now, my claim that debates over continuity and discontinuity are a primary driver of the quantitative/qualitative divide comes a bit out of left field — I know — so let me back up a few hundred years and explain.
Francis Bacon wrote that knowledge would be better understood if it were collected into orderly tables. His plea extended, of course, to historical knowledge, and inspired renewed interest in a genre already over a thousand years old: tabular chronology.
These chronologies were world histories, aligning the pasts of several regions which each reconned the passage of time differently.
Isaac Newton inherited this tradition, and dabbled throughout his life in establishing a more accurate universal chronology, aligning Biblical history with Greek legends and Egyptian pharoahs.
Newton brought to history the same mind he brought to everything else: one of stars and calculations. Like his peers, Newton relied on historical accounts of astronomical observations to align simultaneous events across thousands of miles. Kepler and Scaliger, among others, also partook in this “scientific history”.
Where Newton departed from his contemporaries, however, was in his use of statistics for sorting out history. In the late 1500s, the average or arithmetic mean was popularized by astronomers as a way of smoothing out noisy measurements. Newton co-opted this method to help him estimate the length of royal reigns, and thus the ages of various dynasties and kingdoms.
On average, Newton figured, a king’s reign lasted 18-20 years. If the history books record 5 kings, that means the dynasty lasted between 90 and 100 years.
Newton was among the first to apply averages to fill in chronologies, though not the first to apply them to human activities. By the late 1600s, demographic statistics of contemporary life — of births, burials and the like — were becoming common. They were ways of revealing divinely ordered regularities.
Incidentally, this is an early example of our illustrious tradition of uncritically appropriating methods from the natural sciences. See? We’ve all done it, even Newton!
Joking aside, this is an important point: statistical averages represented divine regularities. Human statistics began as a means to uncover universal truths, and they continue to be employed in that manner. More on that later, though.
Newton’s method didn’t quite pass muster, and skepticism grew rapidly on the whole prospect of mathematical history.
Criticizing Newton in 1782, for example, Samuel Musgrave argued, in part, that there are no discernible universal laws of history operating in parallel to the universal laws of nature. Nature can be mathematized; people cannot.
Not everyone agreed. Francesco Algarotti passionately argued that Newton’s calculation of average reigns, the application of math to history, was one of his greatest achievements. Even Voltaire tried Newton’s method, aligning a Chinese chronology with Western dates using average length of reigns.
Which brings us to the earlier continuity/discontinuity point: quantitative history stirs debate in part because it draws together two activities Immanuel Kant sets in opposition: the tendency to generalize, and the tendency to specify.
The tendency to generalize, later dubbed Nomothetic, often describes the sciences: extrapolating general laws from individual observations. Examples include the laws of gravity, the theory of evolution by natural selection, and so forth.
The tendency to specify, later dubbed Idiographic, describes, mostly, the humanities: understanding specific, contingent events in their own context and with awareness of subjective experiences. This could manifest as a microhistory of one parish in the French Revolution, a critical reading of Frankenstein focused on gender dynamics, and so forth.
These two approaches aren’t mutually exclusive, and they frequently come in contact around scholarship of the past. Paleontologists, for example, apply general laws of biology and geology to tell the specific story of prehistoric life on Earth. Astronomers, similarly, combine natural laws and specific observations to trace to origins of our universe.
Historians have, with cyclically recurring intensity, engaged in similar efforts. One recent nomothetic example is that of cliodynamics: the practitioners use data and simulations to discern generalities such as why nations fail or what causes war. Recent idiographic historians associate more with the cultural and theoretical turns in historiography, often focusing on microhistories or the subjective experiences of historical actors.
Both tend to meet around quantitative history, but the conversation began well before the urge to quantify. They often fruitfully align and improve one another when working in concert; for example when the historian cites a common historical pattern in order to highlight and contextualize an event which deviates from it.
But more often, nomothetic and idiographic historians find themselves at odds. Newton extrapolated “laws” for the length of kings, and was criticized for thinking mathematics had any place in the domain of the uniquely human. Newton’s contemporaries used human statistics to argue for divine regularities, and this was eventually criticized as encroaching on human agency, free will, and the uniqueness of subjective experience.
I’ll highlight some moments in this debate, focusing on English-speaking historians, and will conclude with what we today might learn from foibles of the quantitative historians who came before.
Let me reiterate, though, that quantitative is not nomothetic history, but they invite each other, so I shouldn’t be ahistorical by dividing them.
Take Henry Buckle, who in 1857 tried to bridge the two-culture divide posed by C.P. Snow a century later. He wanted to use statistics to find general laws of human progress, and apply those generalizations to the histories of specific nations.
Buckle was well-aware of historiography’s place between nomothetic and idiographic cultures, writing: “it is the business of the historian to mediate between these two parties, and reconcile their hostile pretensions by showing the point at which their respective studies ought to coalesce.”
In direct response, James Froud wrote that there can be no science of history. The whole idea of Science and History being related was nonsensical, like talking about the colour of sound. They simply do not connect.
This was a small exchange in a much larger Victorian debate pitting narrative history against a growing interest in scientific history. The latter rose on the coattails of growing popular interest in science, much like our debates today align with broader discussions around data science, computation, and the visible economic successes of startup culture.
This is, by the way, contemporaneous with something yesterday’s keynote highlighted: the 19th century drive to establish ‘urban laws’.
By now, we begin seeing historians leveraging public trust in scientific methods as a means for political control and pushing agendas. This happens in concert with the rise of punched cards and, eventually, computational history. Perhaps the best example of this historical moment comes from the American Census in the late 19th century.
Briefly, a group of 19th century American historians, journalists, and census chiefs used statistics, historical atlases, and the machinery of the census bureau to publicly argue for the disintegration of the U.S. Western Frontier in the late 19th century.
These moves were, in part, made to consolidate power in the American West and wrestle control from the native populations who still lived there. They accomplished this, in part, by publishing popular atlases showing that the western frontier was so fractured that it was difficult to maintain and defend. 1
The argument, it turns out, was pretty compelling.
Part of what drove the statistical power and scientific legitimacy of these arguments was the new method, in 1890, of entering census data on punched cards and processing them in tabulating machines. The mechanism itself was wildly successful, and the inventor’s company wound up merging with a few others to become IBM. As was true of punched-card humanities projects through the time of Father Roberto Busa, this work was largely driven by women.
It’s worth pausing to remember that the history of punch card computing is also a history of the consolidation of government power. Seeing like a computer was, for decades, seeing like a state. And how we see influences what we see, what we care about, how we think.
Recall the Ed Ayers quote I mentioned at the beginning of his talk. He said the statistical machinery of early quantitative historians could not represent the nuance of historical experience. That doesn’t just mean the math they used; it means the actual machinery involved.
See, one of the truly groundbreaking punch card technologies at the turn of the century was the card sorter. Each card could represent a person, or household, or whatever else, which is sort of legible one-at-a-time, but unmanageable in giant stacks.
Now, this is still well before “computers”, but machines were being developed which could sort these cards into one of twelve pockets based on which holes were punched. So, for example, if you had cards punched for people’s age, you could sort the stacks into 10 different pockets to break them up by age groups: 0-9, 10-19, 20-29, and so forth.
This turned out to be amazing for eyeball estimates. If your 20-29 pocket was twice as full as your 10-19 pocket after all the cards were sorted, you had a pretty good idea of the age distribution.
Over the next 50 years, this convenience would shape the social sciences. Consider demographics or marketing. Both developed in the shadow of punch cards, and both relied heavily on what’s called “segmentation”, the breaking of society into discrete categories based on easily punched attributes. Age ranges, racial background, etc. These would be used to, among other things, determine who was interested in what products.
They’d eventually use statistics on these segments to inform marketing strategies.
But, if you look at the statistical tests that already existed at the time, these segmentations weren’t always the best way to break up the data. For example, age flows smoothly between 0 and 100; you could easily contrive a statistical test to show that, as a person ages, she’s more likely to buy one product over another, over a set of smooth functions.
That’s not how it worked though. Age was, and often still is, chunked up into ten or so distinct ranges, and those segments were each analyzed individually, as though they were as distinct from one another as dogs and cats. That is, 0-9 is as related to 10-19 as it is to 80-89.
What we see here is the deep influence of technological affordances on scholarly practice, and it’s an issue we still face today, though in different form.
As historians began using punch cards and social statistics, they inherited, or appropriated, a structure developed for bureaucratic government processing, and were rightly soon criticized for its dehumanizing qualities.
Unsurprisingly, given this backdrop, historians in the first few decades of the 20th century often shied away from or rejected quantification.
The next wave of quantitative historians, who reached their height in the 1930s, approached the problem with more subtlety than the previous generations in the 1890s and 1860s.
Charles Beard’s famous Economic Interpretation of the Constitution of the United States used economic and demographic stats to argue that the US Constitution was economically motivated. Beard, however, did grasp the fundamental idiographic critique of quantitative history, claiming that history was, quote:
“beyond the reach of mathematics — which cannot assign meaningful values to the imponderables, immeasurables, and contingencies of history.”
The other frequent critique of quantitative history, still heard, is that it uncritically appropriates methods from stats and the sciences.
This also wasn’t entirely true. The slide behind me shows famed statistician Karl Pearson’s attempt to replicate the math of Isaac Newton that we saw earlier using more sophisticated techniques.
By the 1940s, Americans with graduate training in statistics like Ernest Rubin were actively engaging historians in their own journals, discussing how to carefully apply statistics to historical research.
On the other side of the channel, the French Annales historians were advocating longue durée history; a move away from biographies to prosopographies, from events to structures. In its own way, this was another historiography teetering on the edge between the nomothetic and idiographic, an approach that sought to uncover the rhymes of history.
Interest in quantitative approaches surged again in the late 1950s, led by a new wave of Annales historians like Fernand Braudel and American quantitative manifestos like those by Benson, Conrad, and Meyer.
William Aydolette went so far as to point out that all historians implicitly quantify, when they use words like “many”, “average”, “representative”, or “growing” – and the question wasn’t can there be quantitative history, but when should formal quantitative methods be utilized?
By 1968, George Murphy, seeing the swell of interest, asked a very familiar question: why now? He asked why the 1960s were different from the 1860s or 1930s, why were they, in that historical moment, able to finally do it right? His answer was that it wasn’t just the new technologies, the huge datasets, the innovative methods: it was the zeitgeist. The 1960s was the right era for computational history, because it was the era of computation.
By the early 70s, there was a historian using a computer in every major history department. Quantitative history had finally grown into itself.
Of course, in retrospect, Murphy was wrong. Once the pendulum swung too far towards scientific history, theoretical objections began pushing it the other way.
In Poverty of Historicism, Popper rejected scientific history, but mostly as a means to reject historicism outright. Popper’s arguments represent an attack from outside the historiographic tradition, but one that eventually had significant purchase even among historians, as an indication of the failure of nomothetic approaches to culture. It is, to an extent, a return to Musgrave’s critique of Isaac Newton.
At the same time, we see growing criticism from historians themselves. Arthur Schlesinger famously wrote that “important questions are important precisely because they are not susceptible to quantitative answers.”
There was a converging consensus among English-speaking historians, as in the early 20th century, that quantification erased the essence of the humanities, that it smoothed over the very inequalities and historical contingencies we needed to highlight.
Jacques Barzun summed it up well, if scathingly, saying history ought to free us from the bonds of the machine, not feed us into it.
The skeptics prevailed, and the pendulum swung the other way. The post-structural, cultural, and literary-critical turns in historiography pivoted away from quantification and computation. The final nail was probably Fogel and Engerman’s 1974 Time on the Cross, which reduced the Atlantic slave-trade to economic figures, and didn’t exactly treat the subject with nuance and care.
The cliometricians, demographers, and quantitative historians didn’t disappear after the cultural turn, but their numbers shrunk, and they tended to find themselves in social science departments, or fled here to Europe, where social and economic historians were faring better.
Which brings us, 40 years on, to the middle of a new wave of quantitative or “formal method” history. Ed Ayers, like George Murphy before him, wrote, essentially, this time it’s different.
And he’s right, to a point. Many here today draw their roots not to the cliometricians, but to the very cultural historians who rejected quantification in the first place. Ours is a digital history steeped in the the values of the cultural turn, that respects social justice and seeks to use our approaches to shine a light on the underrepresented and the historically contingent.
But that doesn’t stop a new wave of critiques that, if not repeating old arguments, certainly rhymes. Take Johanna Drucker’s recent call to rebrand data as capta, because when we treat observations objectively as if it were the same as the phenomena observed, we collapse the critical distance between the world and our interpretation of it. And interpretation, Drucker contends, is the foundation on which humanistic knowledge is based.
Which is all to say, every swing of the pendulum between idiographic and nomothetic history was situated in its own historical moment. It’s not a clock’s pendulum, but Foucault’s pendulum, with each swing’s apex ending up slightly off from the last. The issues of chronology and astronomy are different from those of eugenics and manifest destiny, which are themselves different from the capitalist and dehumanizing tendencies of 1950s mainframes.
But they all rhyme. Quantitative history has failed many times, for many reasons, but there are a few threads that bind them which we can learn from — or, at least, a few recurring mistakes we can recognize in ourselves and try to avoid going forward.
We won’t, I suspect, stop the pendulum’s inevitable about-face, but at least we can continue our work with caution, respect, and care.
The lesson I’d like to highlight may be summed up in one question, asked by Humpty Dumpty to Alice: which is to be master?
Over several hundred years of quantitative history, the advice of proponents and critics alike tends to align with this question. Indeed in 1956, R.G. Collingwood wrote specifically “statistical research is for the historian a good servant but a bad master,” referring to the fact that statistical historical patterns mean nothing without historical context.
Schlesinger, the guy who I mentioned earlier who said historical questions are interesting precisely because they can’t be quantified, later acknowledged that while quantitative methods can be useful, they’ll lead historians astray. Instead of tackling good questions, he said, historians will tackle easily quantifiable ones — and Schlesinger was uncomfortable by the tail wagging the dog.
I’ve found many ways in which historians have accidentally given over agency to their methods and machines over the years, but these five, I think, are the most relevant to our current moment.
Unfortunately since we running out of time, you’ll just have to trust me that these are historically recurring.
Number 1 is the uncareful appropriation of statistical methods for historical uses. It controls us precisely because it offers us a black box whose output we don’t truly understand.
A common example I see these days is in network visualizations. People visualize nodes and edges using what are called force-directed layouts in Gephi, but they don’t exactly understand what those layouts mean. As these layouts were designed, physical proximity of nodes are not meant to represent relatedness, yet I’ve seen historians interpret two neighboring nodes as being related because of their visual adjacency.
This is bad. It’s false. But because we don’t quite understand what’s happening, we get lured by the black box into nonsensical interpretations.
The second way methods drive us is in our reliance on methodological imports. That is, we take the time to open the black box, but we only use methods that we learn from statisticians or scientists. Even when we fully understand the methods we import, if we’re bound to other people’s analytic machinery, we’re bound to their questions and biases.
Take the example I mentioned earlier, with demographic segmentation, punch card sorters, and its influence on social scientific statistics. The very mechanical affordances of early computers influence the sort of questions people asked for decades: how do discrete groups of people react to the world in different ways, and how do they compare with one another?
The next thing to watch out for is naive scientism. Even if you know the assumptions of your methods, and you develop your own techniques for the problem at hand, you still can fall into the positivist trap that Johanna Drucker warns us about — collapsing the distance between what we observe and some underlying “truth”.
This is especially difficult when we’re dealing with “big data”. Once you’re working with so much material you couldn’t hope to read it all, it’s easy to be lured into forgetting the distance between operationalizations and what you actually intend to measure.
For instance, if I’m finding friendships in Early Modern Europe by looking for particular words being written in correspondences, I will completely miss the existence of friends who were neighbors, and thus had no reason to write letters for us to eventually read.
A fourth way we can be mislead by quantitative methods is the ease with which they lend an air of false precision or false certainty.
This is the problem Matthew Lincoln and the other panelists brought up yesterday, where missing or uncertain data, once quantified, falsely appears precise enough to make comparisons.
I see this mistake crop up in early and recent quantitative histories alike; we measure, say, the changing rate of transnational shipments over time, and notice a positive trend. The problem is the positive difference is quite small, easily attributable to error, but because numbers are always precise, it still feels like we’re being more precise than doing a qualitative assessment. Even when it’s unwarranted.
The last thing to watch out for, and maybe the most worrisome, is the blinders quantitative analysis places on historians who don’t engage in other historiographic methods. This has been the downfall of many waves of quantitative history in the past; the inability to care about or even see that which can’t be counted.
This was, in part, was what led Time on the Cross to become the excuse to drive historians from cliometrics. The indicators of slavery that were measurable were sufficient to show it to have some semblance of economic success for black populations; but it was precisely those aspects of slavery they could not measure that were the most historically important.
So how do we regain mastery in light of these obstacles?
1. Uncareful Appropriation – Collaboration
Regarding the uncareful appropriation of methods, we can easily sidestep the issue of accidentally misusing a method by collaborating with someone who knows how the method works. This may require a translator; statisticians can as easily misunderstand historical problems as historians can misunderstand statistics.
Historians and statisticians can fruitfully collaborate, though, if they have someone in the middle trained to some extent in both — even if they’re not themselves experts. For what it’s worth, Dutch institutions seem to be ahead of the game in this respect, which is something that should be fostered.
2. Reliance on Imports – Statistical Training
Getting away from reliance on disciplinary imports may take some more work, because we ourselves must learn the approaches well enough to augment them, or create our own. Right now in DH this is often handled by summer institutes and workshop series, but I’d argue those are not sufficient here. We need to make room in our curricula for actual methods courses, or even degrees focused on methodology, in the same fashion as social scientists, if we want to start a robust practice of developing appropriate tools for our own research.
3. Naive Scientism – Humanities History
The spectre of naive scientism, I think, is one we need to be careful of, but we are also already well-equipped to deal with it. If we want to combat the uncareful use of proxies in digital history, we need only to teach the history of the humanities; why the cultural turn happened, what’s gone wrong with positivistic approaches to history in the past, etc.
Incidentally, I think this is something digital historians already guard well against, but it’s still worth keeping in mind and making sure we teach it. Particularly, digital historians need to remain aware of parallel approaches from the past, rather than tracing their background only to the textual work of people like Roberto Busa in Italy.
False precision and false certainty have some shallow fixes, and some deep ones. In the short term, we need to be better about understanding things like confidence intervals and error bars, and use methods like what Matthew Lincoln highlighted yesterday.
In the long term, though, digital history would do well to adopt triangulation strategies to help mitigate against these issues. That means trying to reach the same conclusion using multiple different methods in parallel, and seeing if they all agree. If they do, you can be more certain your results are something you can trust, and not just an accident of the method you happened to use.
5. Quantitative Blinders – Rejecting Digital History
Avoiding quantitative blinders – that is, the tendency to only care about what’s easily countable – is an easy fix, but I’m afraid to say it, because it might put me out of a job. We can’t call what we do digital history, or quantitative history, or cliometrics, or whatever else. We are, simply, historians.
Some of us use more quantitative methods, and some don’t, but if we’re not ultimately contributing to the same body of work, both sides will do themselves a disservice by not bringing every approach to bear in the wide range of interests historians ought to pursue.
Qualitative and idiographic historians will be stuck unable to deal with the deluge of material that can paint us a broader picture of history, and quantitative or nomothetic historians will lose sight of the very human irregularities that make history worth studying in the first place. We must work together.
If we don’t come together, we’re destined to remain punched-card humanists – that is, we will always be constrained and led by our methods, not by history.
Of course, this divide is a false one. There are no purely quantitative or purely qualitative studies; close-reading historians will continue to say things like “representative” or “increasing”, and digital historians won’t start publishing graphs with no interpretation.
Still, silos exist, and some of us have trouble leaving the comfort of our digital humanities conferences or our “traditional” history conferences.
That’s why this conference, I think, is so refreshing. It offers a great mix of both worlds, and I’m privileged and thankful to have been able to attend. While there are a lot of lessons we can still learn from those before us, from my vantage point, I think we’re on the right track, and I look forward to seeing more of those fruitful combinations over the course of today.
This account is influenced from some talks by Ben Schmidt. Any mistakes are from my own faulty memory, and not from his careful arguments. ↩
“God willing, we’ll all meet again in Spaceballs 2: The Search for More Money.” -Mel Brooks, Spaceballs, 1987
A long time ago in a galaxy far, far away (2012 CE, Indiana), I wrote a few blog posts explaining that, when writing history, it might be good to talk to historians (1,2,3). They were popular posts for the Irregular, and inspired by Mel Brooks’ recent interest in making Spaceballs 2, I figured it was time for a sequel of my own. You know, for all the money this blog pulls in. 1
Two teams recently published very similar articles, attempting cultural comparison via a study of historical figures in different-language editions of Wikipedia. The first, by Gloor et al., is for a conference next week in Japan, and frames itself as cultural anthropology through the study of leadership networks. The second, by Eom et al. and just published in PLoS ONE, explores cross-cultural influence through historical figures who span different language editions of Wikipedia.
Before reading the reviews, keep in mind I’m not commenting on method or scientific contribution—just historical soundness. This often doesn’t align with the original authors’ intents, which is fine. My argument isn’t that these pieces fail at their goals (science is, after all, iterative), but that they would be markedly improved by adhering to the same standards of historical rigor as they adhere to in their home disciplines, which they could accomplish easily by collaborating with a historian.
The road goes both ways. If historians don’t want physicists and statisticians bulldozing through history, we ought to be open to collaborating with those who don’t have a firm grasp on modern historiography, but who nevertheless have passion, interest, and complementary skills. If the point is understanding people better, by whatever means relevant, we need to do it together.
“Cultural Anthropology Through the Lens of Wikipedia – A Comparison of Historical Leadership Networks in the English, Chinese, Japanese and German Wikipedia” by Gloor et al. analyzes “the historical networks of the World’s leaders since the beginning of written history, comparing them in the four different Wikipedias.”
Their method is simple (simple isn’t bad!): take each “people page” in Wikipedia, and create a network of people based on who else is linked within that page. For example, if Wikipedia’s article on Mozart links to Beethoven, a connection is drawn between them. Connections are only drawn between people whose lives overlap; for example, the Mozart (1756-1791) Wikipedia page also links to Chopin (1810-1849), but because they did not live concurrently, no connection is drawn.
A separate network is created for four different language editions of Wikipedia (English, Chinese, Japanese, German), because biographies in each edition are rarely exact translations, and often different people will be prominent within the same biography across all four languages. PageRank was calculated for all the people in the resulting networks, to get a sense of who the most central figures are according to the Wikipedia link structure.
“Who are the most important people of all times?” the authors ask, to which their data provides them an answer. 2 In China and Japan, they show, only warriors and politicians make the cut, whereas religious leaders, artists, and scientists made more of a mark on Germany and the English-speaking world. Historians and biographers wind up central too, given how often their names appear on the pages of famous contemporaries on whom they wrote.
Diversity is also a marked difference: 80% of the “top 50” people for the English Wikipedia were themselves non-English, whereas only 4% of the top people from the Chinese Wikipedia are not Chinese. The authors conclude that “probing the historical perspective of many different language-specific Wikipedias gives an X-ray view deep into the historical foundations of cultural understanding of different countries.”
Small quibbles aside (e.g. their data include the year 0 BC, which doesn’t exist), the big issue here is the ease with which they claim these are the “most important” actors in history, and that these datasets provides an “X-ray” into the language cultures that produced them. This betrays the same naïve assumptions that plague much of culturomics research: that you can uncritically analyze convenient datasets as a proxy for analyzing larger cultural trends.
You can in fact analyze convenient datasets as a proxy for larger cultural trends, you just need some cultural awareness and a critical perspective.
In this case, several layers of assumptions are open for questioning, including:
Is the PageRank algorithm a good proxy for historical importance? (The answer turns out to be yes in some situations, but probably not this one.)
Is the link structure in Wikipedia a good proxy for historical dependency? (No, although it’s probably a decent proxy for current cultural popularity of historical figures, which would have been a better framing for this article. Better yet, these data can be used to explore the many well-known and unknown biases that pervade Wikipedia.)
Can differences across language editions of Wikipedia be explained by any factors besides cultural differences? (Yes. For example, editors of the German-language Wikipedia may be less likely to write a German biography if one already exists in English, given that ≈64% of Germany speaks English.)
These and other questions, unexplored in the article, make it difficult to take at face value that this study can reveal important historical actors or compare cultural norms of importance. Which is a shame, because simple datasets and approaches like this one can produce culturally and scientifically valid results that wind up being incredibly important. And the scholars working on the project are top-notch, it’s just that they don’t have all the necessary domain expertise to explore their data and questions.
The great thing about PLoS is the quality control on its publications: there isn’t much. As long as primary research is presented, the methods are sound, the data are open, and the experiment is well-documented, you’re in.
It’s a great model: all reasonable work by reasonable people is published, and history decides whether an article is worthy of merit. Contrast this against the current model, where (let’s face it) everything gets published eventually anyway, it’s just a question of how many journal submissions and rounds of peer review you’re willing to sit through. Research sits for years waiting to be published, subject to the whims of random reviewers and editors who may hold long grudges, when it could be out there the minute it’s done, open to critique and improvement, and available to anyone to draw inspiration or to learn from someone’s mistakes.
“Interactions of Cultures and Top People of Wikipedia from Ranking of 24 Language Editions” by Eom et al. is a perfect example of this model. Do I consider it a paragon of cultural research? Obviously not, if I’m reviewing it here. Am I happy the authors published it, respectful of their attempt, and willing to use it to push forward our mutual goal of soundly-researched cultural understanding? Absolutely.
Eom et al.’s piece, similar to that of Gloor et al. above, uses links between Wikipedia people pages to rank historical figures and to make cultural comparisons. The article explores 24 different language editions of Wikipedia, and goes one step further, using the data to explore intercultural influence. Importantly, given that this is a journal-length article and not a paper from a conference proceeding like Gloor et al.’s, extra space and thought was clearly put into the cultural biases of Wikipedia across languages. That said, neither of the articles reviewed here include any authors who identify themselves as historians or cultural experts.
This study collected data a bit differently from the last. Instead of a network connecting only those people whose lives overlapped, this network connected all pages within a single-language edition of Wikipedia, based only on links between articles. 3 They then ranked pages using a number of metrics, including but not limited to PageRank, and only then automatically extracted people to find who was the most prominent in each dataset.
In short, every Wikipedia article is linked in a network and ranked, after which all articles are culled except those about people. The authors explain: “On the basis of this data set we analyze spatial, temporal, and gender skewness in Wikipedia by analyzing birth place, birth date, and gender of the top ranked historical figures in Wikipedia.” By birth place, they mean the country currently occupying the location where a historical figure was born, such that Aristophanes, born in Byzantium 2,300 years ago, is considered Turkish for the purpose of this dataset. The authors note this can lead to cultural misattributions ≈3.5% of the time (e.g. Kant is categorized as Russian, having been born in a city now in Russian territory). They do not, however, call attention to the mutability of culture over time.
It is unsurprising, though comforting, to note that the fairly different approach to measuring prominence yields many of the same top-10 results as Gloor’s piece: Shakespeare, Napoleon, Bush, Jesus, etc.
Analysis of the dataset resulted in several worthy conclusions:
Many of the “top” figures across all language editions hail from Western Europe or the U.S.
Language editions bias local heroes (half of top figures in Wikipedia English are from the U.S. and U.K.; half of those in Wikipedia Hindi are from India) and regional heroes (Among Wikipedia Korean, many top figures are Chinese).
Top figures are distributed throughout time in a pattern you’d expect given global population growth, excepting periods representing foundations of modern cultures (religions, politics, and so forth).
The farther you go back in time, the less likely a top figure from a certain edition of Wikipedia is to have been born in that language’s region. That is, modern prominent figures in Wikipedia English are from the U.S. or the U.K., but the earlier you go, the less likely top figures are born in English-speaking regions. (I’d question this a bit, given cultural movement and mutability, but it’s still a result worth noting).
Women are consistently underrepresented in every measure and edition. More recent top people are more likely to be women than those from earlier years.
The article goes on to describe methods and results for tracking cultural influence, but this blog post is already tediously long, so I’ll leave that section out of this review.
There are many methodological limitations to their approach, but the authors are quick to notice and point them out. They mention that Linnaeus ranks so highly because “he laid the foundations for the modern biological naming scheme so that plenty of articles about animals, insects and plants point to the Wikipedia article about him.” This research was clearly approached with a critical eye toward methodology.
Eom et al. do not fare as well historically as methodologically; opportunities to frame claims more carefully, or to ask different sorts of questions, are overlooked. I mentioned earlier that the research assumes historical cultural consistency, but cultural currents intersect languages and geography at odd angles.
The fact that Wikipedia English draws significantly from other locations the earlier you look should come as no surprise. But, it’s unlikely English Wikipedians are simply looking to more historically diverse subjects; rather, the locus of some cultural current (Christianity, mathematics, political philosophy) has likely moved from one geographic region to another. This should be easy to test with their dataset by looking at geographic clustering and spread in any given year. It’d be nice to see them move in that direction next.
I do appreciate that they tried to validate their method by comparing their “top people” to lists other historians have put together. Unfortunately, the only non-Wikipedia-based comparison they make is to a book written by an astrophysicist and white separatist with no historical training: “To assess the alignment of our ranking with previous work by historians, we compare it with [Michael H.] Hart’s list of the top 100 people who, according to him, most influenced human history.”
Both articles claim that an algorithm analyzing Wikipedia networks can compare cultures and discover the most important historical actors, though neither define what they mean by “important.” The claim rests on the notion that Wikipedia’s grand scale and scope smooths out enough authorial bias that analyses of Wikipedia can inductively lead to discoveries about Culture and History.
And critically approached, that notion is more plausible than historians might admit. These two reviewed articles, however, don’t bring that critique to the table. 4 In truth, the dataset and analysis lets us look through a remarkably clear mirror into the cultures that created Wikipedia, the heroes they make, and the roots to which they feel most connected.
Usefully for historians, there is likely much overlap between history and the picture Wikipedia paints of it, but the nature of that overlap needs to be understood before we can use Wikipedia to aid our understanding of the past. Without that understanding, boldly inductive claims about History and Culture risk reinforcing the same systemic biases which we’ve slowly been trying to fix. I’m absolutely certain the authors don’t believe that only 5% of history’s most important figures were women, but the framing of the articles do nothing to dispel readers of this notion.
Eom et al. themselves admit “[i]t is very difficult to describe history in an objective way,” which I imagine is a sentiment we can all get behind. They may find an easier path forward in the company of some historians.
If you’re curious, the 10 most important people in the English-speaking world, in order, are George W. Bush, ol’ Willy Shakespeare, Sidney Lee, Jesus, Charles II, Aristotle, Napoleon, Muhammad, Charlemagne, and Plutarch. ↩
What do you think, is a year long enough to wait between Networks Demystified posts? I don’t think so, which is why it’s been a year and a month. Welcome back! A recent twitter back-and-forth culminated in a request for a discussion of “bimodal networks”, and my Networks Demystified series seemed like a perfect place for just such a discussion.
@scott_bot what I need, I think, is a discussion of network metrics from a bimodal perspective.
What’s a bimodal network, you ask? (Go on, ask aloud at your desk. Nobody will look at you funny, this is the age of Siri!) A bimodal network is one which connects two varieties of things. It’s also called a bipartite, 2-partite,or 2-modenetwork. A network of authors connected to the papers they write is bimodal, as are networks of books to topics, and people to organizations they are affiliated with.
This is a bimodal network which connects people and the clubs they belong to. Alice is a member of the Network Club and the We Love History Society, Bob‘s in the Network Club and the No Adults Allowed Club, and Carol‘s in the No Adults Allowed Club.
Bimodal networks are part of a larger class of k-partite networks. Unipartite/unimodal networks have only one type of node (remember, nodes are the stuff being connected by the edges), bipartite/bimodal networks have two types of nodes, tripartite/trimodal networks have three types of node, and so on to infinity.
The most common networks you’ll see being researched are unipartite. Who follows whom on Twitter? Who’s writing to whom in early modern Europe? What articles cite which other articles? All are examples of unipartite networks. It’s important to realize this isn’t necessarily determined by the dataset, but by the researcher doing the studying. For example, you can use the same organization affiliation dataset to create a unipartite network of who is in a club with whom, or a bipartite network of which person is affiliated with each organization.
The above illustration shows the same dataset used to create a unimodal and a bimodal network. The process of turning a pre-existing bimodal network into a unimodal network is called a bimodal projection. This process collapses one set of nodes into edges connecting the other set. In this case, because Alice and Bob are both members of the Network Club, the Network Club collapses into becoming an edge between those two people. The No Adults Allowed Club collapses into an edge between Bob and Carol. Because only Alice is a member of the We Love History Society, it does not collapse into an edge connecting any people.
You can also collapse the network in the opposite direction, connecting organizations who share people. No Adults Allowed and Network Club would share an edge (Bob), as would Network Club and We Love History Society (Alice).
Why Bimodal Networks?
If the same dataset can be described with unimodal networks, which are less complex, why go to bi-, tri-, or multimodal? The answer to that is in your research question: different network representations suit different questions better.
Collaboration is a hot topic in bibliometrics. Who collaborates with whom? Why? Do your collaborators affect your future collaborations? Co-authorship networks are well-suited to some of these questions, since they directly connect collaborators who author a piece together. This is a unimodal network: I wrote The Historian’s Macroscope with Shawn Graham and Ian Milligan, so we draw an edge connecting each of us together.
Some of the more focused questions of collaboration, however, require a more nuanced view of the data. Let’s say you want to know how individual instances of collaboration affect individual research patterns going forward. In this case, you want to know more than the fact that I’ve co-authored two pieces with Shawn and Ian, and they’ve co-authored three pieces together.
For this added nuance, we can draw an edge from each of us to The Historian’s Macroscope (rather than each-other), then another set edges to the piece we co-authored in The Programming Historian, and a last set of edges going from Shawn and Ian to the piece they wrote in the Journal of Digital Humanities. That’s three people nodes and three publication nodes.
Why Not Bimodal Networks?
Humanities data are often a rich array of node types: people, places, things, ideas, all connected to each other via a complex network. The trade-off is, the more complex and multimodal your dataset, the less you can reasonably do with it. This is one of the fundamental tensions between computational and traditional humanities. More categories lead to a richer understanding of the diversity of human experience, but are incredibly unhelpful when you want to count things.
Consider two pie-charts showing the religious makeup of the United States. The first chart groups together religions that fall under a similar umbrella, and the second does not. That is, the first chart groups religions like Calvinists and Lutherans together into the same pie slice (Protestants), and the second splits them into separate slices. The second, more complex chart obviously presents a richer picture of religious diversity in the United States, but it’s also significantly more difficult to read. It might trick you into thinking there are more Catholics than Protestants in the country, due to how the pie is split.
The same is true in network analysis. By creating a dataset with a hundred varieties of nodes, you lose your ability to see a bigger picture through meaningful aggregations.
Surely, you’re thinking, bimodal networks, with only two categories, should be fine! Wellllll, yes and no. You don’t bump into the same aggregation problem you do with very multimodal networks; instead, you bump into technical and mathematical issues. These issues are why I often warn non-technical researchers away from bimodal networks in research. They’re not theoretically unsound, they’re just difficult to work with properly unless you know what changes when you’re working with these complex networks.
The following section will discuss a few network metrics you may be familiar with, and what they mean for bimodal networks.
Network Metrics and Bimodality
The easiest thing to measure in a network is a node’s degree centrality. You’ll recall this is a measurement of how many edges are attached to a node, which gives a rough proxy for this concept we’ve come to call network “centrality“. It means different things depending on your data and your question: the most important or well-connected person in your social network; the point in the U.S. electrical grid which is most vulnerable to attack; the book that shares the most concepts with other books (the encyclopedia?); the city that the most traders pass through to get to their destination. These are all highly “central” in the networks they occupy.
Degree centrality is the easiest such proxy to compute: how many connections does a node have? The idea is that nodes that are more highly connected are more central. The assumption only goes so far, and it’s easy to come up with nodes that are central that do not have a high degree, as with the network below.
That’s the thing with these metrics: if you know how they work, you know which networks they apply well to, and which they do not. If what you mean by “centrality” is “has more friends”, and we’re talking about a Facebook network, then degree centrality is a perfect metric for the job.
If what you mean is “an important stop for river trade”, and we’re talking about 12th century Russia, then degree centrality sucks. The below is an illustration of such a network by Pitts (1978):
Moscow is number 35, and pretty clearly the most central according to the above criteria (you’ll likely pass through it to reach other destinations). But it only has a degree centrality of four! Node 9 also has a degree centrality of four, but clearly doesn’t play as important a structural role as Moscow in this network.
We already see that depending on your question, your definitions, and your dataset, specific metrics will either be useful or not. Metrics may change meanings entirely from one network to the next – for example, looking at bimodal rather than unimodal networks.
Consider what degree centrality means for the Alice, Bob, and Carol’s bimodal affiliation network above, where each is associated with a different set of clubs. Calculate the degree centralities in your head (hint: if you can’t, you haven’t learned what degree centrality means yet. Try again.).
Alice and Bob have a degree of 2, and Carol has a degree of 1. Is this saying anything about how central each is to the network? Not at all. Compare this to the unimodal projection, and you’ll see Bob is clearly the only structurally central actor in the network. In a bimodal network, degree centrality is nothing more than a count of affiliations with the other half of the network. It is much less likely to tell you something structurally useful than if you were looking at a unimodal network.
Consider another common measurement: clustering coefficient. You’ll recall that a node’s local clustering coefficient is the extent to which its neighbors are neighbors to one another. If all my Facebook friends know each other, I have a high clustering coefficient; if none of them know each other, I have a low clustering coefficient. If all of a power plant’s neighbors directly connect to one another, it has a high clustering coefficient, and if they don’t, it has a low clustering coefficient.
This measurement winds up being important for all sorts of reasons, but one way to interpret its meaning is as a proxy for the extent to which a node bridges diverse communities, the extent to which it is an important broker. In the 17th century, Henry Oldenburg was an important broker between disparate scholarly communities, in that he corresponded with people all across Europe, many of whom would never meet one another. The fact that they’d never meet is represented by the local clustering coefficient. It’s low, so we know his neighbors were unlikely to be neighbors of one another.
You can get creative (and network scientists often are) with what this metric means in the context of your own dataset. As long as you know how the algorithm works (taking the fraction of neighbors who are neighbors to one another), and the structural assumptions underlying your dataset, you can argue why clustering coefficient is a useful proxy for answering whatever question you’re asking.
Your argument may be pretty good, like if you say clustering coefficient is a decent (but not the best) proxy for revealing nodes that broker between disparate sections of a unimodal social network. Or your argument may be bad, like if you say clustering coefficient is a good proxy for organizational cohesion on the bimodal Alice, Bob, and Carol affiliation network above.
A thorough glance at the network, and a realization of our earlier definition of clustering coefficient (taking the fraction of neighbors who are neighbors to one another), should reveal why this is a bad justification. Alice’s clustering coefficient is zero. As is Bob’s. As is the Network Club’s. Every node has a clustering coefficient of zero, because no node’s neighbors connect to each other. That’s just the nature of bimodal networks: they connect across, rather than between, modes. Alice can never connect directly with Bob, and the Network Club can never connect directly with the We Love History Society.
Bob’s neighbors (the organizations) can never be neighbors with each other. There will never be a clustering coefficient as we defined it.
In short, the simplest definition of clustering coefficient doesn’t work on bimodal networks. It’s obvious if you know how your network works, and how clustering coefficient is calculated, but if you don’t think about it before you press the easy “clustering coefficient” button in Gephi, you’ll be lead astray.
Gephi doesn’t know if your network is bimodal or unimodal or ∞modal. Gephi doesn’t care. Gephi just does what you tell it to. You want Gephi to tell you the degree centralities in a bimodal network? Here ya go! You want it to give you the local clustering coefficients of nodes in a bimodal network? Voila! Everything still works as though these metrics would produce meaningful, sensible results.
But they won’t be meaningful on your network. You need to be your own network’s sanity check, and not rely on software to tell you something’s a bad idea. Think about your network, think about your algorithm, and try to work through what an algorithm means in the context of your data.
Using Bimodal Networks
This doesn’t mean you should stop using bimodal networks. Most of the easy network software out there comes with algorithms made for unimodal networks, but other algorithms exist and are available for more complex networks. Very occasionally, but by no means always, you can project your bimodal network to a unimodal network, as described above, and run your unimodal algorithms on that new network projection.
There are a number of times when this doesn’t work well. At 2,300 words, this tutorial is already too long, so I’ll leave thinking through why as an exercise for the reader. It’s less complicated than you’d expect, if you have a pen and paper and know how fractions work.
The better solution, usually, is to use an algorithm meant for bi- or multimodal networks. Tore Opsahl has put together a good primer on the subject with regard to clustering coefficient (slightly mathy, but you can get through it with ample use of Wikipedia). He argues that projection isn’t an optimal solution, but gives a simple algorithm for a finding bimodal clustering coefficients, and directions to do so in R. Essentially the algorithm extends the visibility of the clustering coefficient, asking whether a node’s neighbors 2 hops away can reach the others via 2 hops as well. Put another way, I don’t want to know what clubs Bob belongs to, but rather whether Alice and Carol can also connect to one another through a club.
It’s a bit difficult to write without the use of formulae, but looking at the bimodal network and thinking about what clustering coefficient ought to mean should get you on the right track.
The issue is there aren’t easy solutions through platforms like Gephi, and that’s probably on us as Digital Humanists. I’ve found that DHers are much more likely to have bi- or multimodal datasets than most network researchers. If we want to be able to analyze them easily, we need to start developing our own plugins to Gephi, or our own tools, to do so. Push-button solutions are great if you know what’s happening when you push the button.
So let this be an addendum to my previous warnings against using bimodal networks: by all means, use them, but make sure you really think about the algorithms and your data, and what any given metric might imply when run on your network specifically. There are all sorts of free resources online you can find by googling your favorite algorithm. Use them.
For more information, read up on specific algorithms, methods, interpretations, etc. for two-mode networks from Tore Opsahl.
History and astronomy are a lot alike. When people claim history couldn’t possibly be scientific, because how can you do science without direct experimentation, astronomy should be used as an immediate counterexample.
Astronomers and historians both view their subjects from great distances; too far to send instruments for direct measurement and experimentation. Things have changed a bit in the last century for astronomy, of course, with the advent of machines sensitive enough to create earth-based astronomical experiments. We’ve also built ships to take us to the farthest reaches, for more direct observations.
It’s unlikely we’ll invent a time machine any time soon, though, so historians are still stuck looking at the past in the same way we looked at the stars for so many thousands of years: through a glass, darkly. Like astronomers, we face countless observational distortions, twisting the evidence that appears before us until we’re left with an echo of a shadow of the past. We recreate the past through narratives, combining what we know of human nature with the evidence we’ve gathered, eventually (hopefully) painting ever-clearer pictures of a time we could never touch with our fingers.
Some take our lack of direct access as a good excuse to shake away all trappings of “scientific” methods. This seems ill-advised. Retaining what we’ve learned over the past 50 years about how we construct the world we see is important, but it’s not the whole story, and it’s got enough parallels with 17th century astronomy that we might learn some lessons from that example.
In the summer 1610, Galileo observed Saturn through a telescope for the first time. He wrote with surprise that
the star of Saturn is not a single star, but is a composite of three, which almost touch each other, never change or move relative to each other, and are arranged in a row along the zodiac, the middle one being three times larger than the two lateral ones…
This curious observation would take half a century to resolve into what we today see as Saturn’s rings. Galileo wrote that others, using inferior telescopes, would report seeing Saturn as oblong, rather than as three distinct spheres. Low and behold, within months, several observers reported an oblong Saturn.
What shocked Galileo even more, however, was an observation two years later when the two smaller bodies disappeared entirely. They appeared consistently, with every observation, and then one day poof they’re gone. And when they eventually did come back, they looked remarkably odd.
Saturn sometimes looked as though it had “handles”, one connected to either side, but the nature of those handles were unknown to Galileo, as was the reason why sometimes it looked like Saturn had handles, sometimes moons, and sometimes nothing at all.
Saturn was just really damn weird. Take a look at these observations from Gassendi a few decades later:
What the heck was going on? Many unsatisfying theories were put forward, but there was no real consensus.
Enter Christiaan Huygens, who in the 1650s was fascinated by the Saturn problem. He believed a better telescope was needed to figure out what was going on, and eventually got some help from his brother to build one.
The idea was successful. Within short order, Huygens developed the hypothesis that Saturn was encircled by a ring. This explanation, along with the various angles we would be viewing Saturn and its ring from Earth, accounted for the multitude of appearances Saturn could take. The figure below explains this:
The explanation, of course, was not universally accepted. An opposing explanation by an anti-Copernican Jesuit contested that Saturn had six moons, the configuration of which accounted for the many odd appearances of the planet. Huygens countered that the only way such a hypothesis could be sustained would be with inferior telescopes.
While the exact details of the dispute are irrelevant, the proposed solution was very clever, and speaks to contemporary methods in digital history. The Accademia del Cimento devised an experiment that would, in a way, test the opposing hypotheses. They built two physical models of Saturn, one with a ring, and one with six satellites configured just-so.
In 1660, the experimenters at the academy put the model of a ringed Saturn at the end of a 75-meter / 250-foot hallway. Four torches illuminated the model but were obscured from observers, so they wouldn’t be blinded by the torchlight. Then they had observers view the model through various quality telescopes from the other end of the hallway. The observers were essentially taken from the street, so they wouldn’t have preconceived notions of what they were looking at.
Depending on the distance and quality of the telescope, observers reported seeing an oblong shape, three small spheres, and other observations that were consistent with what astronomers had seen. When seen through a glass, darkly, a ringed Saturn does indeed form the most unusual shapes.
In short, the Accademia del Cimento devised an experiment, not to test the physical world, but to test whether an underlying reality could appear completely different through the various distortions that come along with how we observe it. If Saturn had rings, would it look to us as though it had two small satellites? Yes.
This did not prove Huygens’ theory, but it did prove it to be a viable candidate given the observational instruments at the time. Within a short time, the ring theory became generally accepted.
The Battle of Trafalgar
So what’s Saturn’s ring have to do with the price of tea in China? What about digital history?
The importance is in the experiment and the model. You do not need direct access to phenomena, whether they be historical or astronomical, to build models, conduct experiments, or generally apply scientific-style methods to test, elaborate, or explore a theory.
In October 1805, Lord Nelson led the British navy to a staggering victory against the French and Spanish during the Napoleonic Wars. The win is attributed to Nelson’s unusual and clever battle tactics of dividing his forces in columns perpendicular to the single line of the enemy ships. Twenty-seven British ships defeated thirty-three Franco-Spanish ones. Nelson didn’t lose a single British ship lost, while the Franco-Spanish fleet lost twenty-two.
But let’s say the prevailing account is wrong. Let’s say, instead, due to the direction of the wind and the superior weaponry of the British navy, victory was inevitable: no brilliant naval tactician required.
This isn’t a question of counterfactual history, it’s simply a question of competing theories. But how can we support this new theory without venturing into counterfactual thinking, speculation? Obviously Nelson did lead the fleet, and obviously he did use novel tactics, and obviously a resounding victory ensued. These are indisputable historical facts.
It turns out we can use a similar trick to what the Accademia del Cimento devised in 1660: pretend as though things are different (Saturn has a ring; Nelson’s tactics did not win the battle), and see whether our observations would remain the same (Saturn looks like it is flanked by two smaller moons; the British still defeated the French and Spanish).
It turns out, further, that someone’s already done this. In 2003, two Italian physicists built a simulation of the Battle of Trafalgar, taking into account details of the ships, various strategies, wind direction, speed, and so forth. The simulation is a bit like a video game that runs itself: every ship has its own agency, with the ability to make decisions based on its environment, to attack and defend, and so forth. It’s from a class of simulations called agent-based models.
When the authors directed the British ships to follow Lord Nelson’s strategy, of two columns, the fleet performed as expected: little loss of life on behalf of the British, major victory, and so forth. But when they ran the model without Nelson’s strategy, a combination of wind direction and superior British firepower still secured a British victory, even though the fleet was outnumbered.
…[it’s said] the English victory in Trafalgar is substantially due to the particular strategy adopted by Nelson, because a different plan would have led the outnumbered British fleet to lose for certain. On the contrary, our counterfactual simulations showed that English victory always occur unless the environmental variables (wind speed and direction) and the global strategies of the opposed factions are radically changed, which lead us to consider the British fleet victory substantially ineluctable.
Essentially, they tested assumptions of an alternative hypothesis, and found those assumptions would also lead to the observed results. A military historian might (and should) quibble with the details of their simplifying assumptions, but that’s all part of the process of improving our knowledge of the world. Experts disagree, replace simplistic assumptions with more informed ones, and then improve the model to see if the results still hold.
The Parable of the Polygons
This agent-based approach to testing theories about how society works is exemplified by the Schelling segregation model. This week the model shot to popularity through Vi Hart and Nicky Case’s Parable of the Polygons, a fabulous, interactive discussion of some potential causes of segregation. Go click on it, play through it, experience it. It’s worth it. I’ll wait.
Finished? Great! The model shows that, even if people only move homes if less than 1/3rd of their neighbors are the same color that they are, massive segregation will still occur. That doesn’t seem like too absurd a notion: everyone being happy with 2/3rds of their neighbors as another color, and 1/3rd as their own, should lead to happy, well-integrated communities, right?
Wrong, apparently. It turns out that people wanting 33% of their neighbors to be the same color as they are is sufficient to cause segregated communities. Take a look at the community created in Parable of the Polygons under those conditions:
This shows that very light assumptions of racism can still easily lead to divided communities. It’s not making claims about racism, or about society: what it’s doing is showing that this particular model, where people want a third of their neighbors to be like them, is sufficient to produce what we see in society today. Much like Saturn having rings is sufficient to produce the observation of two small adjacent satellites.
More careful work is needed, then, to decide whether the model is an accurate representation of what’s going on, but establishing that base, that the model is a plausible description of reality, is essential before moving forward.
Digital history is a ripe field for this sort of research. Like astronomers, we cannot (yet?) directly access what came before us, but we can still devise experiments to help support our research, in finding plausible narratives and explanations of the past. The NEH Office of Digital Humanities has already started funding workshops and projects along these lines, although they are most often geared toward philosophers and literary historians.
The person doing the most thoughtful theoretical work at the intersection of digital history and agent-based modeling is likely Marten Düring, who is definitely someone to keep an eye on if you’re interested in this area. An early innovator and strong practitioner in this field is Shawn Graham, who actively blogs about related issues. This technique, however, is far from the only one available to historians for devising experiments with the past. There’s a lot we can still learn from 17th century astronomers.
We interrupt this usually-DH blog because I got in a discussion about Special Relativity with a friend, and promised it was easily understood using only the math we use for triangles. But I’m a historian, so I can’t leave a good description alone without some background.
If you just want to learn how relativity works, skip ahead to the next post, Relativity Made Simple [Note! I haven’t written it yet, this is a two-part post. Stay-tuned for the next section]; if you hate science and don’t want to know how the universe functions, but love history, read only this post. If you have a month of time to kill, just skip this post entirely and read through my 122-item relativity bibliography on Zotero. Everyone else, disregard this paragraph.
An Oddly Selective History of Relativity
This is not a history of how Einstein came up with his Theory of Special Relativity as laid out in Zur Elektrodynamik bewegter Körper in 1905. It’s filled with big words like aberration and electrodynamics, and equations with occult symbols. We don’t need to know that stuff. This is a history of how others understood relativity. Eventually, you’re going to understand relativity, but first I’m going to tell you how other people, much smarter than you, did not.
There’s an infamous (potentially mythical) story about how difficult it is to understand relativity: Arthur Eddington, a prominent astronomer, was asked whether it was true that only three people in the world understood relativity. After pausing for a moment, Eddington replies “I’m trying to think who the third person is!” This was about General Relativity, but it was also a joke: good scientists know relativity isn’t incredibly difficult to grasp, and even early on, lots of people could claim to understand it.
Good historians, however, know that’s not the whole story. It turns out a lot of people who thought they understood Einstein’s conceptions of relativity actually did not, including those who agreed with him. This, in part, is that story.
Relativity Before Einstein
Einstein’s special theory of relativity relied on two assumptions: (1) you can’t ever tell whether you’re standing still or moving at a constant velocity (or, in physics-speak, the laws of physics in any inertial reference frame are indistinguishable from one another), and (2) light always looks like it’s moving at the same speed (in physics-speak, the speed of light is always constant no matter the velocity of the emitting body nor that of the observer’s inertial reference frame). Let’s trace these concepts back.
Our story begins in the 14th century. William of Occam, famous for his razor, claimed motion was merely the location of a body and its successive positions over time; motion itself was in the mind. Because position was simply defined in terms of the bodies that surround it, this meant motion was relative. Occam’s student, Buridan, pushed that claim forward, saying “If anyone is moved in a ship and imagines that he is at rest, then, should he see another ship which is truly at rest, it will appear to him that the other ship is moved.”
The story movies forward at irregular speed (much like the speed of this blog, and the pacing of this post). Within a century, scholars introduced the concepts of an infinite universe without any center, nor any other ‘absolute’ location. Copernicus cleverly latched onto this relativistic thinking by showing that the math works just as well, if not better, when the Earth orbits the Sun, rather than vice versa. Galileo claimed there was no way, on the basis of mechanical experiments, to tell whether you were standing still or moving at a uniform speed.
For his part, Descartes disagreed, but did say that the only way one could discuss movement was relative to other objects. Christian Huygens takes Descartes a step forward, showing that there are no ‘privileged’ motions or speeds (that is, there is no intrinsic meaning of a universal ‘at rest’ – only ‘at rest’ relative to other bodies). Isaac Newton knew that it was impossible to measure something’s absolute velocity (rather than velocity relative to an observer), but still, like Descartes, supported the idea that there was an absolute space and absolute velocity – we just couldn’t measure it.
Lets skip ahead some centuries. The year is 1893; the U.S. Supreme Court declared the tomato was a vegetable, Gandhi campaigned against segregation in South Africa, and the U.S. railroad industry bubble had just popped, forcing the government to bail out AIG for $85 billion. Or something. Also, by this point, most scientists thought light traveled in waves. Given that in order for something to travel in a wave, something has to be waving, scientists posited there was this luminiferous ether that pervaded the universe, allowing light to travel between stars and candles and those fish with the crazy headlights. It makes perfect sense. In order for sound waves to travel, they need air to travel through; in order for light waves to travel, they need the ether.
Ernst Mach, A philosopher read by many contemporaries (including Einstein), said that Newton and Descartes were wrong: absolute space and absolute motion are meaningless. It’s all relative, and only relative motion has any meaning. It is both physically impossible to measure the an objects “real” velocity, and also philosophically nonsensical. The ether, however, was useful. According to Mach and others, we could still measure something kind of like absolute position and velocity by measuring things in relationship to that all-pervasive ether. Presumably, the ether was just sitting still, doing whatever ether does, so we could use its stillness as a reference point and measure how fast things were going relative to it.
Well, in theory. Earth is hurtling through space, orbiting the sun at about 70,000 miles per hour, right? And it’s spinning too, at about a thousand miles an hour. But the ether is staying still. And light, supposedly, always travels at the same speed through the ether no matter what. So in theory, light should look like it’s moving a bit faster if we’re moving toward its source, relative to the ether, and a bit slower, if we’re moving away from it, relative to the ether. It’s just like if you’re in a train hurdling toward a baseball pitcher at 100 mph, and the pitcher throws a ball at you, also at 100 mph, in a futile attempt to stop the train. To you, the baseball will look like it’s going twice as fast, because you’re moving toward it.
It turns out measuring the speed of light in relation to the ether was really difficult. A bunch of very clever people made a bunch of very clever instruments which really should have measured the speed of earth moving through the ether, based on small observed differences of the speed of light going in different directions, but the experiments always showed light moving at the same speed. Scientists figured this must mean the earth was actually exerting a pull on the ether in its vicinity, dragging it along with it as the earth hurtled through space, explaining why light seemed to be constant in both directions when measured on earth. They devised even cleverer experiments that would account for such an ether drag, but even those seemed to come up blank. Their instruments, it was decided, simply were not yet fine-tuned enough to measure such small variations in the speed of light.
Not so fast! shouted Lorentz, except he shouted it in Dutch. Lorentz used the new electromagnetic theory to suggest that the null results of the ether experiments were actually a result, not of the earth dragging the ether along behind it, but of physical objects compressing when they moved against the ether. The experiments weren’t showing any difference in the speed of light they sought because the measuring instruments themselves contracted to just the right length to perfectly offset the difference in the velocity of light, when measuring “into” the ether. The ether was literally squeezing the electrons in the meter stick together so it became a little shorter; short enough to inaccurately measure light’s speed. The set of equations used to describe this effect became known as Lorentz Transformations. One property of these transformations was that the physical contractions would, obviously, appear the same from any observer. No matter how fast you were going relative to your measuring device, if it were moving into the ether, you would see it contracting slightly to accommodate the measurement difference.
Not so fast! shouted Poincaré, except he shouted it in French. This property of transformations to always appear the same, relative to the ether, was actually a problem. Remember that 500 years of physics that said there is no way to mechanically determine your absolute speed or absolute location in space? Yeah, so did Poincaré. He said the only way you could measure velocity or location was matter-to-matter, not matter-to-ether, so the Lorentz transformations didn’t fly.
It’s worth taking a brief aside to talk about the underpinnings of the theories of both Lorentz and Poincaré. Their theories were based on experimental evidence, which is to say, they based their reasoning on contraction on apparent experimental evidence of said contraction, and they based their theories of relativity off of experimental evidence of motion being relative.
Einstein and Relativity
When Einstein hit the scene in 1905, he approached relativity a bit differently. Instead of trying to fit the apparent contraction of objects from the ether drift experiment to a particular theory, Einstein began with the assumption that light always appeared to move at the same rate, regardless of the relative velocity of the observer. The other assumption he began with was that there was no privileged frame of reference; no absolute space or velocity, only the movement of matter relative to other matter. I’ll work out the math later, but, unsurprisingly, it turned out that working out these assumptions led to exactly the same transformation equations as Lorentz came up with experimentally.
The math was the same. The difference was in the interpretation of the math. Einstein’s theory required no ether, but what’s more, it did not require any physical explanations at all. Because Einstein’s theory of special relativity rested on two postulates about measurement, the theory’s entire implications rested in its ability to affect how we measure or observe the universe. Thus, the interpretation of objects “contracting,” under Einstein’s theory, was that they were not contracting at all. Instead, objects merely appear as though they contract relative to the movement of the observer. Another result of these transformation equations is that, from the perspective of the observer, time appears to move slower or faster depending on the relative speed of what is being observed. Lorentz’s theory predicted the same time dilation effects, but he just chalked it up to a weird result of the math that didn’t actually manifest itself. In Einstein’s theory, however, weird temporal stretching effects were Actually What Was Going On.
To reiterate: the math of Lorentz, Einstein, and Poincaré were (at least at this early stage) essentially equivalent. The result was that no experimental result could favor one theory over another. The observational predictions between each theory were exactly the same.
Relativity’s Supporters in America
I’m focusing on America here because it’s rarely focused on in the historiography, and it’s about time someone did. If I were being scholarly and citing my sources, this might actually be a novel contribution to historiography. Oh well, BLOG! All my primary sources are in that Zotero library I linked to earlier.
In 1910, Daniel Comstock wrote a popular account of the relativity of Lorentz and Einstein, to some extent conflating the two. He suggested that if Einstein’s postulates could be experimentally verified, his special theory of relativity would be true. “If either of these postulates be proved false in the future, then the structure erected can not be true in is present form. The question is, therefore, an experimental one.” Comstock’s statement betrays a misunderstanding of Einstein’s theory, though, because, at the time of that writing, there was no experimental difference between the two theories.
Gilbert Lewis and Richard Tolman presented a paper at the 1908 American Physical Society in New York, where they describe themselves as fully behind Einstein over Lorentz. Oddly, they consider Einstein’s theory to be correct, as opposed to Lorentz’s, because his postulates were “established on a pretty firm basis of experimental fact.” Which, to reiterate, couldn’t possibly have been a difference between Lorentz and Einstein. Even more oddly still, they presented the theory not as one of physics or of measurement, but of psychology (a bit like 14th century Oresme). The two went on to separately write a few articles which supposedly experimentally confirmed the postulates of special relativity.
In fact, the few Americans who did seem to engage with the actual differences between Lorentz and Einstein did so primarily in critique. Louis More, a well-respected physicist from Cincinnati, labeled the difference as metaphysical and primarily useless. This American critique was fairly standard.
At the 1909 America Physical Society meeting in Boston, one physicist (Harold Wilson) claimed his experiments showed the difference between Einstein and Lorentz. One of the few American truly theoretical physicists, W.S. Franklin, was in attendance, and the lectures he saw inspired him to write a popular account of relativity in 1911; in it, he found no theoretical difference between Lorentz and Einstein. He tended to side theoretically with Einstein, but assumed Lorentz’s theory implied the same space and time dilation effects, which they did not.
Even this series of misunderstandings should be taken as shining examples in the context of an American approach to theoretical physics that was largely antagonistic, at times decrying theoretical differences entirely. At a symposium on Ether Theories at the 1911 APS, the presidential address by William Magie was largely about the uselessness of relativity because, according to him, physics should be a functional activity based in utility and experimentation. Joining Magie’s “side” in the debate were Michelson, Morley, and Arthur Gordon Webster, the co-founder of the America Physical Society. Of those at the meeting supporting relativity, Lewis was still convinced Einstein differed experimentally from Lorentz, and Franklin and Comstock each felt there was no substantive difference between the two. In 1912, Indiana University’s R.D. Carmichael stated Einstein’s postulates were “a direct generalization from experiment.” In short, the American’s were really focused on experiment.
Of Einstein’s theory, Louis More wrote in 1912:
Professor Einstein’s theory of Relativity [… is] proclaimed somewhat noisily to be the greatest revolution in scientific method since the time of Newton. That [it is] revolutionary there can be no doubt, in so far as [it] substitutes mathematical symbols as the basis of science and denies that any concrete experience underlies these symbols, thus replacing an objective by a subjective universe. The question remains whether this is a step forward or backward […] if there is here any revolution in thought, it is in reality a return to the scholastic methods of the Middle Ages.
More goes on to say how the “Anglo-Saxons” demand practical results, not the unfathomable theories of “the German mind.” Really, that quote about sums it up. By this point, the only Americans who even talked about relativity were the ones who trained in Germany.
I’ll end here, where most histories of the reception of relativity begin: the first Solvay Conference. It’s where this beautiful picture was taken.
To sum up: in the seven year’s following Einstein’s publication, the only Americans who agreed with Einstein were ones who didn’t quite understand him. You, however, will understand it much better, if you only read the next post [coming this week!].
[edit: I’m realizing I didn’t make it clear in this post that I’m aware many historians consider themselves scientists, and that there’s plenty of scientific historical archaeology and anthropology. That’s exactly what I’m advocating there be more of, and more varied.]
Short Answer: Yes.
Less Snarky Answer: Historians need to be flexible to fresh methods, fresh perspectives, and fresh blood. Maybe not that last one, I guess, as it might invite vampires.Okay, I suppose this answer wasn’t actually less snarky.
The long answer is that historians don’t necessarily need scientists, but that we do need fresh scientific methods. Perhaps as an accident of our association with the ill-defined “humanities”, or as a result of our being placed in an entirely different culture (see: C.P. Snow), most historians seem fairly content with methods rooted in thinking about text and other archival evidence. This isn’t true of all historians, of course – there are economic historians who use statistics, historians of science who recreate old scientific experiments, classical historians who augment their research with archaeological findings, archival historians who use advanced ink analysis, and so forth. But it wouldn’t be stretching the truth to say that, for the most part, historiography is the practice of thinking cleverly about words to make more words.
I’ll argue here that our reliance on traditional methods (or maybe more accurately, our odd habit of rarely discussing method) is crippling historiography, and is making it increasingly likely that the most interesting and innovative historical work will come from non-historians. Sometimes these studies are ill-informed, especially when the authors decide not to collaborate with historians who know the subject, but to claim that a few ignorant claims about history negate the impact of these new insights is an exercise in pedantry.
In defending the humanities, we like to say that scientists and technologists with liberal arts backgrounds are more well-rounded, better citizens of the world, more able to contextualize their work. Non-humanists benefit from a liberal arts education in pretty much all the ways that are impossible to quantify (and thus, extremely difficult to defend against budget cuts). We argue this in the interest of rounding a person’s knowledge, to make them aware of their past, of their place in a society with staggering power imbalances and systemic biases.
Humanities departments should take a page from their own books. Sure, a few general ed requirements force some basic science and math… but I got an undergraduate history degree in a nice university, and I’m well aware how little STEM I actually needed to get through it. Our departments are just as guilty of narrowness as those of our STEM colleagues, and often because of it, we rely on applied mathematicians, statistical physicists, chemists, or computer scientists to do our innovative work for (or sometimes, thankfully, with) us.
Of course, there’s still lots of innovative work to be done from a textual perspective. I’m not downplaying that. Not everyone needs to use crazy physics/chemistry/computer science/etc. methods. But there’s a lot of low hanging fruit at the intersection of historiography and the natural sciences, and we’re not doing a great job of plucking it.
The story below is illustrative.
Last night, Blaise Agüera y Arcas presented his research on Gutenberg to a packed house at our rare books library. He’s responsible for a lot of the cool things that have come out of Microsoft in the last few years, and just got a job at Google, where presumably he will continue to make cool things. Blaise has degrees in physics and applied mathematics. And, a decade ago, Blaise and historian/librarian Paul Needham sent ripples through the History of the Book community by showing that Gutenberg’s press did not work at all the way people expected.
It was generally assumed that Gutenberg employed a method called punchcutting in order to create a standard font. A letter carved into a metal rod (a “punch”) would be driven into a softer metal (a “matrix”) in order to create a mold. The mold would be filled with liquid metal which hardened to form a small block of a single letter (a “type”), which would then be loaded onto the press next to other letters, inked, and then impressed onto a page. Because the mold was metal, many duplicate “types” could be made of the same letter, thus allowing many uses of the same letter to appear identical on a single pressed page.
This process is what allowed all the duplicate letters to appear identical in Gutenberg’s published books. Except, of course, careful historians of early print noticed that letters weren’t, in fact, identical. In the 1980s, Paul Needham and a colleague attempted to produce an inventory of all the different versions of letters Gutenberg used, but they stopped after frequently finding 10 or more obviously distinct versions of the same letter.
This was perplexing, but the subject was bracketed away for a while, until Blaise Agüera y Arcas came to Princeton and decided to work with Needham on the problem. Using extremely high-resolution imagining techniques, Blaise noted that there were in fact hundreds of versions of every letter. Not only that, there were actually variations and regularities in the smaller elements that made up letters. For example, an “n” was formed by two adjacent vertical lines, but occasionally the two vertical lines seem to have flipped places entirely. The extremely basic letter “i” itself had many variations, but within those variations, many odd self-similarities.
Historians had, until this analysis, assumed most letter variations were due to wear of the type blocks. This analysis blew that hypothesis out of the water. These “i”s were clearly not all made in the same mold; but then, how had they been made? To answer this, they looked even closer at the individual letters.
It’s difficult to see at first glance, but they found something a bit surprising. The letters appeared to be formed of overlapping smaller parts: a vertical line, a diagonal box, and so forth. The below figure shows a good example of this. The glyphs on the bottom have have a stem dipping below the bottom horizontal line, while the glyphs at the top do not.
The conclusion Needham and Agüera y Arcas drew, eventually, was that the punchcutting method must not have been used for Gutenberg’s early material. Instead, a set of carved “strokes” were pushed into hard sand or soft clay, configured such that the strokes would align to form various letters, not unlike the formation of cuneiform. This mold would then be used to cast letters, creating the blocks we recognize from movable type. The catch is that this soft clay could only cast letters a few times before it became unusable and would need to be recreated. As Gutenberg needed multiple instances of individual letters per page, many of those letters would be cast from slightly different soft molds.
At the end of his talk, Blaise made an offhand comment: how is it that historians/bibliographers/librarians have been looking at these Gutenbergs for so long, discussing the triumph of their identical characters, and not noticed that the characters are anything but uniform? Or, of those who had noticed it, why hadn’t they raised any red flags?
The insights they produced weren’t staggering feats of technology. He used a nice camera, a light shining through the pages of an old manuscript, and a few simple image recognition and clustering algorithms. The clustering part could even have been done by hand, and actually had been, by Paul Needham. And yes, it’s true, everything is obvious in hindsight, but there were a lot of eyes on these bibles, and odds are if some of them had been historians who were trained in these techniques, this insight could have come sooner. Every year students do final projects and theses and dissertations, but what percent of those use techniques from outside historiography?
In short, there’s a lot of very basic assumptions we make about the past that could probably be updated significantly if we had the right skillset, or knew how to collaborate with those who did. I think people like William Newman, who performs Newton’s alchemical experiments, is on the right track. As is Shawn Graham, who reanimates the trade networks of ancient Rome using agent-based simulations, or Devon Elliott, who creates computational and physical models of objects from the history of stage magic. Elliott’s models have shown that certain magic tricks couldn’t possibly have worked as they were described to.
The challenge is how to encourage this willingness to reach outside traditional historiographic methods to learn about the past. Changing curricula to be more flexible is one way, but that is a slow and institutionally difficult process. Perhaps faculty could assign group projects to students taking their gen-ed history courses, encouraging disciplinary mixes and non-traditional methods. It’s an open question, and not an easy one, but it’s one we need to tackle.
Operationalize: to express or define (something) in terms of the operations used to determine or prove it.
Precision deceives. Quantification projects an illusion of certainty and solidity no matter the provenance of the underlying data. It is a black box, through which uncertain estimations become sterile observations. The process involves several steps: a cookie cutter to make sure the data are all shaped the same way, an equation to aggregate the inherently unique, a visualization to display exact values from a process that was anything but.
In this post, I suggest that Moretti’s discussion of operationalization leaves out an integral discussion on precision, and I introduce a new term, appreciability, as a constraint on both accuracy and precision in the humanities. This conceptual constraint paves the way for an experimental digital humanities.
Operationalizing and the Natural Sciences
An operationalization is the use of definition and measurement to create meaningful data. It is an incredibly important aspect of quantitative research, and it has served the western world well for at leas 400 years. Franco Moretti recently published a LitLab Pamphlet and a nearly identical article in the New Left Review about operationalization, focusing on how it can bridge theory and text in literary theory. Interestingly, his description blurs the line between the operationalization of his variables (what shape he makes the cookie cutters that he takes to his text) and the operationalization of his theories (how the variables interact to form a proxy for his theory).
Moretti’s account anchors the practice in its scientific origin, citing primarily physicists and historians of physics. This is a deft move, but an unexpected one in a recent DH environment which attempts to distance itself from a narrative of humanists just playing with scientists’ toys. Johanna Drucker, for example, commented on such practices:
[H]umanists have adopted many applications […] that were developed in other disciplines. But, I will argue, such […] tools are a kind of intellectual Trojan horse, a vehicle through which assumptions about what constitutes information swarm with potent force. These assumptions are cloaked in a rhetoric taken wholesale from the techniques of the empirical sciences that conceals their epistemological biases under a guise of familiarity.
Rendering observation (the act of creating a statistical, empirical, or subjective account or image) as if it were the same as the phenomena observed collapses the critical distance between the phenomenal world and its interpretation, undoing the basis of interpretation on which humanistic knowledge production is based.
But what Drucker does not acknowledge here is that this positivist account is a century-old caricature of the fundamental assumptions of the sciences. Moretti’s account of operationalization as it percolates through physics is evidence of this. The operational view very much agrees with Drucker’s thesis, where the phenomena observed takes second stage to a definition steeped in the nature of measurement itself. Indeed, Einstein’s introduction of relativity relied on an understanding that our physical laws and observations of them rely not on the things themselves, but on our ability to measure them in various circumstances. The prevailing theory of the universe on a large scale is a theory of measurement, not of matter. Moretti’s reliance on natural scientific roots, then, is not antithetical to his humanistic goals.
I’m a bit horrified to see myself typing this, but I believe Moretti doesn’t go far enough in appropriating natural scientific conceptual frameworks. When describing what formal operationalization brings to the table that was not there before, he lists precision as the primary addition. “It’s new because it’s precise,” Moretti claims, “Phaedra is allocated 29 percent of the word-space, not 25, or 39.” But he asks himself: is this precision useful? Sometimes, he concludes, “It adds detail, but it doesn’t change what we already knew.”
I believe Moretti is asking the wrong first question here, and he’s asking it because he does not steal enough from the natural sciences. The question, instead, should be: is this precision meaningful? Only after we’ve assessed the reliability of new-found precision can we understand its utility, and here we can take some inspiration from the scientists, in their notions of accuracy, precision, uncertainty, and significant figures.
First some definitions. The accuracy of a measurement is how close it is to the true value you are trying to capture, whereas the precision of a measurement is how often a repeated measurement produces the same results. The number of significant figures is a measurement of how precise the measuring instrument can possibly be. False precisionis the illusion that one’s measurement is more precise than is warranted given the significant figures. Propagation of uncertainty is the pesky habit of false precision to weasel its way into the conclusion of a study, suggesting conclusions that might be unwarranted.
Accuracy roughly corresponds to how well-suited your operationalization is to finding the answer you’re looking for. For example, if you’re interested in the importance of Gulliver in Gulliver’s Travels, and your measurement is based on how often the character name is mentioned (12 times, by the way), you can be reasonably certain your measurement is inaccurate for your purposes.
Precision roughly corresponds to how fine-tuned your operationalization is, and how likely it is that slight changes in measurement will affect the outcomes of the measurement. For example, if you’re attempting to produce a network of interacting characters from The Three Musketeers, and your measuring “instrument” is increase the strength of connection between two characters every time they appear in the same 100-word block, then you might be subject to difficulties of precision. That is, your network might look different if you start your sliding 100-word window from the 1st word, the 15th word, or the 50th word. The amount of variation in the resulting network is the degree of imprecision of your operationalization.
Significant figures are a bit tricky to port to DH use. When you’re sitting at home, measuring some space for a new couch, you may find that your meter stick only has tick marks to the centimeter, but nothing smaller. This is your highest threshold for precision; if you eyeballed and guessed your space was actually 250.5cm, you’ll have reported a falsely precise number. Others looking at your measurement may have assumed your meter stick was more fine-grained than it was, and any calculations you make from that number will propagate that falsely precise number.
Uncertainty propagation is especially tricky when you wind up combing two measurements together, when one is more precise and the other less. The rule of thumb is that your results can only be as precise as the least precise measurements that made its way into your equation. The final reported number is then generally in the form of 250 (±1 cm). Thankfully, for our couch, the difference of a centimeter isn’t particularly appreciable. In DH research, I have rarely seen any form of precision calculated, and I believe some of those projects would have reported different results had they accurately represented their significant figures.
Precision, Accuracy, and Appreciability in DH
Moretti’s discussion of the increase of precision granted by operationalization leaves out any discussion of the certainty of that precision. Let’s assume for a moment that his operationalization is accurate (that is, his measurement is a perfect conversion between data and theory). Are his measurements precise? In the case of Phaedra, the answer at first glance is yes, words-per-character in a play would be pretty robust against slight changes in the measurement process.
And yet, I imagine, that answer will probably not sit well with some humanists. They may ask themselves: Is Oenone’s 12% appreciably different from Theseus’s 13% of the word-space of the play? In the eyes of the author? Of the actors? Of the audience? Does the difference make a difference?
The mechanisms by which people produce and consume literature is not precise. Surely Jean Racine did not sit down intending to give Theseus a fraction more words than Oenone. Perhaps in DH we need a measurement of precision, not of the measuring device, but of our ability to interact with the object we are studying. In a sense, I’m arguing, we are not limited to the precision of the ruler when measuring humanities objects, but to the precision of the human.
In the natural sciences, accuracy is constrained by precision: you can only have as accurate a measurement as your measuring device is precise. In the corners of humanities where we study how people interact with each other and with cultural objects, we need a new measurement that constrains both precision and accuracy: appreciability. A humanities quantification can only be as precise as that precision is appreciable by the people who interact with matter at hand. If two characters differ by a single percent of the wordspace, and that difference is impossible to register in a conscious or subconscious level, what is the meaning of additional levels of precision (and, consequently, additional levels of accuracy)?
Experimental Digital Humanities
Which brings us to experimental DH. How does one evaluate the appreciability of an operationalization except by devising clever experiments to test the extent of granularity a person can register? Without such understanding, we will continue to create formulae and visualizations which portray a false sense of precision. Without visual cues to suggest uncertainty, graphs present a world that is exact and whose small differentiations appear meaningful or deliberate.
Experimental DH is not without precedent. In Reading Tea Leaves (Chang et al., 2009), for example, the authors assessed the quality of certain topic modeling tweaks based on how a large number of people assessed the coherence of certain topics. If this approach were to catch on, as well as more careful acknowledgements of accuracy, precision, and appreciability, then those of us who are making claims to knowledge in DH can seriously bolster our cases.
There are some who present the formal nature of DH as antithetical to the highly contingent and interpretative nature of the larger humanities. I believe appreciability and experimentation can go some way alleviating the tension between the two schools, building one into the other. On the way, it might build some trust in humanists who think we sacrifice experience for certainty, and in natural scientists who are skeptical of our abilities to apply quantitative methods.
Right now, DH seems to find its most fruitful collaborations in computer science or statistics departments. Experimental DH would open the doors to new types of collaborations, especially with psychologists and sociologists.
I’m at an extremely early stage in developing these ideas, and would welcome all comments (especially those along the lines of “You dolt! Appreciability already exists, we call it x.”) Let’s see where this goes.
There’s an oft-spoken and somewhat strawman tale of how the digital humanities is bridging C.P. Snow’s “Two Culture” divide, between the sciences and the humanities. This story is sometimes true (it’s fun putting together Ocean’s Eleven-esque teams comprising every discipline needed to get the job done) and sometimes false (plenty of people on either side still view the other with skepticism), but as a historian of science, I don’t find the divide all that interesting. As Snow’s title suggests, this divide is first and foremost cultural. There’s another overlapping divide, a bit more epistemological, methodological, and ontological, which I’ll explore here. It’s the nomothetic(type)/idiographic(token) divide, and I’ll argue here that not only are its barriers falling, but also that the distinction itself is becoming less relevant.
Nomothetic (Greek for “establishing general laws”-ish) and Idiographic (Greek for “pertaining to the individual thing”-ish) approaches to knowledge have often split the sciences and the humanities. I’ll offload the hard work onto Wikipedia:
Nomothetic is based on what Kant described as a tendency to generalize, and is typical for the natural sciences. It describes the effort to derive laws that explain objective phenomena in general.
Idiographic is based on what Kant described as a tendency to specify, and is typical for the humanities. It describes the effort to understand the meaning of contingent, unique, and often subjective phenomena.
These words are long and annoying to keep retyping, and so in the longstanding humanistic tradition of using new words for words which already exist, henceforth I shall refer to nomothetic as type and idiographic as token. 1 I use these because a lot of my digital humanities readers will be familiar with their use in text mining. If you counted the number of unique words in a text, you’d be be counting the number of types. If you counted the number of total words in a text, you’d be counting the number of tokens, because each token (word) is an individual instance of a type. You can think of a type as the platonic ideal of the word (notice the word typical?), floating out there in the ether, and every time it’s actually used, it’s one specific token of that general type.
Usually the natural and social sciences look for general principles or causal laws, of which the phenomena they observe are specific instances. A social scientist might note that every time a student buys a $500 textbook, they actively seek a publisher to punch, but when they purchase $20 textbooks, no such punching occurs. This leads to the discovery of a new law linking student violence with textbook prices. It’s worth noting that these laws can and often are nuanced and carefully crafted, with an awareness that they are neither wholly deterministic nor ironclad.
The humanities (or at least history, which I’m more familiar with) are more interested in what happened than in what tends to happen. Without a doubt there are general theories involved, just as in the social sciences there are specific instances, but the intent is most-often to flesh out details and create a particular internally consistent narrative. They look for tokens where the social scientists look for types. Another way to look at it is that the humanist wants to know what makes a thing unique, and the social scientist wants to know what makes a thing comparable.
It’s been noted these are fundamentally different goals. Indeed, how can you in the same research articulate the subjective contingency of an event while simultaneously using it to formulate some general law, applicable in all such cases? Rather than answer that question, it’s worth taking time to survey some recent research.
A recent digital humanities panel at MLA elicited responses by Ted Underwood and Haun Saussy, of which this post is in part itself a response. One of the papers at the panel, by Long and So, explored the extent to which haiku-esque poetry preceded what is commonly considered the beginning of haiku in America by about 20 years. They do this by teaching the computer the form of the haiku, and having it algorithmically explore earlier poetry looking for similarities. Saussy comments on this work:
[…] macroanalysis leads us to reconceive one of our founding distinctions, that between the individual work and the generality to which it belongs, the nation, context, period or movement. We differentiate ourselves from our social-science colleagues in that we are primarily interested in individual cases, not general trends. But given enough data, the individual appears as a correlation among multiple generalities.
One of the significant difficulties faced by digital humanists, and a driving force behind critics like Johanna Drucker, is the fundamental opposition between the traditional humanistic value of stressing subjectivity, uniqueness, and contingency, and the formal computational necessity of filling a database with hard decisions. A database, after all, requires you to make a series of binary choices in well-defined categories: is it or isn’t it an example of haiku? Is the author a man or a woman? Is there an author or isn’t there an author?
Underwood addresses this difficulty in his response:
Though we aspire to subtlety, in practice it’s hard to move from individual instances to groups without constructing something like the sovereign in the frontispiece for Hobbes’ Leviathan – a homogenous collection of instances composing a giant body with clear edges.
But he goes on to suggest that the initial constraint of the digital media may not be as difficult to overcome as it appears. Computers may even offer us a way to move beyond the categories we humanists use, like genre or period.
Aren’t computers all about “binary logic”? If I tell my computer that this poem both is and is not a haiku, won’t it probably start to sputter and emit smoke?
Well, maybe not. And actually I think this is a point that should be obvious but just happens to fall in a cultural blind spot right now. The whole point of quantification is to get beyond binary categories — to grapple with questions of degree that aren’t well-represented as yes-or-no questions. Classification algorithms, for instance, are actually very good at shades of gray; they can express predictions as degrees of probability and assign the same text different degrees of membership in as many overlapping categories as you like.
Here we begin to see how the questions asked of digital humanists (on the one side; computational social scientists are tackling these same problems) are forcing us to reconsider the divide between the general and the specific, as well as the meanings of categories and typologies we have traditionally taken for granted. However, this does not yet cut across the token/type divide: this has gotten us to the macro scale, but it does not address general principles or laws that might govern specific instances. Historical laws are a murky subject, prone to inducing fits of anti-deterministic rage. Complex Systems Science and the lessons we learn from Agent-Based Modeling, I think, offer us a way past that dilemma, but more on that later.
For now, let’s talk about influence. Or diffusion. Or intertextuality. 2Matthew Jockers has been exploring these concepts, most recently in his book Macroanalysis. The undercurrent of his research (I think I’ve heard him call it his “dangerous idea”) is a thread of almost-determinism. It is the simple idea that an author’s environment influences her writing in profound and easy to measure ways. On its surface it seems fairly innocuous, but it’s tied into a decades-long argument about the role of choice, subjectivity, creativity, contingency, and determinism. One word that people have used to get around the debate is affordances, and it’s as good a word as any to invoke here. What Jockers has found is a set of environmental conditions which afford certain writing styles and subject matters to an author. It’s not that authors are predetermined to write certain things at certain times, but that a series of factors combine to make the conditions ripe for certain writing styles, genres, etc., and not for others. The history of science analog would be the idea that, had Einstein never existed, relativity and quantum physics would still have come about; perhaps not as quickly, and perhaps not from the same person or in the same form, but they were ideas whose time had come. The environment was primed for their eventual existence. 3
It is here we see the digital humanities battling with the token/type distinction, and finding that distinction less relevant to its self-identification. It is no longer a question of whether one can impose or generalize laws on specific instances, because the axes of interest have changed. More and more, especially under the influence of new macroanalytic methodologies, we find that the specific and the general contextualize and augment each other.
The computational social sciences are converging on a similar shift. Jon Kleinberg likes to compare some old work by Stanley Milgram 4, where he had people draw maps of cities from memory, with digital city reconstruction projects which attempt to bridge the subjective and objective experiences of cities. The result in both cases is an attempt at something new: not quite objective, not quite subjective, and not quite intersubjective. It is a representation of collective individual experiences which in its whole has meaning, but also can be used to contextualize the specific. That these types of observations can often lead to shockingly accurate predictive “laws” isn’t really the point; they’re accidental results of an attempt to understand unique and contingent experiences at a grand scale. 5
It is no surprise that the token/type divide is woven into the subjective/objective divide. However, as Daston and Galison have pointed out, objectivity is not an ahistorical category. 6 It has a history, is only positively defined in relation to subjectivity, and neither were particularly useful concepts before the 19th century.
I would argue, as well, that the nomothetic and idiographic divide is one which is outliving its historical usefulness. Work from both the digital humanities and the computational social sciences is converging to a point where the objective and the subjective can peaceably coexist, where contingent experiences can be placed alongside general predictive principles without any cognitive dissonance, under a framework that allows both deterministic and creative elements. It is not that purely nomothetic or purely idiographic research will no longer exist, but that they no longer represent a binary category which can usefully differentiate research agendas. We still have Snow’s primary cultural distinctions, of course, and a bevy of disciplinary differences, but it will be interesting to see where this shift in axes takes us.
I am not the first to do this. Aviezer Tucker (2012) has a great chapter in The Oxford Handbook of Philosophy of Social Science, “Sciences of Historical Tokens and Theoretical Types: History and the Social Sciences” which introduces and historicizes the vocabulary nicely. ↩
Hah! I tricked you. I don’t intend to define digital humanities here—too much blood has already been spilled over that subject. I’m sure we all remember the terrible digital humanities / humanities computing wars of 2004, now commemorated yearly under a Big Tent in the U.S., Europe, or in 2015, Australia. Most of us still suffer from ACH or ALLC (edit: I’ve been reminded the more politically correct acronym these days is EADH).
Instead, I’m here to report the findings of an extremely informal survey, with a sample size of 5, inspired by Paige Morgan’s question of what courses an undergraduate interested in digital humanities should take:
Should undergrads w/ humanities majors & interest in #digitalhumanities grad work pursue a second CS major, or a CS minor?
The question inspired a long discussion, worth reading through if you’re interested in digital humanities curricula. I suggested, were the undergrad interested in the heavily computational humanities (like Ted Underwood, Ben Schmidt, etc.), they might take linear algebra, statistics for social science, programming 1 & 2, web development, and a social science (like psych) research methods course, along with all their regular humanities courses. Others suggested to remove some and include others, and of course all of these are pipe dreams unless our mystery undergrad is in the six year program.
The discussion got me thinking: how did the digital humanists we know and love get to where they are today? Given that the basic ethos of DH is that if you want to know something, you just have to ask, I decided to ask a few well-respected DHers how someone might go about reaching expertise in their subject matter. This isn’t a question of how to define digital humanities, but of the sorts of things the digital humanists we know and love learned to get where they are today. I asked:
Some of you may have seen this tweet by Paige Morgan this morning, asking about what classes an undergraduate student should take hoping to pursue DH. I’ve emailed you, a random and diverse smattering of highly recognizable names associated with DH, in the hopes of getting a broader answer than we were able to generate through twitter alone.
I know you’re all extremely busy, so please excuse my unsolicited semi-mass email and no worries if you don’t get around to replying.
If you do reply, however, I’d love to get a list of undergraduate courses (traditional humanities or otherwise) that you believe was or would be instrumental to the research you do. My list, for example, would include historical methods, philosophy of science, linear algebra, statistics, programming, and web development. I’ll take the list of lists and write up a short blog post about them, because I believe it would be beneficial for many new students who are interested in pursuing DH in all its guises. I’d also welcome suggestions for other people and “schools of DH” I’m sure to have missed.
And because the people in DH are awesome and forthcoming, I got many replies back. I’ll list them first here, and then attempt some preliminary synthesis below.
The first reply was from Ted Underwood, who was afraid my question skirted a bit too close to defining DH, saying:
No matter how heavily I hedge and qualify my response (“this is just a personal list relevant to the particular kind of research I do …”), people will tend to read lists like this as tacit/covert/latent efforts to DEFINE DH — an enterprise from which I never harvest anything but thorns.
Thankfully he came back to me a bit later, saying he’d worked up the nerve to reply to my survey because he’s “coming to the conclusion that this is a vital question we can’t afford to duck, even if it’s controversial [emphasis added]”. Ted continued:
So here goes, with three provisos:
I’m talking only about my own field (literary text mining), and not about the larger entity called “DH,” which may be too deeply diverse to fit into a single curriculum.
A lot of this is not stuff I actually took in the classroom.
I really don’t have strong opinions about how much of this should be taken as an undergrad, and what can wait for grad school. In practice, no undergrad is going to prepare themselves specifically for literary text mining (at least, I hope not). They should be aiming at some broader target.
But at some point, as preparation for literary text-mining, I’d recommend
A lot of courses in literary history and critical theory (you probably need a major’s worth of courses in some aspect of literary studies).
At least one semester of experience programming. Two semesters is better. But existing CS courses may not be the most efficient delivery system. You probably don’t need big-O notation. You do need data structures. You may not need to sweat the fine points of encapsulation. You probably do need to know about version control. I think there’s room for a “Programming for Humanists” course here.
Maybe one semester of linguistics (I took historical linguistics, but corpus linguistics would also work).
Statistics — a methods course for social scientists would be great.
At least one course in data mining / machine learning. This may presuppose more math than one semester of statistics will provide, so
Your recommendation of linear algebra is probably also a good idea.
I doubt all of that will fit in anyone’s undergrad degree. So in practice, any undergrad with courses in literary history plus a semester or two of programming experience, and perhaps statistics, would be doing very well.
So Underwood’s reply was that focusing too much in undergrad is not necessarily ideal, but were an undergraduate interested in literary text mining, they wouldn’t go astray with literary history, critical theory, a programming for humanists course, linguistics, statistics, data mining, and potentially linear algebra.
While Underwood is pretty well known for his computational literary history, Johanna Drucker is probably most well known in our circles for her work in DH criticism. Her reply was concise and helpful:
In the best of all possible worlds, this would be followed by specialized classes in database design, scripting for the humanities, GIS/mapping, virtual worlds design, metadata/classification/culture, XML/markup, and data mining (textual corpora, image data mining, network analysis), and complex systems modeling, as well as upper division courses in disciplines (close/distant reading for literary studies, historical methods and mapping etc.).
The site she points is an online coursebook that provides a broad overview of DH concepts, along with exercises and tutorials, that would make a good basic course on the groundwork of DH. She then lists a familiar list of computer-related and humanities course that might be useful.
The next reply came from Melissa Terras, the director of the DH center (I’m sorry, centre) at UCL. Her response was a bit more general:
My first response is that they must be interested in Humanities research – and make the transition to being taught about Humanities, to doing research in the Humanities, and get the bug for finding out new information about a Humanities topic. It doesn’t matter what the Humanities subject is – but they must understand Humanities research questions, and what it means to undertake new research in the Humanities proper. (Doesn’t matter if their research project has no computing component, it’s about a hunger for new knowledge in this area, rather than digesting prior knowledge).
Like Underwood and Drucker, Terras is stressing that students cannot forget the humanities for the digital.
Then they must become information literate, and IT literate. We have a variety of training courses at our institution, and there is also the “European Driving License in IT” which is basic IT skills. They must get the bug for learning more about computing too. They’ll know after some basic courses whether they are a natural fit to computing.
Without the bug to do research, and the bug to understand more about computing, they are sunk for pursuing DH. These are the two main prerequisites.
Interestingly (but not surprisingly, given general DH trends), Terras frames passion about computing as more important than any particular skill.
Once they get the bug, then taking whatever courses are on offer to them at their institution – either for credit modules, or pure training courses in various IT methods, would stand them in good stead. For example, you are not going to get a degree course in Photoshop, but attending 6 hours of training in that…. plus spreadsheets, plus databases, plus XML, plus web design, would prepare you for pursuing a variety of other courses. Even if the institution doesnt offer taught DH courses, chances are they offer training in IT. They need to get their hands dirty, and to love learning more about computing, and the information environment we inhabit.
Her stress on hyper-focused courses of a few hours each is also interesting, and very much in line with our “workshop and summer school”-focused training mindset in DH.
It’s at that stage I’d be looking for a master’s program in DH, to take the learning of both IT and the humanities to a different level. Your list excludes people who have done “pure” humanities as an undergrad to pursuing DH, and actually, I think DH needs people who are, ya know, obsessed with Byzantine Sculpture in the first instance, but aren’t afraid of learning new aspects of computing without having any undergrad credit courses in it.
I’d also say that there is plenty room for people who do it the other way around – undergrads in comp sci, who then learn and get the bug for humanities research.
Terras continued that taking everything as an undergraduate would equate more to liberal arts or information science than a pure humanities degree:
As with all of these things, it depends on the make up of the individual programs. In my undergrad, I did 6 courses in my final year. If I had taken all of the ones you suggest: (historical methods, philosophy of science, linear algebra, statistics, programming, and web development) then I wouldn’t have been able to take any humanities courses! which would mean I was doing liberal arts, or information science, rather than a pure humanities degree. This will be a problem for many – just sayin’. 🙂
But yes, I think the key thing really is the *interest* and the *passion*. If your institution doesnt allow that type of courses as part of a humanities degree, you haven’t shot yourself in the foot, you just need to learn computing some other way…
Self-teaching is something that I think most people reading this blog can get behind (or commiserate with). I’m glad Terras shifted my focus away from undergraduate courses, and more on how to get a DH education.
John Walsh is known in the DH world for his work on TEI, XML, and other formal data models of humanities media. He replied:
I started undergrad as a fine arts major (graphic design) at Ohio University, before switching to English literary studies. As an art major, I was required during my freshman year to take “Comparative Arts I & II,” in which we studied mostly the formal aspects of literature, visual arts, music, and architecture. Each of the two classes occupied a ten-week “quarter” (fall winter spring summer), rather than a semester. At the time OU had a department of comparative arts, which has since become the School of Interdisciplinary Arts.
In any case, they were fascinating classes, and until you asked the question, I hadn’t really considered those courses in the context of DH, but they were definitely relevant and influential to my own work. I took these courses in the 80s, but I imagine an updated version that took into account digital media and digital representations of non-digital media would be especially useful. The study of the formal aspects of these different art forms and media and shared issues of composition and construction gave me a solid foundation for my own work constructing things to model and represent these formal characteristics and relationships.
Walsh was the first one to single out a specific humanities course as particularly beneficial to the DH agenda. It makes sense: the course appears to have crossed many boundaries, focusing particularly on formal similarities. I’d hazard that this approach is at the heart of many of the more computational and formal areas of digital humanities (but perhaps less so for those areas more aligned with new media or critical theory).
I agree web development should be in the mix somewhere, along with something like Ryan Cordell’s “Text Technologies” that would cover various representations of text/documents and a look at their production, digital and otherwise, as well as tools (text analysis, topic modeling, visualization) for doing interesting things with those texts/documents.
Otherwise, Walsh’s courses aligned with those of Underwood and Drucker.
Matt Jockers‘ expertise, like Underwoods, tends toward computational literary history and criticism. His reply was short and to the point:
The thing I see missing here are courses Linguistics and Machine Learning. Specifically courses in computational linguistics, corpus linguistics, and NLP. The later are sometimes found in the CS depts. and sometimes in linguistics, it depends. Likewise, courses in Machine Learning are sometimes found in Statistics (as at Stanford) and sometimes in CS (as at UNL).
Jockers, like Underwood, mentioned that I was missing linguistics. On the twitter conversation, Heather Froehlich pointed out the same deficiency. He and Underwood also pointed out machine learning, which are particularly useful for the sort of research they both do.
I was initially surprised by how homogeneous the answers were, given the much-touted diversity of the digital humanities. I had asked a few others to get back to me, who for various reasons couldn’t get back to me at the time, situated more closely in the new media, alt-ac, and library camps, but even the similarity among those I asked was a bit surprising. Is it that DH is slowly canonizing around particular axes and methods, or is my selection criteria just woefully biased? I wouldn’t be too surprised if it were the latter.
In the end, it seems (at least according to life-paths of these particular digital humanists), the modern digital humanist should be a passionate generalist, well-versed in their particular field of humanistic inquiry, and decently-versed in a dizzying array of subjects and methods that are tied to computers in some way or another. The path is not necessarily one an undergraduate curriculum is well-suited for, but the self-motivated have many potential sources for education.
I was initially hoping to turn this short survey into a list of potential undergraduate curricula for different DH paths (much like my list of DH syllabi), but it seems we’re either not yet at that stage, or DH is particularly ill-suited for the undergraduate-style curricula. I’m hoping some of you will leave comments on the areas of DH I’ve clearly missed, but from the view thus-far, there seems to be more similarities than differences.