tl;dr Academics’ individual policing of disciplinary boundaries at the expense of intellectual merit does a disservice to our global research community, which is already structured to reinforce disciplinarity at every stage. We should work harder to encourage research misfits to offset this structural pull.
The academic game is stacked to reinforce old community practices. PhDs aren’t only about specialization, but about teaching you to think, act, write, and cite like the discipline you’ll soon join. Tenure is about proving to your peers you are like them. Publishing and winning grants are as much about goodness of fit as about quality of work.
This isn’t bad. One of science’s most important features is that it’s often cumulative or at least agglomerative, that scientists don’t start from scratch with every new project, but build on each other’s work to construct an edifice that often resembles progress. The scientific pipeline uses PhDs, tenure, journals, and grants as built-in funnels, ensuring everyone is squeezed snugly inside the pipes at every stage of their career. It’s a clever institutional trick to keep science cumulative.
But the funnels work too well. Or at least, there’s no equally entrenched clever institutional mechanism for building new pipes, for allowing the development of new academic communities that break the mold. Publishing in established journals that enforce their community boundaries is necessary for your career; most of the world’s scholarly grant programs are earmarked for and evaluated by specific academic communities. It’s easy to be disciplinary, and hard to be a misfit.
To be sure, this is a known problem. Patches abound. Universities set aside funds for “interdisciplinary research” or “underfunded areas”; postdoc positions, centers, and antidsciplinary journals exist to encourage exactly the sort of weird research I’m claiming has no little place in today’s university. These solutions are insufficient.
University or even external grant programs fostering “interdisciplinarity” for its own sake become mostly useless because of the laws of Goodhart & Campbell. They’re usually designed to bring disciplines together rather than to sidestep disciplinarity altogether, which while admirable, is a system that’s pretty easy to game, and often leads to awkward alliances of convenience.
Universities do a bit better in encouraging certain types of centers that, rather than being “interdisciplinary”, are focused on a specific goal, method, or topic that doesn’t align easily with the local department structure. A new pipe, to extend my earlier bad metaphor. The problems arise here because centers often lack the institutional benefits available to departments: they rely on soft money, don’t get kickback from grant overheads, don’t get money from cross-listed courses, and don’t get tenure lines. Antidisciplinary postdoc positions suffer a similar fate, allowing misfits to thrive for a year or so before having to go back on the job market to rinse & repeat.
In short, the overwhelming inertial force of academic institutions pulls towards disciplinarity despite frequent but half-assed or poorly-supported attempts to remedy the situation. Even when new disciplinary configurations break free of institutional inertia, presenting themselves as means to knowledge every bit as legitimate as traditional departments (chemistry, history, sociology, etc.), it can take decades for them to even be given the chance to fail.
It is perhaps unsurprising that the community which taught us about autopoiesis proved incapable of sustaining itself, though half a century on its influences are glaringly apparent and far-reaching across today’s research universities. I wonder if we reconfigured the organization of colleges and departments from scratch today, whether there would be more departments of environmental studies and fewer departments of [redacted] 1.
I bring this all up to raise awareness of the difficulty facing good work with no discernible home, and to advocate for some individual action which, though it won’t change the system overnight, will hopefully make the world a bit easier for those who deserve it.
It is this: relax the reflexive disciplinary boundary drawing, and foster programs or communities which celebrate misfits.I wrote a bit about this last year in the context of history and culturomics; historians clamored to show that culturomics was bad history, but culturomics never attempted to be good history—it attempted to be good culturomics. Though I’d argue it often failed at that as well, it should have been evaluated by its own criteria, not the criteria of some related but different discipline.
Some potential ways to move forward:
If you are reviewing for a journal or grant and the piece is great, but doesn’t quite fit, and you can’t think of a better home for it, push against the editor to let it in anyway.
If you’re a journal editor or grant program officer, be more flexible with submissions which don’t fit your mold but don’t have easy homes elsewhere.
If you control funds for research grants, earmark half your money for good work that lacks a home. Not “good work that lacks a home but still looks like the humanities”, or “good work that looks like economics but happens to involve a computer scientist and a biologist”, but truly homeless work. I realize this won’t happen, but if I’m advocating, I might as well advocate big!
If you are training graduate students, hiring faculty, or evaluating tenure cases, relax the boundary-drawing urge to say “her work is fascinating, but it’s not exactly our department.”
If you have administrative and financial power at a university, commit to supporting nondisciplinary centers and agendas with the creation of tenure lines, the allocation of course & indirect funds, and some of the security offered to departments.
Ultimately, we need clever systems to foster nondisciplinary thinking which are as robust as those systems that foster cumulative research. This problem is above my paygrade. In the meantime, though, we can at least avoid the urge to equate disciplinary fitness with intellectual quality.
You didn’t seriously expect me to name names, did you? ↩
The below is the transcript from my October 29 keynote presented to the Creativity and The City 1600-2000 conference in Amsterdam, titled “Punched-Card Humanities”. I survey historical approaches to quantitative history, how they relate to the nomothetic/idiographic divide, and discuss some lessons we can learn from past successes and failures. For ≈200 relevant references, see this Zotero folder.
I’m here to talk about Digital History, and what we can learn from its quantitative antecedents. If yesterday’s keynote was framing our mutual interest in the creative city, I hope mine will help frame our discussions around the bottom half of the poster; the eHumanities perspective.
Specifically, I’ve been delighted to see at this conference, we have a rich interplay between familiar historiographic and cultural approaches, and digital or eHumanities methods, all being brought to bear on the creative city. I want to take a moment to talk about where these two approaches meet.
Yesterday’s wonderful keynote brought up the complicated goal of using new digital methods to explore the creative city, without reducing the city to reductive indices. Are we living up to that goal? I hope a historical take on this question might help us move in this direction, that by learning from those historiographic moments when formal methods failed, we can do better this time.
Digital History is different, we’re told. “New”. Many of us know historians who used computers in the 1960s, for things like demography or cliometrics, but what we do today is a different beast.
Commenting on these early punched-card historians, in 1999, Ed Ayers wrote, quote, “the first computer revolution largely failed.” The failure, Ayers, claimed, was in part due to their statistical machinery not being up to the task of representing the nuances of human experience.
We see this rhetoric of newness or novelty crop up all the time. It cropped up a lot in pioneering digital history essays by Roy Rosenzweig and Dan Cohen in the 90s and 2000s, and we even see a touch of it, though tempered, in this conference’s theme.
In yesterday’s final discussion on uncertainty, Dorit Raines reminded us the difference between quantitative history in the 70s and today’s Digital History is that today’s approaches broaden our sources, whereas early approaches narrowed them.
To say “we’re at a unique historical moment” is something common to pretty much everyone, everywhere, forever. And it’s always a little bit true, right?
It’s true that every historical moment is unique. Unprecedented. Digital History, with its unique combination of public humanities, media-rich interests, sophisticated machinery, and quantitative approaches, is pretty novel.
But as the saying goes, history never repeats itself, but it rhymes. Each thread making up Digital History has a long past, and a lot of the arguments for or against it have been made many times before. Novelty is a convenient illusion that helps us get funding.
Not coincidentally, it’s this tension I’ll highlight today: between revolution and evolution, between breaks and continuities, and between the historians who care more about what makes a moment unique, and those who care more about what connects humanity together.
To be clear, I’m operating on two levels here: the narrative and the metanarrative. The narrative is that the history of digital history is one of continuities and fractures; the metanarrative is that this very tension between uniqueness and self-similarity is what swings the pendulum between quantitative and qualitative historians.
Now, my claim that debates over continuity and discontinuity are a primary driver of the quantitative/qualitative divide comes a bit out of left field — I know — so let me back up a few hundred years and explain.
Francis Bacon wrote that knowledge would be better understood if it were collected into orderly tables. His plea extended, of course, to historical knowledge, and inspired renewed interest in a genre already over a thousand years old: tabular chronology.
These chronologies were world histories, aligning the pasts of several regions which each reconned the passage of time differently.
Isaac Newton inherited this tradition, and dabbled throughout his life in establishing a more accurate universal chronology, aligning Biblical history with Greek legends and Egyptian pharoahs.
Newton brought to history the same mind he brought to everything else: one of stars and calculations. Like his peers, Newton relied on historical accounts of astronomical observations to align simultaneous events across thousands of miles. Kepler and Scaliger, among others, also partook in this “scientific history”.
Where Newton departed from his contemporaries, however, was in his use of statistics for sorting out history. In the late 1500s, the average or arithmetic mean was popularized by astronomers as a way of smoothing out noisy measurements. Newton co-opted this method to help him estimate the length of royal reigns, and thus the ages of various dynasties and kingdoms.
On average, Newton figured, a king’s reign lasted 18-20 years. If the history books record 5 kings, that means the dynasty lasted between 90 and 100 years.
Newton was among the first to apply averages to fill in chronologies, though not the first to apply them to human activities. By the late 1600s, demographic statistics of contemporary life — of births, burials and the like — were becoming common. They were ways of revealing divinely ordered regularities.
Incidentally, this is an early example of our illustrious tradition of uncritically appropriating methods from the natural sciences. See? We’ve all done it, even Newton!
Joking aside, this is an important point: statistical averages represented divine regularities. Human statistics began as a means to uncover universal truths, and they continue to be employed in that manner. More on that later, though.
Newton’s method didn’t quite pass muster, and skepticism grew rapidly on the whole prospect of mathematical history.
Criticizing Newton in 1782, for example, Samuel Musgrave argued, in part, that there are no discernible universal laws of history operating in parallel to the universal laws of nature. Nature can be mathematized; people cannot.
Not everyone agreed. Francesco Algarotti passionately argued that Newton’s calculation of average reigns, the application of math to history, was one of his greatest achievements. Even Voltaire tried Newton’s method, aligning a Chinese chronology with Western dates using average length of reigns.
Which brings us to the earlier continuity/discontinuity point: quantitative history stirs debate in part because it draws together two activities Immanuel Kant sets in opposition: the tendency to generalize, and the tendency to specify.
The tendency to generalize, later dubbed Nomothetic, often describes the sciences: extrapolating general laws from individual observations. Examples include the laws of gravity, the theory of evolution by natural selection, and so forth.
The tendency to specify, later dubbed Idiographic, describes, mostly, the humanities: understanding specific, contingent events in their own context and with awareness of subjective experiences. This could manifest as a microhistory of one parish in the French Revolution, a critical reading of Frankenstein focused on gender dynamics, and so forth.
These two approaches aren’t mutually exclusive, and they frequently come in contact around scholarship of the past. Paleontologists, for example, apply general laws of biology and geology to tell the specific story of prehistoric life on Earth. Astronomers, similarly, combine natural laws and specific observations to trace to origins of our universe.
Historians have, with cyclically recurring intensity, engaged in similar efforts. One recent nomothetic example is that of cliodynamics: the practitioners use data and simulations to discern generalities such as why nations fail or what causes war. Recent idiographic historians associate more with the cultural and theoretical turns in historiography, often focusing on microhistories or the subjective experiences of historical actors.
Both tend to meet around quantitative history, but the conversation began well before the urge to quantify. They often fruitfully align and improve one another when working in concert; for example when the historian cites a common historical pattern in order to highlight and contextualize an event which deviates from it.
But more often, nomothetic and idiographic historians find themselves at odds. Newton extrapolated “laws” for the length of kings, and was criticized for thinking mathematics had any place in the domain of the uniquely human. Newton’s contemporaries used human statistics to argue for divine regularities, and this was eventually criticized as encroaching on human agency, free will, and the uniqueness of subjective experience.
I’ll highlight some moments in this debate, focusing on English-speaking historians, and will conclude with what we today might learn from foibles of the quantitative historians who came before.
Let me reiterate, though, that quantitative is not nomothetic history, but they invite each other, so I shouldn’t be ahistorical by dividing them.
Take Henry Buckle, who in 1857 tried to bridge the two-culture divide posed by C.P. Snow a century later. He wanted to use statistics to find general laws of human progress, and apply those generalizations to the histories of specific nations.
Buckle was well-aware of historiography’s place between nomothetic and idiographic cultures, writing: “it is the business of the historian to mediate between these two parties, and reconcile their hostile pretensions by showing the point at which their respective studies ought to coalesce.”
In direct response, James Froud wrote that there can be no science of history. The whole idea of Science and History being related was nonsensical, like talking about the colour of sound. They simply do not connect.
This was a small exchange in a much larger Victorian debate pitting narrative history against a growing interest in scientific history. The latter rose on the coattails of growing popular interest in science, much like our debates today align with broader discussions around data science, computation, and the visible economic successes of startup culture.
This is, by the way, contemporaneous with something yesterday’s keynote highlighted: the 19th century drive to establish ‘urban laws’.
By now, we begin seeing historians leveraging public trust in scientific methods as a means for political control and pushing agendas. This happens in concert with the rise of punched cards and, eventually, computational history. Perhaps the best example of this historical moment comes from the American Census in the late 19th century.
Briefly, a group of 19th century American historians, journalists, and census chiefs used statistics, historical atlases, and the machinery of the census bureau to publicly argue for the disintegration of the U.S. Western Frontier in the late 19th century.
These moves were, in part, made to consolidate power in the American West and wrestle control from the native populations who still lived there. They accomplished this, in part, by publishing popular atlases showing that the western frontier was so fractured that it was difficult to maintain and defend. 1
The argument, it turns out, was pretty compelling.
Part of what drove the statistical power and scientific legitimacy of these arguments was the new method, in 1890, of entering census data on punched cards and processing them in tabulating machines. The mechanism itself was wildly successful, and the inventor’s company wound up merging with a few others to become IBM. As was true of punched-card humanities projects through the time of Father Roberto Busa, this work was largely driven by women.
It’s worth pausing to remember that the history of punch card computing is also a history of the consolidation of government power. Seeing like a computer was, for decades, seeing like a state. And how we see influences what we see, what we care about, how we think.
Recall the Ed Ayers quote I mentioned at the beginning of his talk. He said the statistical machinery of early quantitative historians could not represent the nuance of historical experience. That doesn’t just mean the math they used; it means the actual machinery involved.
See, one of the truly groundbreaking punch card technologies at the turn of the century was the card sorter. Each card could represent a person, or household, or whatever else, which is sort of legible one-at-a-time, but unmanageable in giant stacks.
Now, this is still well before “computers”, but machines were being developed which could sort these cards into one of twelve pockets based on which holes were punched. So, for example, if you had cards punched for people’s age, you could sort the stacks into 10 different pockets to break them up by age groups: 0-9, 10-19, 20-29, and so forth.
This turned out to be amazing for eyeball estimates. If your 20-29 pocket was twice as full as your 10-19 pocket after all the cards were sorted, you had a pretty good idea of the age distribution.
Over the next 50 years, this convenience would shape the social sciences. Consider demographics or marketing. Both developed in the shadow of punch cards, and both relied heavily on what’s called “segmentation”, the breaking of society into discrete categories based on easily punched attributes. Age ranges, racial background, etc. These would be used to, among other things, determine who was interested in what products.
They’d eventually use statistics on these segments to inform marketing strategies.
But, if you look at the statistical tests that already existed at the time, these segmentations weren’t always the best way to break up the data. For example, age flows smoothly between 0 and 100; you could easily contrive a statistical test to show that, as a person ages, she’s more likely to buy one product over another, over a set of smooth functions.
That’s not how it worked though. Age was, and often still is, chunked up into ten or so distinct ranges, and those segments were each analyzed individually, as though they were as distinct from one another as dogs and cats. That is, 0-9 is as related to 10-19 as it is to 80-89.
What we see here is the deep influence of technological affordances on scholarly practice, and it’s an issue we still face today, though in different form.
As historians began using punch cards and social statistics, they inherited, or appropriated, a structure developed for bureaucratic government processing, and were rightly soon criticized for its dehumanizing qualities.
Unsurprisingly, given this backdrop, historians in the first few decades of the 20th century often shied away from or rejected quantification.
The next wave of quantitative historians, who reached their height in the 1930s, approached the problem with more subtlety than the previous generations in the 1890s and 1860s.
Charles Beard’s famous Economic Interpretation of the Constitution of the United States used economic and demographic stats to argue that the US Constitution was economically motivated. Beard, however, did grasp the fundamental idiographic critique of quantitative history, claiming that history was, quote:
“beyond the reach of mathematics — which cannot assign meaningful values to the imponderables, immeasurables, and contingencies of history.”
The other frequent critique of quantitative history, still heard, is that it uncritically appropriates methods from stats and the sciences.
This also wasn’t entirely true. The slide behind me shows famed statistician Karl Pearson’s attempt to replicate the math of Isaac Newton that we saw earlier using more sophisticated techniques.
By the 1940s, Americans with graduate training in statistics like Ernest Rubin were actively engaging historians in their own journals, discussing how to carefully apply statistics to historical research.
On the other side of the channel, the French Annales historians were advocating longue durée history; a move away from biographies to prosopographies, from events to structures. In its own way, this was another historiography teetering on the edge between the nomothetic and idiographic, an approach that sought to uncover the rhymes of history.
Interest in quantitative approaches surged again in the late 1950s, led by a new wave of Annales historians like Fernand Braudel and American quantitative manifestos like those by Benson, Conrad, and Meyer.
William Aydolette went so far as to point out that all historians implicitly quantify, when they use words like “many”, “average”, “representative”, or “growing” – and the question wasn’t can there be quantitative history, but when should formal quantitative methods be utilized?
By 1968, George Murphy, seeing the swell of interest, asked a very familiar question: why now? He asked why the 1960s were different from the 1860s or 1930s, why were they, in that historical moment, able to finally do it right? His answer was that it wasn’t just the new technologies, the huge datasets, the innovative methods: it was the zeitgeist. The 1960s was the right era for computational history, because it was the era of computation.
By the early 70s, there was a historian using a computer in every major history department. Quantitative history had finally grown into itself.
Of course, in retrospect, Murphy was wrong. Once the pendulum swung too far towards scientific history, theoretical objections began pushing it the other way.
In Poverty of Historicism, Popper rejected scientific history, but mostly as a means to reject historicism outright. Popper’s arguments represent an attack from outside the historiographic tradition, but one that eventually had significant purchase even among historians, as an indication of the failure of nomothetic approaches to culture. It is, to an extent, a return to Musgrave’s critique of Isaac Newton.
At the same time, we see growing criticism from historians themselves. Arthur Schlesinger famously wrote that “important questions are important precisely because they are not susceptible to quantitative answers.”
There was a converging consensus among English-speaking historians, as in the early 20th century, that quantification erased the essence of the humanities, that it smoothed over the very inequalities and historical contingencies we needed to highlight.
Jacques Barzun summed it up well, if scathingly, saying history ought to free us from the bonds of the machine, not feed us into it.
The skeptics prevailed, and the pendulum swung the other way. The post-structural, cultural, and literary-critical turns in historiography pivoted away from quantification and computation. The final nail was probably Fogel and Engerman’s 1974 Time on the Cross, which reduced the Atlantic slave-trade to economic figures, and didn’t exactly treat the subject with nuance and care.
The cliometricians, demographers, and quantitative historians didn’t disappear after the cultural turn, but their numbers shrunk, and they tended to find themselves in social science departments, or fled here to Europe, where social and economic historians were faring better.
Which brings us, 40 years on, to the middle of a new wave of quantitative or “formal method” history. Ed Ayers, like George Murphy before him, wrote, essentially, this time it’s different.
And he’s right, to a point. Many here today draw their roots not to the cliometricians, but to the very cultural historians who rejected quantification in the first place. Ours is a digital history steeped in the the values of the cultural turn, that respects social justice and seeks to use our approaches to shine a light on the underrepresented and the historically contingent.
But that doesn’t stop a new wave of critiques that, if not repeating old arguments, certainly rhymes. Take Johanna Drucker’s recent call to rebrand data as capta, because when we treat observations objectively as if it were the same as the phenomena observed, we collapse the critical distance between the world and our interpretation of it. And interpretation, Drucker contends, is the foundation on which humanistic knowledge is based.
Which is all to say, every swing of the pendulum between idiographic and nomothetic history was situated in its own historical moment. It’s not a clock’s pendulum, but Foucault’s pendulum, with each swing’s apex ending up slightly off from the last. The issues of chronology and astronomy are different from those of eugenics and manifest destiny, which are themselves different from the capitalist and dehumanizing tendencies of 1950s mainframes.
But they all rhyme. Quantitative history has failed many times, for many reasons, but there are a few threads that bind them which we can learn from — or, at least, a few recurring mistakes we can recognize in ourselves and try to avoid going forward.
We won’t, I suspect, stop the pendulum’s inevitable about-face, but at least we can continue our work with caution, respect, and care.
The lesson I’d like to highlight may be summed up in one question, asked by Humpty Dumpty to Alice: which is to be master?
Over several hundred years of quantitative history, the advice of proponents and critics alike tends to align with this question. Indeed in 1956, R.G. Collingwood wrote specifically “statistical research is for the historian a good servant but a bad master,” referring to the fact that statistical historical patterns mean nothing without historical context.
Schlesinger, the guy who I mentioned earlier who said historical questions are interesting precisely because they can’t be quantified, later acknowledged that while quantitative methods can be useful, they’ll lead historians astray. Instead of tackling good questions, he said, historians will tackle easily quantifiable ones — and Schlesinger was uncomfortable by the tail wagging the dog.
I’ve found many ways in which historians have accidentally given over agency to their methods and machines over the years, but these five, I think, are the most relevant to our current moment.
Unfortunately since we running out of time, you’ll just have to trust me that these are historically recurring.
Number 1 is the uncareful appropriation of statistical methods for historical uses. It controls us precisely because it offers us a black box whose output we don’t truly understand.
A common example I see these days is in network visualizations. People visualize nodes and edges using what are called force-directed layouts in Gephi, but they don’t exactly understand what those layouts mean. As these layouts were designed, physical proximity of nodes are not meant to represent relatedness, yet I’ve seen historians interpret two neighboring nodes as being related because of their visual adjacency.
This is bad. It’s false. But because we don’t quite understand what’s happening, we get lured by the black box into nonsensical interpretations.
The second way methods drive us is in our reliance on methodological imports. That is, we take the time to open the black box, but we only use methods that we learn from statisticians or scientists. Even when we fully understand the methods we import, if we’re bound to other people’s analytic machinery, we’re bound to their questions and biases.
Take the example I mentioned earlier, with demographic segmentation, punch card sorters, and its influence on social scientific statistics. The very mechanical affordances of early computers influence the sort of questions people asked for decades: how do discrete groups of people react to the world in different ways, and how do they compare with one another?
The next thing to watch out for is naive scientism. Even if you know the assumptions of your methods, and you develop your own techniques for the problem at hand, you still can fall into the positivist trap that Johanna Drucker warns us about — collapsing the distance between what we observe and some underlying “truth”.
This is especially difficult when we’re dealing with “big data”. Once you’re working with so much material you couldn’t hope to read it all, it’s easy to be lured into forgetting the distance between operationalizations and what you actually intend to measure.
For instance, if I’m finding friendships in Early Modern Europe by looking for particular words being written in correspondences, I will completely miss the existence of friends who were neighbors, and thus had no reason to write letters for us to eventually read.
A fourth way we can be mislead by quantitative methods is the ease with which they lend an air of false precision or false certainty.
This is the problem Matthew Lincoln and the other panelists brought up yesterday, where missing or uncertain data, once quantified, falsely appears precise enough to make comparisons.
I see this mistake crop up in early and recent quantitative histories alike; we measure, say, the changing rate of transnational shipments over time, and notice a positive trend. The problem is the positive difference is quite small, easily attributable to error, but because numbers are always precise, it still feels like we’re being more precise than doing a qualitative assessment. Even when it’s unwarranted.
The last thing to watch out for, and maybe the most worrisome, is the blinders quantitative analysis places on historians who don’t engage in other historiographic methods. This has been the downfall of many waves of quantitative history in the past; the inability to care about or even see that which can’t be counted.
This was, in part, was what led Time on the Cross to become the excuse to drive historians from cliometrics. The indicators of slavery that were measurable were sufficient to show it to have some semblance of economic success for black populations; but it was precisely those aspects of slavery they could not measure that were the most historically important.
So how do we regain mastery in light of these obstacles?
1. Uncareful Appropriation – Collaboration
Regarding the uncareful appropriation of methods, we can easily sidestep the issue of accidentally misusing a method by collaborating with someone who knows how the method works. This may require a translator; statisticians can as easily misunderstand historical problems as historians can misunderstand statistics.
Historians and statisticians can fruitfully collaborate, though, if they have someone in the middle trained to some extent in both — even if they’re not themselves experts. For what it’s worth, Dutch institutions seem to be ahead of the game in this respect, which is something that should be fostered.
2. Reliance on Imports – Statistical Training
Getting away from reliance on disciplinary imports may take some more work, because we ourselves must learn the approaches well enough to augment them, or create our own. Right now in DH this is often handled by summer institutes and workshop series, but I’d argue those are not sufficient here. We need to make room in our curricula for actual methods courses, or even degrees focused on methodology, in the same fashion as social scientists, if we want to start a robust practice of developing appropriate tools for our own research.
3. Naive Scientism – Humanities History
The spectre of naive scientism, I think, is one we need to be careful of, but we are also already well-equipped to deal with it. If we want to combat the uncareful use of proxies in digital history, we need only to teach the history of the humanities; why the cultural turn happened, what’s gone wrong with positivistic approaches to history in the past, etc.
Incidentally, I think this is something digital historians already guard well against, but it’s still worth keeping in mind and making sure we teach it. Particularly, digital historians need to remain aware of parallel approaches from the past, rather than tracing their background only to the textual work of people like Roberto Busa in Italy.
False precision and false certainty have some shallow fixes, and some deep ones. In the short term, we need to be better about understanding things like confidence intervals and error bars, and use methods like what Matthew Lincoln highlighted yesterday.
In the long term, though, digital history would do well to adopt triangulation strategies to help mitigate against these issues. That means trying to reach the same conclusion using multiple different methods in parallel, and seeing if they all agree. If they do, you can be more certain your results are something you can trust, and not just an accident of the method you happened to use.
5. Quantitative Blinders – Rejecting Digital History
Avoiding quantitative blinders – that is, the tendency to only care about what’s easily countable – is an easy fix, but I’m afraid to say it, because it might put me out of a job. We can’t call what we do digital history, or quantitative history, or cliometrics, or whatever else. We are, simply, historians.
Some of us use more quantitative methods, and some don’t, but if we’re not ultimately contributing to the same body of work, both sides will do themselves a disservice by not bringing every approach to bear in the wide range of interests historians ought to pursue.
Qualitative and idiographic historians will be stuck unable to deal with the deluge of material that can paint us a broader picture of history, and quantitative or nomothetic historians will lose sight of the very human irregularities that make history worth studying in the first place. We must work together.
If we don’t come together, we’re destined to remain punched-card humanists – that is, we will always be constrained and led by our methods, not by history.
Of course, this divide is a false one. There are no purely quantitative or purely qualitative studies; close-reading historians will continue to say things like “representative” or “increasing”, and digital historians won’t start publishing graphs with no interpretation.
Still, silos exist, and some of us have trouble leaving the comfort of our digital humanities conferences or our “traditional” history conferences.
That’s why this conference, I think, is so refreshing. It offers a great mix of both worlds, and I’m privileged and thankful to have been able to attend. While there are a lot of lessons we can still learn from those before us, from my vantage point, I think we’re on the right track, and I look forward to seeing more of those fruitful combinations over the course of today.
This account is influenced from some talks by Ben Schmidt. Any mistakes are from my own faulty memory, and not from his careful arguments. ↩
If you claim computational approaches to history (“digital history”) lets historians ask new types of questions, or that they offer new historical approaches to answering or exploring old questions, you are wrong. You’re not actually wrong, but you are institutionally wrong, which is maybe worse.
This is a problem, because rhetoric from practitioners (including me) is that we can bring some “new” to the table, and when we don’t, we’re called out for not doing so. The exchange might (but probably won’t) go like this:
Digital Historian: And this graph explains how velociraptors were of utmost importance to Victorian sensibilities.
Historian in Audience: But how is this telling us anything we haven’t already heard before? Didn’t John Hammond already make the same claim?
DH: That’s true, he did. One thing the graph shows, though, is that velicoraptors in general tend to play much more unimportant roles across hundreds of years, which lends support to the Victorian thesis.
HiA: Yes, but the generalized argument doesn’t account for cultural differences across those times, so doesn’t meaningfully contribute to this (or any other) historical conversation.
History (like any discipline) is made of people, and those people have Ideas about what does or doesn’t count as history (well, historiography, but that’s a long word so let’s ignore it). If you ask a new type of question or use a new approach, that new thing probably doesn’t fit historians’ Ideas about proper history.
The age of peak celebrity has been consistent over time: about 75 years after birth. But the other parameters have been changing. Fame comes sooner and rises faster. Between the early 19th century and the mid-20th century, the age of initial celebrity declined from 43 to 29 years, and the doubling time fell from 8.1 to 3.3 years.
Historians saw those claims and asked “so what”? It’s not interesting or relevant according to the things historians usually consider interesting or relevant, and it’s problematic in ways historians find things problematic. For example, it ignores cultural differences, does not speak to actual human experiences, and has nothing of use to say about a particular historical moment.
It’s true. Culturomics-style questions do not fit well within a humanities paradigm (incommensurable, anyone?). By the standard measuring stick of what makes a good history project, culturomics does not measure up. A new type of question requires a new measuring stick; in this case, I think a good one for culturomics-style approaches is the extent to which they bridge individual experiences with large-scale social phenomena, or how well they are able to reconcile statistical social regularities with free or contingent choice.
The point, though, is a culturomics presentation would fit few of the boxes expected at a history conference, and so would be considered a failure. Rightly so, too—it’s a bad history presentation. But what culturomics is successfully doing is asking new types of questions, whether or not historians find them legitimate or interesting. Is it good culturomics?
To put too fine a point on it, since history is often a question-driven discipline, new types of questions that are too different from previous types are no longer legitimately within the discipline of history, even if they are intrinsically about human history and do not fit in any other discipline.
What’s more, new types of questions may appear simplistic by historian’s standards, because they fail at fulfilling even the most basic criteria usually measuring historical worth. It’s worth keeping in mind that, to most of the rest of the world, our historical work often fails at meeting their criteria for worth.
New approaches to old questions share a similar fate, but for different reasons. That is, if they are novel, they are not interesting, and if they are interesting, they are not novel.
Traditional historical questions are, let’s face it, not particularly new. Tautologically. Some old questions in my field are: what role did now-silent voices play in constructing knowledge-making instruments in 17th century astronomy? How did scholarship become institutionalized in the 18th century? Why was Isaac Newton so annoying?
My own research is an attempt to provide a broader view of those topics (at least, the first two) using computational means. Since my topical interest has a rich tradition among historians, it’s unlikely any of my historically-focused claims (for example, that scholarly institutions were built to replace the really complicated and precarious role people played in coordinating social networks) will be without precedent.
After decades, or even centuries, of historical work in this area, there will always be examples of historians already having made my claims. My contribution is the bolstering of a particular viewpoint, the expansion of its applicability, the reframing of a discussion. Ultimately, maybe, I convince the world that certain social network conditions play an important role in allowing scholarly activity to be much more successful at its intended goals. My contribution is not, however, a claim that is wholly without precedent.
But this is a problem, since DH rhetoric, even by practitioners, can understandably lead people to expect such novelty. Historians in particular are very good at fitting old patterns to new evidence. It’s what we’re trained to do.
Any historical claim (to an acceptable question within the historical paradigm) can easily be countered with “but we already knew that”. Either the question’s been around long enough that every plausible claim has been covered, or the new evidence or theory is similar enough to something pre-existing that it can be taken as precedent.
The most masterful recent discussion of this topic was Matthew Lincoln’s Confabulation in the humanities, where he shows how easy it is to make up evidence and get historians to agree that they already knew it was true.
To put too fine a point on it, new approaches to old historical questions are destined to produce results which conform to old approaches; or if they don’t, it’s easy enough to stretch the old & new theories together until they fit. New approaches to old questions will fail at producing completely surprising results; this is a bad standard for historical projects.If a novel methodology were to create truly unrecognizable results, it is unlikely those results would be recognized as “good history” within the current paradigm. That is, historians would struggle to care.
What Is This Beast?
What is this beast we call digital history? Boundary-drawing is a tried-and-true tradition in the humanities, digital or otherwise. It’s theoretically kind of stupid but practically incredibly important, since funding decisions, tenure cases, and similar career-altering forces are at play. If digital history is a type of history, it’s fundable as such, tenurable as such; if it isn’t, it ain’t. What’s more, if what culturomics researchers are doing are also history, their already-well-funded machine can start taking slices of the sad NEH pie.
So “what counts?” is unfortunately important to answer.
This discussion around what is “legitimate history research” is really important, but I’d like to table it for now, because it’s so often conflated with the discussion of what is “legitimate research” sans history. The former question easily overshadows the latter, since academics are mostly just schlubs trying to make a living.
For the last century or so, history and philosophy of science have been smooshed together in departments and conferences. It’s caused a lot of concern. Does history of science need philosophy of science? Does philosophy of science need history of science? What does it mean to combine the two? Is what comes out of the middle even useful?
Weirdly, the question sometimes comes down to “does history and philosophy of science even exist?”. It’s weird because people identify with that combined title, so I published a citation analysis in Erkenntnis a few years back that basically showed that, indeed, there is an area between the two communities, and indeed those people describe themselves as doing HPS, whatever that means to them.
I bring this up because digital history, as many of us practice it, leaves us floating somewhere between public engagement, social science, and history. Culturomics occupies a similar interstitial space, though inching closer to social physics and complex systems.
From this vantage point, we have a couple of options. We can say digital history is just history from a slightly different angle, and try to be evaluated by standard historical measuring sticks—which would make our work easily criticized as not particularly novel. Or we can say digital history is something new, occupying that in-between space—which could render the work unrecognizable to our usual communities.
The either/or proposition is, of course, ludicrous. The best work being done now skirts the line, offering something just novel enough to be surprising, but not so out of traditional historical bounds as to be grouped with culturomics. But I think we need to more deliberate and organized in this practice, lest we want to be like History and Philosophy of Science, still dealing with basic questions of legitimacy fifty years down the line.
In the short term, this probably means trying not just to avoid the rhetoric of newness, but to actively curtail it. In the long term, it may mean allying with like-minded historians, social scientists, statistical physicists, and complexity scientists to build a new framework of legitimacy that recognizes the forms of knowledge we produce which don’t always align with historiographic standards. As Cassidy Sugimoto and I recently wrote, this often comes with journals, societies, and disciplinary realignment.
The least we can do is steer away from a novelty rhetoric, since what is novel often isn’t history, and what is history often isn’t novel.
Here’s a way of thinking that might get us past this muddle (and I think I agree with the authors that the hype around DH is a mistake): let’s stop branding our scholarship. We don’t need Next Big Things and we don’t need Academic Superstars, whether they are DH Superstars or Theory Superstars. What we do need is to find more democratic and inclusive ways of thinking about the value of scholarship and scholarly communities.
This is relevant here, and good, but tough to reconcile with the earlier post. In an ideal world, without disciplinary brandings, we can all try to be welcoming of works on their own merits, without relying our preconceived disciplinary criteria. In the present condition, though, it’s tough to see such an environment forming. In that context, maybe a unified digital history “brand” is the best way to stay afloat. This would build barriers against whatever new thing comes along next, though, so it’s a tough question.
Nickoal Eichmann (corresponding author), Jeana Jorgensen, Scott B. Weingart1
NOTE: This is a pre-peer reviewed draft submitted for publication in Feminist Debates in Digital Humanities, eds. Jacque Wernimont and Elizabeth Losh, University of Minnesota Press (2017). Comments are welcome, and a downloadable dataset / more figures are forthcoming. This chapter will be released alongside another on the history of DH conferences, co-authored by Weingart & Eichmann (forthcoming), which will go into further detail on technical aspects of this study, including the data collection & statistics. Many of the materials first appeared on this blog. To cite this preprint, use the figshare DOI: https://dx.doi.org/10.6084/m9.figshare.3120610.v1
Digital Humanities (DH) is said to have a light side and a dark side. Niceness, globality, openness, and inclusivity sit at one side of this binary caricature; commodification, neoliberalism, techno-utopianism, and white male privilege sit at the other. At times, the plurality of DH embodies both descriptions.
We hope a diverse and critical DH is a goal shared by all. While DH, like the humanities writ large, is not a monolith, steps may be taken to improve its public face and shared values through positively influencing its communities. The Alliance of Digital Humanities Organizations’ (ADHO’s) annual conference hosts perhaps the largest such community. As an umbrella organization of six international digital humanities constituent organizations, as well as 200 DH centers in a few dozen countries, ADHO and its conference ought to represent the geographic, disciplinary, and demographic diversity of those who identify as digital humanists.
The annual conference offers insight into how the world sees DH. While it may not represent the plurality of views held by self-described digital humanists, the conference likely influences the values of its constituents. If the conference glorifies Open Access, that value will be taken up by its regular attendees; if the conference fails to prioritize diversity, this too will be reinforced.
This chapter explores fifteen years of DH conferences, presenting a quantified look at the values implicitly embedded in the event. Women are consistently underrepresented, in spite of the fact that the most prominent figures at the conference are as likely women as men. The geographic representation of authors has become more diverse over time—though authors with non-English names are still significantly less likely to pass peer review. The topical landscape is heavily gendered, suggesting a masculine bias may be built into the value system of the conference itself. Without data on skin color or ethnicity, we are unable to address racial or related diversity and bias here.
There have been some improvements over time and, especially recently, a growing awareness of diversity-related issues. While many of the conference’s negative traits are simply reflections of larger entrenched academic biases, this is no comfort when self-reinforcing biases foster a culture of microaggression and white male privilege. Rather than using this study as an excuse to write off DH as just another biased community, we offer statistics, critiques, and suggestions as a vehicle to improve ADHO’s conference, and through it the rest of self-identified Digital Humanities.
Digital humanities (DH), we are told, exists under a “big tent”, with porous borders, little gatekeeping, and, heck, everyone’s just plain “nice”. Indeed, the term itself is not used definitionally, but merely as a “tactical convenience” to get stuff done without worrying so much about traditional disciplinary barriers. DH is “global”, “public”, and diversely populated. It will “save the humanities” from its crippling self-reflection (cf. this essay), while simultaneously saving the computational social sciences from their uncritical approaches to data. DH contains its own mirror: it is both humanities done digitally, and the digital as scrutinized humanistically. As opposed to the staid, backwards-looking humanities we are used to, the digital humanities “experiments”, “plays”, and even “embraces failure” on ideological grounds. In short, we are the hero Gotham needs.
Digital Humanities, we are told, is a narrowly-defined excuse to push a “neoliberal agenda”, a group of “bullies” more interested in forcing humanists to code than in speaking truth to power. It is devoid of cultural criticism, and because of the way DHers uncritically adopt tools and methods from the tech industry, they in fact often reinforce pre-existing power structures. DH is nothing less than an unintentionally rightist vehicle for techno-utopianism, drawing from the same font as MOOCs and complicit in their devaluing of education, diversity, and academic labor. It is equally complicit in furthering both the surveillance state and the surveillance economy, exemplified in its stunning lack of response to the Snowden leaks. As a progeny of the computer sciences, digital humanities has inherited the same lack of gender and racial diversity, and any attempt to remedy the situation is met with incredible resistance.
The truth, as it so often does, lies somewhere in the middle of these extreme caricatures. It’s easy to ascribe attributes to Digital Humanities synecdochically, painting the whole with the same brush as one of its constituent parts. One would be forgiven, for example, for coming away from the annual international ADHO Digital Humanities conference assuming DH were a parade of white men quantifying literary text. An attendee of HASTAC, on the other hand, might leave seeing DH as a diverse community focused on pedagogy, but lacking in primary research. Similar straw-snapshots may be drawn from specific journals, subcommunities, regions, or organizations.
But these synecdoches have power. Our public face sets the course of DH, via who it entices to engage with us, how it informs policy agendas and funding allocations, and who gets inspired to be the next generation of digital humanists. Especially important is the constituency and presentation of the annual Digital Humanities conference. Every year, several hundred students, librarians, staff, faculty, industry professionals, administrators and researchers converge for the conference, organized by the Alliance of Digital Humanities Organizations (ADHO). As an umbrella organization of six international digital humanities constituent organizations, as well as 200 DH centers in a few dozen countries, ADHO and its conference ought to represent the geographic, disciplinary, and demographic diversity of those who identify as digital humanists. And as DH is a community that prides itself on its activism and its social/public goals, if the annual DH conference does not celebrate this diversity, the DH community may suffer a crisis of identity (…okay, a bigger crisis of identity).
So what does the DH conference look like, to an outsider? Is it diverse? What topics are covered? Where is it held? Who is participating, who is attending, and where are they coming from? This essay offers incomplete answers to these questions for fifteen years of DH conferences (2000-2015), focusing particularly on DH2013 (Nebraska, USA), DH2014 (Lausanne, Switzerland), and DH2015 (Sydney, Australia). 2 We do so with a double-agenda: (1) to call out the biases and lack of diversity at ADHO conferences in the earnest hope it will help improve future years’ conferences, and (2) to show that simplistic, reductive quantitative methods can be applied critically, and need not feed into techno-utopic fantasies or an unwavering acceptance of proxies as a direct line to Truth. By “distant reading” DH and turning our “macroscopes” on ourselves, we offer a critique of our culture, and hopefully inspire fruitful discomfort in DH practitioners who apply often-dehumanizing tools to their subjects, but have not themselves fallen under the same distant gaze.
Among other findings, we observe a large gender gap for authorship that is not mirrored among those who simply attend the conference. We also show a heavily gendered topical landscape, which likely contributes to topical biases during peer review. Geographic diversity has improved over fifteen years, suggesting ADHO’s strategy to expand beyond the customary North American / European rotation was a success. That said, there continues to be a visible bias against non-English names in the peer review process. We could not get data on ethnicity, race, or skin color, but given our regional and name data, as well as personal experience, we suspect in this area, diversity remains quite low.
We do notice some improvement over time and, especially in the last few years, a growing awareness of our own diversity problems. The #whatifDH2016 3 hashtag, for example, was a reaction to an all-male series of speakers introducing DH2015 in Sydney. The hashtag caught on and made it to ADHO’s committee on conferences, who will use it in planning future events. Our remarks here are in the spirit of #whatifDH2016; rather than using this study as an excuse to defame digital humanities, we hope it becomes a vehicle to improve ADHO’s conference, and through it the rest of our community.
Social Justice and Equality in the Digital Humanities
Diversity in the Academy
In order to contextualize gender and ethnicity in the DH community, we must take into account developments throughout higher education. This is especially important since much of DH work is done in university and other Ivory Tower settings. Clear progress has been made from the times when all-male, all-white colleges were the norm, but there are still concerns about the marginalization of scholars who are not white, male, able-bodied, heterosexual, or native English-speakers. Many campuses now have diversity offices and have set diversity-related goals at both the faculty and student levels (for example, see the Ohio State University’s diversity objectives and strategies 2007-12). On the digital front, blogs such as Conditionally Accepted, Fight the Tower, University of Venus, and more all work to expose the normative biases in academia through activist dialogue.
From both a historical and contemporary lens, there is data supporting the clustering of women and other minority scholars in certain realms of academia, from specific fields and subjects to contingent positions. When it comes to gender, the phrase “feminization” has been applied both to academia in general and to specific fields. It contains two important connotations: that of an area in which women are in the majority, and the sense of a change over time, such that numbers of women participants are increasing in relation to men (Leathwood and Read 2008, 10). It can also signal a less quantitative shift in values, “whereby ‘feminine’ values, concerns, and practices are seen to be changing the culture of an organization, a field of practice or society as a whole” (ibid).
In terms of specific disciplines, the feminization of academia has taken a particular shape. Historian Lynn Hunt suggests the following propositions about feminization in the humanities and history specifically: the feminization of history parallels what is happening in the social sciences and humanities more generally; the feminization of the social sciences and humanities is likely accompanied by a decline in status and resources; and other identity categories, such as ethnic minority status and age/generation, also interact with feminization in ways that are still becoming coherent.
Feminization has clear consequences for the perception and assignation of value of a given field. Hunt writes: “There is a clear correlation between relative pay and the proportion of women in a field; those academic fields that have attracted a relatively high proportion of women pay less on average than those that have not attracted women in the same numbers.” Thus, as we examine the topics that tend to be clustered by gender in DH conference submissions, we must keep in mind the potential correlations of feminization and value, though it is beyond the scope of this paper to engage in chicken-or-egg debates about the causal relationship between misogyny and the devaluing of women’s labor and women’s topics.
There is no obvious ethnicity-based parallel to the concept of the feminization of academia; it wouldn’t be culturally intelligible to talk about the “people-of-colorization of academia”, or the “non-white-ization of academia.” At any rate, according to a U.S. Department of Education survey, in 2013 79% of all full-time faculty in degree-granting postsecondary institutions were white. The increase of non-white faculty from 2009 (19.2% of the whole) to 2013 (21.5%) is very small indeed.
Why does this matter? As Jeffrey Milem, Mitchell Chang, and Anthony Lising Antonio write in regard to faculty of color, “Having a diverse faculty ensures that students see people of color in roles of authority and as role models or mentors. Faculty of color are also more likely than other faculty to include content related to diversity in their curricula and to utilize active learning and student-centered teaching techniques…a coherent and sustained faculty diversity initiative must exist if there is to be any progress in diversifying the faculty” (25). By centering marginalized voices, scholarly institutions have the ability to send messages about who is worthy of inclusion.
Recent Criticisms of Diversity in DH
In terms of DH specifically, diversity within the community and conferences has been on the radar for several years, and has recently gained special attention, as digital humanists and other academics alike have called for critical and feminist engagement in diversity and a move away from what seems to be an exclusionary culture. In January 2011, THATCamp SoCal included a section called “Diversity in DH,” in which participants explored the lack of openness in DH and, in the end, produced a document, “Toward an Open Digital Humanities” that summarized their discussions. The “Overview” in this document mirrors the same conversation we have had for the last several years:
We recognize that a wide diversity of people is necessary to make digital humanities function. As such, digital humanities must take active strides to include all the areas of study that comprise the humanities and must strive to include participants of diverse age, generation, sex, skill, race, ethnicity, sexuality, gender, ability, nationality, culture, discipline, areas of interest. Without open participation and broad outreach, the digital humanities movement limits its capacity for critical engagement. (ibid)
This proclamation represents the critiques of the DH landscape in 2011, in which DH practitioners and participants were assumed to be privileged and white, that they excluded student-learners, and that they held myopic views of what constitutes DH. Most importantly for this chapter, THATCamp SoCal’s “Diversity in DH” section participants called for critical approaches and social justice of DH scholarship and participation, including “principles for feminist/non-exclusionary groundrules in each session (e.g., ‘step up/step back’) so that the loudest/most entitled people don’t fill all the quiet moments.” They also advocated defending the least-heard voices “so that the largest number of people can benefit…”
These voices certainly didn’t fall flat. However, since THATCamps are often comprised of geographically local DH microcommunities, they benefit from an inclusive environment but suffer as isolated events. As result, it seems that the larger, discipline-specific venues which have greater attendance and attraction continue to amplify privileged voices. Even so, 2011 continued to represent a year that called for critical engagement in diversity in DH, with an explicit “Big Tent” theme for DH2011 held in Stanford, California. Embracing the concept the “Big Tent” deliberately opened the doors and widened the spectrum of DH, at least in terms of methods and approaches. However, as Melissa Terras pointed out, DH was “still a very rich, very western academic field” (Terras, 2011), even with a few DH2011 presentations engaging specifically with topics of diversity in DH. 4
A focus on diversity-related issues has only grown in the interim. We’ve recently seen greater attention and criticism of DH exclusionary culture, for instance, at the 2015 Modern Language Association (MLA) annual convention, which included the roundtable discussion “Disrupting Digital Humanities.” It confronted the “gatekeeping impulse” in DH, and echoing THATCamp SoCal 2011, these panelists aimed to shut down hierarchical dialogues in DH, encourage non-traditional scholarship, amplify “marginalized voices,” advocate for DH novices, and generously support the work of peers. 5 The theme for DH2015 in Sydney, Australia was “Global Digital Humanities,” and between its successes and collective action arising from frustrations at its failures, the community seems poised to pay even greater attention to diversity. Other recent initiatives in this vein worth mention include #dhpoco, GO::DH, and Jacqueline Wernimont’s “Build a Better Panel,” 6 whose activist goals are helping diversify the community and raise awareness of areas where the community can improve.
While it would be fruitful to conduct a longitudinal historiographical analysis of diversity in DH, more recent criticisms illustrate a history of perceived exclusionary culture, which is why we hope to provide a data-driven approach to continue the conversation and call for feminist and critical engagement and intervention.
While DH as a whole has been critiqued for its lack of diversity and inclusion, how does the annual ADHO DH conference measure up? To explore this in a data-driven fashion, we have gathered publicly available annual ADHO conference programs and schedules from 2000-2015. From those conference materials, we have entered presentation and author information into a spreadsheet to analyze various trends over time, such as gender and geography as indicators of diversity. Particular information that we have collected includes: presentation title, keywords (if available), abstract and full-text (if available), presentation type, author name, author institutional affiliation and academic department (if available), and corresponding country of that affiliation at the time of the presentation(s). We normalized and hand-cleaned names, institutions, and departments, so that, to the best of our knowledge, each author entry represented a unique person and, accordingly, was assigned a unique ID. Next, we added gender information (m/f/other/unknown) to authors by a combination of hand-entry and automated inference. While this is problematic for many reasons, 7 since it does not allow for diversity in gender options and tracing gender changes over time, it does give us a useful preliminary lense to view gender diversity at DH conferences.
For 2013’s conference, ADHO instituted a series of changes aimed at improving inclusivity, diversity, and quality. This drive was steered by that year’s program committee chair, Bethany Nowviskie, alongside 2014’s chair, Melissa Terras. Their reformative goals matched our current goals in this essay, and speak to a long history of experimentation and improvement efforts on behalf of ADHO. Their changes included making the conference more welcome to outsiders through ending policies that only insiders knew about; making the CFP less complex and easier to translate into multiple languages; taking reviewer language competencies into account systematically; and streamlining the submission and review process.
The biggest noticeable change to DH2013, however, was the institution of a reviewer bidding process and a phase of semi-open peer review. Peer reviewers were invited to read through and rank every submitted abstract according to how qualified they felt to review the abstract. Following this, the conference committee would match submissions to qualified peer reviewers, taking into account conflicts of interest. Submitting authors were invited to respond to reviews, and the committee would make a final decision based on the various reviews and rebuttals.This continues to be the process through DH2016. Changes continue to be made, most recently in 2016 with the addition of “Diversity” and “Multilinguality” as new keywords authors can append to their submissions.
While the list of submitted abstracts was private, accessible only to reviewers, as reviewers ourselves we had access to the submissions during the bidding phase. We used this access to create a dataset of conference submissions for DH2013, DH2014, and DH2015, which includes author names, affiliations, submission titles, author-selected topics, author-chosen keywords, and submission types (long paper, short paper, poster, panel).
We augmented this dataset by looking at the final conference programs in ‘13, ‘14, and ‘15, noting which submissions eventually made it onto the final conference program, and how they changed from the submission to the final product. This allows us to roughly estimate the acceptance rate of submissions, by comparing the submitted abstract lists to the final programs. It is not perfect, however, given that we don’t actually know whether submissions that didn’t make it to the final program were rejected, or if they were accepted and withdrawn. We also do not know who reviewed what, nor do we know the reviewers’ scores or any associated editorial decisions.
The original dataset, then, included fields for title, authors, author affiliations, original submission type, final accepted type, topics, keywords, and a boolean field for whether a submission made it to the final conference program. We cleaned the data up by merging duplicate people, ensuring e.g., if “Melissa Terras” was an author on two different submissions, she counted as the same person. For affiliations, we semi-automatically merged duplicate institutions, found the countries they reside in, and assigned those countries to broad UN regions. We also added data to the set, first automatically guessing a gender for each author, and then correcting the guesses by hand.
Given that abstracts were submitted to conferences with an expectation of privacy, we have not released the full submission dataset; we have, however, released the full dataset of final conference programs. 8
We would like to acknowledge the gross and problematic simplifications involved in this process of gendering authors without their consent or input. As Miriam Posner has pointed out, with regards to Getty’s Union List of Author Names, “no self-respecting humanities scholar would ever get away with such a crude representation of gender in traditional work”. And yet, we represent authors in just this crude fashion, labeling authors as male, female, or unknown/other. We did not encode changes of author gender over time, even though we know of at least a few authors in the dataset for whom this applies. We do not use the affordances of digital data to represent the fluidity of gender. This is problematic for a number of reasons, not least of which because, when we take a cookie cutter to the world, everything in the world will wind up looking like cookies.
We made this decision because, in the end, all data quality is contingent to the task at hand. It is possible to acknowledge an ontology’s shortcomings while still occasionally using that ontology to a positive effect. This is not always the case: often poor proxies get in the way a research agenda (e.g., citations as indicators of “impact” in digital humanities), rather than align with it. In the humanities, poor proxies are much more likely to get in the way of research than help it along, and afford the ability to make insensitive or reductivist decisions in the name of “scale”.
For example, in looking for ethnic diversity of a discipline, one might analyze last names as a proxy for country of origin, or analyze the color of recognized faces in pictures from recent conferences as a proxy for ethnic genealogy. Among other reasons, this approach falls short because ethnicity, race, and skin color are often not aligned, and last names (especially in the U.S.) are rarely indicative of anything at all. But they’re easy solutions, so people use them. These are moments when a bad proxy (and for human categories, proxies are almost universally bad) does not fruitfully contribute to a research agenda. As George E.P. Box put it, “all models are wrong, but some are useful.”
Some models are useful. Sometimes, the stars align and the easy solution is the best one for the question. If someone were researching immediate reactions of racial bias in the West, analyzing skin tone may get us something useful. In this case, the research focus is not someone’s racial identity, but someone’s race as immediately perceived by others, which would likely align with skin tone. Simply: if a person looks black, they’re more likely to be treated as such by the (white) world at large. 9
We believe our proxies, though grossly inaccurate, are useful for the questions of gender and geographic diversity and bias. The first step to improving DH conference diversity is noticing a problem; our data show that problem through staggeringly imbalanced regional and gender ratios. With regards to gender bias, showing whether reviewers are less likely to accept papers from authors who appear to be women can reveal entrenched biases, whether or not the author actually identifies as a woman. With that said, we invite future researchers to identify and expand on our admitted categorical errors, allowing everyone to see the contours of our community with even greater nuance.
The annual ADHO conference has grown significantly in the last fifteen years, as described in our companion piece 10, within which can be found a great discussion of our methods. This piece, rather than covering overall conference trends, focuses specifically on issues of diversity and acceptance rates. We cover geographic and gender diversity from 2000-2015, with additional discussions of topicality and peer review bias beginning in 2013.
Women comprise 36.1% of the 3,239 authors to DH conference presentations over the last fifteen years, counting every unique author only once. Melissa Terras’ names appears on 29 presentations between 200-2015, and Scott B. Weingart’s name appears on 4 presentations, but for the purpose of this metric each name counts only once. Female authorship representation fluctuates between 29%-38% depending on the year.
Weighting every authorship event individually (i.e., Weingart’s name counts 4 times, Terras’ 29 times), women’s representation drops to 32.7%. This reveals that women are less likely to author multiple pieces compared to their male counterparts. More than a third of the DH authorship pool are women, but fewer than a third of every name that appears on a presentation is a woman’s. Even fewer single-authored pieces are by a woman; only 29.8% of the 984 single-authored works between 2000-2015 female-authored. About a third (33.4%) of first authors on presentations are women. See Fig. 1 for a breakdown of these numbers over time. Note the lack of periodicity, suggesting gender representation is not affected by whether the conference is held in Europe or North America (until 2015, the conference alternated locations every year). The overall ratio wavers, but is neither improving nor worsening over time.
The gender disparity sparked controversy at DH2015 in Sydney. It was, however, at odds with a common anecdotal awareness that many of the most respected role-models and leaders in the community are women. To explore this disconnect, we experimented with using centrality in co-authorship networks as a proxy for fame, respectability, and general presence within the DH consciousness. We assume that individuals who author many presentations, co-author with many people, and play a central role in connecting DH’s disparate communities of authorship are the ones who are most likely to garner the respect (or at least awareness) of conference attendees.
We created a network of authors connected to their co-authors from presentations between 2000-2015, with ties strengthening the more frequently two authors collaborate. Of the 3,239 authors in our dataset, 61% (1,750 individuals) are reachable by one another via their co-authorship ties. For example, Beth Plale is reachable by Alan Liu because she co-authored with J. Stephen Downie, who co-authored with Geoffrey Rockwell, who co-authored with Alan Liu. Thus, 61% of the network is connected in one large component, and there are 299 smaller components, islands of co-authorship disconnected from the larger community.
The average woman co-authors with 5 other authors, and the average man co-authors with 5.3 other authors. The median number of co-authors for both men and women is 4. The average and median of several centrality measurements (closeness, betweenness, pagerank, and eigenvector) for both men and women are nearly equivalent; that is, any given woman is just as likely to be near the co-authorship core as any given man. Naturally, this does not imply that half of the most central authors are women, since only a third of the entire authorship pool are women. It means instead that gender does not influence one’s network centrality. Or at least it should.
The statistics show a curious trend for the most central figures in the network. Of the top 10 authors who co-author with the most others, 60% are women. Of the top 20, 45% are women. Of the top 50, 38% are women. Of the top 100, 32% are women. That is, the over half the DH co-authorship stars are women, but the further towards the periphery you look, the more men occupy the middle-tier positions (i.e., not stars, but still fairly active co-authors). The same holds true for the various centrality measurements: betweenness (60% women in top 10; 40% in top 20; 32% in top 50; 34% in top 100), pagerank (50% women in top 10; 40% in top 20; 32% in top 50; 28% in top 100), and eigenvector (60% women in top 10; 40% in top 20; 40% in top 50; 34% in top 100).
In short, half or more of the DH conference stars are women, but as you creep closer to the network periphery, you are increasingly likely to notice the prevailing gender disparity. This supports the mismatch between an anecdotal sense that women play a huge role in DH, and the data showing they are poorly represented at conferences. The results also match with the fact that women are disproportionately more likely to write about management and leadership, discussed at greater length below.
The heavily-male gender skew at DH conferences may lead one to suspect a bias in the peer review process. Recent data, however, show that if such a bias exists, it is not direct. Over the past three conferences, 71% of women and 73% of men who submitted presentations passed the peer review process. The difference is not great enough to rule out random chance (p=0.16 using χ²). The skew at conferences is more a result of fewer women submitting articles than of women’s articles not getting accepted. The one caveat, explained more below, is that certain topics women are more likely to write about are also less likely to be accepted through peer-review.
This does not imply a lack of bias in the DH community. For example, although only 33.5% of authors at DH2015 in Sydney were women, 46% of conference attendees were women. If women were simply uninterested in DH, the split in attendance vs. authorship would not be so high.
In regard to discussions of women in different roles in the DH community – less the publishing powerhouses and more the community leaders and organizers – the concept of the “glass cliff” can be useful. Research on the feminization of academia in Sweden uses the term “glass cliff” as a “metaphor used to describe a phenomenon when women are appointed to precarious leadership roles associated with an increased risk of negative consequences when a company is performing poorly and for example is experiencing profit falls, declining stock performance, and job cuts” (Peterson 2014, 4). The female academics (who also occupied senior managerial positions) interviewed in Helen Peterson’s study expressed concerns about increasing workloads, the precarity of their positions, and the potential for interpersonal conflict.
Institutional politics may also play a role in the gendered data here. Sarah Winslow says of institutional context that “female faculty are less likely to be located at research institutions or institutions that value research over teaching, both of which are associated with greater preference for research” (779). The research, teaching, and service divide in academia remains a thorny issue, especially given the prevalence of what has been called the pink collar workforce in academia, or the disproportionate amount of women working in low-paying teaching-oriented areas. This divide likely also contributed to differing gender ratios between attendees and authors at DH2015.
While the gendered implications of time allocation in universities are beyond the scope of this paper, it might be useful to note that there might be long-term consequences for how people spend their time interacting with scholarly tasks that extend beyond one specific institution. Winslow writes: “Since women bear a disproportionate responsibility for labor that is institution-specific (e.g., institutional housekeeping, mentoring individual students), their investments are less likely to be portable across institutions. This stands in stark contrast to men, whose investments in research make them more highly desirable candidates should they choose to leave their own institutions” (790). How this plays out specifically in the DH community remains to be seen, but the interdisciplinarity of DH along with its projects that span multiple working groups and institutions may unsettle some of the traditional bias that women in academia face.
Until 2015, the DH conference alternated every year between North America and Europe. As expected, until recently, the institutions represented at the conference have hailed mostly from these areas, with the primary locus falling in North America. In fact, since 2000, North American authors were the largest authorial constituency at eleven of the fifteen conferences, even though North America only hosted the conference seven times in that period.
With that said, as opposed to gender representation, national and institutional diversity is improving over time. Using an Index of Qualitative Variation (IQV), institutional variation begins around 0.992 in 2000 and ends around 0.996 in 2015, with steady increases over time. National IQV begins around 0.79 in 2010 and ends around 0.83 in 2015, also with steady increases over time. The most recent conference was the first that included over 30% of authors and attendees arriving from outside Europe or North America. Now that ADHO has implemented a three-year cycle, with every third year marked by a movement outside its usual territory, that diversity is likely to increase further still.
The most well-represented institutions are not as dominating as some may expect, given the common view of DH as a community centered around particular powerhouse departments or universities. The university with the most authors contributing to DH conferences (2.4% of the total authors) is King’s College London, followed by the Universities of Illinois (1.85%), Alberta (1.83%), and Virginia (1.75%). The most prominent university outside of North America or Europe is Ritsumeikan University, contributing 1.07% of all DH conference authors. In all, over a thousand institutions have contributed authors to the conference, and that number increases every year.
While these numbers represent institutional origins, the data available does not allow any further diving into birth countries, native language, ethnic identities, etc. The 2013-2015 dataset, including peer review information, does yield some insight into geography-influenced biases that may map to language or identity. While the peer review data do not show any clear bias by institutional country, there is a very clear bias against names which do not appear frequently in the U.S. Census or Social Security Index. We discovered this when attempting to statistically infer the gender of authors using these U.S.-based indices. 11 From 2013-2015, presentations written by those with names appearing frequently in these indices were significantly more likely to be accepted than those written by authors with non-English names (p < 0.0001). Whereas approximately 72% of authors with common U.S. names passed peer review, only 61% of authors with uncommon names passed. Without more data, we have no idea whether this tremendous disparity is due to a bias against popular topics from non-English-speaking countries, a higher likelihood of peer reviewers rejecting text written by non-native writers, an implicit bias by peer reviewers when they see “foreign” names, or something else entirely.
When submitting a presentation, authors are given the opportunity to provide keywords for their submission. Some keywords can be chosen freely, while others must be chosen from a controlled list of about 100 potential topics. These controlled keywords are used to help in the process of conference organization and peer reviewer selection, and they stay roughly constant every year. New keywords are occasionally added to the list, as in 2016, where authors can now select three topics which were not previously available: “Digital Humanities – Diversity”, “Digital Humanities – Multilinguality”, and “3D Printing”. The 2000-2015 conference dataset does not include keywords for every article, so this analysis will only cover the more detailed dataset, 2013-2015, with additional data on submissions for DH2016.
From 2013-2016, presentations were tagged with an average of six controlled keywords per submission. The most-used keywords are unsurprising: “Text Analysis” (tagged on 22% of submissions), “Data Mining / Text Mining” (20%), “Literary Studies” (20%), “Archives, Repositories, Sustainability And Preservation” (19%), and “Historical Studies” (18%). The most frequently-used keyword potentially pertaining directly to issues of diversity, “Cultural Studies”, appears on on 14% of submissions from 2013-2016. Only 2% of submissions are tagged with “Gender Studies”. The two diversity-related keywords introduced this year are already being used surprisingly frequently, with 9% of submissions in 2016 tagged “Digital Humanities – Diversity” and 6% of submissions tagged “Digital Humanities – Multilinguality”. With over 650 conference submissions for 2016, this translates to a reasonably large community of DH authors presenting on topics related to diversity.
Joining the topic and gender data for 2013-2015 reveals the extent to which certain subject matters are gendered at DH conferences. 12 Women are twice as likely to use the “Gender Studies” tag as male authors, whereas men are twice as likely to use the “Asian Studies” tag as female authors. Subjects related to pedagogy, creative / performing arts, art history, cultural studies, GLAM (galleries, libraries, archives, museums), DH institutional support, and project design/organization/management are more likely to be presented by women. Men, on the other hand, are more likely to write about standards & interoperability, the history of DH, programming, scholarly editing, stylistics, linguistics, network analysis, and natural language processing / text analysis. It seems DH topics have inherited the usual gender skews associated with the disciplines in which those topics originate.
We showed earlier that there was no direct gender bias in the peer review process. While true, there appears to be indirect bias with respect to how certain gendered topics are considered acceptable by the DH conference peer reviewers. A woman has just as much chance of getting a paper through peer review as a man if they both submit a presentation on the same topic (e.g., both women and men have a 72% chance of passing peer review if they write about network analysis, or a 65% chance of passing peer review if they write about knowledge representation), but topics that are heavily gendered towards women are less likely to get accepted. Cultural studies has a 57% acceptance rate, gender studies 60%, pedagogy 51%. Male-skewed topics have higher acceptance rates, like text analysis (83%), programming (80%), or Asian studies (79%). The female-gendering of DH institutional support and project organization also supports our earlier claim that, while women are well-represented among the DH leadership, they are more poorly represented in those topics that the majority of authors are discussing (programming, text analysis, etc.).
Regarding the clustering – and devaluing – of topics that women tend to present on at DH conferences, the widespread acknowledgement of the devaluing of women’s labor may help to explain this. We discussed the feminization of academia above, and indeed, this is a trend seen in practically all facets of society. The addition of emotional labor or caretaking tasks complicates this. Economist Teresa Ghilarducchi explains: “a lot of what women do in their lives is punctuated by time outside of the labor market — taking care of family, taking care of children — and women’s labor has always been devalued…[people] assume that she had some time out of the labor market and that she was doing something that was basically worthless, because she wasn’t being paid for it.” In academia specifically, the labyrinthine relationship of pay to tasks/labor further obscures value: we are rarely paid per task (per paper published or presented) on the research front; service work is almost entirely invisible; and teaching factors in with course loads, often with more up-front transparency for contingent laborers such as adjuncts and part-timers.
Our results seem to point to less of an obvious bias against women scholars than a subtler bias against topics that women tend to gravitate toward, or are seen as gravitating toward. This is in line with the concept of postfeminism, or the notion that feminism has met its main goals (e.g. getting women the right to vote and the right to an education), and thus is irrelevant to contemporary social needs and discourse. Thoroughly enmeshed in neoliberal discourse, postfeminism makes discussing misogyny seem obsolete and obscures the subtler ways in which sexism operates in daily life (Pomerantz, Raby, and Stefanik 2013). While individuals may or may not choose to identify as postfeminist, the overarching beliefs associated with postfeminism have permeated North American culture at a number of levels, leading us to posit the acceptance of the ideals of postfeminism as one explanation for the devaluing of topics that seem associated with women.
Discussion and Future Research
The analysis reveals an annual DH conference with a growing awareness of diversity-related issues, with moderate improvements in regional diversity, stagnation in gender diversity, and unknown (but anecdotally poor) diversity with regards to language, ethnicity, and skin color. Knowledge at the DH conference is heavily gendered, though women are not directly biased against during peer review, and while several prominent women occupy the community’s core, women occupy less space in the much larger periphery. No single or small set of institutions dominate the conference attendance, and though North America’s influence on ADHO cannot be understated, recent ADHO efforts are significantly improving the geographic spread of its constituency.
The DH conference, and by extension ADHO, is not the digital humanities. It is, however, the largest annual gathering of self-identified digital humanists, 13 and as such its makeup holds influence over the community at large. Its priorities, successes, and failures reflect on DH, both within the community and to the outside world, and those priorities get reinforced in future generations. If the DH conference remains as it is—devaluing knowledge associated with femininity, comprising only 36% women, and rejecting presentations by authors with non-English names—it will have significant difficulty attracting a more diverse crowd without explicit interventions. Given the shortcomings revealed in the data above, we present some possible interventions that can be made by ADHO or its members to foster a more diverse community, inspired by #WhatIfDH2016:
As pointed out by Yvonne Perkins, Ask presenters to include a brief “Collections Used” section, when appropriate. Such a practice would highlight and credit the important work being done by those who aren’t necessarily engaging in publishable research, and help legitimize that work to conference attendees.
As pointed out by Vika Zafrin, create guidelines for reviewers explicitly addressing diversity, and provide guidance on noticing and reducing peer review bias.
As pointed out by Vika Zafrin, community members can make an effort to solicit presentation submissions from women and people of color.
As pointed out by Vika Zafrin, collect and analyze data on who is peer reviewing, to see whether or the extent to which biases creep in at that stage.
As pointed out by Aimée Morrison, ensure that the conference stage is at least as diverse as the conference audience. This can be accomplished in a number of ways, from conference organizers making sure their keynote speakers draw from a broad pool, to organizing last-minute lightning lectures specifically for those who are registered but not presenting.
As pointed out by Christina Boyles, encourage the submission of research focused around the intersection of race, gender, and sexuality studies. This may be partially accomplished by including more topical categories for conference submissions, a step which ADHO has already taken for 2016.
As pointed out by many, take explicit steps in ensuring conference access to those with disabilities. We suggest this become an explicit part of the application package submitted by potential host institutions.
As pointed out by many, ensure the ease of participation-at-a-distance (both as audience and as speaker) for those without the resources to travel.
Give marginalized communities greater representation in the DH Conference peer reviewer pool. This can be done grassroots, with each of us reaching out to colleagues to volunteer as reviewers, and organizationally, perhaps by ADHO creating a volunteer group to seek out and encourage more diverse reviewers.
Consider the difference between diversifying (verb) vs. talking about diversity (noun), and consider whether other modes of disrupting hegemony, such as decolonization and queering, might be useful in these processes.
Contribute to the #whatifDH2016 and #whatifDH2017 discussions on twitter with other ideas for improvements.
Many options are available to improve representation at DH conferences, and some encouraging steps are already being taken by ADHO and its members. We hope to hear more concrete steps that may be taken, especially learned from experiences in other communities or outside of academia, in order to foster a healthier and more welcoming conference going forward.
In the interest of furthering these goals and improving the organizational memory of ADHO, the public portion of the data (final conference programs with full text and unique author IDs) is available alongside this publication [will link in final draft]. With this, others may test, correct, or improve our work. We will continue work by extending the dataset back to 1990, continuing to collect for future conferences, and creating an infrastructure that will allow the database to connect to others with similar collections. This will include the ability to encode more nuanced and fluid gender representations, and for authors to correct their own entries. Further work will also include exploring topical co-occurrence, institutional bias in peer review, how institutions affect centrality in the co-authorship network, and how authors who move between institutions affect all these dynamics.
The Digital Humanities will never be perfect. It embodies the worst of its criticisms and the best of its ideals, sometimes simultaneously. We believe a more diverse community will help tip those scales in the right direction, and present this chapter in service of that belief.
Peterson, Helen. “An Academic ‘Glass Cliff’? Exploring the Increase of Women in Swedish Higher Education Management.” Athens Journal of Education 1, no. 1 (February 2014): 32–44.
Pomerantz, Shauna, Rebecca Raby, and Andrea Stefanik. “Girls Run the World? Caught between Sexism and Postfeminism in the School.” *Gender & Society *27, no. 2 (April 1, 2013): 185-207. doi:10.1177/0891243212473199
Each author contributed equally to the final piece; please disregard authorship order. ↩
See Melissa Terras, “Disciplined: Using Educational Studies to Analyse ‘Humanities Computing.’” Literary and Linguistic Computing 21, no. 2 (June 1, 2006): 229–46. doi:10.1093/llc/fql022. Terras takes a similar approach, analyzing Humanities Computing “through its community, research, curriculum, teaching programmes, and the message they deliver, either consciously or unconsciously, about the scope of the discipline.” ↩
The authors have created a browsable archive of #whatifDH2016 tweets. ↩
Of the 146 presentations at DH2011, two standout in relation to diversity in DH: “Is There Anybody out There? Discovering New DH Practitioners in other Countries” and “A Trip Around the World: Balancing Geographical Diversity in Academic Research Teams.” ↩
See “Disrupting DH,” http://www.disruptingdh.com/ ↩
See Wernimont’s blog post, “No More Excuses” (September 2015) for more, as well as the Tumblr blog, “Congrats, you have an all male panel!” ↩
Miriam Posner offers a longer and more eloquent discussion of this in, “What’s Next: The Radical, Unrealized Potential of Digital Humanities.” Miriam Posner’s Blog. July 27, 2015. http://miriamposner.com/blog/whats-next-the-radical-unrealized-potential-of-digital-humanities/ ↩
[Link to the full public dataset, forthcoming and will be made available by time of publication]) ↩
We would like to acknowledge that race and ethnicity are frequently used interchangeably, though both are cultural constructs with their roots in Darwinian thought, colonialism, and imperialism. We retain these terms because they express cultural realities and lived experiences of oppression and bias, not because there is any scientific validity to their existence. For more on this tension, see John W.Burton, (2001), Culture and the Human Body: An Anthropological Perspective. Prospect Heights, Illinois: Waveland Press, 51-54. ↩
Weingart, S.B. & Eichmann, N. (2016). “What’s Under the Big Tent?: A Study of ADHO Conference Abstracts.” Manuscript submitted for publication. ↩
We used the process and script described in: Lincoln Mullen (2015). gender: Predict Gender from Names Using Historical Data. R package version 0.5.0.9000 (https://github.com/ropensci/gender) and Cameron Blevins and Lincoln Mullen, “Jane, John … Leslie? A Historical Method for Algorithmic Gender Prediction,” Digital Humanities Quarterly 9.3 (2015). ↩
For a breakdown of specific numbers of gender representation across all 96 topics from 2013-2015, see Weingart’s “Acceptances to Digital Humanities 2015 (part 4)”. ↩
While ADHO’s annual conference is usually the largest annual gathering of digital humanists, that place is constantly being vied for by the Digital Humanities Summer Institute in Victoria, Canada, which in 2013 boasted more attendees than DH2013 in Lincoln, Nebraska. ↩
Women are (nearly but not quite) as likely as men to be accepted by peer reviewers at DH conferences, but names foreign to the US are less likely than either men or women to be accepted to these conferences. Some topics are more likely to be written on by women (gender, culture, teaching DH, creative arts & art history, GLAM, institutions), and others more likely to be discussed by men (standards, archaeology, stylometry, programming/software).
You may know I’m writing a series on Digital Humanities conferences, of which this is the zillionth post. 1 This post has nothing to do with DH2015, but instead looks at DH2013, DH2014, and DH2015 all at once. I continue my recent trend of looking at diversity in Digital Humanities conferences, drawing especially on these two posts (1, 2) about topic, gender, and acceptance rates.
This post will be longer than usual, since Heather Froehlich rightly pointed out my methods in these posts aren’t as transparent as they ought to be, and I’d like to change that.
@scott_bot@nmhouston don’t get me wrong – i think they’re interesting and useful but i don’t think they’re 100% transparent
As someone who deals with algorithms and large datasets, I desperately seek out those moments when really stupid algorithms wind up aligning with a research goal, rather than getting in the way of it.
In the humanities, stupid algorithms are much more likely to get in the way of my research than help it along, and afford me the ability to make insensitive or reductivist decisions in the name of “scale”. For example, in looking for ethnic diversity of a discipline, I can think of two data-science-y approaches to solving this problem: analyzing last names for country of origin, or analyzing the color of recognized faces in pictures from recent conferences.
Obviously these are awful approaches, for a billion reasons that I need not enumerate, but including the facts that ethnicity and color are often not aligned, and last names (especially in the states) are rarely indicative of anything at all. But they’re easy solutions, so you see people doing them pretty often. I try to avoid that.
Sometimes, though, the stars align and the easy solution is the best one for the question. Let’s say we were looking to understand immediate reactions of racial bias; in that case, analyzing skin tone may get us something useful because we don’t actually care about the race of the person, what we care about is the immediateperceived race by other people, which is much more likely to align with skin tone. Simply: if a person looks black, they’re more likely to be treated as such by the world at large.
This is what I’m banking on for peer review data and bias. For the majority of my data on DH conferences, Nickoal Eichmann and I have been going in and hand-coding every single author with a gender that we glean from their website, pictures, etc. It’s quite slow, far from perfect (see my note), but it’s at least more sensitive than the brute force method, we hope to improve it quite soon with user-submitted genders, and it gets us a rough estimate of gender ratios in DH conferences.
But let’s say we want to discuss bias, rather than diversity. In that case, I actually prefer the brute force method, because instead of giving me a sense of the actual gender of an author, it can give me a sense of what the peer reviewers perceive an author’s gender to be. That is, if a peer reviewer sees the name “Mary” as the primary author of an article, how likely is the reviewer to think the author is written by a woman, and will this skew their review?
That’s my goal today, so instead of hand-coding like usual, I went to Lincoln Mullen’s fabulous package for inferring gender from first names in the programming language R. It does so by looking in the US Census and Social Security Database, looking at the percentage of men and women with a certain first name, and then gives you both the ratio of men-to-women with that name, and the most likely guess of the person’s gender.
Inferring Gender for Peer Review
I don’t have a palantír and my DH data access is not limitless. In fact, everything I have I’ve scraped from public or semi-public spaces, which means I have no knowledge of who reviewed what for ADHO conferences, the scores given to submissions, etc. What I do have the titles and author names for every submission to an ADHO conference since 2013 (explanation), and the final program of those conferences. This means I can see which submissions don’t make it to the presentation stage; that’s not always a reflection of whether an article gets accepted, but it’s probably pretty close.
So here’s what I did: created a list of every first name that appears on every submission, rolled the list it into Lincoln Mullen’s gender inference machine, and then looked at how often authors guessed to be men made it through to the presentation stage, versus how often authors guessed to women made it through. That is to say, if an article is co-authored by one man and three women, and it makes it through, I count it as one acceptance for men and three for women. It’s not the only way to do it, but it’s the way I did it.
I’m arguing this can be used as a proxy for gender bias in reviews and editorial decisions: that if first names that look like women’s names are more often rejected 2 than ones that look like men’s names, there’s likely bias in the review process.
Results: Bias in Peer Review?
Totaling all authors from 2013-2015, the inference machine told me 1,008 names looked like women’s names; 1,707 looked like men’s names; and 515 could not be inferred. “Could not be inferred” is code for “the name is foreign-sounding and there’s not enough data to guess”. Remember as well, this is counting every authorship as a separate event, so if Melissa Terras submits one paper in 2013 and one in 2014, the name “Melissa” appears in my list twice.
So we see that in 2013-2015, 70.3% of woman-authorship-events get accepted, 73.2% of man-authorship-events get accepted, and only 60.6% of uninferrable-authorship-events get accepted. I’ll discuss gender more soon, but this last bit was totally shocking to me. It took me a second to realize what it meant: that if your first name isn’t a standard name on the US Census or Social Security database, you’re much less likely to get accepted to a Digital Humanities conference. Let’s break it out by year.
We see an interesting trend here, some surprising, some not. Least surprising is that the acceptance rates for non-US names is most equal this year, when the conference is being held so close to Asia (which the inference machine seems to have the most trouble with). My guess is that A) more non-US people who submit are actually able to attend, and B) reviewers this year are more likely to be from the same sorts of countries that the program is having difficulties with, so they’re less likely to be biased towards non-US first names. There’s also potentially a language issue here: that non-US submissions are more likely to be rejected because they are either written in another language, or written in a way that native English speakers may find difficult to understand.
But the fact of the matter is, there’s a very clear bias against submissions by people with names non-standard to the US. The bias, oddly, is most pronounced in 2014, when the conference was held in Switzerland. I have no good guesses as to why.
So now that we have the big effect out of the way, let’s get to the small one: gender disparity. Honestly, I had expected it to be worse; it is worse this years than the two previous, but that may just be statistical noise. It’s true that women do fair worse overall by 1-3%, which isn’t huge, but it’s big enough to mention. However.
Topics and Gender
However, it turns out that the entire gender bias effect we see is explained by the topical bias I already covered the other day. (Scroll down for the rest of the post.)
What’s shown here will be fascinating to many of us, and some of it more surprising than others. A full 67% of authors on the 25 DH submissions labeled “gender studies” are labeled as women by Mullen’s algorithm. And remember, many of those may be the same author; for example if “Scott Weingart” is listed as an author on multiple submissions, this chart counts those separately.
Other topics that are heavily skewed towards women: drama, poetry, art history, cultural studies, GLAM, and (importantly), institutional support and DH infrastructure. Remember how I said a large percentage of of those responsible for running DH centers, committees, and organizations are women? This is apparently the topic they’re publishing in.
If we look instead at the bottom of the chart, those topics skewed towards men, we see stylometrics, programming & software, standards, image processing, network analysis, etc. Basically either the CS-heavy topics, or the topics from when we were still “humanities computing”, a more CS-heavy community. These topics, I imagine, inherit their gender ratio problems from the various disciplines we draw them from.
You may notice I left out pedagogical topics from my list above, which are heavily skewed towards women. I’m singling that out specially because, if you recall from my previous post, pedagogical topics are especially unlikely to be accepted to DH conferences. In fact, a lot of the topics women are submitting in aren’t getting accepted to DH conferences, you may recall.
It turns out that the gender bias in acceptance ratios is entirely accounted for by the topical bias. When you break out topics that are not gender-skewed (ontologies, UX design, etc.), the acceptance rates between men and women are the same – the bias disappears. What this means is the small gender bias is coming at the topical level, rather than at the gender level, and since women are writing more about those topics, they inherit the peer review bias.
Does this mean there is no gender bias in DH conferences?
No. Of course not. I already showed yesterday that 46% of attendees to DH2015 are women, whereas only 35% of authors are. What it means is the bias against topics is gendered, but in a peculiar way that actually may be (relatively) easy to solve, and if we do solve it, it’d also likely go a long way in solving that attendee/authorship ratio too.
Get more women peer reviewing for DH conferences.
Although I don’t know who’s doing the peer reviews, I’d guess that the gender ratio of peer reviewers is about the same as the ratio of authors; 34% women, 66% men. If that is true, then it’s unsurprising that the topics women tend to write about are not getting accepted, because by definition these are the topics that men publishing at DH conferences find less interesting or relevant3. If reviewers gravitate towards topics of their own interest, and if their interests are skewed by gender, it’d also likely skew results of peer review. If we are somehow able to improve the reviewer ratio, I suspect the bias in topic acceptance, and by extension gender acceptance, will significantly reduce.
Jacqueline Wernimont points out in a comment below that another way improving the situation is to break the “gender lines” I’ve drawn here, and make sure to attend presentations on topics that are outside your usual scope if (like me) you gravitate more towards one side than another.
Obviously this is all still preliminary, and I plan to show the breakdown of acceptances by topic and gender in a later post so you don’t just have to trust me on it, but at the 2,000-word-mark this is getting long-winded, and I’d like feedback and thoughts before going on.
[edit: I’m realizing I didn’t make it clear in this post that I’m aware many historians consider themselves scientists, and that there’s plenty of scientific historical archaeology and anthropology. That’s exactly what I’m advocating there be more of, and more varied.]
Short Answer: Yes.
Less Snarky Answer: Historians need to be flexible to fresh methods, fresh perspectives, and fresh blood. Maybe not that last one, I guess, as it might invite vampires.Okay, I suppose this answer wasn’t actually less snarky.
The long answer is that historians don’t necessarily need scientists, but that we do need fresh scientific methods. Perhaps as an accident of our association with the ill-defined “humanities”, or as a result of our being placed in an entirely different culture (see: C.P. Snow), most historians seem fairly content with methods rooted in thinking about text and other archival evidence. This isn’t true of all historians, of course – there are economic historians who use statistics, historians of science who recreate old scientific experiments, classical historians who augment their research with archaeological findings, archival historians who use advanced ink analysis, and so forth. But it wouldn’t be stretching the truth to say that, for the most part, historiography is the practice of thinking cleverly about words to make more words.
I’ll argue here that our reliance on traditional methods (or maybe more accurately, our odd habit of rarely discussing method) is crippling historiography, and is making it increasingly likely that the most interesting and innovative historical work will come from non-historians. Sometimes these studies are ill-informed, especially when the authors decide not to collaborate with historians who know the subject, but to claim that a few ignorant claims about history negate the impact of these new insights is an exercise in pedantry.
In defending the humanities, we like to say that scientists and technologists with liberal arts backgrounds are more well-rounded, better citizens of the world, more able to contextualize their work. Non-humanists benefit from a liberal arts education in pretty much all the ways that are impossible to quantify (and thus, extremely difficult to defend against budget cuts). We argue this in the interest of rounding a person’s knowledge, to make them aware of their past, of their place in a society with staggering power imbalances and systemic biases.
Humanities departments should take a page from their own books. Sure, a few general ed requirements force some basic science and math… but I got an undergraduate history degree in a nice university, and I’m well aware how little STEM I actually needed to get through it. Our departments are just as guilty of narrowness as those of our STEM colleagues, and often because of it, we rely on applied mathematicians, statistical physicists, chemists, or computer scientists to do our innovative work for (or sometimes, thankfully, with) us.
Of course, there’s still lots of innovative work to be done from a textual perspective. I’m not downplaying that. Not everyone needs to use crazy physics/chemistry/computer science/etc. methods. But there’s a lot of low hanging fruit at the intersection of historiography and the natural sciences, and we’re not doing a great job of plucking it.
The story below is illustrative.
Last night, Blaise Agüera y Arcas presented his research on Gutenberg to a packed house at our rare books library. He’s responsible for a lot of the cool things that have come out of Microsoft in the last few years, and just got a job at Google, where presumably he will continue to make cool things. Blaise has degrees in physics and applied mathematics. And, a decade ago, Blaise and historian/librarian Paul Needham sent ripples through the History of the Book community by showing that Gutenberg’s press did not work at all the way people expected.
It was generally assumed that Gutenberg employed a method called punchcutting in order to create a standard font. A letter carved into a metal rod (a “punch”) would be driven into a softer metal (a “matrix”) in order to create a mold. The mold would be filled with liquid metal which hardened to form a small block of a single letter (a “type”), which would then be loaded onto the press next to other letters, inked, and then impressed onto a page. Because the mold was metal, many duplicate “types” could be made of the same letter, thus allowing many uses of the same letter to appear identical on a single pressed page.
This process is what allowed all the duplicate letters to appear identical in Gutenberg’s published books. Except, of course, careful historians of early print noticed that letters weren’t, in fact, identical. In the 1980s, Paul Needham and a colleague attempted to produce an inventory of all the different versions of letters Gutenberg used, but they stopped after frequently finding 10 or more obviously distinct versions of the same letter.
This was perplexing, but the subject was bracketed away for a while, until Blaise Agüera y Arcas came to Princeton and decided to work with Needham on the problem. Using extremely high-resolution imagining techniques, Blaise noted that there were in fact hundreds of versions of every letter. Not only that, there were actually variations and regularities in the smaller elements that made up letters. For example, an “n” was formed by two adjacent vertical lines, but occasionally the two vertical lines seem to have flipped places entirely. The extremely basic letter “i” itself had many variations, but within those variations, many odd self-similarities.
Historians had, until this analysis, assumed most letter variations were due to wear of the type blocks. This analysis blew that hypothesis out of the water. These “i”s were clearly not all made in the same mold; but then, how had they been made? To answer this, they looked even closer at the individual letters.
It’s difficult to see at first glance, but they found something a bit surprising. The letters appeared to be formed of overlapping smaller parts: a vertical line, a diagonal box, and so forth. The below figure shows a good example of this. The glyphs on the bottom have have a stem dipping below the bottom horizontal line, while the glyphs at the top do not.
The conclusion Needham and Agüera y Arcas drew, eventually, was that the punchcutting method must not have been used for Gutenberg’s early material. Instead, a set of carved “strokes” were pushed into hard sand or soft clay, configured such that the strokes would align to form various letters, not unlike the formation of cuneiform. This mold would then be used to cast letters, creating the blocks we recognize from movable type. The catch is that this soft clay could only cast letters a few times before it became unusable and would need to be recreated. As Gutenberg needed multiple instances of individual letters per page, many of those letters would be cast from slightly different soft molds.
At the end of his talk, Blaise made an offhand comment: how is it that historians/bibliographers/librarians have been looking at these Gutenbergs for so long, discussing the triumph of their identical characters, and not noticed that the characters are anything but uniform? Or, of those who had noticed it, why hadn’t they raised any red flags?
The insights they produced weren’t staggering feats of technology. He used a nice camera, a light shining through the pages of an old manuscript, and a few simple image recognition and clustering algorithms. The clustering part could even have been done by hand, and actually had been, by Paul Needham. And yes, it’s true, everything is obvious in hindsight, but there were a lot of eyes on these bibles, and odds are if some of them had been historians who were trained in these techniques, this insight could have come sooner. Every year students do final projects and theses and dissertations, but what percent of those use techniques from outside historiography?
In short, there’s a lot of very basic assumptions we make about the past that could probably be updated significantly if we had the right skillset, or knew how to collaborate with those who did. I think people like William Newman, who performs Newton’s alchemical experiments, is on the right track. As is Shawn Graham, who reanimates the trade networks of ancient Rome using agent-based simulations, or Devon Elliott, who creates computational and physical models of objects from the history of stage magic. Elliott’s models have shown that certain magic tricks couldn’t possibly have worked as they were described to.
The challenge is how to encourage this willingness to reach outside traditional historiographic methods to learn about the past. Changing curricula to be more flexible is one way, but that is a slow and institutionally difficult process. Perhaps faculty could assign group projects to students taking their gen-ed history courses, encouraging disciplinary mixes and non-traditional methods. It’s an open question, and not an easy one, but it’s one we need to tackle.
There’s an oft-spoken and somewhat strawman tale of how the digital humanities is bridging C.P. Snow’s “Two Culture” divide, between the sciences and the humanities. This story is sometimes true (it’s fun putting together Ocean’s Eleven-esque teams comprising every discipline needed to get the job done) and sometimes false (plenty of people on either side still view the other with skepticism), but as a historian of science, I don’t find the divide all that interesting. As Snow’s title suggests, this divide is first and foremost cultural. There’s another overlapping divide, a bit more epistemological, methodological, and ontological, which I’ll explore here. It’s the nomothetic(type)/idiographic(token) divide, and I’ll argue here that not only are its barriers falling, but also that the distinction itself is becoming less relevant.
Nomothetic (Greek for “establishing general laws”-ish) and Idiographic (Greek for “pertaining to the individual thing”-ish) approaches to knowledge have often split the sciences and the humanities. I’ll offload the hard work onto Wikipedia:
Nomothetic is based on what Kant described as a tendency to generalize, and is typical for the natural sciences. It describes the effort to derive laws that explain objective phenomena in general.
Idiographic is based on what Kant described as a tendency to specify, and is typical for the humanities. It describes the effort to understand the meaning of contingent, unique, and often subjective phenomena.
These words are long and annoying to keep retyping, and so in the longstanding humanistic tradition of using new words for words which already exist, henceforth I shall refer to nomothetic as type and idiographic as token. 1 I use these because a lot of my digital humanities readers will be familiar with their use in text mining. If you counted the number of unique words in a text, you’d be be counting the number of types. If you counted the number of total words in a text, you’d be counting the number of tokens, because each token (word) is an individual instance of a type. You can think of a type as the platonic ideal of the word (notice the word typical?), floating out there in the ether, and every time it’s actually used, it’s one specific token of that general type.
Usually the natural and social sciences look for general principles or causal laws, of which the phenomena they observe are specific instances. A social scientist might note that every time a student buys a $500 textbook, they actively seek a publisher to punch, but when they purchase $20 textbooks, no such punching occurs. This leads to the discovery of a new law linking student violence with textbook prices. It’s worth noting that these laws can and often are nuanced and carefully crafted, with an awareness that they are neither wholly deterministic nor ironclad.
The humanities (or at least history, which I’m more familiar with) are more interested in what happened than in what tends to happen. Without a doubt there are general theories involved, just as in the social sciences there are specific instances, but the intent is most-often to flesh out details and create a particular internally consistent narrative. They look for tokens where the social scientists look for types. Another way to look at it is that the humanist wants to know what makes a thing unique, and the social scientist wants to know what makes a thing comparable.
It’s been noted these are fundamentally different goals. Indeed, how can you in the same research articulate the subjective contingency of an event while simultaneously using it to formulate some general law, applicable in all such cases? Rather than answer that question, it’s worth taking time to survey some recent research.
A recent digital humanities panel at MLA elicited responses by Ted Underwood and Haun Saussy, of which this post is in part itself a response. One of the papers at the panel, by Long and So, explored the extent to which haiku-esque poetry preceded what is commonly considered the beginning of haiku in America by about 20 years. They do this by teaching the computer the form of the haiku, and having it algorithmically explore earlier poetry looking for similarities. Saussy comments on this work:
[…] macroanalysis leads us to reconceive one of our founding distinctions, that between the individual work and the generality to which it belongs, the nation, context, period or movement. We differentiate ourselves from our social-science colleagues in that we are primarily interested in individual cases, not general trends. But given enough data, the individual appears as a correlation among multiple generalities.
One of the significant difficulties faced by digital humanists, and a driving force behind critics like Johanna Drucker, is the fundamental opposition between the traditional humanistic value of stressing subjectivity, uniqueness, and contingency, and the formal computational necessity of filling a database with hard decisions. A database, after all, requires you to make a series of binary choices in well-defined categories: is it or isn’t it an example of haiku? Is the author a man or a woman? Is there an author or isn’t there an author?
Underwood addresses this difficulty in his response:
Though we aspire to subtlety, in practice it’s hard to move from individual instances to groups without constructing something like the sovereign in the frontispiece for Hobbes’ Leviathan – a homogenous collection of instances composing a giant body with clear edges.
But he goes on to suggest that the initial constraint of the digital media may not be as difficult to overcome as it appears. Computers may even offer us a way to move beyond the categories we humanists use, like genre or period.
Aren’t computers all about “binary logic”? If I tell my computer that this poem both is and is not a haiku, won’t it probably start to sputter and emit smoke?
Well, maybe not. And actually I think this is a point that should be obvious but just happens to fall in a cultural blind spot right now. The whole point of quantification is to get beyond binary categories — to grapple with questions of degree that aren’t well-represented as yes-or-no questions. Classification algorithms, for instance, are actually very good at shades of gray; they can express predictions as degrees of probability and assign the same text different degrees of membership in as many overlapping categories as you like.
Here we begin to see how the questions asked of digital humanists (on the one side; computational social scientists are tackling these same problems) are forcing us to reconsider the divide between the general and the specific, as well as the meanings of categories and typologies we have traditionally taken for granted. However, this does not yet cut across the token/type divide: this has gotten us to the macro scale, but it does not address general principles or laws that might govern specific instances. Historical laws are a murky subject, prone to inducing fits of anti-deterministic rage. Complex Systems Science and the lessons we learn from Agent-Based Modeling, I think, offer us a way past that dilemma, but more on that later.
For now, let’s talk about influence. Or diffusion. Or intertextuality. 2Matthew Jockers has been exploring these concepts, most recently in his book Macroanalysis. The undercurrent of his research (I think I’ve heard him call it his “dangerous idea”) is a thread of almost-determinism. It is the simple idea that an author’s environment influences her writing in profound and easy to measure ways. On its surface it seems fairly innocuous, but it’s tied into a decades-long argument about the role of choice, subjectivity, creativity, contingency, and determinism. One word that people have used to get around the debate is affordances, and it’s as good a word as any to invoke here. What Jockers has found is a set of environmental conditions which afford certain writing styles and subject matters to an author. It’s not that authors are predetermined to write certain things at certain times, but that a series of factors combine to make the conditions ripe for certain writing styles, genres, etc., and not for others. The history of science analog would be the idea that, had Einstein never existed, relativity and quantum physics would still have come about; perhaps not as quickly, and perhaps not from the same person or in the same form, but they were ideas whose time had come. The environment was primed for their eventual existence. 3
It is here we see the digital humanities battling with the token/type distinction, and finding that distinction less relevant to its self-identification. It is no longer a question of whether one can impose or generalize laws on specific instances, because the axes of interest have changed. More and more, especially under the influence of new macroanalytic methodologies, we find that the specific and the general contextualize and augment each other.
The computational social sciences are converging on a similar shift. Jon Kleinberg likes to compare some old work by Stanley Milgram 4, where he had people draw maps of cities from memory, with digital city reconstruction projects which attempt to bridge the subjective and objective experiences of cities. The result in both cases is an attempt at something new: not quite objective, not quite subjective, and not quite intersubjective. It is a representation of collective individual experiences which in its whole has meaning, but also can be used to contextualize the specific. That these types of observations can often lead to shockingly accurate predictive “laws” isn’t really the point; they’re accidental results of an attempt to understand unique and contingent experiences at a grand scale. 5
It is no surprise that the token/type divide is woven into the subjective/objective divide. However, as Daston and Galison have pointed out, objectivity is not an ahistorical category. 6 It has a history, is only positively defined in relation to subjectivity, and neither were particularly useful concepts before the 19th century.
I would argue, as well, that the nomothetic and idiographic divide is one which is outliving its historical usefulness. Work from both the digital humanities and the computational social sciences is converging to a point where the objective and the subjective can peaceably coexist, where contingent experiences can be placed alongside general predictive principles without any cognitive dissonance, under a framework that allows both deterministic and creative elements. It is not that purely nomothetic or purely idiographic research will no longer exist, but that they no longer represent a binary category which can usefully differentiate research agendas. We still have Snow’s primary cultural distinctions, of course, and a bevy of disciplinary differences, but it will be interesting to see where this shift in axes takes us.
I am not the first to do this. Aviezer Tucker (2012) has a great chapter in The Oxford Handbook of Philosophy of Social Science, “Sciences of Historical Tokens and Theoretical Types: History and the Social Sciences” which introduces and historicizes the vocabulary nicely. ↩
Submissions for the 2014 Digital Humanities conference just closed. It’ll be in Switzerland this time around, which unfortunately means I won’t be able make it, but I’ll be eagerly following along from afar. Like last year, reviewers are allowed to preview the submitted abstracts. Also like last year, I’m going to be a reviewer, which means I’ll have the opportunity to revisit the submissions to DH2013 to see how the submissions differed this time around. No doubt when the reviews are in and the accepted articles are revealed, I’ll also revisit my analysis of DH conference acceptances.
To start with, the conference organizers received a record number of submissions this year: 589. Last year’s Nebraska conference only received 348 submissions. The general scope of the submissions haven’t changed much; authors were still supposed to tag their submissions using a controlled vocabulary of 95 topics, and were also allowed to submit keywords of their own making. Like last year, authors could submit long papers, short papers, panels, or posters, but unlike last year, multilingual submissions were encouraged (English, French, German, Italian, or Spanish). [edit: Bethany Nowviskie, patient awesome person that she is, has noticed yet another mistake I’ve made in this series of posts. Apparently last year they also welcomed multilingual submissions, and it is standard practice.]
Digital Humanities is known for its collaborative nature, and not much has changed in that respect between 2013 and 2014 (Figure 1). Submissions had, on average, between two and three authors, with 60% of submissions in both years having at least two authors. This year, a few fewer papers have single authors, and a few more have two authors, but the difference is too small to be attributable to anything but noise.
The distribution of topics being written about has changed mildly, though rarely in extreme ways. Any changes visible should also be taken with a grain of salt, because a trend over a single year is hardly statistically robust to small changes, say, in the location of the event.
The grey bars in Figure 2 show what percentage of DH2014 submissions are tagged with a certain topic, and the red dotted outlines show what the percentages were in 2013. The upward trends to note this year are text analysis, historical studies, cultural studies, semantic analysis, and corpora and corpus activities. Text analysis was tagged to 15% of submissions in 2013 and is now tagged to 20% of submissions, or one out of every five. Corpus analysis similarly bumped from 9% to 13%. Clearly this is an important pillar of modern DH.
I’ve pointed out before that History is secondary compared to Literary Studies in DH (although Ted Underwood has convincingly argued, using Ben Schmidt’s data, that the numbers may merely be due to fewer people studying history). This year, however, historical studies nearly doubled in presence, from 10% to 17%. I haven’t yet collected enough years of DH conference data to see if this is a trend in the discipline at large, or more of a difference between European and North American DH. Semantic analysis jumped from 1% to 7% of the submissions, cultural studies went from 10% to 14%, and literary studies stayed roughly equivalent. Visualization, one of the hottest topics of DH2013, has become even hotter in 2014 (14% to 16%).
The most visible drops in coverage came in pedagogy, scholarly editions, user interfaces, and research involving social media and the web. At DH2013, submissions on pedagogy had a surprisingly low acceptance rate, which combined the drop in pedagogy submissions this year (11% to 8% in “Digital Humanities – Pedagogy and Curriculum” and 7% to 4% in “Teaching and Pedagogy”) might suggest a general decline in interest in the DH world in pedagogy. “Scholarly Editing” went from 11% to 7% of the submissions, and “Interface and User Experience Design” from 13% to 8%, which is yet more evidence for the lack of research going into the creation of scholarly editions compared to several years ago. The most surprising drops for me were those in “Internet / World Wide Web” (12% to 8%) and “Social Media” (8.5% to 5%), which I would have guessed would be growing rather than shrinking.
The last thing I’ll cover in this post is the author-chosen keywords. While authors needed to tag their submissions from a list of 95 controlled vocabulary words, they were also encouraged to tag their entries with keywords they could choose themselves. In all they chose nearly 1,700 keywords to describe their 589 submissions. In last year’s analysis of these keywords, I showed that visualization seemed to be the glue that held the DH world together; whether discussing TEI, history, network analysis, or archiving, all the disparate communities seemed to share visualization as a primary method. The 2014 keyword map (Figure 3) reveals the same trend: visualization is squarely in the middle. In this graph, two keywords are linked if they appear together on the same submission, thus creating a network of keywords as they co-occur with one another. Words appear bigger when they span communities.
Despite the multilingual conference, the large component of the graph is still English. We can see some fairly predictable patterns: TEI is coupled quite closely with XML; collaboration is another keyword that binds the community together, as is (obviously) “Digital Humanities.” Linguistic and literature are tightly coupled, much moreso than, say, linguistic and history. It appears the distant reading of poetry is becoming popular, which I’d guess is a relatively new phenomena, although I haven’t gone back and checked.
This work has been supported by an ACH microgrant to analyze DH conferences and the trends of DH through them, so keep an eye out for more of these posts forthcoming that look through the last 15 years. Though I usually share all my data, I’ll be keeping these to myself, as the submitters to the conference did so under an expectation of privacy if their proposals were not accepted.
[edit: there was some interest on twitter last night for a raw frequency of keywords. Because keywords are author-chosen and I’m trying to maintain some privacy on the data, I’m only going to list those keywords used at least twice. Here you go (Figure 4)!]
Earlier today, Heather Froehlich shared what’s at this point become a canonical illustration among Ph.D. students: “The Illustrated guide to a Ph.D.” The illustrator, Matt Might, describes the sum of human knowledge as a circle. As a child, you sit at the center of the circle, looking out in all directions.
Eventually, he describes, you get various layers of education, until by the end of your bachelor’s degree you’ve begun focusing on a specialty, focusing knowledge in one direction.
A master’s degree further deepens your focus, extending you toward an edge, and the process of pursuing a Ph.D., with all the requisite reading, brings you to a tiny portion of the boundary of human knowledge.
You push and push at the boundary until one day you finally poke through, pushing that tiny portion of the circle of knowledge just a wee bit further than it was. That act of pushing through is a Ph.D.
It’s an uplifting way of looking at the Ph.D. process, inspiring that dual feeling of insignificance and importance that staring at the Hubble Ultra-Deep Field tends to bring about. It also exemplifies, in my mind, one of the broken aspects of the modern Ph.D. But while we’re on the subject of the Hubble Ultra-Deep Field, let me digress momentarily about stars.
Quite a while before you or I were born, Great Thinkers with Big Beards (I hear even the Great Women had them back then) also suggested we sat at the center of a giant circle, looking outwards. The entire universe, or in those days, the cosmos (Greek: κόσμος, “order”), was a series of perfect layered spheres, with us in the middle, and the stars embedded in the very top. The stars were either gems fixed to the last sphere, or they were little holes poked through it that let the light from heaven shine through.
As I see it, if we connect the celestial spheres theory to “The Illustrated Guide to a Ph.D.”, we’d arrive at the inescapable conclusion that every star in the sky is another dissertation, another hole poked letting the light of heaven shine through. And yeah, it takes a very prescriptive view of the knowledge and the universe that either you or I can argue with, but for this post we can let it slide because it’s beautiful, isn’t it? If you’re a Ph.D. student, don’t you want to be able to do this?
The problem is I don’t actually want to do this, and I imagine a lot of other people don’t want to do this, because there are already so many goddamn stars. Stars are nice. They’re pretty, how they twinkle up there in space, trillions of miles away from one another. That’s how being a Ph.D. student feels sometimes, too: there’s your research, my research, and a gap between us that can reach from Alpha Centauri and back again. Really, just astronomically far away.
It shouldn’t have to be this way. Right now a Ph.D. is about finding or doing something that’s new, in a really deep and narrow way. It’s about pricking the fabric of the spheres to make a new star. In the end, you’ll know more about less than anyone else in the world. But there’s something deeply unsettling about students being trained to ignore the forest for the trees. In an increasingly connected world, the universe of knowledge about it seems to be ever-fracturing. Very few are being trained to stand back a bit and try to find patterns in the stars. To draw constellations.
I should know. I’ve been trying to write a dissertation on something huge, and the advice I’ve gotten from almost every professor I’ve encountered is that I’ve got to scale it down. Focus more. I can’t come up with something new about everything, so I’ve got to do it about one thing, and do it well. And that’s good advice, I know! If a lot of people weren’t doing that a lot of the time, we’d all just be running around in circles and not doing cool things like going to the moon or watching animated pictures of cats on the internet.
But we also need to stand back and take stock, to connect things, and right now there are institutional barriers in place making that really difficult. My advisor, who stands back and connects things for a living (like the map of science below), gives me the same prudent advice as everyone else: focus more. It’s practical advice. For all that universities celebrate interdisciplinarity, in the end you still need to get hired by adepartment, and if you don’t fit neatly into their disciplinary niche, you’re not likely to make it. My request is simple. If you’re responsible for hiring researchers, or promoting them, or in charge of a department or (!) a university, make it easier to be interdisciplinary. Continue hiring people who make new stars, but also welcome the sort of people who want to connect them. There certainly are a lot of stars out there, and it’s getting harder and harder to see what they have in common, and to connect them to what we do every day. New things are great, but connecting old things in new ways is also great. Sometimes we need to think wider, not deeper.
After my last post about co-citation analysis, the author of one of the papers I was responding to, K. Brad Wray, generously commented and suggested I write up and publish the results and send them off to Erkenntnis, which is the same journal he published his results. That sounded like a great idea, so I am.
Because so many good ideas have come from comments on this blog, I’d like to try opening my first draft to communal commenting. For those who aren’t familiar with google docs (anyone? Bueller?), you can comment by selecting test and either hitting ctrl-alt-m, or going to the insert-> menu and clicking ‘Comment’.
The paper is about the relationship between history of science and philosophy of science, and draws both from the blog post and from this page with additional visualizations. There is also an appendix (pdf, sorry) with details of data collection and some more interesting results for the HPS buffs. If you like history of science, philosophy of science, or citation analysis, I’d love to see your comments! If you have any general comments that don’t refer to a specific part of the text, just post them in the blog comments below.
This is a bit longer form than the usual blog, so who knows if it will inspire much interaction, but it’s worth a shot. Anyone who is signed in so I can see their name will get credit in the acknowledgements.