“Digital History” Can Never Be New

If you claim computational approaches to history (“digital history”) lets historians ask new types of questions, or that they offer new historical approaches to answering or exploring old questions, you are wrong. You’re not actually wrong, but you are institutionally wrong, which is maybe worse.

This is a problem, because rhetoric from practitioners (including me) is that we can bring some “new” to the table, and when we don’t, we’re called out for not doing so. The exchange might (but probably won’t) go like this:

Digital Historian: And this graph explains how velociraptors were of utmost importance to Victorian sensibilities.

Historian in Audience: But how is this telling us anything we haven’t already heard before? Didn’t John Hammond already make the same claim?

DH: That’s true, he did. One thing the graph shows, though, is that velicoraptors in general tend to play much more unimportant roles across hundreds of years, which lends support to the Victorian thesis.

HiA: Yes, but the generalized argument doesn’t account for cultural differences across those times, so doesn’t meaningfully contribute to this (or any other) historical conversation.

New Questions

History (like any discipline) is made of people, and those people have Ideas about what does or doesn’t count as history (well, historiography, but that’s a long word so let’s ignore it). If you ask a new type of question or use a new approach, that new thing probably doesn’t fit historians’ Ideas about proper history.

Take culturomics. They make claims like this:

The age of peak celebrity has been consistent over time: about 75 years after birth. But the other parameters have been changing. Fame comes sooner and rises faster. Between the early 19th century and the mid-20th century, the age of initial celebrity declined from 43 to 29 years, and the doubling time fell from 8.1 to 3.3 years.

Historians saw those claims and asked “so what”? It’s not interesting or relevant according to the things historians usually consider interesting or relevant, and it’s problematic in ways historians find things problematic. For example, it ignores cultural differences, does not speak to actual human experiences, and has nothing of use to say about a particular historical moment.

It’s true. Culturomics-style questions do not fit well within a humanities paradigm (incommensurable, anyone?). By the standard measuring stick of what makes a good history project, culturomics does not measure up. A new type of question requires a new measuring stick; in this case, I think a good one for culturomics-style approaches is the extent to which they bridge individual experiences with large-scale social phenomena, or how well they are able to reconcile statistical social regularities with free or contingent choice.

The point, though, is a culturomics presentation would fit few of the boxes expected at a history conference, and so would be considered a failure. Rightly so, too—it’s a bad history presentation. But what culturomics is successfully doing is asking new types of questions, whether or not historians find them legitimate or interesting. Is it good culturomics?

To put too fine a point on it, since history is often a question-driven discipline, new types of questions that are too different from previous types are no longer legitimately within the discipline of history, even if they are intrinsically about human history and do not fit in any other discipline.

What’s more, new types of questions may appear simplistic by historian’s standards, because they fail at fulfilling even the most basic criteria usually measuring historical worth. It’s worth keeping in mind that, to most of the rest of the world, our historical work often fails at meeting their criteria for worth.

New Approaches

New approaches to old questions share a similar fate, but for different reasons. That is, if they are novel, they are not interesting, and if they are interesting, they are not novel.

Traditional historical questions are, let’s face it, not particularly new. Tautologically. Some old questions in my field are: what role did now-silent voices play in constructing knowledge-making instruments in 17th century astronomy? How did scholarship become institutionalized in the 18th century? Why was Isaac Newton so annoying?

My own research is an attempt to provide a broader view of those topics (at least, the first two) using computational means. Since my topical interest has a rich tradition among historians, it’s unlikely any of my historically-focused claims (for example, that scholarly institutions were built to replace the really complicated and precarious role people played in coordinating social networks) will be without precedent.

After decades, or even centuries, of historical work in this area, there will always be examples of historians already having made my claims. My contribution is the bolstering of a particular viewpoint, the expansion of its applicability, the reframing of a discussion. Ultimately, maybe, I convince the world that certain social network conditions play an important role in allowing scholarly activity to be much more successful at its intended goals. My contribution is not, however, a claim that is wholly without precedent.

But this is a problem, since DH rhetoric, even by practitioners, can understandably lead people to expect such novelty. Historians in particular are very good at fitting old patterns to new evidence. It’s what we’re trained to do.

Any historical claim (to an acceptable question within the historical paradigm) can easily be countered with “but we already knew that”. Either the question’s been around long enough that every plausible claim has been covered, or the new evidence or theory is similar enough to something pre-existing that it can be taken as precedent.

The most masterful recent discussion of this topic was Matthew Lincoln’s Confabulation in the humanities, where he shows how easy it is to make up evidence and get historians to agree that they already knew it was true.

To put too fine a point on it, new approaches to old historical questions are destined to produce results which conform to old approaches; or if they don’t, it’s easy enough to stretch the old & new theories together until they fit. New approaches to old questions will fail at producing completely surprising results; this is a bad standard for historical projects. If a novel methodology were to create truly unrecognizable results, it is unlikely those results would be recognized as “good history” within the current paradigm. That is, historians would struggle to care.

What Is This Beast?

What is this beast we call digital history? Boundary-drawing is a tried-and-true tradition in the humanities, digital or otherwise. It’s theoretically kind of stupid but practically incredibly important, since funding decisions, tenure cases, and similar career-altering forces are at play. If digital history is a type of history, it’s fundable as such, tenurable as such; if it isn’t, it ain’t. What’s more, if what culturomics researchers are doing are also history, their already-well-funded machine can start taking slices of the sad NEH pie.

Artist's rendition of sad NEH pie. [via]
Artist’s rendition of sad NEH pie. [via]
So “what counts?” is unfortunately important to answer.

This discussion around what is “legitimate history research” is really important, but I’d like to table it for now, because it’s so often conflated with the discussion of what is “legitimate research” sans history. The former question easily overshadows the latter, since academics are mostly just schlubs trying to make a living.

For the last century or so, history and philosophy of science have been smooshed together in departments and conferences. It’s caused a lot of concern. Does history of science need philosophy of science? Does philosophy of science need history of science? What does it mean to combine the two? Is what comes out of the middle even useful?

Weirdly, the question sometimes comes down to “does history and philosophy of science even exist?”. It’s weird because people identify with that combined title, so I published a citation analysis in Erkenntnis a few years back that basically showed that, indeed, there is an area between the two communities, and indeed those people describe themselves as doing HPS, whatever that means to them.

Look! Right in the middle there, it's history and philosophy of science.
Look! Right in the middle there, it’s history and philosophy of science.

I bring this up because digital history, as many of us practice it, leaves us floating somewhere between public engagement, social science, and history. Culturomics occupies a similar interstitial space, though inching closer to social physics and complex systems.

From this vantage point, we have a couple of options. We can say digital history is just history from a slightly different angle, and try to be evaluated by standard historical measuring sticks—which would make our work easily criticized as not particularly novel. Or we can say digital history is something new, occupying that in-between space—which could render the work unrecognizable to our usual communities.

The either/or proposition is, of course, ludicrous. The best work being done now skirts the line, offering something just novel enough to be surprising, but not so out of traditional historical bounds as to be grouped with culturomics. But I think we need to more deliberate and organized in this practice, lest we want to be like History and Philosophy of Science, still dealing with basic questions of legitimacy fifty years down the line.

In the short term, this probably means trying not just to avoid the rhetoric of newness, but to actively curtail it. In the long term, it may mean allying with like-minded historians, social scientists, statistical physicists, and complexity scientists to build a new framework of legitimacy that recognizes the forms of knowledge we produce which don’t always align with historiographic standards. As Cassidy Sugimoto and I recently wrote, this often comes with journals, societies, and disciplinary realignment.

The least we can do is steer away from a novelty rhetoric, since what is novel often isn’t history, and what is history often isn’t novel.

“Branding” – An Addendum

After writing this post, I read Amardeep Singh’s call to, among other things, avoid branding:

Here’s a way of thinking that might get us past this muddle (and I think I agree with the authors that the hype around DH is a mistake): let’s stop branding our scholarship. We don’t need Next Big Things and we don’t need Academic Superstars, whether they are DH Superstars or Theory Superstars. What we do need is to find more democratic and inclusive ways of thinking about the value of scholarship and scholarly communities.

This is relevant here, and good, but tough to reconcile with the earlier post. In an ideal world, without disciplinary brandings, we can all try to be welcoming of works on their own merits, without relying our preconceived disciplinary criteria. In the present condition, though, it’s tough to see such an environment forming. In that context, maybe a unified digital history “brand” is the best way to stay afloat. This would build barriers against whatever new thing comes along next, though, so it’s a tough question.

Representation at Digital Humanities Conferences (2000-2015)

Nickoal Eichmann (corresponding author), Jeana Jorgensen, Scott B. Weingart 1

NOTE: This is a pre-peer reviewed draft submitted for publication in Feminist Debates in Digital Humanities, eds. Jacque Wernimont and Elizabeth Losh, University of Minnesota Press (2017). Comments are welcome, and a downloadable dataset / more figures are forthcoming. This chapter will be released alongside another on the history of DH conferences, co-authored by Weingart & Eichmann (forthcoming), which will go into further detail on technical aspects of this study, including the data collection & statistics. Many of the materials first appeared on this blog. To cite this preprint, use the figshare DOI:  https://dx.doi.org/10.6084/m9.figshare.3120610.v1


Digital Humanities (DH) is said to have a light side and a dark side. Niceness, globality, openness, and inclusivity sit at one side of this binary caricature; commodification, neoliberalism, techno-utopianism, and white male privilege sit at the other. At times, the plurality of DH embodies both descriptions.

We hope a diverse and critical DH is a goal shared by all. While DH, like the humanities writ large, is not a monolith, steps may be taken to improve its public face and shared values through positively influencing its communities. The Alliance of Digital Humanities Organizations’ (ADHO’s) annual conference hosts perhaps the largest such community. As an umbrella organization of six international digital humanities constituent organizations, as well as 200 DH centers in a few dozen countries, ADHO and its conference ought to represent the geographic, disciplinary, and demographic diversity of those who identify as digital humanists.

The annual conference offers insight into how the world sees DH. While it may not represent the plurality of views held by self-described digital humanists, the conference likely influences the values of its constituents. If the conference glorifies Open Access, that value will be taken up by its regular attendees; if the conference fails to prioritize diversity, this too will be reinforced.

This chapter explores fifteen years of DH conferences, presenting a quantified look at the values implicitly embedded in the event. Women are consistently underrepresented, in spite of the fact that the most prominent figures at the conference are as likely women as men. The geographic representation of authors has become more diverse over time—though authors with non-English names are still significantly less likely to pass peer review. The topical landscape is heavily gendered, suggesting a masculine bias may be built into the value system of the conference itself. Without data on skin color or ethnicity, we are unable to address racial or related diversity and bias here.

There have been some improvements over time and, especially recently, a growing awareness of diversity-related issues. While many of the conference’s negative traits are simply reflections of larger entrenched academic biases, this is no comfort when self-reinforcing biases foster a culture of microaggression and white male privilege. Rather than using this study as an excuse to write off DH as just another biased community, we offer statistics, critiques, and suggestions as a vehicle to improve ADHO’s conference, and through it the rest of self-identified Digital Humanities.


Digital humanities (DH), we are told, exists under a “big tent”, with porous borders, little gatekeeping, and, heck, everyone’s just plain “nice”. Indeed, the term itself is not used definitionally, but merely as a “tactical convenience” to get stuff done without worrying so much about traditional disciplinary barriers. DH is “global”, “public”, and diversely populated. It will “save the humanities” from its crippling self-reflection (cf. this essay), while simultaneously saving the computational social sciences from their uncritical approaches to data. DH contains its own mirror: it is both humanities done digitally, and the digital as scrutinized humanistically. As opposed to the staid, backwards-looking humanities we are used to, the digital humanities “experiments”, “plays”, and even “embraces failure” on ideological grounds. In short, we are the hero Gotham needs.

Digital Humanities, we are told, is a narrowly-defined excuse to push a “neoliberal agenda”, a group of “bullies” more interested in forcing humanists to code than in speaking truth to power. It is devoid of cultural criticism, and because of the way DHers uncritically adopt tools and methods from the tech industry, they in fact often reinforce pre-existing power structures. DH is nothing less than an unintentionally rightist vehicle for techno-utopianism, drawing from the same font as MOOCs and complicit in their devaluing of education, diversity, and academic labor. It is equally complicit in furthering both the surveillance state and the surveillance economy, exemplified in its stunning lack of response to the Snowden leaks. As a progeny of the computer sciences, digital humanities has inherited the same lack of gender and racial diversity, and any attempt to remedy the situation is met with incredible resistance.

The truth, as it so often does, lies somewhere in the middle of these extreme caricatures. It’s easy to ascribe attributes to Digital Humanities synecdochically, painting the whole with the same brush as one of its constituent parts. One would be forgiven, for example, for coming away from the annual international ADHO Digital Humanities conference assuming DH were a parade of white men quantifying literary text. An attendee of HASTAC, on the other hand, might leave seeing DH as a diverse community focused on pedagogy, but lacking in primary research. Similar straw-snapshots may be drawn from specific journals, subcommunities, regions, or organizations.

But these synecdoches have power. Our public face sets the course of DH, via who it entices to engage with us, how it informs policy agendas and funding allocations, and who gets inspired to be the next generation of digital humanists. Especially important is the constituency and presentation of the annual Digital Humanities conference. Every year, several hundred students, librarians, staff, faculty, industry professionals, administrators and researchers converge for the conference, organized by the Alliance of Digital Humanities Organizations (ADHO). As an umbrella organization of six international digital humanities constituent organizations, as well as 200 DH centers in a few dozen countries, ADHO and its conference ought to represent the geographic, disciplinary, and demographic diversity of those who identify as digital humanists. And as DH is a community that prides itself on its activism and its social/public goals, if the annual DH conference does not celebrate this diversity, the DH community may suffer a crisis of identity (…okay, a bigger crisis of identity).

So what does the DH conference look like, to an outsider? Is it diverse? What topics are covered? Where is it held? Who is participating, who is attending, and where are they coming from? This essay offers incomplete answers to these questions for fifteen years of DH conferences (2000-2015), focusing particularly on DH2013 (Nebraska, USA), DH2014 (Lausanne, Switzerland), and DH2015 (Sydney, Australia). 2 We do so with a double-agenda: (1) to call out the biases and lack of diversity at ADHO conferences in the earnest hope it will help improve future years’ conferences, and (2) to show that simplistic, reductive quantitative methods can be applied critically, and need not feed into techno-utopic fantasies or an unwavering acceptance of proxies as a direct line to Truth. By “distant reading” DH and turning our “macroscopes” on ourselves, we offer a critique of our culture, and hopefully inspire fruitful discomfort in DH practitioners who apply often-dehumanizing tools to their subjects, but have not themselves fallen under the same distant gaze.

Among other findings, we observe a large gender gap for authorship that is not mirrored among those who simply attend the conference. We also show a heavily gendered topical landscape, which likely contributes to topical biases during peer review. Geographic diversity has improved over fifteen years, suggesting ADHO’s strategy to expand beyond the customary North American / European rotation was a success. That said, there continues to be a visible bias against non-English names in the peer review process. We could not get data on ethnicity, race, or skin color, but given our regional and name data, as well as personal experience, we suspect in this area, diversity remains quite low.

We do notice some improvement over time and, especially in the last few years, a growing awareness of our own diversity problems. The #whatifDH2016 3 hashtag, for example, was a reaction to an all-male series of speakers introducing DH2015 in Sydney. The hashtag caught on and made it to ADHO’s committee on conferences, who will use it in planning future events. Our remarks here are in the spirit of #whatifDH2016; rather than using this study as an excuse to defame digital humanities, we hope it becomes a vehicle to improve ADHO’s conference, and through it the rest of our community.

Social Justice and Equality in the Digital Humanities

Diversity in the Academy

In order to contextualize gender and ethnicity in the DH community, we must take into account developments throughout higher education. This is especially important since much of DH work is done in university and other Ivory Tower settings. Clear progress has been made from the times when all-male, all-white colleges were the norm, but there are still concerns about the marginalization of scholars who are not white, male, able-bodied, heterosexual, or native English-speakers. Many campuses now have diversity offices and have set diversity-related goals at both the faculty and student levels (for example, see the Ohio State University’s diversity objectives and strategies 2007-12). On the digital front, blogs such as Conditionally Accepted, Fight the Tower, University of Venus, and more all work to expose the normative biases in academia through activist dialogue.

From both a historical and contemporary lens, there is data supporting the clustering of women and other minority scholars in certain realms of academia, from specific fields and subjects to contingent positions. When it comes to gender, the phrase “feminization” has been applied both to academia in general and to specific fields. It contains two important connotations: that of an area in which women are in the majority, and the sense of a change over time, such that numbers of women participants are increasing in relation to men (Leathwood and Read 2008, 10). It can also signal a less quantitative shift in values, “whereby ‘feminine’ values, concerns, and practices are seen to be changing the culture of an organization, a field of practice or society as a whole” (ibid).

In terms of specific disciplines, the feminization of academia has taken a particular shape. Historian Lynn Hunt suggests the following propositions about feminization in the humanities and history specifically: the feminization of history parallels what is happening in the social sciences and humanities more generally; the feminization of the social sciences and humanities is likely accompanied by a decline in status and resources; and other identity categories, such as ethnic minority status and age/generation, also interact with feminization in ways that are still becoming coherent.

Feminization has clear consequences for the perception and assignation of value of a given field. Hunt writes: “There is a clear correlation between relative pay and the proportion of women in a field; those academic fields that have attracted a relatively high proportion of women pay less on average than those that have not attracted women in the same numbers.” Thus, as we examine the topics that tend to be clustered by gender in DH conference submissions, we must keep in mind the potential correlations of feminization and value, though it is beyond the scope of this paper to engage in chicken-or-egg debates about the causal relationship between misogyny and the devaluing of women’s labor and women’s topics.

There is no obvious ethnicity-based parallel to the concept of the feminization of academia; it wouldn’t be culturally intelligible to talk about the “people-of-colorization of academia”, or the “non-white-ization of academia.” At any rate, according to a U.S. Department of Education survey, in 2013 79% of all full-time faculty in degree-granting postsecondary institutions were white. The increase of non-white faculty from 2009 (19.2% of the whole) to 2013 (21.5%) is very small indeed.

Why does this matter? As Jeffrey Milem, Mitchell Chang, and Anthony Lising Antonio write in regard to faculty of color, “Having a diverse faculty ensures that students see people of color in roles of authority and as role models or mentors. Faculty of color are also more likely than other faculty to include content related to diversity in their curricula and to utilize active learning and student-centered teaching techniques…a coherent and sustained faculty diversity initiative must exist if there is to be any progress in diversifying the faculty” (25). By centering marginalized voices, scholarly institutions have the ability to send messages about who is worthy of inclusion.

Recent Criticisms of Diversity in DH

In terms of DH specifically, diversity within the community and conferences has been on the radar for several years, and has recently gained special attention, as digital humanists and other academics alike have called for critical and feminist engagement in diversity and a move away from what seems to be an exclusionary culture. In January 2011, THATCamp SoCal included a section called “Diversity in DH,” in which participants explored the lack of openness in DH and, in the end, produced a document, “Toward an Open Digital Humanities” that summarized their discussions. The “Overview” in this document mirrors the same conversation we have had for the last several years:

We recognize that a wide diversity of people is necessary to make digital humanities function. As such, digital humanities must take active strides to include all the areas of study that comprise the humanities and must strive to include participants of diverse age, generation, sex, skill, race, ethnicity, sexuality, gender, ability, nationality, culture, discipline, areas of interest. Without open participation and broad outreach, the digital humanities movement limits its capacity for critical engagement. (ibid)

This proclamation represents the critiques of the DH landscape in 2011, in which DH practitioners and participants were assumed to be privileged and white, that they excluded student-learners, and that they held myopic views of what constitutes DH. Most importantly for this chapter, THATCamp SoCal’s “Diversity in DH” section participants called for critical approaches and social justice of DH scholarship and participation, including “principles for feminist/non-exclusionary groundrules in each session (e.g., ‘step up/step back’) so that the loudest/most entitled people don’t fill all the quiet moments.” They also advocated defending the least-heard voices “so that the largest number of people can benefit…”

These voices certainly didn’t fall flat. However, since THATCamps are often comprised of geographically local DH microcommunities, they benefit from an inclusive environment but suffer as isolated events. As result, it seems that the larger, discipline-specific venues which have greater attendance and attraction continue to amplify privileged voices. Even so, 2011 continued to represent a year that called for critical engagement in diversity in DH, with an explicit “Big Tent” theme for DH2011 held in Stanford, California. Embracing the concept the “Big Tent” deliberately opened the doors and widened the spectrum of DH, at least in terms of methods and approaches. However, as Melissa Terras pointed out, DH was “still a very rich, very western academic field” (Terras, 2011), even with a few DH2011 presentations engaging specifically with topics of diversity in DH. 4

A focus on diversity-related issues has only grown in the interim. We’ve recently seen greater attention and criticism of DH exclusionary culture, for instance, at the 2015 Modern Language Association (MLA) annual convention, which included the roundtable discussion “Disrupting Digital Humanities.” It confronted the “gatekeeping impulse” in DH, and echoing THATCamp SoCal 2011, these panelists aimed to shut down hierarchical dialogues in DH, encourage non-traditional scholarship, amplify “marginalized voices,” advocate for DH novices, and generously support the work of peers. 5 The theme for DH2015 in Sydney, Australia was “Global Digital Humanities,” and between its successes and collective action arising from frustrations at its failures, the community seems poised to pay even greater attention to diversity. Other recent initiatives in this vein worth mention include #dhpoco, GO::DH, and Jacqueline Wernimont’s “Build a Better Panel,” 6 whose activist goals are helping diversify the community and raise awareness of areas where the community can improve.

While it would be fruitful to conduct a longitudinal historiographical analysis of diversity in DH, more recent criticisms illustrate a history of perceived exclusionary culture, which is why we hope to provide a data-driven approach to continue the conversation and call for feminist and critical engagement and intervention.


While DH as a whole has been critiqued for its lack of diversity and inclusion, how does the annual ADHO DH conference measure up? To explore this in a data-driven fashion, we have gathered publicly available annual ADHO conference programs and schedules from 2000-2015. From those conference materials, we have entered presentation and author information into a spreadsheet to analyze various trends over time, such as gender and geography as indicators of diversity. Particular information that we have collected includes: presentation title, keywords (if available), abstract and full-text (if available), presentation type, author name, author institutional affiliation and academic department (if available), and corresponding country of that affiliation at the time of the presentation(s). We normalized and hand-cleaned names, institutions, and departments, so that, to the best of our knowledge, each author entry represented a unique person and, accordingly, was assigned a unique ID. Next, we added gender information (m/f/other/unknown) to authors by a combination of hand-entry and automated inference. While this is problematic for many reasons, 7 since it does not allow for diversity in gender options and tracing gender changes over time, it does give us a useful preliminary lense to view gender diversity at DH conferences.

For 2013’s conference, ADHO instituted a series of changes aimed at improving inclusivity, diversity, and quality. This drive was steered by that year’s program committee chair, Bethany Nowviskie, alongside 2014’s chair, Melissa Terras. Their reformative goals matched our current goals in this essay, and speak to a long history of experimentation and improvement efforts on behalf of ADHO. Their changes included making the conference more welcome to outsiders through ending policies that only insiders knew about; making the CFP less complex and easier to translate into multiple languages; taking reviewer language competencies into account systematically; and streamlining the submission and review process.

The biggest noticeable change to DH2013, however, was the institution of a reviewer bidding process and a phase of semi-open peer review. Peer reviewers were invited to read through and rank every submitted abstract according to how qualified they felt to review the abstract. Following this, the conference committee would match submissions to qualified peer reviewers, taking into account conflicts of interest. Submitting authors were invited to respond to reviews, and the committee would make a final decision based on the various reviews and rebuttals.This continues to be the process through DH2016. Changes continue to be made, most recently in 2016 with the addition of “Diversity” and “Multilinguality” as new keywords authors can append to their submissions.

While the list of submitted abstracts was private, accessible only to reviewers, as reviewers ourselves we had access to the submissions during the bidding phase. We used this access to create a dataset of conference submissions for DH2013, DH2014, and DH2015, which includes author names, affiliations, submission titles, author-selected topics, author-chosen keywords, and submission types (long paper, short paper, poster, panel).

We augmented this dataset by looking at the final conference programs in ‘13, ‘14, and ‘15, noting which submissions eventually made it onto the final conference program, and how they changed from the submission to the final product. This allows us to roughly estimate the acceptance rate of submissions, by comparing the submitted abstract lists to the final programs. It is not perfect, however, given that we don’t actually know whether submissions that didn’t make it to the final program were rejected, or if they were accepted and withdrawn. We also do not know who reviewed what, nor do we know the reviewers’ scores or any associated editorial decisions.

The original dataset, then, included fields for title, authors, author affiliations, original submission type, final accepted type, topics, keywords, and a boolean field for whether a submission made it to the final conference program. We cleaned the data up by merging duplicate people, ensuring e.g., if “Melissa Terras” was an author on two different submissions, she counted as the same person. For affiliations, we semi-automatically merged duplicate institutions, found the countries they reside in, and assigned those countries to broad UN regions. We also added data to the set, first automatically guessing a gender for each author, and then correcting the guesses by hand.

Given that abstracts were submitted to conferences with an expectation of privacy, we have not released the full submission dataset; we have, however, released the full dataset of final conference programs. 8

We would like to acknowledge the gross and problematic simplifications involved in this process of gendering authors without their consent or input. As Miriam Posner has pointed out, with regards to Getty’s Union List of Author Names, “no self-respecting humanities scholar would ever get away with such a crude representation of gender in traditional work”. And yet, we represent authors in just this crude fashion, labeling authors as male, female, or unknown/other. We did not encode changes of author gender over time, even though we know of at least a few authors in the dataset for whom this applies. We do not use the affordances of digital data to represent the fluidity of gender. This is problematic for a number of reasons, not least of which because, when we take a cookie cutter to the world, everything in the world will wind up looking like cookies.

We made this decision because, in the end, all data quality is contingent to the task at hand. It is possible to acknowledge an ontology’s shortcomings while still occasionally using that ontology to a positive effect. This is not always the case: often poor proxies get in the way a research agenda (e.g., citations as indicators of “impact” in digital humanities), rather than align with it. In the humanities, poor proxies are much more likely to get in the way of research than help it along, and afford the ability to make insensitive or reductivist decisions in the name of “scale”.

For example, in looking for ethnic diversity of a discipline, one might analyze last names as a proxy for country of origin, or analyze the color of recognized faces in pictures from recent conferences as a proxy for ethnic genealogy. Among other reasons, this approach falls short because ethnicity, race, and skin color are often not aligned, and last names (especially in the U.S.) are rarely indicative of anything at all. But they’re easy solutions, so people use them. These are moments when a bad proxy (and for human categories, proxies are almost universally bad) does not fruitfully contribute to a research agenda. As George E.P. Box put it, “all models are wrong, but some are useful.”

Some models are useful. Sometimes, the stars align and the easy solution is the best one for the question. If someone were researching immediate reactions of racial bias in the West, analyzing skin tone may get us something useful. In this case, the research focus is not someone’s racial identity, but someone’s race as immediately perceived by others, which would likely align with skin tone. Simply: if a person looks black, they’re more likely to be treated as such by the (white) world at large. 9

We believe our proxies, though grossly inaccurate, are useful for the questions of gender and geographic diversity and bias. The first step to improving DH conference diversity is noticing a problem; our data show that problem through staggeringly imbalanced regional and gender ratios. With regards to gender bias, showing whether reviewers are less likely to accept papers from authors who appear to be women can reveal entrenched biases, whether or not the author actually identifies as a woman. With that said, we invite future researchers to identify and expand on our admitted categorical errors, allowing everyone to see the contours of our community with even greater nuance.


The annual ADHO conference has grown significantly in the last fifteen years, as described in our companion piece 10, within which can be found a great discussion of our methods. This piece, rather than covering overall conference trends, focuses specifically on issues of diversity and acceptance rates. We cover geographic and gender diversity from 2000-2015, with additional discussions of topicality and peer review bias beginning in 2013.


Women comprise 36.1% of the 3,239 authors to DH conference presentations over the last fifteen years, counting every unique author only once. Melissa Terras’ names appears on 29 presentations between 200-2015, and Scott B. Weingart’s name appears on 4 presentations, but for the purpose of this metric each name counts only once. Female authorship representation fluctuates between 29%-38% depending on the year.

Weighting every authorship event individually (i.e., Weingart’s name counts 4 times, Terras’ 29 times), women’s representation drops to 32.7%. This reveals that women are less likely to author multiple pieces compared to their male counterparts. More than a third of the DH authorship pool are women, but fewer than a third of every name that appears on a presentation is a woman’s. Even fewer single-authored pieces are by a woman; only 29.8% of the 984 single-authored works between 2000-2015 female-authored. About a third (33.4%) of first authors on presentations are women. See Fig. 1 for a breakdown of these numbers over time. Note the lack of periodicity, suggesting gender representation is not affected by whether the conference is held in Europe or North America (until 2015, the conference alternated locations every year). The overall ratio wavers, but is neither improving nor worsening over time.

Figure 1. re
Figure 1. Representation of Women at ADHO Conferences, 2000-2015.

The gender disparity sparked controversy at DH2015 in Sydney. It was, however, at odds with a common anecdotal awareness that many of the most respected role-models and leaders in the community are women. To explore this disconnect, we experimented with using centrality in co-authorship networks as a proxy for fame, respectability, and general presence within the DH consciousness. We assume that individuals who author many presentations, co-author with many people, and play a central role in connecting DH’s disparate communities of authorship are the ones who are most likely to garner the respect (or at least awareness) of conference attendees.

We created a network of authors connected to their co-authors from presentations between 2000-2015, with ties strengthening the more frequently two authors collaborate. Of the 3,239 authors in our dataset, 61% (1,750 individuals) are reachable by one another via their co-authorship ties. For example, Beth Plale is reachable by Alan Liu because she co-authored with J. Stephen Downie, who co-authored with Geoffrey Rockwell, who co-authored with Alan Liu. Thus, 61% of the network is connected in one large component, and there are 299 smaller components, islands of co-authorship disconnected from the larger community.

The average woman co-authors with 5 other authors, and the average man co-authors with 5.3 other authors. The median number of co-authors for both men and women is 4. The average and median of several centrality measurements (closeness, betweenness, pagerank, and eigenvector) for both men and women are nearly equivalent; that is, any given woman is just as likely to be near the co-authorship core as any given man. Naturally, this does not imply that half of the most central authors are women, since only a third of the entire authorship pool are women. It means instead that gender does not influence one’s network centrality. Or at least it should.

The statistics show a curious trend for the most central figures in the network. Of the top 10 authors who co-author with the most others, 60% are women. Of the top 20, 45% are women. Of the top 50, 38% are women. Of the top 100, 32% are women. That is, the over half the DH co-authorship stars are women, but the further towards the periphery you look, the more men occupy the middle-tier positions (i.e., not stars, but still fairly active co-authors). The same holds true for the various centrality measurements: betweenness (60% women in top 10; 40% in top 20; 32% in top 50; 34% in top 100), pagerank (50% women in top 10; 40% in top 20; 32% in top 50; 28% in top 100), and eigenvector (60% women in top 10; 40% in top 20; 40% in top 50; 34% in top 100).

In short, half or more of the DH conference stars are women, but as you creep closer to the network periphery, you are increasingly likely to notice the prevailing gender disparity. This supports the mismatch between an anecdotal sense that women play a huge role in DH, and the data showing they are poorly represented at conferences. The results also match with the fact that women are disproportionately more likely to write about management and leadership, discussed at greater length below.

The heavily-male gender skew at DH conferences may lead one to suspect a bias in the peer review process. Recent data, however, show that if such a bias exists, it is not direct. Over the past three conferences, 71% of women and 73% of men who submitted presentations passed the peer review process. The difference is not great enough to rule out random chance (p=0.16 using χ²). The skew at conferences is more a result of fewer women submitting articles than of women’s articles not getting accepted. The one caveat, explained more below, is that certain topics women are more likely to write about are also less likely to be accepted through peer-review.

This does not imply a lack of bias in the DH community. For example, although only 33.5% of authors at DH2015 in Sydney were women, 46% of conference attendees were women. If women were simply uninterested in DH, the split in attendance vs. authorship would not be so high.

In regard to discussions of women in different roles in the DH community – less the publishing powerhouses and more the community leaders and organizers – the concept of the “glass cliff” can be useful. Research on the feminization of academia in Sweden uses the term “glass cliff” as a “metaphor used to describe a phenomenon when women are appointed to precarious leadership roles associated with an increased risk of negative consequences when a company is performing poorly and for example is experiencing profit falls, declining stock performance, and job cuts” (Peterson 2014, 4). The female academics (who also occupied senior managerial positions) interviewed in Helen Peterson’s study expressed concerns about increasing workloads, the precarity of their positions, and the potential for interpersonal conflict.

Institutional politics may also play a role in the gendered data here. Sarah Winslow says of institutional context that “female faculty are less likely to be located at research institutions or institutions that value research over teaching, both of which are associated with greater preference for research” (779). The research, teaching, and service divide in academia remains a thorny issue, especially given the prevalence of what has been called the pink collar workforce in academia, or the disproportionate amount of women working in low-paying teaching-oriented areas. This divide likely also contributed to differing gender ratios between attendees and authors at DH2015.

While the gendered implications of time allocation in universities are beyond the scope of this paper, it might be useful to note that there might be long-term consequences for how people spend their time interacting with scholarly tasks that extend beyond one specific institution. Winslow writes: “Since women bear a disproportionate responsibility for labor that is institution-specific (e.g., institutional housekeeping, mentoring individual students), their investments are less likely to be portable across institutions. This stands in stark contrast to men, whose investments in research make them more highly desirable candidates should they choose to leave their own institutions” (790). How this plays out specifically in the DH community remains to be seen, but the interdisciplinarity of DH along with its projects that span multiple working groups and institutions may unsettle some of the traditional bias that women in academia face.


Until 2015, the DH conference alternated every year between North America and Europe. As expected, until recently, the institutions represented at the conference have hailed mostly from these areas, with the primary locus falling in North America. In fact, since 2000, North American authors were the largest authorial constituency at eleven of the fifteen conferences, even though North America only hosted the conference seven times in that period.

With that said, as opposed to gender representation, national and institutional diversity is improving over time. Using an Index of Qualitative Variation (IQV), institutional variation begins around 0.992 in 2000 and ends around 0.996 in 2015, with steady increases over time. National IQV begins around 0.79 in 2010 and ends around 0.83 in 2015, also with steady increases over time. The most recent conference was the first that included over 30% of authors and attendees arriving from outside Europe or North America. Now that ADHO has implemented a three-year cycle, with every third year marked by a movement outside its usual territory, that diversity is likely to increase further still.

The most well-represented institutions are not as dominating as some may expect, given the common view of DH as a community centered around particular powerhouse departments or universities. The university with the most authors contributing to DH conferences (2.4% of the total authors) is King’s College London, followed by the Universities of Illinois (1.85%), Alberta (1.83%), and Virginia (1.75%). The most prominent university outside of North America or Europe is Ritsumeikan University, contributing 1.07% of all DH conference authors. In all, over a thousand institutions have contributed authors to the conference, and that number increases every year.

While these numbers represent institutional origins, the data available does not allow any further diving into birth countries, native language, ethnic identities, etc. The 2013-2015 dataset, including peer review information, does yield some insight into geography-influenced biases that may map to language or identity. While the peer review data do not show any clear bias by institutional country, there is a very clear bias against names which do not appear frequently in the U.S. Census or Social Security Index. We discovered this when attempting to statistically infer the gender of authors using these U.S.-based indices. 11 From 2013-2015, presentations written by those with names appearing frequently in these indices were significantly more likely to be accepted than those written by authors with non-English names (p < 0.0001). Whereas approximately 72% of authors with common U.S. names passed peer review, only 61% of authors with uncommon names passed. Without more data, we have no idea whether this tremendous disparity is due to a bias against popular topics from non-English-speaking countries, a higher likelihood of peer reviewers rejecting text written by non-native writers, an implicit bias by peer reviewers when they see “foreign” names, or something else entirely.


When submitting a presentation, authors are given the opportunity to provide keywords for their submission. Some keywords can be chosen freely, while others must be chosen from a controlled list of about 100 potential topics. These controlled keywords are used to help in the process of conference organization and peer reviewer selection, and they stay roughly constant every year. New keywords are occasionally added to the list, as in 2016, where authors can now select three topics which were not previously available: “Digital Humanities – Diversity”, “Digital Humanities – Multilinguality”, and “3D Printing”. The 2000-2015 conference dataset does not include keywords for every article, so this analysis will only cover the more detailed dataset, 2013-2015, with additional data on submissions for DH2016.

From 2013-2016, presentations were tagged with an average of six controlled keywords per submission. The most-used keywords are unsurprising: “Text Analysis” (tagged on 22% of submissions), “Data Mining / Text Mining” (20%), “Literary Studies” (20%), “Archives, Repositories, Sustainability And Preservation” (19%), and “Historical Studies” (18%). The most frequently-used keyword potentially pertaining directly to issues of diversity, “Cultural Studies”, appears on on 14% of submissions from 2013-2016. Only 2% of submissions are tagged with “Gender Studies”. The two diversity-related keywords introduced this year are already being used surprisingly frequently, with 9% of submissions in 2016 tagged “Digital Humanities – Diversity” and 6% of submissions tagged “Digital Humanities – Multilinguality”. With over 650 conference submissions for 2016, this translates to a reasonably large community of DH authors presenting on topics related to diversity.

Joining the topic and gender data for 2013-2015 reveals the extent to which certain subject matters are gendered at DH conferences. 12 Women are twice as likely to use the “Gender Studies” tag as male authors, whereas men are twice as likely to use the “Asian Studies” tag as female authors. Subjects related to pedagogy, creative / performing arts, art history, cultural studies, GLAM (galleries, libraries, archives, museums), DH institutional support, and project design/organization/management are more likely to be presented by women. Men, on the other hand, are more likely to write about standards & interoperability, the history of DH, programming, scholarly editing, stylistics, linguistics, network analysis, and natural language processing / text analysis. It seems DH topics have inherited the usual gender skews associated with the disciplines in which those topics originate.

We showed earlier that there was no direct gender bias in the peer review process. While true, there appears to be indirect bias with respect to how certain gendered topics are considered acceptable by the DH conference peer reviewers. A woman has just as much chance of getting a paper through peer review as a man if they both submit a presentation on the same topic (e.g., both women and men have a 72% chance of passing peer review if they write about network analysis, or a 65% chance of passing peer review if they write about knowledge representation), but topics that are heavily gendered towards women are less likely to get accepted. Cultural studies has a 57% acceptance rate, gender studies 60%, pedagogy 51%. Male-skewed topics have higher acceptance rates, like text analysis (83%), programming (80%), or Asian studies (79%). The female-gendering of DH institutional support and project organization also supports our earlier claim that, while women are well-represented among the DH leadership, they are more poorly represented in those topics that the majority of authors are discussing (programming, text analysis, etc.).

Regarding the clustering – and devaluing – of topics that women tend to present on at DH conferences, the widespread acknowledgement of the devaluing of women’s labor may help to explain this. We discussed the feminization of academia above, and indeed, this is a trend seen in practically all facets of society. The addition of emotional labor or caretaking tasks complicates this. Economist Teresa Ghilarducchi explains: “a lot of what women do in their lives is punctuated by time outside of the labor market — taking care of family, taking care of children — and women’s labor has always been devalued…[people] assume that she had some time out of the labor market and that she was doing something that was basically worthless, because she wasn’t being paid for it.” In academia specifically, the labyrinthine relationship of pay to tasks/labor further obscures value: we are rarely paid per task (per paper published or presented) on the research front; service work is almost entirely invisible; and teaching factors in with course loads, often with more up-front transparency for contingent laborers such as adjuncts and part-timers.

Our results seem to point to less of an obvious bias against women scholars than a subtler bias against topics that women tend to gravitate toward, or are seen as gravitating toward. This is in line with the concept of postfeminism, or the notion that feminism has met its main goals (e.g. getting women the right to vote and the right to an education), and thus is irrelevant to contemporary social needs and discourse. Thoroughly enmeshed in neoliberal discourse, postfeminism makes discussing misogyny seem obsolete and obscures the subtler ways in which sexism operates in daily life (Pomerantz, Raby, and Stefanik 2013). While individuals may or may not choose to identify as postfeminist, the overarching beliefs associated with postfeminism have permeated North American culture at a number of levels, leading us to posit the acceptance of the ideals of postfeminism as one explanation for the devaluing of topics that seem associated with women.

Discussion and Future Research

The analysis reveals an annual DH conference with a growing awareness of diversity-related issues, with moderate improvements in regional diversity, stagnation in gender diversity, and unknown (but anecdotally poor) diversity with regards to language, ethnicity, and skin color. Knowledge at the DH conference is heavily gendered, though women are not directly biased against during peer review, and while several prominent women occupy the community’s core, women occupy less space in the much larger periphery. No single or small set of institutions dominate the conference attendance, and though North America’s influence on ADHO cannot be understated, recent ADHO efforts are significantly improving the geographic spread of its constituency.

The DH conference, and by extension ADHO, is not the digital humanities. It is, however, the largest annual gathering of self-identified digital humanists, 13 and as such its makeup holds influence over the community at large. Its priorities, successes, and failures reflect on DH, both within the community and to the outside world, and those priorities get reinforced in future generations. If the DH conference remains as it is—devaluing knowledge associated with femininity, comprising only 36% women, and rejecting presentations by authors with non-English names—it will have significant difficulty attracting a more diverse crowd without explicit interventions. Given the shortcomings revealed in the data above, we present some possible interventions that can be made by ADHO or its members to foster a more diverse community, inspired by #WhatIfDH2016:

  • As pointed out by Yvonne Perkins, Ask presenters to include a brief “Collections Used” section, when appropriate. Such a practice would highlight and credit the important work being done by those who aren’t necessarily engaging in publishable research, and help legitimize that work to conference attendees.

  • As pointed out by Vika Zafrin, create guidelines for reviewers explicitly addressing diversity, and provide guidance on noticing and reducing peer review bias.

  • As pointed out by Vika Zafrin, community members can make an effort to solicit presentation submissions from women and people of color.

  • As pointed out by Vika Zafrin, collect and analyze data on who is peer reviewing, to see whether or the extent to which biases creep in at that stage.

  • As pointed out by Aimée Morrison, ensure that the conference stage is at least as diverse as the conference audience. This can be accomplished in a number of ways, from conference organizers making sure their keynote speakers draw from a broad pool, to organizing last-minute lightning lectures specifically for those who are registered but not presenting.

  • As pointed out by Tonya Howe, encourage presentations or attendance from more process-oriented liberal arts delegates.

  • As pointed out by Christina Boyles, encourage the submission of research focused around the intersection of race, gender, and sexuality studies. This may be partially accomplished by including more topical categories for conference submissions, a step which ADHO has already taken for 2016.

  • As pointed out by many, take explicit steps in ensuring conference access to those with disabilities. We suggest this become an explicit part of the application package submitted by potential host institutions.

  • As pointed out by many, ensure the ease of participation-at-a-distance (both as audience and as speaker) for those without the resources to travel.

  • As requested by Karina van Dalen-Oskam, chair of ADHO’s Steering Committee, send her an email on how to navigate the difficult cultural issues facing an international organization.

  • Give marginalized communities greater representation in the DH Conference peer reviewer pool. This can be done grassroots, with each of us reaching out to colleagues to volunteer as reviewers, and organizationally, perhaps by ADHO creating a volunteer group to seek out and encourage more diverse reviewers.

  • Consider the difference between diversifying (verb) vs. talking about diversity (noun), and consider whether other modes of disrupting hegemony, such as decolonization and queering, might be useful in these processes.

  • Contribute to the #whatifDH2016 and #whatifDH2017 discussions on twitter with other ideas for improvements.

Many options are available to improve representation at DH conferences, and some encouraging steps are already being taken by ADHO and its members. We hope to hear more concrete steps that may be taken, especially learned from experiences in other communities or outside of academia, in order to foster a healthier and more welcoming conference going forward.

In the interest of furthering these goals and improving the organizational memory of ADHO, the public portion of the data (final conference programs with full text and unique author IDs) is available alongside this publication [will link in final draft]. With this, others may test, correct, or improve our work. We will continue work by extending the dataset back to 1990, continuing to collect for future conferences, and creating an infrastructure that will allow the database to connect to others with similar collections. This will include the ability to encode more nuanced and fluid gender representations, and for authors to correct their own entries. Further work will also include exploring topical co-occurrence, institutional bias in peer review, how institutions affect centrality in the co-authorship network, and how authors who move between institutions affect all these dynamics.

The Digital Humanities will never be perfect. It embodies the worst of its criticisms and the best of its ideals, sometimes simultaneously. We believe a more diverse community will help tip those scales in the right direction, and present this chapter in service of that belief.

Works Cited

#whatifdh2015 “TAGS Searchable Twitter Archive,” n.d. http://hawksey.info/tagsexplorer/arc.html?key=10C2c1phG1QywDmy4lG4mro6VBiv0UuZlLL_uZ8HFfkc&gid=400689247

ADHO. “Our Mission,” n.d. http://adho.org/

“ADHO Announces New Steering Committee Chair.” ADHO, n.d. http://www.adho.org/announcements/2015/adho-announces-new-steering-committee-chair

“All Models Are Wrong.” Wikipedia, September 20, 2015. https://en.wikipedia.org/w/index.php?title=All_models_are_wrong&oldid=681908687

Blevins, Cameron, and Lincoln Mullen. “Jane, John … Leslie? A Historical Method for Algorithmic Gender Prediction.” Digital Humanities Quarterly 9, no. 3 (2015). http://www.digitalhumanities.org/dhq/vol/9/3/000223/000223.html

Boyles, Christina. “#WhatIfDH2016 Made Space for Scholars Who Are Interested in the Intersection(s) between DH and Race, Gender, and Sexuality Studies?” @clboyles, July 1, 2015. https://twitter.com/clboyles/statuses/616080151365861376

Burton, John W. Culture and the Human Body: An Anthropological Perspective. Prospect Heights, Ill.: Waveland Press, 2001.

“centerNet,” n.d. http://www.dhcenternet.org/

Cohen, Dan. “Catching the Good.” Dan Cohen, March 30, 2012. http://www.dancohen.org/2012/03/30/catching-the-good/

“Conditionally Accepted.” Inside Higher Education, n.d. https://www.insidehighered.com/users/conditionally-accepted

“Conference.” ADHO, n.d. http://adho.org/conference

“Congrats, You Have an All Male Panel!” n.d. http://allmalepanels.tumblr.com/

“DH Dark Sider (@DHDarkSider) | Twitter,” n.d. https://twitter.com/dhdarksider

“DH Enthusiast (@DH_Enthusiast) | Twitter,” n.d. https://twitter.com/DH_Enthusiast

“Disrupting the Digital Humanities.” Disrupting the Digital Humanities, n.d. http://www.disruptingdh.com/

Diversity in DH @ THATCamp. “Toward an Open Digital Humanities,” January 11, 2011. https://docs.google.com/document/d/1uPtB0xr793V27vHBmBZr87LY6Pe1BLxN-_DuJzqG-wU/edit?usp=sharing

Drucker, Johanna. “Humanistic Theory and Digital Scholarship.” In Debates in the Digital Humanities. University of Minnesota Press, 2012. http://dhdebates.gc.cuny.edu/debates/text/34

“Fight The Tower : Women of Color in Academia,” n.d. http://fighttower.com/

Ghilarducci, Teresa. “Why Women Over 50 Can’t Find Jobs.” Portside, n.d. http://portside.org/2016-01-18/why-women-over-50-can’t-find-jobs

“Global Outlook::Digital Humanities | Promoting Collaboration among Digital Humanities Researchers World-Wide,” n.d. http://www.globaloutlookdh.org/

Golumbia, David. “Right Reaction and the Digital Humanities.” Uncomputing, July 3, 2015. http://www.uncomputing.org/?p=1666

Howe, Tonya. “#whatifDH2016 Advocated for More Process-Oriented Liberal Arts Delegates?” Microblog. Twitter.com/howet, June 30, 2015. https://twitter.com/howet/statuses/616045260570030080

Hunt, Lynn. “Has the Battle Been Won? The Feminization of History.” Perspectives on History, May 1998. https://www.historians.org/publications-and-directories/perspectives-on-history/may-1998/has-the-battle-been-won-the-feminization-of-history

Lothian, Alexis. “THATCamp and Diversity in Digital Humanities.” Queer Geek Theory, n.d. http://www.queergeektheory.org/2011/01/thatcamp-and-diversity-in-digital-humanities/

Milen, Jeffrey F., Mitchell J. Chang, and Anthony Lising Antonio. “Making Diversity Work on Campus: A Research-Based Perspective.” Association American Colleges and Universities, 2005. http://citeseerx.ist.psu.edu/viewdoc/download?doi=

Morrison, Aimée. “#WhatIfDH2016 Had as Many Women on the Stage as in the Audience? http://www.scottbot.net/HIAL/?p=41355 #dh2015.” Microblog. @digiwonk, June 30, 2015. https://twitter.com/digiwonk/status/616042963093835776

Mullen, Lincoln. Ropensci/gender: Predict Gender from Names Using Historical Data, n.d. https://github.com/ropensci/gender

Nowviskie, Bethany. “Asking for It.” Bethany Nowviskie, February 8, 2014. http://nowviskie.org/2014/asking-for-it/

———. “Cats and Ships.” Bethany Nowviskie, November 2, 2012. http://nowviskie.org/2012/cats-and-ships/

Ohio State University. “Diversity Action Plan,” n.d. https://www.osu.edu/diversityplan/index.php

Perkins, Yvonne. “International Researchers Value Work of Australian Libraries and Archives.” Stumbling Through the Past, July 20, 2015. https://stumblingpast.wordpress.com/2015/07/21/intnl_researchers_value_oz_libraries_archives/

Peterson, Helen. “An Academic ‘Glass Cliff’? Exploring the Increase of Women in Swedish Higher Education Management.” Athens Journal of Education 1, no. 1 (February 2014): 32–44.

Pomerantz, Shauna, Rebecca Raby, and Andrea Stefanik. “Girls Run the World? Caught between Sexism and Postfeminism in the School.” *Gender & Society *27, no. 2 (April 1, 2013): 185-207. doi:10.1177/0891243212473199

Posner, Miriam. “What’s Next: The Radical, Unrealized Potential of Digital Humanities.” Miriam Posner’s Blog, July 27, 2015. http://miriamposner.com/blog/whats-next-the-radical-unrealized-potential-of-digital-humanities/

“Postcolonial Digital Humanities | Global Explorations of Race, Class, Gender, Sexuality and Disability within Cultures of Technology,” n.d. http://dhpoco.org/

Steiger, Kay. “The Pink Collar Workforce of Academia: Low-Paid Adjunct Faculty, Who Are Mostly Female, Have Started Unionizing for Better Pay—and Winning.” The Nation, July 11, 2013. http://www.thenation.com/article/academias-pink-collar-workforce/

Terras, Melissa. “Disciplined: Using Educational Studies to Analyse ‘Humanities Computing.’” Literary and Linguistic Computing 21, no. 2 (June 1, 2006): 229–46. doi:10.1093/llc/fql022

———. “Peering Inside the Big Tent: Digital Humanities and the Crisis of Inclusion.” Melissa Terras’ Blog, July 26, 2011. http://melissaterras.blogspot.com/2011/07/peering-inside-big-tent-digital.html

“THATCamp Southern California 2011 | The Humanities and Technology Camp,” n.d. http://socal2011.thatcamp.org/

“University of Venus.” Inside Higher Education, n.d. https://www.insidehighered.com/blogs/university-venus

U.S. Department of Education, National Center for Education Statistics. “Race/ethnicity of College Faculty,” 2015. https://nces.ed.gov/fastfacts/display.asp?id=61

Weingart, Scott. “Acceptances to Digital Humanities 2015 (part 4).” The Scottbot Irregular, June 28, 2015. http://www.scottbot.net/HIAL/?p=41375

———. “The Myth of Text Analytics and Unobtrusive Measurement.” The Scottbot Irregular, May 6, 2012. http://www.scottbot.net/HIAL/?p=16713

Wernimont, Jacqueline. “Build a Better Panel: Women in DH.” Jacqueline Wernimont. Accessed January 14, 2016. https://jwernimont.wordpress.com/2015/09/19/build-a-better-panel-women-in-dh/

———. “No More Excuses.” Jacqueline Wernimont, September 19, 2015. https://jwernimont.wordpress.com/2015/09/19/no-more-excuses/

Winslow, Sarah. “Gender Inequality and Time Allocations Among Academic Faculty.” Gender & Society 24, no. 6 (December 1, 2010): 769–93. doi:10.1177/0891243210386728.

Zafrin, Vika. “#WhatIfDH2016 Created Guidelines for Reviewers Explicitly Addressing Diversity & Providing Guidance on Reducing One’s Bias?” Microblog. @veek, June 30, 2015. https://twitter.com/veek/status/616041712163680256

———. “#WhatIfDH2016 Encouraged ALL Community Members to Reach out to Women & POC and Solicit Paper Submissions?” Microblog. @veek, June 30, 2015. https://twitter.com/veek/statuses/616041931949363200

———. “#WhatIfDH2016 Expanded ConfTool Pro to Record Reviewer Biases along Gender, Race, Country-of-Origin GDP Lines?” Microblog. @veek, June 30, 2015. https://twitter.com/veek/statuses/616043562799636481


  1. Each author contributed equally to the final piece; please disregard authorship order.
  2. See Melissa Terras, “Disciplined: Using Educational Studies to Analyse ‘Humanities Computing.’” Literary and Linguistic Computing 21, no. 2 (June 1, 2006): 229–46. doi:10.1093/llc/fql022. Terras takes a similar approach, analyzing Humanities Computing “through its community, research, curriculum, teaching programmes, and the message they deliver, either consciously or unconsciously, about the scope of the discipline.”
  3. The authors have created a browsable archive of #whatifDH2016 tweets.
  4. Of the 146 presentations at DH2011, two standout in relation to diversity in DH: “Is There Anybody out There? Discovering New DH Practitioners in other Countries” and “A Trip Around the World: Balancing Geographical Diversity in Academic Research Teams.”
  5. See “Disrupting DH,” http://www.disruptingdh.com/
  6. See Wernimont’s blog post, “No More Excuses” (September 2015) for more, as well as the Tumblr blog, “Congrats, you have an all male panel!”
  7. Miriam Posner offers a longer and more eloquent discussion of this in, “What’s Next: The Radical, Unrealized Potential of Digital Humanities.” Miriam Posner’s Blog. July 27, 2015. http://miriamposner.com/blog/whats-next-the-radical-unrealized-potential-of-digital-humanities/
  8. [Link to the full public dataset, forthcoming and will be made available by time of publication])
  9. We would like to acknowledge that race and ethnicity are frequently used interchangeably, though both are cultural constructs with their roots in Darwinian thought, colonialism, and imperialism. We retain these terms because they express cultural realities and lived experiences of oppression and bias, not because there is any scientific validity to their existence. For more on this tension, see John W.Burton, (2001), Culture and the Human Body: An Anthropological Perspective. Prospect Heights, Illinois: Waveland Press, 51-54.
  10. Weingart, S.B. & Eichmann, N. (2016). “What’s Under the Big Tent?: A Study of ADHO Conference Abstracts.” Manuscript submitted for publication.
  11. We used the process and script described in: Lincoln Mullen (2015). gender: Predict Gender from Names Using Historical Data. R package version (https://github.com/ropensci/gender) and Cameron Blevins and Lincoln Mullen, “Jane, John … Leslie? A Historical Method for Algorithmic Gender Prediction,” Digital Humanities Quarterly 9.3 (2015).
  12. For a breakdown of specific numbers of gender representation across all 96 topics from 2013-2015, see Weingart’s “Acceptances to Digital Humanities 2015 (part 4)”.
  13. While ADHO’s annual conference is usually the largest annual gathering of digital humanists, that place is constantly being vied for by the Digital Humanities Summer Institute in Victoria, Canada, which in 2013 boasted more attendees than DH2013 in Lincoln, Nebraska.

Do historians need scientists?

[edit: I’m realizing I didn’t make it clear in this post that I’m aware many historians consider themselves scientists, and that there’s plenty of scientific historical archaeology and anthropology. That’s exactly what I’m advocating there be more of, and more varied.]

Short Answer: Yes.

Less Snarky Answer: Historians need to be flexible to fresh methods, fresh perspectives, and fresh blood. Maybe not that last one, I guess, as it might invite vampires.Okay, I suppose this answer wasn’t actually less snarky.

Long Answer

The long answer is that historians don’t necessarily need scientists, but that we do need fresh scientific methods. Perhaps as an accident of our association with the ill-defined “humanities”, or as a result of our being placed in an entirely different culture (see: C.P. Snow), most historians seem fairly content with methods rooted in thinking about text and other archival evidence. This isn’t true of all historians, of course – there are economic historians who use statistics, historians of science who recreate old scientific experiments, classical historians who augment their research with archaeological findings, archival historians who use advanced ink analysis,  and so forth. But it wouldn’t be stretching the truth to say that, for the most part, historiography is the practice of thinking cleverly about words to make more words.

I’ll argue here that our reliance on traditional methods (or maybe more accurately, our odd habit of rarely discussing method) is crippling historiography, and is making it increasingly likely that the most interesting and innovative historical work will come from non-historians. Sometimes these studies are ill-informed, especially when the authors decide not to collaborate with historians who know the subject, but to claim that a few ignorant claims about history negate the impact of these new insights is an exercise in pedantry.

In defending the humanities, we like to say that scientists and technologists with liberal arts backgrounds are more well-rounded, better citizens of the world, more able to contextualize their work. Non-humanists benefit from a liberal arts education in pretty much all the ways that are impossible to quantify (and thus, extremely difficult to defend against budget cuts). We argue this in the interest of rounding a person’s knowledge, to make them aware of their past, of their place in a society with staggering power imbalances and systemic biases.

Humanities departments should take a page from their own books. Sure, a few general ed requirements force some basic science and math… but I got an undergraduate history degree in a nice university, and I’m well aware how little STEM I actually needed to get through it. Our departments are just as guilty of narrowness as those of our STEM colleagues, and often because of it, we rely on applied mathematicians, statistical physicists, chemists, or computer scientists to do our innovative work for (or sometimes, thankfully, with) us.

Of course, there’s still lots of innovative work to be done from a textual perspective. I’m not downplaying that. Not everyone needs to use crazy physics/chemistry/computer science/etc. methods. But there’s a lot of low hanging fruit at the intersection of historiography and the natural sciences, and we’re not doing a great job of plucking it.

The story below is illustrative.


Last night, Blaise Agüera y Arcas presented his research on Gutenberg to a packed house at our rare books library. He’s responsible for a lot of the cool things that have come out of Microsoft in the last few years, and just got a job at Google, where presumably he will continue to make cool things. Blaise has degrees in physics and applied mathematics. And, a decade ago, Blaise and historian/librarian Paul Needham sent ripples through the History of the Book community by showing that Gutenberg’s press did not work at all the way people expected.

It was generally assumed that Gutenberg employed a method called punchcutting in order to create a standard font. A letter carved into a metal rod (a “punch”) would be driven into a softer metal (a “matrix”) in order to create a mold. The mold would be filled with liquid metal which hardened to form a small block of a single letter (a “type”), which would then be loaded onto the press next to other letters, inked, and then impressed onto a page. Because the mold was metal, many duplicate “types” could be made of the same letter, thus allowing many uses of the same letter to appear identical on a single pressed page.

Punch matrix system. [via]
Punch matrix system. [via]
Type to be pressed. [via]
Type to be pressed. [via]
This process is what allowed all the duplicate letters to appear identical in Gutenberg’s published books. Except, of course, careful historians of early print noticed that letters weren’t, in fact, identical. In the 1980s, Paul Needham and a colleague attempted to produce an inventory of all the different versions of letters Gutenberg used, but they stopped after frequently finding 10 or more obviously distinct versions of the same letter.

Needham's inventory of Gutenberg type. [via]
Needham’s inventory of Gutenberg type. [via]
This was perplexing, but the subject was bracketed away for a while, until Blaise Agüera y Arcas came to Princeton and decided to work with Needham on the problem. Using extremely high-resolution imagining techniques, Blaise noted that there were in fact hundreds of versions of every letter. Not only that, there were actually variations and regularities in the smaller elements that made up letters. For example, an “n” was formed by two adjacent vertical lines, but occasionally the two vertical lines seem to have flipped places entirely. The extremely basic letter “i” itself had many variations, but within those variations, many odd self-similarities.

Variations in the letter "i" in Gutenberg's type. [via]
Variations in the letter “i” in Gutenberg’s type. [via]
Historians had, until this analysis, assumed most letter variations were due to wear of the type blocks. This analysis blew that hypothesis out of the water. These “i”s were clearly not all made in the same mold; but then, how had they been made? To answer this, they looked even closer at the individual letters.


Close up of Gutenberg letters, with light shining through page. [via]
Close up of Gutenberg letters, with light shining through page. [via]
It’s difficult to see at first glance, but they found something a bit surprising. The letters appeared to be formed of overlapping smaller parts: a vertical line, a diagonal box, and so forth. The below figure shows a good example of this. The glyphs on the bottom have have a stem dipping below the bottom horizontal line, while the glyphs at the top do not.

Abbreviation of 'per'. [via]
Abbreviation of ‘per’. [via]
The conclusion Needham and Agüera y Arcas drew, eventually, was that the punchcutting method must not have been used for Gutenberg’s early material. Instead, a set of carved “strokes” were pushed into hard sand or soft clay, configured such that the strokes would align to form various letters, not unlike the formation of cuneiform. This mold would then be used to cast letters, creating the blocks we recognize from movable type. The catch is that this soft clay could only cast letters a few times before it became unusable and would need to be recreated. As Gutenberg needed multiple instances of individual letters per page, many of those letters would be cast from slightly different soft molds.

Low-Hanging Fruit

At the end of his talk, Blaise made an offhand comment: how is it that historians/bibliographers/librarians have been looking at these Gutenbergs for so long, discussing the triumph of their identical characters, and not noticed that the characters are anything but uniform? Or, of those who had noticed it, why hadn’t they raised any red flags?

The insights they produced weren’t staggering feats of technology. He used a nice camera, a light shining through the pages of an old manuscript, and a few simple image recognition and clustering algorithms. The clustering part could even have been done by hand, and actually had been, by Paul Needham. And yes, it’s true, everything is obvious in hindsight, but there were a lot of eyes on these bibles, and odds are if some of them had been historians who were trained in these techniques, this insight could have come sooner. Every year students do final projects and theses and dissertations, but what percent of those use techniques from outside historiography?

In short, there’s a lot of very basic assumptions we make about the past that could probably be updated significantly if we had the right skillset, or knew how to collaborate with those who did. I think people like William Newman, who performs Newton’s alchemical experiments, is on the right track. As is Shawn Graham, who reanimates the trade networks of ancient Rome using agent-based simulations, or Devon Elliott, who creates computational and physical models of objects from the history of stage magic. Elliott’s models have shown that certain magic tricks couldn’t possibly have worked as they were described to.

The challenge is how to encourage this willingness to reach outside traditional historiographic methods to learn about the past. Changing curricula to be more flexible is one way, but that is a slow and institutionally difficult process. Perhaps faculty could assign group projects to students taking their gen-ed history courses, encouraging disciplinary mixes and non-traditional methods. It’s an open question, and not an easy one, but it’s one we need to tackle.

Bridging Token and Type

There’s an oft-spoken and somewhat strawman tale of how the digital humanities is bridging C.P. Snow’s “Two Culture” divide, between the sciences and the humanities. This story is sometimes true (it’s fun putting together Ocean’s Eleven-esque teams comprising every discipline needed to get the job done) and sometimes false (plenty of people on either side still view the other with skepticism), but as a historian of science, I don’t find the divide all that interesting. As Snow’s title suggests, this divide is first and foremost cultural. There’s another overlapping divide, a bit more epistemological, methodological, and ontological, which I’ll explore here. It’s the nomothetic(type)/idiographic(token) divide, and I’ll argue here that not only are its barriers falling, but also that the distinction itself is becoming less relevant.

Nomothetic (Greek for “establishing general laws”-ish) and Idiographic (Greek for “pertaining to the individual thing”-ish) approaches to knowledge have often split the sciences and the humanities. I’ll offload the hard work onto Wikipedia:

Nomothetic is based on what Kant described as a tendency to generalize, and is typical for the natural sciences. It describes the effort to derive laws that explain objective phenomena in general.

Idiographic is based on what Kant described as a tendency to specify, and is typical for the humanities. It describes the effort to understand the meaning of contingent, unique, and often subjective phenomena.

These words are long and annoying to keep retyping, and so in the longstanding humanistic tradition of using new words for words which already exist, henceforth I shall refer to nomothetic as type and idiographic as token. 1 I use these because a lot of my digital humanities readers will be familiar with their use in text mining. If you counted the number of unique words in a text, you’d be be counting the number of types. If you counted the number of total words in a text, you’d be counting the number of tokens, because each token (word) is an individual instance of a type. You can think of a type as the platonic ideal of the word (notice the word typical?), floating out there in the ether, and every time it’s actually used, it’s one specific token of that general type.

The Token/Type Distinction
The Token/Type Distinction

Usually the natural and social sciences look for general principles or causal laws, of which the phenomena they observe are specific instances. A social scientist might note that every time a student buys a $500 textbook, they actively seek a publisher to punch, but when they purchase $20 textbooks, no such punching occurs. This leads to the discovery of a new law linking student violence with textbook prices. It’s worth noting that these laws can and often are nuanced and carefully crafted, with an awareness that they are neither wholly deterministic nor ironclad.

The humanities (or at least history, which I’m more familiar with) are more interested in what happened than in what tends to happen. Without a doubt there are general theories involved, just as in the social sciences there are specific instances, but the intent is most-often to flesh out details and create a particular internally consistent narrative. They look for tokens where the social scientists look for types. Another way to look at it is that the humanist wants to know what makes a thing unique, and the social scientist wants to know what makes a thing comparable.

It’s been noted these are fundamentally different goals. Indeed, how can you in the same research articulate the subjective contingency of an event while simultaneously using it to formulate some general law, applicable in all such cases? Rather than answer that question, it’s worth taking time to survey some recent research.

A recent digital humanities panel at MLA elicited responses by Ted Underwood and Haun Saussy, of which this post is in part itself a response. One of the papers at the panel, by Long and So, explored the extent to which haiku-esque poetry preceded what is commonly considered the beginning of haiku in America by about 20 years. They do this by teaching the computer the form of the haiku, and having it algorithmically explore earlier poetry looking for similarities. Saussy comments on this work:

[…] macroanalysis leads us to reconceive one of our founding distinctions, that between the individual work and the generality to which it belongs, the nation, context, period or movement. We differentiate ourselves from our social-science colleagues in that we are primarily interested in individual cases, not general trends. But given enough data, the individual appears as a correlation among multiple generalities.

One of the significant difficulties faced by digital humanists, and a driving force behind critics like Johanna Drucker, is the fundamental opposition between the traditional humanistic value of stressing subjectivity, uniqueness, and contingency, and the formal computational necessity of filling a database with hard decisions. A database, after all, requires you to make a series of binary choices in well-defined categories: is it or isn’t it an example of haiku? Is the author a man or a woman? Is there an author or isn’t there an author?

Underwood addresses this difficulty in his response:

Though we aspire to subtlety, in practice it’s hard to move from individual instances to groups without constructing something like the sovereign in the frontispiece for Hobbes’ Leviathan – a homogenous collection of instances composing a giant body with clear edges.

But he goes on to suggest that the initial constraint of the digital media may not be as difficult to overcome as it appears. Computers may even offer us a way to move beyond the categories we humanists use, like genre or period.

Aren’t computers all about “binary logic”? If I tell my computer that this poem both is and is not a haiku, won’t it probably start to sputter and emit smoke?

Well, maybe not. And actually I think this is a point that should be obvious but just happens to fall in a cultural blind spot right now. The whole point of quantification is to get beyond binary categories — to grapple with questions of degree that aren’t well-represented as yes-or-no questions. Classification algorithms, for instance, are actually very good at shades of gray; they can express predictions as degrees of probability and assign the same text different degrees of membership in as many overlapping categories as you like.

Here we begin to see how the questions asked of digital humanists (on the one side; computational social scientists are tackling these same problems) are forcing us to reconsider the divide between the general and the specific, as well as the meanings of categories and typologies we have traditionally taken for granted. However, this does not yet cut across the token/type divide: this has gotten us to the macro scale, but it does not address general principles or laws that might govern specific instances. Historical laws are a murky subject, prone to inducing fits of anti-deterministic rage. Complex Systems Science and the lessons we learn from Agent-Based Modeling, I think, offer us a way past that dilemma, but more on that later.

For now, let’s talk about influence. Or diffusion. Or intertextuality. 2 Matthew Jockers has been exploring these concepts, most recently in his book Macroanalysis. The undercurrent of his research (I think I’ve heard him call it his “dangerous idea”) is a thread of almost-determinism. It is the simple idea that an author’s environment influences her writing in profound and easy to measure ways. On its surface it seems fairly innocuous, but it’s tied into a decades-long argument about the role of choice, subjectivity, creativity, contingency, and determinism. One word that people have used to get around the debate is affordances, and it’s as good a word as any to invoke here. What Jockers has found is a set of environmental conditions which afford certain writing styles and subject matters to an author. It’s not that authors are predetermined to write certain things at certain times, but that a series of factors combine to make the conditions ripe for certain writing styles, genres, etc., and not for others. The history of science analog would be the idea that, had Einstein never existed, relativity and quantum physics would still have come about; perhaps not as quickly, and perhaps not from the same person or in the same form, but they were ideas whose time had come. The environment was primed for their eventual existence. 3

An example of shape affording certain actions by constraining possibilities and influencing people. [via]
An example of shape affording certain actions by constraining possibilities and influencing people. [via]
It is here we see the digital humanities battling with the token/type distinction, and finding that distinction less relevant to its self-identification. It is no longer a question of whether one can impose or generalize laws on specific instances, because the axes of interest have changed. More and more, especially under the influence of new macroanalytic methodologies, we find that the specific and the general contextualize and augment each other.

The computational social sciences are converging on a similar shift. Jon Kleinberg likes to compare some old work by Stanley Milgram 4, where he had people draw maps of cities from memory, with digital city reconstruction projects which attempt to bridge the subjective and objective experiences of cities. The result in both cases is an attempt at something new: not quite objective, not quite subjective, and not quite intersubjective. It is a representation of collective individual experiences which in its whole has meaning, but also can be used to contextualize the specific. That these types of observations can often lead to shockingly accurate predictive “laws” isn’t really the point; they’re accidental results of an attempt to understand unique and contingent experiences at a grand scale. 5

Manhattan. Dots represent where people have taken pictures; blue dots are by locals, red by tourists, and yellow unsure. [via Eric Fischer]
Manhattan. Dots represent where people have taken pictures; blue dots are by locals, red by tourists, and yellow are uncertain. [via Eric Fischer]
It is no surprise that the token/type divide is woven into the subjective/objective divide. However, as Daston and Galison have pointed out, objectivity is not an ahistorical category. 6 It has a history, is only positively defined in relation to subjectivity, and neither were particularly useful concepts before the 19th century.

I would argue, as well, that the nomothetic and idiographic divide is one which is outliving its historical usefulness. Work from both the digital humanities and the computational social sciences is converging to a point where the objective and the subjective can peaceably coexist, where contingent experiences can be placed alongside general predictive principles without any cognitive dissonance, under a framework that allows both deterministic and creative elements. It is not that purely nomothetic or purely idiographic research will no longer exist, but that they no longer represent a binary category which can usefully differentiate research agendas. We still have Snow’s primary cultural distinctions, of course, and a bevy of disciplinary differences, but it will be interesting to see where this shift in axes takes us.


  1. I am not the first to do this. Aviezer Tucker (2012) has a great chapter in The Oxford Handbook of Philosophy of Social Science, “Sciences of Historical Tokens and Theoretical Types: History and the Social Sciences” which introduces and historicizes the vocabulary nicely.
  2. Underwood’s post raises these points, as well.
  3. This has sometimes been referred to as environmental possibilism.
  4. Milgram, Stanley. 1976. “Pyschological Maps of Paris.” In Environmental Psychology: People and Their Physical Settings, edited by Proshansky, Ittelson, and Rivlin, 104–124. New York.

    ———. 1982. “Cities as Social Representations.” In Social Representations, edited by R. Farr and S. Moscovici, 289–309.

  5. If you’re interested in more thoughts on this subject specifically, I wrote a bit about it in relation to single-authorship in the humanities here
  6. Daston, Lorraine, and Peter Galison. 2007. Objectivity. New York, NY: Zone Books.

Submissions to Digital Humanities 2014

Submissions for the 2014 Digital Humanities conference just closed. It’ll be in Switzerland this time around, which unfortunately means I won’t be able make it, but I’ll be eagerly following along from afar. Like last year, reviewers are allowed to preview the submitted abstracts. Also like last year, I’m going to be a reviewer, which means I’ll have the opportunity to revisit the submissions to DH2013 to see how the submissions differed this time around. No doubt when the reviews are in and the accepted articles are revealed, I’ll also revisit my analysis of DH conference acceptances.

To start with, the conference organizers received a record number of submissions this year: 589. Last year’s Nebraska conference only received 348 submissions. The general scope of the submissions haven’t changed much; authors were still supposed to tag their submissions using a controlled vocabulary of 95 topics, and were also allowed to submit keywords of their own making. Like last year, authors could submit long papers, short papers, panels, or posters, but unlike last year, multilingual submissions were encouraged (English, French, German, Italian, or Spanish). [edit: Bethany Nowviskie, patient awesome person that she is, has noticed yet another mistake I’ve made in this series of posts. Apparently last year they also welcomed multilingual submissions, and it is standard practice.]

Digital Humanities is known for its collaborative nature, and not much has changed in that respect between 2013 and 2014 (Figure 1). Submissions had, on average, between two and three authors, with 60% of submissions in both years having at least two authors. This year, a few fewer papers have single authors, and a few more have two authors, but the difference is too small to be attributable to anything but noise.

Figure 1. Number of authors per paper.
Figure 1. Number of authors per paper.

The distribution of topics being written about has changed mildly, though rarely in extreme ways. Any changes visible should also be taken with a grain of salt, because a trend over a single year is hardly statistically robust to small changes, say, in the location of the event.

The grey bars in Figure 2 show what percentage of DH2014 submissions are tagged with a certain topic, and the red dotted outlines show what the percentages were in 2013. The upward trends to note this year are text analysis, historical studies, cultural studies, semantic analysis, and corpora and corpus activities. Text analysis was tagged to 15% of submissions in 2013 and is now tagged to 20% of submissions, or one out of every five. Corpus analysis similarly bumped from 9% to 13%. Clearly this is an important pillar of modern DH.

Figure 2. Topics from DH2014 ordered by the percent of submissions which fall in that category. The dotted lines represent the percentage from DH2013.
Figure 2. Topics from DH2014 ordered by the percent of submissions which fall in that category. The red dotted outlines represent the percentage from DH2013.

I’ve pointed out before that History is secondary compared to Literary Studies in DH (although Ted Underwood has convincingly argued, using Ben Schmidt’s data, that the numbers may merely be due to fewer people studying history). This year, however, historical studies nearly doubled in presence, from 10% to 17%. I haven’t yet collected enough years of DH conference data to see if this is a trend in the discipline at large, or more of a difference between European and North American DH. Semantic analysis jumped from 1% to 7% of the submissions, cultural studies went from 10% to 14%, and literary studies stayed roughly equivalent. Visualization, one of the hottest topics of DH2013, has become even hotter in 2014 (14% to 16%).

The most visible drops in coverage came in pedagogy, scholarly editions, user interfaces, and research involving social media and the web. At DH2013, submissions on pedagogy had a surprisingly low acceptance rate, which combined the drop in pedagogy submissions this year (11% to 8% in “Digital Humanities – Pedagogy and Curriculum” and 7% to 4% in “Teaching and Pedagogy”) might suggest a general decline in interest in the DH world in pedagogy. “Scholarly Editing” went from 11% to 7% of the submissions, and “Interface and User Experience Design” from 13% to 8%, which is yet more evidence for the lack of research going into the creation of scholarly editions compared to several years ago. The most surprising drops for me were those in “Internet / World Wide Web” (12% to 8%) and “Social Media” (8.5% to 5%), which I would have guessed would be growing rather than shrinking.

The last thing I’ll cover in this post is the author-chosen keywords. While authors needed to tag their submissions from a list of 95 controlled vocabulary words, they were also encouraged to tag their entries with keywords they could choose themselves. In all they chose nearly 1,700 keywords to describe their 589 submissions. In last year’s analysis of these keywords, I showed that visualization seemed to be the glue that held the DH world together; whether discussing TEI, history, network analysis, or archiving, all the disparate communities seemed to share visualization as a primary method. The 2014 keyword map (Figure 3) reveals the same trend: visualization is squarely in the middle. In this graph, two keywords are linked if they appear together on the same submission, thus creating a network of keywords as they co-occur with one another. Words appear bigger when they span communities.

Figure 3. Co-occurrence of DH2014 author-submitted keywords.
Figure 3. Co-occurrence of DH2014 author-submitted keywords.

Despite the multilingual conference, the large component of the graph is still English. We can see some fairly predictable patterns: TEI is coupled quite closely with XML; collaboration is another keyword that binds the community together, as is (obviously) “Digital Humanities.” Linguistic and literature are tightly coupled, much moreso than, say, linguistic and history. It appears the distant reading of poetry is becoming popular, which I’d guess is a relatively new phenomena, although I haven’t gone back and checked.

This work has been supported by an ACH microgrant to analyze DH conferences and the trends of DH through them, so keep an eye out for more of these posts forthcoming that look through the last 15 years. Though I usually share all my data, I’ll be keeping these to myself, as the submitters to the conference did so under an expectation of privacy if their proposals were not accepted.

[edit: there was some interest on twitter last night for a raw frequency of keywords. Because keywords are author-chosen and I’m trying to maintain some privacy on the data, I’m only going to list those keywords used at least twice. Here you go (Figure 4)!]

Figure 4. Keywords used in DH2014 submissions ordered by frequency.
Figure 4. Keywords used in DH2014 submissions ordered by frequency.

Breaking the Ph.D. model using pretty pictures

Earlier today, Heather Froehlich shared what’s at this point become a canonical illustration among Ph.D. students: “The Illustrated guide to a Ph.D.” The illustrator, Matt Might, describes the sum of human knowledge as a circle. As a child, you sit at the center of the circle, looking out in all directions.

PhDKnowledge.002[1]Eventually, he describes, you get various layers of education, until by the end of your bachelor’s degree you’ve begun focusing on a specialty, focusing knowledge in one direction.

PhDKnowledge.004[1]A master’s degree further deepens your focus, extending you toward an edge, and the process of pursuing a Ph.D., with all the requisite reading, brings you to a tiny portion of the boundary of human knowledge.



You push and push at the boundary until one day you finally poke through, pushing that tiny portion of the circle of knowledge just a wee bit further than it was. That act of pushing through is a Ph.D.



It’s an uplifting way of looking at the Ph.D. process, inspiring that dual feeling of insignificance and importance that staring at the Hubble Ultra-Deep Field tends to bring about. It also exemplifies, in my mind, one of the broken aspects of the modern Ph.D. But while we’re on the subject of the Hubble Ultra-Deep Field, let me digress momentarily about stars.

1024px-Hubble_ultra_deep_field_high_rez_edit1[1]Quite a while before you or I were born, Great Thinkers with Big Beards (I hear even the Great Women had them back then) also suggested we sat at the center of a giant circle, looking outwards. The entire universe, or in those days, the cosmos (Greek: κόσμος, “order”), was a series of perfect layered spheres, with us in the middle, and the stars embedded in the very top. The stars were either gems fixed to the last sphere, or they were little holes poked through it that let the light from heaven shine through.


As I see it, if we connect the celestial spheres theory to “The Illustrated Guide to a Ph.D.”, we’d arrive at the inescapable conclusion that every star in the sky is another dissertation, another hole poked letting the light of heaven shine through. And yeah, it takes a very prescriptive view of the knowledge and the universe that either you or I can argue with, but for this post we can let it slide because it’s beautiful, isn’t it? If you’re a Ph.D. student, don’t you want to be able to do this?

Flammarion[1]The problem is I don’t actually want to do this, and I imagine a lot of other people don’t want to do this, because there are already so many goddamn stars. Stars are nice. They’re pretty, how they twinkle up there in space, trillions of miles away from one another. That’s how being a Ph.D. student feels sometimes, too: there’s your research, my research, and a gap between us that can reach from Alpha Centauri and back again. Really, just astronomically far away.


It shouldn’t have to be this way. Right now a Ph.D. is about finding or doing something that’s new, in a really deep and narrow way. It’s about pricking the fabric of the spheres to make a new star. In the end, you’ll know more about less than anyone else in the world. But there’s something deeply unsettling about students being trained to ignore the forest for the trees. In an increasingly connected world, the universe of knowledge about it seems to be ever-fracturing. Very few are being trained to stand back a bit and try to find patterns in the stars. To draw constellations.

orion-the-hunter[1]I should know. I’ve been trying to write a dissertation on something huge, and the advice I’ve gotten from almost every professor I’ve encountered is that I’ve got to scale it down. Focus more. I can’t come up with something new about everything, so I’ve got to do it about one thing, and do it well. And that’s good advice, I know! If a lot of people weren’t doing that a lot of the time, we’d all just be running around in circles and not doing cool things like going to the moon or watching animated pictures of cats on the internet.

But we also need to stand back and take stock, to connect things, and right now there are institutional barriers in place making that really difficult. My advisor, who stands back and connects things for a living (like the map of science below), gives me the same prudent advice as everyone else: focus more. It’s practical advice. For all that universities celebrate interdisciplinarity, in the end you still need to get hired by a department, and if you don’t fit neatly into their disciplinary niche, you’re not likely to make it.
430561725_4eb7bc5d8a_o1[1]My request is simple. If you’re responsible for hiring researchers, or promoting them, or in charge of a department or (!) a university, make it easier to be interdisciplinary. Continue hiring people who make new stars, but also welcome the sort of people who want to connect them. There certainly are a lot of stars out there, and it’s getting harder and harder to see what they have in common, and to connect them to what we do every day. New things are great, but connecting old things in new ways is also great. Sometimes we need to think wider, not deeper.


From Trees to Webs: Uprooting Knowledge through Visualization

[update: here are some of the pretty pictures I will be showing off in The Hague]

The blog’s been quiet lately; my attention has been occupied by various journal submissions and a new book in the works, but I figured my readers would be interested in one of those forthcoming publications. This is an article [preprint] I’m presenting at the Universal Decimal Classification Seminar in The Hague this October, on the history of how we’ve illustrated the interconnections of knowledge and scholarly domains. It’s basically two stories: one of how we shifted from understanding the world hierarchically to understanding it as a flat web of interconnected parts, and the other of how the thing itself and knowledge of that thing became separated.

Porphyrian Tree: tree of Aristotle's categories from the 6th century. [via]
Porphyrian Tree: tree of Aristotle’s categories originally dating from the 6th century. [via some random website about trees]
A few caveats worth noting: first, because I didn’t want to deal with the copyright issues, there are no actual illustrations in the paper. For the presentation, I’m going to compile a powerpoint with all the necessary attributions and post it alongside this paper so you can all see the relevant pretty pictures. For your viewing pleasure, though, I’ve included some of the illustrations in this blog post.

An interpretation of the classification of knowledge from Hobbes' Leviathan. [via e-ducation]
An interpretation of the classification of knowledge from Hobbes’ Leviathan. [via e-ducation]
Second, because the this is a presentation directed at information scientists, the paper is organized linearly and with a sense of inevitability; or, as my fellow historians would say, it’s very whiggish. I regret not having the space to explore the nuances of the historical narrative, but it would distract from the point and context of this presentation. I plan on writing a more thorough article to submit to a history journal at a later date, hopefully fitting more squarely in the historiographic rhetorical tradition.

H.G. Wells' idea of how students should be taught. [via H.G. Wells, 1938. World Brain. Doubleday, Doran & Co., Inc]
H.G. Wells’ idea of how students should be taught. [via H.G. Wells, 1938. World Brain. Doubleday, Doran & Co., Inc]
In the meantime, if you’re interested in reading the pre-print draft, here it is! All comments are welcome, as like I said, I’d like to make this into a fuller scholarly article beyond the published conference proceedings. I was excited to put this up now, but I’ll probably have a new version with full citation information within the week, if you’re looking to enter this into Zotero/Mendeley/etc. Also, hey! I think this is the first post on the Irregular that has absolutely nothing to do with data analysis.

Recent map of science by Kevin Boyack, Dick Klavans, W. Bradford Paley, and Katy Börner. [via SEED magazine]
Recent map of science by Kevin Boyack, Dick Klavans, W. Bradford Paley, and Katy Börner. [via SEED magazine]

An experiment in communal editing: Finding the history & philosophy of science.

After my last post about co-citation analysis, the author of one of the papers I was responding to, K. Brad Wray, generously commented and suggested I write up and publish the results and send them off to Erkenntnis, which is the same journal he published his results. That sounded like a great idea, so I am.

Because so many good ideas have come from comments on this blog, I’d like to try opening my first draft to communal commenting. For those who aren’t familiar with google docs (anyone? Bueller?), you can comment by selecting test and either hitting ctrl-alt-m, or going to the insert-> menu and clicking ‘Comment’.

The paper is about the relationship between history of science and philosophy of science, and draws both from the blog post and from this page with additional visualizations. There is also an appendix (pdf, sorry) with details of data collection and some more interesting results for the HPS buffs. If you like history of science, philosophy of science, or citation analysis, I’d love to see your comments! If you have any general comments that don’t refer to a specific part of the text, just post them in the blog comments below.

This is a bit longer form than the usual blog, so who knows if it will inspire much interaction, but it’s worth a shot. Anyone who is signed in so I can see their name will get credit in the acknowledgements.

Finding the History and Philosophy of Science (earlier draft)  ← draft 1, thanks for your comments.

Finding the History and Philosophy of Science (current draft) ← comment here!


Bridging the gap

Traditional disciplinary silos have always been useful fictions. They help us organize our research centers, our journals, our academies, and our lives. However much simplicity we gain from quickly and easily being able to place research X into box Y, however, is offset by the requirement of fitting research X into one and only one box Y. What we gain in simplicity, we lose in flexibility.

The academy is facing convergence on two fronts.

A turn toward computation, complicated methodologies, and more nuanced approaches to research is erecting increasingly complex barriers to entry on basic scholarship. Where once disparate disciplines had nothing in common besides membership in the academy, now they are connected by a joint need for computer infrastructure, algorithm expertise, and methodological training. I recently commiserated with a high energy physicist and a geneticist on the difficulties of parallelizing certain data analysis algorithms. Somehow, in the space of minutes, we three very unrelated researchers reached common ground.

An increasing reliance on consilience provides the other converging factor. A steady but relentless rise in interest in interdisciplinarity has manifested itself in scholarly writings through increasingly wide citation patterns. That is, scholars are drawing from sources further from their own, and with growing frequency. 1 Much of this may be attributed to the rise of computer-aided document searches. Whatever the reasons, scholars are drawing from a much wider variety of research, and this in turn often brings more variety to their research.

Google Ngrams shows us how much people like to say "interdisciplinarity."
Measuring the interdisciplinarity of papers over time. From Guo, Hanning, Scott B. Weingart, and Katy Börner. 2011. “Mixed-indicators model for identifying emerging research areas.” Scientometrics 89 (June 21): 421-435.

Methodological and infrastructural convergence, combined with subject consilience, is dislodging scholarship from its traditional disciplinary silos. Perhaps, in an age when one-item-one-box taxonomies are rapidly being replaced by more flexible categorization schemes and machine-assisted self-organizations, these disciplinary distinctions are no longer as useful as they once were.

Unfortunately, the boom of interdisciplinary centers and institutes in the 70’s and 80’s left many graduates untenurable. By focusing on problems out the scope of any one traditional discipline, graduates from these programs often found themselves outside the scope of any particular group that might hire them. A university system that has existed in some recognizable form for the last thousand years cannot help but pick up inertia, and that indeed is what has happened here. While a flexible approach to disciplinarity might be better if starting all over again, the truth is we have to work with what we have, and a total overhaul is unlikely.

The question is this: what are the smallest and easiest possible changes we can make, at the local level, to improve the environment for increasingly convergent research in the long term? Is there a minimal amount of work one can do such that the returns are sufficiently large to support flexibility? One inspiring step is Bethany Nowviskie‘s (and many others’) #alt-ac project and the movement surrounding it, which pushes for alternative or unconventional academic careers.

Alternative Academic Careers

The #alt-ac movement seems to be picking up the most momentum with those straddling the tech/humanities divide, however it is equally important for those crossing all traditional academic divides. This includes divides between traditionally diverse disciplines (e.g., literature and social science), between methods (e.g., unobtrusive measures and surveys), between methodologies (e.g., quantitative and qualitative), or in general between C.P. Snow’s “Two Cultures” of science and the humanities.

These divides are often useful and, given that they are reinforced by tradition, it’s usually not worth the effort to attempt to move beyond them. The majority of scholarly work still fits reasonably well within some pre-existing community. For those working across these largely constructed divides, however, an infrastructure needs to exist to support their research. National and private funding agencies have answered this call admirably, however significant challenges still exist at the career level.

C.P. Snow bridging the "Two Cultures." Image from Scientific American.

Novel and surprising research often comes from connecting previously unrelated silos. For any combination of communities, if there exists interesting research which could be performed at their intersection, it stands to reason that those which have been most difficult to connect would be the most fruitful if combined. These combinations would likely be the ones with the most low-hanging fruit.

The walls between traditional scholarly communities are fading. In order for the academy to remain agile and flexible, it must facilitate and adapt to the changing scholarly landscape. “The academy,” however, is not some unified entity which can suddenly change directions at the whim of a few; it is all of us. What can we do to affect the desired change? On the scholarly communication front, scholars are adapting  by signing pledges to limit publications and reviews to open access venues. We can talk about increasing interdisciplinarity, but what does interdisciplinarity mean when disciplines themselves are so amorphous?

Have any great ideas on what we can do to improve things? Want to tell me how starry-eyed and ignorant I am, and how unnecessary these changes would be? All comments welcome!

[Note: Surprise! I have a conflict of interest. I’m “interdisciplinary” and eventually want to find a job. Help?]


  1. Increasingly interdisciplinary citation patterns is a trend I noticed when working on a paper I recently co-authored in Scientometrics. Over the last 30 years, publications in the Proceedings of the National Academy of Sciences have shown a small but statistically significant trend in the interdisciplinarity of citations. Whereas a paper 30 years ago may have cited sources from one or a small set of closely related journals, papers now are somewhat more likely to cite a larger number of journals in increasingly disparate fields of study. This does take into account the average number of references per paper. A similar but more pronounced trend was shown in the journal Scientometrics. While this is by no means a perfect indicator for the rise of interdisciplinarity, a combination of this study and anecdotal evidence leads me to believe it is the case.