The Index focuses on the flagship ADHO conference, but encompasses other events as well. As of this release, it indexes 489 conferences from 39 countries dating back to 1960. We entered individual work metadata for 59 of those conferences, including titles, keywords, authorships, institutional affiliations, and so on. In all, there are 7,082 conference presentations by 8,392 authors, hailing from 1,830 institutions and 86 countries.
The where/when conference data was crowdsourced through a google sheet and twitter. Conference presentation metadata comes from a variety of public sources. Recent ADHO conferences generally come from XML files on ADHO’s GitHub page, and earlier ADHO / ACH / ALLC conference metadata was entered by hand from old websites, PDFs, listservs, and printed conference programs contributed by Joe Rudman (ACH Treasurer 1985-1989). Non-ADHO conference data was entered by hand from public programs (usually PDFs).
To ease browsing and analysis, we cleaned and merged as much as we could. That means connecting “Cambridge University” with “University of Cambridge” and the occasional “Camridge University”, as well as “Scott B. Weingart” with “Scott Weingart”.
When possible, we also indexed the full texts of conference abstracts / program entries. Currently those are only being used to power the search engine and generally not visible on a conference work page. We hope to be able to display full text in the coming months, as we get permissions to republish them from the original rights holders.
A crucial obstacle to the writing of histories of the field is that much of DH’s archival evidence has either not been preserved or is held by individuals (and so remains ‘hidden’ unless one can discover its existence and secure approval and the means to access it).
Nyhan & Flynn, 2016
This holds as true for DH conferences as for the sources Nyhan & Flinn were working with. There’s no single public archive for physical conference programs, most old conference websites no longer exist (often even absent from the Internet Archive), and even ADHO’s digital records are spread out across many sources or locked in byzantine and private ConfTool data dumps.
Ours isn’t the first attempt to put a DH conference database together, but it is the most extensive. It represents eight years of work by many collaborators and contributors, and builds off those earlier attempts. And it’s all open for anyone to use: anyone can download the data to browse or analyze.
In making this corner of our community’s history more accessible, we hope this helps fill the archival gap pointed out by Nyhan and Flinn.
This was always a passion project: never funded, nor supported by any scholarly organization. We work on it in empty moments, usually months apart. We don’t have resources for additional quality control, nor time to implement all the features we want on the website.
Errors are rampant. We made mistakes entering data, we made mistakes cleaning and merging data, and we inherited mistakes made by conference program editors or even, occasionally, authors themselves. Some common issues include merging two entries that should be separate (John Smith of Florida vs. John Smith of Maryland, not the same person), or not merging two entries who should be connected (J.B. Smith vs. John Smith, the same person).
If you spot an error, please reach out to me, and I’ll do my best to fix it in a timely fashion.
For a dataset that spent its first six years as a hilariously complicated excel spreadsheet shared on Dropbox, the web interface is amazing. It solves so many problems. But there’s still a lot missing, because we simply don’t have time to build it. Faceting works strangely, we’re missing a lot of useful search interfaces, and we just don’t have the infrastructure needed to turn this into a crowdsourcing project.
Errors in data cleaning and merging can be problematic, but generally not show-stopping. Unfortunately data in the Index does have a few show-stopping issues, but rather than keep everything private until we can fix them all, we’re releasing this publicly in the hope the community can help.
When merging people, unless we’re absolutely certain two people with different names are the same, we won’t connect them in the database. That means a Jane Smith who changes her name to Jane Doe after getting married will not be merged. The issue disproportionately affects women (who are more likely to change names during their careers), and as Jessica Otis and I point out, will put those women at a disadvantage in any eventual analysis.
The merging issue also affects people who for whatever reason (often gender/identity transition) decide to change their names. To complicate matters further, many who change their names do not want their birth name / deadname shared on the web, but since we don’t know that, their old names remain visible within our database.
Our data entry and cleaning gets worse the further we (the data collectors) get from our cultural comfort zones. This became a problem in the first and last name fields; figuring out what went where proved particularly difficult for us with respect to Hispanic and Indian authors. We reached out to friends for help, and used the separations entered by authors themselves when XML was available, but errors still abound.
Additionally, our department/institution database ontology (a strict hierarchy) fails for many Italian and French research units, which are often shared across many institutions. Italian and French affiliations are frankly in an abysmal state, and we’d appreciate input to help us untangle that mess.
Sir Not Appearing In This Film
The most apparent absence in the database and website is the full text of presentation abstracts (which are sometimes the entire conference paper).
We have full text for 75% of all works in the database (5,301 of them, to be exact), but because we either don’t know their copyright status or don’t have explicit permission to publish them, they are currently invisible. Searching in the works page will search through the full text of the abstracts, and present works with relevant hits, but visitors will have to find other sources to view the abstracts themselves. We’re working to secure permissions to share what we have, but it may take some time.
When I first started this project back in 2012, it was based on submissions to the DH conference, which I collected by scraping (without permission) the semi-private website for reviewers to select which submissions they were most qualified to review. I continued this for several years, comparing submitted abstracts to what made it to the final program, adding collaborators along the way. For privacy reasons, that data isn’t included here.
Such demographic data would also likely put as at odds with GDPR. It’s worth pointing out that ADHO is keeping its distance from this project because it is worried about potential GDPR ramifications even without demographic data, which is understandable. From several sources (e.g., 1, 2, 3), it seems this is an “archive in the public interest” and thus doesn’t violate GDPR, however I’m not a lawyer and I suppose anything can happen.
In the spirit of GDPR and the general “right to be forgotten”, we will happily take down any personal data for any reason. Just reach out to me if you’d like your materials to be removed.
As alluded to earlier, this was a group effort over eight years. Nickoal Eichmann-Kalwara has been my partner in crime (sometimes literally, but only small crimes) for most of it. We’ve put in countless hours guiding the project, entering and cleaning data, and strong-arming friends into becoming collaborators. Jeana Jorgensen contributed her inimitable expertise and guidance for many years. Matt Lincoln single-handedly turned our janky spreadsheets into an actual database and website.
There are 55 additional names on our credits page, and every one of them is worth mention.
Appendix: Conferences with Presentation Metadata
Although there are 489 conferences in the index, only 59 of them currently include presentation metadata. They are presented below, organized by conference series when applicable. Some series overlap (e.g., ACH/ALLC overlaps one year with ADHO), so the numbers will add up to higher than 59.
Conferences that aren’t part of a series
1964 Conference on the Use of Computers in Humanistic Research (1)
1964 Literary Data Processing Conference (1)
1968 Computers and their potential applications in museums (1)
1969 IBM Symposium on Introducing the Computer into the Humanities (1)
1979 International Conference on Literary and Linguistic Computing (1)
2019 The Arts, Knowledge, and Critique in the Digital Age in India: Addressing Challenges in the Digital Humanities (1)
Conferences that are part of a series
ADHO (the “DH” conference) (15)
Caribbean Digital (1)
Digital Humanities Alliance of India (DHAI) (1)
Digital Humanities Forum (1)
Encuentro de Humanistas Digitales (EHD) (1)
ALLC IM/AGM (4)
Japanese Association for Digital Humanities Annual Conference (JADH) (1)
This first post covers the basic landscape of submissions to next year’s conference: how many submissions there are, what they’re about, and so forth.
The analysis is opinionated and sprinkled with my own preliminary interpretations. If you disagree with something or want to see more, comment below, and I’ll try to address it in the inevitable follow-up. If you want the data, too bad—since it’s only available to reviewers, there’s an expectation of privacy. If you are sad for political or other reasons and live near me, I will bring you chocolate; if you are sad and do not live near me, you should move to Pittsburgh. We have chocolate.
Submission Numbers & Types
I’ll be honest, I was surprised by this year’s submission numbers. This will be the first ADHO conference held in North America since it was held in Nebraska in 2013, and I expected an influx of submissions from people who haven’t been able to travel off the continent for interim events. I expected the biggest submission pool yet.
What we see, instead, are fewer submissions than Kraków last year: 608 in all. The low number of submissions to Sydney was expected, given it was the first conference held outside Europe or North America, but this year’s numbers suggests the DH Hype Machine might be cooling somewhat, after five years of rapid growth.
We need some more years and some more DH-Hype-Machine Indicators to be sure, but I reckon things are slowing down.
The conference offers five submission tracks: Long Paper, Short Paper, Poster, Panel, and (new this year) Virtual Short Paper. The distribution is pretty consistent with previous years, with the only deviation being in Sydney in 2015. Apparently Australians don’t like short papers or posters?
I’ll be interested to see how the “Virtual Short Paper” works out. Since authors need to decide on this format before submitting, it doesn’t allow the flexibility of seeing if funding will become available over the course of the year. Still, it’s a step in the right direction, and I hope it succeeds.
More of the same! If nothing else, we get points for consistency.
Same as it ever was, nearly half of all submissions are by a single author. I don’t know if that’s because humanists need to justify their presentations to hiring and tenure committees who only respect single authorship, or if we’re just used to working alone. A full 80% of submissions have three or fewer authors, suggesting large teams are still not the norm, or that we’re not crediting all of the labor that goes into DH projects with co-authorships. [Post-publication note: See Adam Crymble’s comment, below, for important context]
Language, Topic, & Discipline
Authors choose from several possible submission languages. This year, 557 submissions were received in English, 40 in French, 7 in Spanish, 3 in Italian, and 1 in German. That’s the easy part.
The Powers That Be decided to make my life harder by changing up the categories authors can choose from for 2017. Thanks, Diane, ADHO, or whoever decided this.
In previous years, authors chose any number of keywords from a controlled vocabulary of about 100 possible topics that applied to their submission. Among other purposes, it helped match authors with reviewers. The potential topic list was relatively static for many years, allowing me to analyze the change in interest in topics over time.
This year, they added, removed, and consolidated a bunch of topics, as well as divided the controlled vocabulary into “Topics” (like metadata, morphology, and machine translation) and “Disciplines” (like disability studies, archaeology, and law). This is ultimately good for the conference, but makes it difficult for me to compare this against earlier years, so I’m holding off on that until another post.
But I’m not bitter.
This year’s options are at the bottom of this post in the appendix. Words in red were added or modified this year, and the last list are topics that used to exist, but no longer do.
So let’s take a look at this year’s breakdown by discipline.
Huh. “Computer science”—a topic which last year did not exist—represents nearly a third of submissions. I’m not sure how much this topic actually means anything. My guess is the majority of people using it are simply signifying the “digital” part of their “Digital Humanities” project, since the topic “Programming”—which existed in previous years but not this year—used to only connect to ~6% of submissions.
“Literary studies” represents 30% of all submissions, more than any previous year (usually around 20%), whereas “historical studies” has stayed stable with previous years, at around 20% of submissions. These two groups, however, can be pretty variable year-to-year, and I’m beginning to suspect that their use by authors is not consistent enough to take as meaningful. More on that in a later post.
That said, DH is clearly driven by lit, history, and library/information science. L/IS is a new and welcome category this year; I’ve always suspected that DHers are as much from L/IS as the humanities, and this lends evidence in that direction. Importantly, it also makes apparent a dearth in our disciplinary genealogies: when we trace the history of DH, we talk about the history of humanities computing, the history of the humanities, the history of computing, but rarely the history of L/IS.
I’ll have a more detailed breakdown later, but there were some surprises in my first impressions. “Film and Media Studies” is way up compared to previous years, as are other non-textual disciplines, which refreshingly shows (I hope) the rise of non-textual sources in DH. Finally. Gender studies and other identity- or intersectional-oriented submissions also seem to be on the rise (this may be an indication of US academic interests; we’ll need another few years to be sure).
If we now look at Topic choices (rather than Discipline choices, above), we see similar trends.
Again, these are just first impressions, there’ll be more soon. Text is still the bread and butter of DH, but we see more non-textual methods being used than ever. Some of the old favorites of DH, like authorship attribution, are staying pretty steady against previous years, whereas others, like XML and encoding, seem to be decreasing in interest year after year.
One last note on Topics and Disciplines. There’s a list of discontinued topics at the bottom of the appendix. Most of them have simply been consolidated into other categories, however one set is conspicuously absent: meta-discussions of DH. There are no longer categories for DH’s history, theory, how it’s taught, or its institutional support. These were pretty popular categories in previous years, and I’m not certain why they no longer exist. Perusing the submissions, there are certainly several that fall into these categories.
For Part 2 of this analysis, look forward to more thoughts on the topical breakdown of conference submissions; preliminary geographic and gender analysis of authors; and comparisons with previous years. After that, who knows? I take requests in the comments, but anyone who requests “Free Bird” is banned for life.
Appendix: Controlled Vocabulary
Words in red were added or modified this year, and the last list are topics that used to exist, but no longer do.
agent modeling and simulation
archives, repositories, sustainability and preservation
audio, video, multimedia
authorship attribution / authority
bibliographic methods / textual studies
concording and indexing
copyright, licensing, and Open Access
corpora and corpus activities
cultural and/or institutional infrastructure
data mining / text mining
data modeling and architecture including hypothesis-driven modeling
Nickoal Eichmann (corresponding author), Jeana Jorgensen, Scott B. Weingart1
NOTE: This is a pre-peer reviewed draft submitted for publication in Feminist Debates in Digital Humanities, eds. Jacque Wernimont and Elizabeth Losh, University of Minnesota Press (2017). Comments are welcome, and a downloadable dataset / more figures are forthcoming. This chapter will be released alongside another on the history of DH conferences, co-authored by Weingart & Eichmann (forthcoming), which will go into further detail on technical aspects of this study, including the data collection & statistics. Many of the materials first appeared on this blog. To cite this preprint, use the figshare DOI: https://dx.doi.org/10.6084/m9.figshare.3120610.v1
Digital Humanities (DH) is said to have a light side and a dark side. Niceness, globality, openness, and inclusivity sit at one side of this binary caricature; commodification, neoliberalism, techno-utopianism, and white male privilege sit at the other. At times, the plurality of DH embodies both descriptions.
We hope a diverse and critical DH is a goal shared by all. While DH, like the humanities writ large, is not a monolith, steps may be taken to improve its public face and shared values through positively influencing its communities. The Alliance of Digital Humanities Organizations’ (ADHO’s) annual conference hosts perhaps the largest such community. As an umbrella organization of six international digital humanities constituent organizations, as well as 200 DH centers in a few dozen countries, ADHO and its conference ought to represent the geographic, disciplinary, and demographic diversity of those who identify as digital humanists.
The annual conference offers insight into how the world sees DH. While it may not represent the plurality of views held by self-described digital humanists, the conference likely influences the values of its constituents. If the conference glorifies Open Access, that value will be taken up by its regular attendees; if the conference fails to prioritize diversity, this too will be reinforced.
This chapter explores fifteen years of DH conferences, presenting a quantified look at the values implicitly embedded in the event. Women are consistently underrepresented, in spite of the fact that the most prominent figures at the conference are as likely women as men. The geographic representation of authors has become more diverse over time—though authors with non-English names are still significantly less likely to pass peer review. The topical landscape is heavily gendered, suggesting a masculine bias may be built into the value system of the conference itself. Without data on skin color or ethnicity, we are unable to address racial or related diversity and bias here.
There have been some improvements over time and, especially recently, a growing awareness of diversity-related issues. While many of the conference’s negative traits are simply reflections of larger entrenched academic biases, this is no comfort when self-reinforcing biases foster a culture of microaggression and white male privilege. Rather than using this study as an excuse to write off DH as just another biased community, we offer statistics, critiques, and suggestions as a vehicle to improve ADHO’s conference, and through it the rest of self-identified Digital Humanities.
Digital humanities (DH), we are told, exists under a “big tent”, with porous borders, little gatekeeping, and, heck, everyone’s just plain “nice”. Indeed, the term itself is not used definitionally, but merely as a “tactical convenience” to get stuff done without worrying so much about traditional disciplinary barriers. DH is “global”, “public”, and diversely populated. It will “save the humanities” from its crippling self-reflection (cf. this essay), while simultaneously saving the computational social sciences from their uncritical approaches to data. DH contains its own mirror: it is both humanities done digitally, and the digital as scrutinized humanistically. As opposed to the staid, backwards-looking humanities we are used to, the digital humanities “experiments”, “plays”, and even “embraces failure” on ideological grounds. In short, we are the hero Gotham needs.
Digital Humanities, we are told, is a narrowly-defined excuse to push a “neoliberal agenda”, a group of “bullies” more interested in forcing humanists to code than in speaking truth to power. It is devoid of cultural criticism, and because of the way DHers uncritically adopt tools and methods from the tech industry, they in fact often reinforce pre-existing power structures. DH is nothing less than an unintentionally rightist vehicle for techno-utopianism, drawing from the same font as MOOCs and complicit in their devaluing of education, diversity, and academic labor. It is equally complicit in furthering both the surveillance state and the surveillance economy, exemplified in its stunning lack of response to the Snowden leaks. As a progeny of the computer sciences, digital humanities has inherited the same lack of gender and racial diversity, and any attempt to remedy the situation is met with incredible resistance.
The truth, as it so often does, lies somewhere in the middle of these extreme caricatures. It’s easy to ascribe attributes to Digital Humanities synecdochically, painting the whole with the same brush as one of its constituent parts. One would be forgiven, for example, for coming away from the annual international ADHO Digital Humanities conference assuming DH were a parade of white men quantifying literary text. An attendee of HASTAC, on the other hand, might leave seeing DH as a diverse community focused on pedagogy, but lacking in primary research. Similar straw-snapshots may be drawn from specific journals, subcommunities, regions, or organizations.
But these synecdoches have power. Our public face sets the course of DH, via who it entices to engage with us, how it informs policy agendas and funding allocations, and who gets inspired to be the next generation of digital humanists. Especially important is the constituency and presentation of the annual Digital Humanities conference. Every year, several hundred students, librarians, staff, faculty, industry professionals, administrators and researchers converge for the conference, organized by the Alliance of Digital Humanities Organizations (ADHO). As an umbrella organization of six international digital humanities constituent organizations, as well as 200 DH centers in a few dozen countries, ADHO and its conference ought to represent the geographic, disciplinary, and demographic diversity of those who identify as digital humanists. And as DH is a community that prides itself on its activism and its social/public goals, if the annual DH conference does not celebrate this diversity, the DH community may suffer a crisis of identity (…okay, a bigger crisis of identity).
So what does the DH conference look like, to an outsider? Is it diverse? What topics are covered? Where is it held? Who is participating, who is attending, and where are they coming from? This essay offers incomplete answers to these questions for fifteen years of DH conferences (2000-2015), focusing particularly on DH2013 (Nebraska, USA), DH2014 (Lausanne, Switzerland), and DH2015 (Sydney, Australia). 2 We do so with a double-agenda: (1) to call out the biases and lack of diversity at ADHO conferences in the earnest hope it will help improve future years’ conferences, and (2) to show that simplistic, reductive quantitative methods can be applied critically, and need not feed into techno-utopic fantasies or an unwavering acceptance of proxies as a direct line to Truth. By “distant reading” DH and turning our “macroscopes” on ourselves, we offer a critique of our culture, and hopefully inspire fruitful discomfort in DH practitioners who apply often-dehumanizing tools to their subjects, but have not themselves fallen under the same distant gaze.
Among other findings, we observe a large gender gap for authorship that is not mirrored among those who simply attend the conference. We also show a heavily gendered topical landscape, which likely contributes to topical biases during peer review. Geographic diversity has improved over fifteen years, suggesting ADHO’s strategy to expand beyond the customary North American / European rotation was a success. That said, there continues to be a visible bias against non-English names in the peer review process. We could not get data on ethnicity, race, or skin color, but given our regional and name data, as well as personal experience, we suspect in this area, diversity remains quite low.
We do notice some improvement over time and, especially in the last few years, a growing awareness of our own diversity problems. The #whatifDH2016 3 hashtag, for example, was a reaction to an all-male series of speakers introducing DH2015 in Sydney. The hashtag caught on and made it to ADHO’s committee on conferences, who will use it in planning future events. Our remarks here are in the spirit of #whatifDH2016; rather than using this study as an excuse to defame digital humanities, we hope it becomes a vehicle to improve ADHO’s conference, and through it the rest of our community.
Social Justice and Equality in the Digital Humanities
Diversity in the Academy
In order to contextualize gender and ethnicity in the DH community, we must take into account developments throughout higher education. This is especially important since much of DH work is done in university and other Ivory Tower settings. Clear progress has been made from the times when all-male, all-white colleges were the norm, but there are still concerns about the marginalization of scholars who are not white, male, able-bodied, heterosexual, or native English-speakers. Many campuses now have diversity offices and have set diversity-related goals at both the faculty and student levels (for example, see the Ohio State University’s diversity objectives and strategies 2007-12). On the digital front, blogs such as Conditionally Accepted, Fight the Tower, University of Venus, and more all work to expose the normative biases in academia through activist dialogue.
From both a historical and contemporary lens, there is data supporting the clustering of women and other minority scholars in certain realms of academia, from specific fields and subjects to contingent positions. When it comes to gender, the phrase “feminization” has been applied both to academia in general and to specific fields. It contains two important connotations: that of an area in which women are in the majority, and the sense of a change over time, such that numbers of women participants are increasing in relation to men (Leathwood and Read 2008, 10). It can also signal a less quantitative shift in values, “whereby ‘feminine’ values, concerns, and practices are seen to be changing the culture of an organization, a field of practice or society as a whole” (ibid).
In terms of specific disciplines, the feminization of academia has taken a particular shape. Historian Lynn Hunt suggests the following propositions about feminization in the humanities and history specifically: the feminization of history parallels what is happening in the social sciences and humanities more generally; the feminization of the social sciences and humanities is likely accompanied by a decline in status and resources; and other identity categories, such as ethnic minority status and age/generation, also interact with feminization in ways that are still becoming coherent.
Feminization has clear consequences for the perception and assignation of value of a given field. Hunt writes: “There is a clear correlation between relative pay and the proportion of women in a field; those academic fields that have attracted a relatively high proportion of women pay less on average than those that have not attracted women in the same numbers.” Thus, as we examine the topics that tend to be clustered by gender in DH conference submissions, we must keep in mind the potential correlations of feminization and value, though it is beyond the scope of this paper to engage in chicken-or-egg debates about the causal relationship between misogyny and the devaluing of women’s labor and women’s topics.
There is no obvious ethnicity-based parallel to the concept of the feminization of academia; it wouldn’t be culturally intelligible to talk about the “people-of-colorization of academia”, or the “non-white-ization of academia.” At any rate, according to a U.S. Department of Education survey, in 2013 79% of all full-time faculty in degree-granting postsecondary institutions were white. The increase of non-white faculty from 2009 (19.2% of the whole) to 2013 (21.5%) is very small indeed.
Why does this matter? As Jeffrey Milem, Mitchell Chang, and Anthony Lising Antonio write in regard to faculty of color, “Having a diverse faculty ensures that students see people of color in roles of authority and as role models or mentors. Faculty of color are also more likely than other faculty to include content related to diversity in their curricula and to utilize active learning and student-centered teaching techniques…a coherent and sustained faculty diversity initiative must exist if there is to be any progress in diversifying the faculty” (25). By centering marginalized voices, scholarly institutions have the ability to send messages about who is worthy of inclusion.
Recent Criticisms of Diversity in DH
In terms of DH specifically, diversity within the community and conferences has been on the radar for several years, and has recently gained special attention, as digital humanists and other academics alike have called for critical and feminist engagement in diversity and a move away from what seems to be an exclusionary culture. In January 2011, THATCamp SoCal included a section called “Diversity in DH,” in which participants explored the lack of openness in DH and, in the end, produced a document, “Toward an Open Digital Humanities” that summarized their discussions. The “Overview” in this document mirrors the same conversation we have had for the last several years:
We recognize that a wide diversity of people is necessary to make digital humanities function. As such, digital humanities must take active strides to include all the areas of study that comprise the humanities and must strive to include participants of diverse age, generation, sex, skill, race, ethnicity, sexuality, gender, ability, nationality, culture, discipline, areas of interest. Without open participation and broad outreach, the digital humanities movement limits its capacity for critical engagement. (ibid)
This proclamation represents the critiques of the DH landscape in 2011, in which DH practitioners and participants were assumed to be privileged and white, that they excluded student-learners, and that they held myopic views of what constitutes DH. Most importantly for this chapter, THATCamp SoCal’s “Diversity in DH” section participants called for critical approaches and social justice of DH scholarship and participation, including “principles for feminist/non-exclusionary groundrules in each session (e.g., ‘step up/step back’) so that the loudest/most entitled people don’t fill all the quiet moments.” They also advocated defending the least-heard voices “so that the largest number of people can benefit…”
These voices certainly didn’t fall flat. However, since THATCamps are often comprised of geographically local DH microcommunities, they benefit from an inclusive environment but suffer as isolated events. As result, it seems that the larger, discipline-specific venues which have greater attendance and attraction continue to amplify privileged voices. Even so, 2011 continued to represent a year that called for critical engagement in diversity in DH, with an explicit “Big Tent” theme for DH2011 held in Stanford, California. Embracing the concept the “Big Tent” deliberately opened the doors and widened the spectrum of DH, at least in terms of methods and approaches. However, as Melissa Terras pointed out, DH was “still a very rich, very western academic field” (Terras, 2011), even with a few DH2011 presentations engaging specifically with topics of diversity in DH. 4
A focus on diversity-related issues has only grown in the interim. We’ve recently seen greater attention and criticism of DH exclusionary culture, for instance, at the 2015 Modern Language Association (MLA) annual convention, which included the roundtable discussion “Disrupting Digital Humanities.” It confronted the “gatekeeping impulse” in DH, and echoing THATCamp SoCal 2011, these panelists aimed to shut down hierarchical dialogues in DH, encourage non-traditional scholarship, amplify “marginalized voices,” advocate for DH novices, and generously support the work of peers. 5 The theme for DH2015 in Sydney, Australia was “Global Digital Humanities,” and between its successes and collective action arising from frustrations at its failures, the community seems poised to pay even greater attention to diversity. Other recent initiatives in this vein worth mention include #dhpoco, GO::DH, and Jacqueline Wernimont’s “Build a Better Panel,” 6 whose activist goals are helping diversify the community and raise awareness of areas where the community can improve.
While it would be fruitful to conduct a longitudinal historiographical analysis of diversity in DH, more recent criticisms illustrate a history of perceived exclusionary culture, which is why we hope to provide a data-driven approach to continue the conversation and call for feminist and critical engagement and intervention.
While DH as a whole has been critiqued for its lack of diversity and inclusion, how does the annual ADHO DH conference measure up? To explore this in a data-driven fashion, we have gathered publicly available annual ADHO conference programs and schedules from 2000-2015. From those conference materials, we have entered presentation and author information into a spreadsheet to analyze various trends over time, such as gender and geography as indicators of diversity. Particular information that we have collected includes: presentation title, keywords (if available), abstract and full-text (if available), presentation type, author name, author institutional affiliation and academic department (if available), and corresponding country of that affiliation at the time of the presentation(s). We normalized and hand-cleaned names, institutions, and departments, so that, to the best of our knowledge, each author entry represented a unique person and, accordingly, was assigned a unique ID. Next, we added gender information (m/f/other/unknown) to authors by a combination of hand-entry and automated inference. While this is problematic for many reasons, 7 since it does not allow for diversity in gender options and tracing gender changes over time, it does give us a useful preliminary lense to view gender diversity at DH conferences.
For 2013’s conference, ADHO instituted a series of changes aimed at improving inclusivity, diversity, and quality. This drive was steered by that year’s program committee chair, Bethany Nowviskie, alongside 2014’s chair, Melissa Terras. Their reformative goals matched our current goals in this essay, and speak to a long history of experimentation and improvement efforts on behalf of ADHO. Their changes included making the conference more welcome to outsiders through ending policies that only insiders knew about; making the CFP less complex and easier to translate into multiple languages; taking reviewer language competencies into account systematically; and streamlining the submission and review process.
The biggest noticeable change to DH2013, however, was the institution of a reviewer bidding process and a phase of semi-open peer review. Peer reviewers were invited to read through and rank every submitted abstract according to how qualified they felt to review the abstract. Following this, the conference committee would match submissions to qualified peer reviewers, taking into account conflicts of interest. Submitting authors were invited to respond to reviews, and the committee would make a final decision based on the various reviews and rebuttals.This continues to be the process through DH2016. Changes continue to be made, most recently in 2016 with the addition of “Diversity” and “Multilinguality” as new keywords authors can append to their submissions.
While the list of submitted abstracts was private, accessible only to reviewers, as reviewers ourselves we had access to the submissions during the bidding phase. We used this access to create a dataset of conference submissions for DH2013, DH2014, and DH2015, which includes author names, affiliations, submission titles, author-selected topics, author-chosen keywords, and submission types (long paper, short paper, poster, panel).
We augmented this dataset by looking at the final conference programs in ‘13, ‘14, and ‘15, noting which submissions eventually made it onto the final conference program, and how they changed from the submission to the final product. This allows us to roughly estimate the acceptance rate of submissions, by comparing the submitted abstract lists to the final programs. It is not perfect, however, given that we don’t actually know whether submissions that didn’t make it to the final program were rejected, or if they were accepted and withdrawn. We also do not know who reviewed what, nor do we know the reviewers’ scores or any associated editorial decisions.
The original dataset, then, included fields for title, authors, author affiliations, original submission type, final accepted type, topics, keywords, and a boolean field for whether a submission made it to the final conference program. We cleaned the data up by merging duplicate people, ensuring e.g., if “Melissa Terras” was an author on two different submissions, she counted as the same person. For affiliations, we semi-automatically merged duplicate institutions, found the countries they reside in, and assigned those countries to broad UN regions. We also added data to the set, first automatically guessing a gender for each author, and then correcting the guesses by hand.
Given that abstracts were submitted to conferences with an expectation of privacy, we have not released the full submission dataset; we have, however, released the full dataset of final conference programs. 8
We would like to acknowledge the gross and problematic simplifications involved in this process of gendering authors without their consent or input. As Miriam Posner has pointed out, with regards to Getty’s Union List of Author Names, “no self-respecting humanities scholar would ever get away with such a crude representation of gender in traditional work”. And yet, we represent authors in just this crude fashion, labeling authors as male, female, or unknown/other. We did not encode changes of author gender over time, even though we know of at least a few authors in the dataset for whom this applies. We do not use the affordances of digital data to represent the fluidity of gender. This is problematic for a number of reasons, not least of which because, when we take a cookie cutter to the world, everything in the world will wind up looking like cookies.
We made this decision because, in the end, all data quality is contingent to the task at hand. It is possible to acknowledge an ontology’s shortcomings while still occasionally using that ontology to a positive effect. This is not always the case: often poor proxies get in the way a research agenda (e.g., citations as indicators of “impact” in digital humanities), rather than align with it. In the humanities, poor proxies are much more likely to get in the way of research than help it along, and afford the ability to make insensitive or reductivist decisions in the name of “scale”.
For example, in looking for ethnic diversity of a discipline, one might analyze last names as a proxy for country of origin, or analyze the color of recognized faces in pictures from recent conferences as a proxy for ethnic genealogy. Among other reasons, this approach falls short because ethnicity, race, and skin color are often not aligned, and last names (especially in the U.S.) are rarely indicative of anything at all. But they’re easy solutions, so people use them. These are moments when a bad proxy (and for human categories, proxies are almost universally bad) does not fruitfully contribute to a research agenda. As George E.P. Box put it, “all models are wrong, but some are useful.”
Some models are useful. Sometimes, the stars align and the easy solution is the best one for the question. If someone were researching immediate reactions of racial bias in the West, analyzing skin tone may get us something useful. In this case, the research focus is not someone’s racial identity, but someone’s race as immediately perceived by others, which would likely align with skin tone. Simply: if a person looks black, they’re more likely to be treated as such by the (white) world at large. 9
We believe our proxies, though grossly inaccurate, are useful for the questions of gender and geographic diversity and bias. The first step to improving DH conference diversity is noticing a problem; our data show that problem through staggeringly imbalanced regional and gender ratios. With regards to gender bias, showing whether reviewers are less likely to accept papers from authors who appear to be women can reveal entrenched biases, whether or not the author actually identifies as a woman. With that said, we invite future researchers to identify and expand on our admitted categorical errors, allowing everyone to see the contours of our community with even greater nuance.
The annual ADHO conference has grown significantly in the last fifteen years, as described in our companion piece 10, within which can be found a great discussion of our methods. This piece, rather than covering overall conference trends, focuses specifically on issues of diversity and acceptance rates. We cover geographic and gender diversity from 2000-2015, with additional discussions of topicality and peer review bias beginning in 2013.
Women comprise 36.1% of the 3,239 authors to DH conference presentations over the last fifteen years, counting every unique author only once. Melissa Terras’ names appears on 29 presentations between 200-2015, and Scott B. Weingart’s name appears on 4 presentations, but for the purpose of this metric each name counts only once. Female authorship representation fluctuates between 29%-38% depending on the year.
Weighting every authorship event individually (i.e., Weingart’s name counts 4 times, Terras’ 29 times), women’s representation drops to 32.7%. This reveals that women are less likely to author multiple pieces compared to their male counterparts. More than a third of the DH authorship pool are women, but fewer than a third of every name that appears on a presentation is a woman’s. Even fewer single-authored pieces are by a woman; only 29.8% of the 984 single-authored works between 2000-2015 female-authored. About a third (33.4%) of first authors on presentations are women. See Fig. 1 for a breakdown of these numbers over time. Note the lack of periodicity, suggesting gender representation is not affected by whether the conference is held in Europe or North America (until 2015, the conference alternated locations every year). The overall ratio wavers, but is neither improving nor worsening over time.
The gender disparity sparked controversy at DH2015 in Sydney. It was, however, at odds with a common anecdotal awareness that many of the most respected role-models and leaders in the community are women. To explore this disconnect, we experimented with using centrality in co-authorship networks as a proxy for fame, respectability, and general presence within the DH consciousness. We assume that individuals who author many presentations, co-author with many people, and play a central role in connecting DH’s disparate communities of authorship are the ones who are most likely to garner the respect (or at least awareness) of conference attendees.
We created a network of authors connected to their co-authors from presentations between 2000-2015, with ties strengthening the more frequently two authors collaborate. Of the 3,239 authors in our dataset, 61% (1,750 individuals) are reachable by one another via their co-authorship ties. For example, Beth Plale is reachable by Alan Liu because she co-authored with J. Stephen Downie, who co-authored with Geoffrey Rockwell, who co-authored with Alan Liu. Thus, 61% of the network is connected in one large component, and there are 299 smaller components, islands of co-authorship disconnected from the larger community.
The average woman co-authors with 5 other authors, and the average man co-authors with 5.3 other authors. The median number of co-authors for both men and women is 4. The average and median of several centrality measurements (closeness, betweenness, pagerank, and eigenvector) for both men and women are nearly equivalent; that is, any given woman is just as likely to be near the co-authorship core as any given man. Naturally, this does not imply that half of the most central authors are women, since only a third of the entire authorship pool are women. It means instead that gender does not influence one’s network centrality. Or at least it should.
The statistics show a curious trend for the most central figures in the network. Of the top 10 authors who co-author with the most others, 60% are women. Of the top 20, 45% are women. Of the top 50, 38% are women. Of the top 100, 32% are women. That is, the over half the DH co-authorship stars are women, but the further towards the periphery you look, the more men occupy the middle-tier positions (i.e., not stars, but still fairly active co-authors). The same holds true for the various centrality measurements: betweenness (60% women in top 10; 40% in top 20; 32% in top 50; 34% in top 100), pagerank (50% women in top 10; 40% in top 20; 32% in top 50; 28% in top 100), and eigenvector (60% women in top 10; 40% in top 20; 40% in top 50; 34% in top 100).
In short, half or more of the DH conference stars are women, but as you creep closer to the network periphery, you are increasingly likely to notice the prevailing gender disparity. This supports the mismatch between an anecdotal sense that women play a huge role in DH, and the data showing they are poorly represented at conferences. The results also match with the fact that women are disproportionately more likely to write about management and leadership, discussed at greater length below.
The heavily-male gender skew at DH conferences may lead one to suspect a bias in the peer review process. Recent data, however, show that if such a bias exists, it is not direct. Over the past three conferences, 71% of women and 73% of men who submitted presentations passed the peer review process. The difference is not great enough to rule out random chance (p=0.16 using χ²). The skew at conferences is more a result of fewer women submitting articles than of women’s articles not getting accepted. The one caveat, explained more below, is that certain topics women are more likely to write about are also less likely to be accepted through peer-review.
This does not imply a lack of bias in the DH community. For example, although only 33.5% of authors at DH2015 in Sydney were women, 46% of conference attendees were women. If women were simply uninterested in DH, the split in attendance vs. authorship would not be so high.
In regard to discussions of women in different roles in the DH community – less the publishing powerhouses and more the community leaders and organizers – the concept of the “glass cliff” can be useful. Research on the feminization of academia in Sweden uses the term “glass cliff” as a “metaphor used to describe a phenomenon when women are appointed to precarious leadership roles associated with an increased risk of negative consequences when a company is performing poorly and for example is experiencing profit falls, declining stock performance, and job cuts” (Peterson 2014, 4). The female academics (who also occupied senior managerial positions) interviewed in Helen Peterson’s study expressed concerns about increasing workloads, the precarity of their positions, and the potential for interpersonal conflict.
Institutional politics may also play a role in the gendered data here. Sarah Winslow says of institutional context that “female faculty are less likely to be located at research institutions or institutions that value research over teaching, both of which are associated with greater preference for research” (779). The research, teaching, and service divide in academia remains a thorny issue, especially given the prevalence of what has been called the pink collar workforce in academia, or the disproportionate amount of women working in low-paying teaching-oriented areas. This divide likely also contributed to differing gender ratios between attendees and authors at DH2015.
While the gendered implications of time allocation in universities are beyond the scope of this paper, it might be useful to note that there might be long-term consequences for how people spend their time interacting with scholarly tasks that extend beyond one specific institution. Winslow writes: “Since women bear a disproportionate responsibility for labor that is institution-specific (e.g., institutional housekeeping, mentoring individual students), their investments are less likely to be portable across institutions. This stands in stark contrast to men, whose investments in research make them more highly desirable candidates should they choose to leave their own institutions” (790). How this plays out specifically in the DH community remains to be seen, but the interdisciplinarity of DH along with its projects that span multiple working groups and institutions may unsettle some of the traditional bias that women in academia face.
Until 2015, the DH conference alternated every year between North America and Europe. As expected, until recently, the institutions represented at the conference have hailed mostly from these areas, with the primary locus falling in North America. In fact, since 2000, North American authors were the largest authorial constituency at eleven of the fifteen conferences, even though North America only hosted the conference seven times in that period.
With that said, as opposed to gender representation, national and institutional diversity is improving over time. Using an Index of Qualitative Variation (IQV), institutional variation begins around 0.992 in 2000 and ends around 0.996 in 2015, with steady increases over time. National IQV begins around 0.79 in 2010 and ends around 0.83 in 2015, also with steady increases over time. The most recent conference was the first that included over 30% of authors and attendees arriving from outside Europe or North America. Now that ADHO has implemented a three-year cycle, with every third year marked by a movement outside its usual territory, that diversity is likely to increase further still.
The most well-represented institutions are not as dominating as some may expect, given the common view of DH as a community centered around particular powerhouse departments or universities. The university with the most authors contributing to DH conferences (2.4% of the total authors) is King’s College London, followed by the Universities of Illinois (1.85%), Alberta (1.83%), and Virginia (1.75%). The most prominent university outside of North America or Europe is Ritsumeikan University, contributing 1.07% of all DH conference authors. In all, over a thousand institutions have contributed authors to the conference, and that number increases every year.
While these numbers represent institutional origins, the data available does not allow any further diving into birth countries, native language, ethnic identities, etc. The 2013-2015 dataset, including peer review information, does yield some insight into geography-influenced biases that may map to language or identity. While the peer review data do not show any clear bias by institutional country, there is a very clear bias against names which do not appear frequently in the U.S. Census or Social Security Index. We discovered this when attempting to statistically infer the gender of authors using these U.S.-based indices. 11 From 2013-2015, presentations written by those with names appearing frequently in these indices were significantly more likely to be accepted than those written by authors with non-English names (p < 0.0001). Whereas approximately 72% of authors with common U.S. names passed peer review, only 61% of authors with uncommon names passed. Without more data, we have no idea whether this tremendous disparity is due to a bias against popular topics from non-English-speaking countries, a higher likelihood of peer reviewers rejecting text written by non-native writers, an implicit bias by peer reviewers when they see “foreign” names, or something else entirely.
When submitting a presentation, authors are given the opportunity to provide keywords for their submission. Some keywords can be chosen freely, while others must be chosen from a controlled list of about 100 potential topics. These controlled keywords are used to help in the process of conference organization and peer reviewer selection, and they stay roughly constant every year. New keywords are occasionally added to the list, as in 2016, where authors can now select three topics which were not previously available: “Digital Humanities – Diversity”, “Digital Humanities – Multilinguality”, and “3D Printing”. The 2000-2015 conference dataset does not include keywords for every article, so this analysis will only cover the more detailed dataset, 2013-2015, with additional data on submissions for DH2016.
From 2013-2016, presentations were tagged with an average of six controlled keywords per submission. The most-used keywords are unsurprising: “Text Analysis” (tagged on 22% of submissions), “Data Mining / Text Mining” (20%), “Literary Studies” (20%), “Archives, Repositories, Sustainability And Preservation” (19%), and “Historical Studies” (18%). The most frequently-used keyword potentially pertaining directly to issues of diversity, “Cultural Studies”, appears on on 14% of submissions from 2013-2016. Only 2% of submissions are tagged with “Gender Studies”. The two diversity-related keywords introduced this year are already being used surprisingly frequently, with 9% of submissions in 2016 tagged “Digital Humanities – Diversity” and 6% of submissions tagged “Digital Humanities – Multilinguality”. With over 650 conference submissions for 2016, this translates to a reasonably large community of DH authors presenting on topics related to diversity.
Joining the topic and gender data for 2013-2015 reveals the extent to which certain subject matters are gendered at DH conferences. 12 Women are twice as likely to use the “Gender Studies” tag as male authors, whereas men are twice as likely to use the “Asian Studies” tag as female authors. Subjects related to pedagogy, creative / performing arts, art history, cultural studies, GLAM (galleries, libraries, archives, museums), DH institutional support, and project design/organization/management are more likely to be presented by women. Men, on the other hand, are more likely to write about standards & interoperability, the history of DH, programming, scholarly editing, stylistics, linguistics, network analysis, and natural language processing / text analysis. It seems DH topics have inherited the usual gender skews associated with the disciplines in which those topics originate.
We showed earlier that there was no direct gender bias in the peer review process. While true, there appears to be indirect bias with respect to how certain gendered topics are considered acceptable by the DH conference peer reviewers. A woman has just as much chance of getting a paper through peer review as a man if they both submit a presentation on the same topic (e.g., both women and men have a 72% chance of passing peer review if they write about network analysis, or a 65% chance of passing peer review if they write about knowledge representation), but topics that are heavily gendered towards women are less likely to get accepted. Cultural studies has a 57% acceptance rate, gender studies 60%, pedagogy 51%. Male-skewed topics have higher acceptance rates, like text analysis (83%), programming (80%), or Asian studies (79%). The female-gendering of DH institutional support and project organization also supports our earlier claim that, while women are well-represented among the DH leadership, they are more poorly represented in those topics that the majority of authors are discussing (programming, text analysis, etc.).
Regarding the clustering – and devaluing – of topics that women tend to present on at DH conferences, the widespread acknowledgement of the devaluing of women’s labor may help to explain this. We discussed the feminization of academia above, and indeed, this is a trend seen in practically all facets of society. The addition of emotional labor or caretaking tasks complicates this. Economist Teresa Ghilarducchi explains: “a lot of what women do in their lives is punctuated by time outside of the labor market — taking care of family, taking care of children — and women’s labor has always been devalued…[people] assume that she had some time out of the labor market and that she was doing something that was basically worthless, because she wasn’t being paid for it.” In academia specifically, the labyrinthine relationship of pay to tasks/labor further obscures value: we are rarely paid per task (per paper published or presented) on the research front; service work is almost entirely invisible; and teaching factors in with course loads, often with more up-front transparency for contingent laborers such as adjuncts and part-timers.
Our results seem to point to less of an obvious bias against women scholars than a subtler bias against topics that women tend to gravitate toward, or are seen as gravitating toward. This is in line with the concept of postfeminism, or the notion that feminism has met its main goals (e.g. getting women the right to vote and the right to an education), and thus is irrelevant to contemporary social needs and discourse. Thoroughly enmeshed in neoliberal discourse, postfeminism makes discussing misogyny seem obsolete and obscures the subtler ways in which sexism operates in daily life (Pomerantz, Raby, and Stefanik 2013). While individuals may or may not choose to identify as postfeminist, the overarching beliefs associated with postfeminism have permeated North American culture at a number of levels, leading us to posit the acceptance of the ideals of postfeminism as one explanation for the devaluing of topics that seem associated with women.
Discussion and Future Research
The analysis reveals an annual DH conference with a growing awareness of diversity-related issues, with moderate improvements in regional diversity, stagnation in gender diversity, and unknown (but anecdotally poor) diversity with regards to language, ethnicity, and skin color. Knowledge at the DH conference is heavily gendered, though women are not directly biased against during peer review, and while several prominent women occupy the community’s core, women occupy less space in the much larger periphery. No single or small set of institutions dominate the conference attendance, and though North America’s influence on ADHO cannot be understated, recent ADHO efforts are significantly improving the geographic spread of its constituency.
The DH conference, and by extension ADHO, is not the digital humanities. It is, however, the largest annual gathering of self-identified digital humanists, 13 and as such its makeup holds influence over the community at large. Its priorities, successes, and failures reflect on DH, both within the community and to the outside world, and those priorities get reinforced in future generations. If the DH conference remains as it is—devaluing knowledge associated with femininity, comprising only 36% women, and rejecting presentations by authors with non-English names—it will have significant difficulty attracting a more diverse crowd without explicit interventions. Given the shortcomings revealed in the data above, we present some possible interventions that can be made by ADHO or its members to foster a more diverse community, inspired by #WhatIfDH2016:
As pointed out by Yvonne Perkins, Ask presenters to include a brief “Collections Used” section, when appropriate. Such a practice would highlight and credit the important work being done by those who aren’t necessarily engaging in publishable research, and help legitimize that work to conference attendees.
As pointed out by Vika Zafrin, create guidelines for reviewers explicitly addressing diversity, and provide guidance on noticing and reducing peer review bias.
As pointed out by Vika Zafrin, community members can make an effort to solicit presentation submissions from women and people of color.
As pointed out by Vika Zafrin, collect and analyze data on who is peer reviewing, to see whether or the extent to which biases creep in at that stage.
As pointed out by Aimée Morrison, ensure that the conference stage is at least as diverse as the conference audience. This can be accomplished in a number of ways, from conference organizers making sure their keynote speakers draw from a broad pool, to organizing last-minute lightning lectures specifically for those who are registered but not presenting.
As pointed out by Christina Boyles, encourage the submission of research focused around the intersection of race, gender, and sexuality studies. This may be partially accomplished by including more topical categories for conference submissions, a step which ADHO has already taken for 2016.
As pointed out by many, take explicit steps in ensuring conference access to those with disabilities. We suggest this become an explicit part of the application package submitted by potential host institutions.
As pointed out by many, ensure the ease of participation-at-a-distance (both as audience and as speaker) for those without the resources to travel.
Give marginalized communities greater representation in the DH Conference peer reviewer pool. This can be done grassroots, with each of us reaching out to colleagues to volunteer as reviewers, and organizationally, perhaps by ADHO creating a volunteer group to seek out and encourage more diverse reviewers.
Consider the difference between diversifying (verb) vs. talking about diversity (noun), and consider whether other modes of disrupting hegemony, such as decolonization and queering, might be useful in these processes.
Contribute to the #whatifDH2016 and #whatifDH2017 discussions on twitter with other ideas for improvements.
Many options are available to improve representation at DH conferences, and some encouraging steps are already being taken by ADHO and its members. We hope to hear more concrete steps that may be taken, especially learned from experiences in other communities or outside of academia, in order to foster a healthier and more welcoming conference going forward.
In the interest of furthering these goals and improving the organizational memory of ADHO, the public portion of the data (final conference programs with full text and unique author IDs) is available alongside this publication [will link in final draft]. With this, others may test, correct, or improve our work. We will continue work by extending the dataset back to 1990, continuing to collect for future conferences, and creating an infrastructure that will allow the database to connect to others with similar collections. This will include the ability to encode more nuanced and fluid gender representations, and for authors to correct their own entries. Further work will also include exploring topical co-occurrence, institutional bias in peer review, how institutions affect centrality in the co-authorship network, and how authors who move between institutions affect all these dynamics.
The Digital Humanities will never be perfect. It embodies the worst of its criticisms and the best of its ideals, sometimes simultaneously. We believe a more diverse community will help tip those scales in the right direction, and present this chapter in service of that belief.
Peterson, Helen. “An Academic ‘Glass Cliff’? Exploring the Increase of Women in Swedish Higher Education Management.” Athens Journal of Education 1, no. 1 (February 2014): 32–44.
Pomerantz, Shauna, Rebecca Raby, and Andrea Stefanik. “Girls Run the World? Caught between Sexism and Postfeminism in the School.” *Gender & Society *27, no. 2 (April 1, 2013): 185-207. doi:10.1177/0891243212473199
Each author contributed equally to the final piece; please disregard authorship order. ↩
See Melissa Terras, “Disciplined: Using Educational Studies to Analyse ‘Humanities Computing.’” Literary and Linguistic Computing 21, no. 2 (June 1, 2006): 229–46. doi:10.1093/llc/fql022. Terras takes a similar approach, analyzing Humanities Computing “through its community, research, curriculum, teaching programmes, and the message they deliver, either consciously or unconsciously, about the scope of the discipline.” ↩
The authors have created a browsable archive of #whatifDH2016 tweets. ↩
Of the 146 presentations at DH2011, two standout in relation to diversity in DH: “Is There Anybody out There? Discovering New DH Practitioners in other Countries” and “A Trip Around the World: Balancing Geographical Diversity in Academic Research Teams.” ↩
See “Disrupting DH,” http://www.disruptingdh.com/ ↩
See Wernimont’s blog post, “No More Excuses” (September 2015) for more, as well as the Tumblr blog, “Congrats, you have an all male panel!” ↩
Miriam Posner offers a longer and more eloquent discussion of this in, “What’s Next: The Radical, Unrealized Potential of Digital Humanities.” Miriam Posner’s Blog. July 27, 2015. http://miriamposner.com/blog/whats-next-the-radical-unrealized-potential-of-digital-humanities/ ↩
[Link to the full public dataset, forthcoming and will be made available by time of publication]) ↩
We would like to acknowledge that race and ethnicity are frequently used interchangeably, though both are cultural constructs with their roots in Darwinian thought, colonialism, and imperialism. We retain these terms because they express cultural realities and lived experiences of oppression and bias, not because there is any scientific validity to their existence. For more on this tension, see John W.Burton, (2001), Culture and the Human Body: An Anthropological Perspective. Prospect Heights, Illinois: Waveland Press, 51-54. ↩
Weingart, S.B. & Eichmann, N. (2016). “What’s Under the Big Tent?: A Study of ADHO Conference Abstracts.” Manuscript submitted for publication. ↩
We used the process and script described in: Lincoln Mullen (2015). gender: Predict Gender from Names Using Historical Data. R package version 0.5.0.9000 (https://github.com/ropensci/gender) and Cameron Blevins and Lincoln Mullen, “Jane, John … Leslie? A Historical Method for Algorithmic Gender Prediction,” Digital Humanities Quarterly 9.3 (2015). ↩
For a breakdown of specific numbers of gender representation across all 96 topics from 2013-2015, see Weingart’s “Acceptances to Digital Humanities 2015 (part 4)”. ↩
While ADHO’s annual conference is usually the largest annual gathering of digital humanists, that place is constantly being vied for by the Digital Humanities Summer Institute in Victoria, Canada, which in 2013 boasted more attendees than DH2013 in Lincoln, Nebraska. ↩
[note: originally published as draft on March 17th, 2016]
DH2016 announced their final(ish) program yesterday and, of course, that means it’s analysis time. Every year, I steal scrape submission data from the reviewer interface, and then scrape the final conference program, to report acceptance rates and basic stats for the annual event. See my previous 7.2 million previous posts on the subject. Nobody gives me data, I take it (capta, amiright?), so take these results with as many grains of salt as you’ll find at the DH2016 salt mines.
As expected, this will be the biggest ADHO conference to date, continuing a mostly-consistent trend of yearly growth. Excluding workshops & keynotes, this year’s ADHO conference in Kraków, Poland will feature 417 posters & presentations, up from 259 in 2015 (an outlier, held in Australia) and the previous record of 345 in 2014 (Switzerland). At this rate, the number of DH presentations should surpass our human population by the year 2126 (or earlier in the case of unexpected zombies).
Acceptance rates this year are on par with previous years. An email from ADHO claims this year’s overall acceptance rate to be 62%, and my calculations put it at 64%. Previous years were within this range: 2013 Nebraska (64%), 2014 Switzerland (59%), and 2015 Australia (72%). Regarding form, the most difficult type of presentation to get through review is a long paper, with only 44% of submitted long papers being accepted as long papers. Another 7.5% of long papers were accepted as posters, and 10% as short papers. In total, 62% of long paper submissions were accepted in some form. Reviewers accepted 75% of panels and posters, leaving them mostly in their original form. The category least likely to get accepted in any form was the short paper, with an overall 59% acceptance rate (50% accepted as short papers; 8% accepted as posters). The moral of the story is that your best bet to get accepted to DH is to submit a poster. If you hate posters, submit a long paper; even if it’s not accepted as a long paper, it might still get in as a short or a poster. But if you do hate posters, maybe just avoid this conference.
About a third of this year’s presentations are single-authored, another third dual-authored, and the last third are authored by three or more people. As with 2013-2015, more authors means a more likely acceptance: reviewers accepted 51% of single-authored presentations, 66% of dual-authored presentations, and 74% of three-or-more-authored presentations.
Topically, the landscape of DH2016 will surprise few. A quarter of all presentations will involve text analysis, followed by historical studies (23% of all presentations), archives (21%), visualizations (20%), text/data mining (20%), and literary studies (20%). DH self-reflection is always popular, with this year’s hot-button issues being DH diversity (10%), DH pedagogy (10%), and DH facilities (7%). Surprisingly, other categories pertaining to pedagogy are also growing compared to previous years, though mostly it’s due to more submissions in that area. Reviewers still don’t rate pedagogy presentations very highly, but more on that in the next post. Some topical low spots compared to previous years include social media (2% of all presentations), anthropology (3%), VR/AR (3%), crowdsourcing (4%), and philosophy (5%).
This year will likely be the most linguistically diverse conference thus-far: 92% English, 7% French, 0.5% German, with other presentations in Spanish, Italian, Polish, etc. (And by “most linguistically diverse” obviously I mean “really not very diverse but have you seen the previous conferences?”) Submitting in a non-English language doesn’t appreciably affect acceptance rates.
That’s all for now. Stay-tuned for Pt. 2, with more thorough comparisons to previous years, actual granular data/viz on topics, analyses of gender and geography, as well as interpretations of what the changing landscape means for DH.
Twice a year I indulge my meta-disciplinary sweet tooth: once to look at who’s submitting what to ADHO’s annual digital humanities conference, and once to look at which pieces get accepted (see the rest of the series). This post presents my first look at DH2016 conference submissions, the data for which I scraped from ConfTool during the open peer review bidding phase. Open peer review bidding began in 2013, so I have 4 years of data. I opt not to publish this data, as most authors submit pieces under an expectation of privacy, and might violently throw things at my face if people find out which submissions weren’t accepted. Also ethics.
Submission Numbers & Types
The basic numbers: 652 submissions (268 long papers, 223 short papers, 33 panels / multiple paper sessions, 128 posters). For those playing along at home, that’s:
2013 Nebraska: 348 (144/118/20/66)
2014 Lausanne: 589 (250/198/30/111)
2015 Sydney: 360 (192/102/13/53)
2016 Kraków: 652 (268/223/33/128)
DH2016 submissions are on par to continue the consistent-ish trend of growth every year since 1999, the large dip in 2015 unsurprising given its very different author pool, and the fact that it was the first time the conference visited the southern hemisphere or Asia-Pacific. The different author pool in 2015 also likely explains why it was the only conference to deviate from the normal submission-type ratios.
Regarding co-authorship, the number has shifted this year, though not enough to pass any significance tests.
DH2016 has proportionally slightly fewer single authored papers than previous years, and slightly more 2-, 3-, and 4-authored papers. One submission has 17 authors (not quite the 5,154-author record of high energy physics, but we’re getting there, eh?), but mostly it’s par for the course here.
Topically, DH2016 submissions continue many trends seen previously.
Authors must tag their submissions into multiple categories, or topics, using a controlled vocabulary. The figure presents a list of topics tagged to submissions, ordered top-to-bottom by the largest proportion of submissions with a certain tag for 2016. Nearly 25% of DH2016 submissions, for example, were tagged with “Text Analysis”. The dashed lines represent previous years’ tag proportions, with the darkest representing 2015, getting lighter towards 2013. New topics, those which just entered the controlled vocabulary this year, are listed in red. They are 3D Printing, DH Multilinguality, and DH Diversity.
Scroll past the long figure below to read my analysis:
In a reveal that will shock all species in the known universe, text analysis dominates DH2016 submissions—the proportion even grew from previous years. Text & data mining, archives, and data visualization aren’t far behind, each growing from previous years.
What did actually (pleasantly) surprise me was that, for the first time since I began counting in 2013, history submissions outnumber literary ones. Compare this to 2013, when literary studies were twice as well represented as historical. Other top-level categories experiencing growth include: corpus studies, content analysis, knowledge representation, NLP, and linguistics.
Two areas which I’ve pointed out previously as needing better representation, geography and pedagogy, both grew compared to previous years. I’ve also pointed out a lack of discussion of diversity, but part of that lack was that authors had no “diversity” category to label their research with—that is, the issue I pointed out may have been as much a problem with the topic taxonomy as with the research itself. ADHO added “Diversity” and “Multilinguality” as potential topic labels this year, which were tagged to 9.4% and 6.5% of submissions, respectively. One-in-ten submissions dealing specifically with issues of diversity is encouraging to see.
Unsurprisingly, since Sydney, submissions tagged “Asian Studies” have dropped. Other consistent drops over the last few years include software design, A/V & multimedia (sadface), information retrieval, XML & text encoding, internet & social media-related topics, crowdsourcing, and anthropology. The conference is also getting less self-referential, with a consistent drop in DH histories and meta-analyses (like this one!). Mysteriously, submissions tagged with the category “Other” have dropped rapidly each year, suggesting… dunno, aliens?
I have the suspicion that some numbers are artificially growing because there are more topics tagged per article this year than previous years, which I’ll check and report on in the next post.
It may be while before I upload the next section due to other commitments. In the meantime, you can fill your copious free-time reading earlier posts on this subject or my recent book with Shawn Graham & Ian Milligan, The Historian’s Macroscope. Maybe you can buy it for your toddler this holiday season. It fits perfectly in any stocking (assuming your stockings are infinitely deep, like Mary Poppins’ purse, which as a Jew watching Christmas from afar I just always assume is the case).
Women are (nearly but not quite) as likely as men to be accepted by peer reviewers at DH conferences, but names foreign to the US are less likely than either men or women to be accepted to these conferences. Some topics are more likely to be written on by women (gender, culture, teaching DH, creative arts & art history, GLAM, institutions), and others more likely to be discussed by men (standards, archaeology, stylometry, programming/software).
You may know I’m writing a series on Digital Humanities conferences, of which this is the zillionth post. 1 This post has nothing to do with DH2015, but instead looks at DH2013, DH2014, and DH2015 all at once. I continue my recent trend of looking at diversity in Digital Humanities conferences, drawing especially on these two posts (1, 2) about topic, gender, and acceptance rates.
This post will be longer than usual, since Heather Froehlich rightly pointed out my methods in these posts aren’t as transparent as they ought to be, and I’d like to change that.
@scott_bot@nmhouston don’t get me wrong – i think they’re interesting and useful but i don’t think they’re 100% transparent
As someone who deals with algorithms and large datasets, I desperately seek out those moments when really stupid algorithms wind up aligning with a research goal, rather than getting in the way of it.
In the humanities, stupid algorithms are much more likely to get in the way of my research than help it along, and afford me the ability to make insensitive or reductivist decisions in the name of “scale”. For example, in looking for ethnic diversity of a discipline, I can think of two data-science-y approaches to solving this problem: analyzing last names for country of origin, or analyzing the color of recognized faces in pictures from recent conferences.
Obviously these are awful approaches, for a billion reasons that I need not enumerate, but including the facts that ethnicity and color are often not aligned, and last names (especially in the states) are rarely indicative of anything at all. But they’re easy solutions, so you see people doing them pretty often. I try to avoid that.
Sometimes, though, the stars align and the easy solution is the best one for the question. Let’s say we were looking to understand immediate reactions of racial bias; in that case, analyzing skin tone may get us something useful because we don’t actually care about the race of the person, what we care about is the immediateperceived race by other people, which is much more likely to align with skin tone. Simply: if a person looks black, they’re more likely to be treated as such by the world at large.
This is what I’m banking on for peer review data and bias. For the majority of my data on DH conferences, Nickoal Eichmann and I have been going in and hand-coding every single author with a gender that we glean from their website, pictures, etc. It’s quite slow, far from perfect (see my note), but it’s at least more sensitive than the brute force method, we hope to improve it quite soon with user-submitted genders, and it gets us a rough estimate of gender ratios in DH conferences.
But let’s say we want to discuss bias, rather than diversity. In that case, I actually prefer the brute force method, because instead of giving me a sense of the actual gender of an author, it can give me a sense of what the peer reviewers perceive an author’s gender to be. That is, if a peer reviewer sees the name “Mary” as the primary author of an article, how likely is the reviewer to think the author is written by a woman, and will this skew their review?
That’s my goal today, so instead of hand-coding like usual, I went to Lincoln Mullen’s fabulous package for inferring gender from first names in the programming language R. It does so by looking in the US Census and Social Security Database, looking at the percentage of men and women with a certain first name, and then gives you both the ratio of men-to-women with that name, and the most likely guess of the person’s gender.
Inferring Gender for Peer Review
I don’t have a palantír and my DH data access is not limitless. In fact, everything I have I’ve scraped from public or semi-public spaces, which means I have no knowledge of who reviewed what for ADHO conferences, the scores given to submissions, etc. What I do have the titles and author names for every submission to an ADHO conference since 2013 (explanation), and the final program of those conferences. This means I can see which submissions don’t make it to the presentation stage; that’s not always a reflection of whether an article gets accepted, but it’s probably pretty close.
So here’s what I did: created a list of every first name that appears on every submission, rolled the list it into Lincoln Mullen’s gender inference machine, and then looked at how often authors guessed to be men made it through to the presentation stage, versus how often authors guessed to women made it through. That is to say, if an article is co-authored by one man and three women, and it makes it through, I count it as one acceptance for men and three for women. It’s not the only way to do it, but it’s the way I did it.
I’m arguing this can be used as a proxy for gender bias in reviews and editorial decisions: that if first names that look like women’s names are more often rejected 2 than ones that look like men’s names, there’s likely bias in the review process.
Results: Bias in Peer Review?
Totaling all authors from 2013-2015, the inference machine told me 1,008 names looked like women’s names; 1,707 looked like men’s names; and 515 could not be inferred. “Could not be inferred” is code for “the name is foreign-sounding and there’s not enough data to guess”. Remember as well, this is counting every authorship as a separate event, so if Melissa Terras submits one paper in 2013 and one in 2014, the name “Melissa” appears in my list twice.
So we see that in 2013-2015, 70.3% of woman-authorship-events get accepted, 73.2% of man-authorship-events get accepted, and only 60.6% of uninferrable-authorship-events get accepted. I’ll discuss gender more soon, but this last bit was totally shocking to me. It took me a second to realize what it meant: that if your first name isn’t a standard name on the US Census or Social Security database, you’re much less likely to get accepted to a Digital Humanities conference. Let’s break it out by year.
We see an interesting trend here, some surprising, some not. Least surprising is that the acceptance rates for non-US names is most equal this year, when the conference is being held so close to Asia (which the inference machine seems to have the most trouble with). My guess is that A) more non-US people who submit are actually able to attend, and B) reviewers this year are more likely to be from the same sorts of countries that the program is having difficulties with, so they’re less likely to be biased towards non-US first names. There’s also potentially a language issue here: that non-US submissions are more likely to be rejected because they are either written in another language, or written in a way that native English speakers may find difficult to understand.
But the fact of the matter is, there’s a very clear bias against submissions by people with names non-standard to the US. The bias, oddly, is most pronounced in 2014, when the conference was held in Switzerland. I have no good guesses as to why.
So now that we have the big effect out of the way, let’s get to the small one: gender disparity. Honestly, I had expected it to be worse; it is worse this years than the two previous, but that may just be statistical noise. It’s true that women do fair worse overall by 1-3%, which isn’t huge, but it’s big enough to mention. However.
Topics and Gender
However, it turns out that the entire gender bias effect we see is explained by the topical bias I already covered the other day. (Scroll down for the rest of the post.)
What’s shown here will be fascinating to many of us, and some of it more surprising than others. A full 67% of authors on the 25 DH submissions labeled “gender studies” are labeled as women by Mullen’s algorithm. And remember, many of those may be the same author; for example if “Scott Weingart” is listed as an author on multiple submissions, this chart counts those separately.
Other topics that are heavily skewed towards women: drama, poetry, art history, cultural studies, GLAM, and (importantly), institutional support and DH infrastructure. Remember how I said a large percentage of of those responsible for running DH centers, committees, and organizations are women? This is apparently the topic they’re publishing in.
If we look instead at the bottom of the chart, those topics skewed towards men, we see stylometrics, programming & software, standards, image processing, network analysis, etc. Basically either the CS-heavy topics, or the topics from when we were still “humanities computing”, a more CS-heavy community. These topics, I imagine, inherit their gender ratio problems from the various disciplines we draw them from.
You may notice I left out pedagogical topics from my list above, which are heavily skewed towards women. I’m singling that out specially because, if you recall from my previous post, pedagogical topics are especially unlikely to be accepted to DH conferences. In fact, a lot of the topics women are submitting in aren’t getting accepted to DH conferences, you may recall.
It turns out that the gender bias in acceptance ratios is entirely accounted for by the topical bias. When you break out topics that are not gender-skewed (ontologies, UX design, etc.), the acceptance rates between men and women are the same – the bias disappears. What this means is the small gender bias is coming at the topical level, rather than at the gender level, and since women are writing more about those topics, they inherit the peer review bias.
Does this mean there is no gender bias in DH conferences?
No. Of course not. I already showed yesterday that 46% of attendees to DH2015 are women, whereas only 35% of authors are. What it means is the bias against topics is gendered, but in a peculiar way that actually may be (relatively) easy to solve, and if we do solve it, it’d also likely go a long way in solving that attendee/authorship ratio too.
Get more women peer reviewing for DH conferences.
Although I don’t know who’s doing the peer reviews, I’d guess that the gender ratio of peer reviewers is about the same as the ratio of authors; 34% women, 66% men. If that is true, then it’s unsurprising that the topics women tend to write about are not getting accepted, because by definition these are the topics that men publishing at DH conferences find less interesting or relevant3. If reviewers gravitate towards topics of their own interest, and if their interests are skewed by gender, it’d also likely skew results of peer review. If we are somehow able to improve the reviewer ratio, I suspect the bias in topic acceptance, and by extension gender acceptance, will significantly reduce.
Jacqueline Wernimont points out in a comment below that another way improving the situation is to break the “gender lines” I’ve drawn here, and make sure to attend presentations on topics that are outside your usual scope if (like me) you gravitate more towards one side than another.
Obviously this is all still preliminary, and I plan to show the breakdown of acceptances by topic and gender in a later post so you don’t just have to trust me on it, but at the 2,000-word-mark this is getting long-winded, and I’d like feedback and thoughts before going on.
There’s a disparity between gender diversity in authorship and attendance at DH2015; attendees are diverse, authors aren’t. That said, the geography of attendance is actually pretty encouraging this year. A lot of this work draws a project on the history of DH conferences I’m undertaking with the inimitable Nickoal Eichmann. She’s been integral on the research of everything you read about conferences pre-2013.
Diversity at DH2015: Preliminary Numbers
For those just joining us, I’m analyzing this year’s international Digital Humanities conference being held in Sydney, Australia (part 1, part 2). This is the 10th post in a series of reflective entries on Digital Humanities conferences, throughout which I explore the landscape of Digital Humanities as it is represented by the ADHO conference. There are other Digital Humanities (a great place to start exploring them in Alex Gil’s arounddh), but since this is the biggest event, it’s also an integral reflection on our community to the public and non-DH academic world.
If the DH conference is our public face, we all hope it does a good job of representing our constituent parts, big or small. It does not. The DH conference systematically underrepresents women and people from parts of the world that are not Europe or North America.
Until today, I wasn’t sure whether this was an issue of underrepresentation, an issue of lack of actual diversity among our constituents, or both. Today’s data have shown me it may be more underrepresentation than lack of diversity, although I can’t yet say anything with certainty without data from more conferences.
I come to this conclusion by comparing attendees to the conference to authors of presentations at the conference. My assumption is that if authorship and attendee diversity are equal, and both poor, then we have a diversity problem. If instead attendance is diverse but authorship is not, then we have a representation problem. It turns out, at least in this dataset, the latter is true. I’ve been able to reach the conclusion because the conference organizing committee (themselves a diverse, fantastic bunch) have published and made available the DH2015 attendance list.
Because this is an important subject, this post is more somber and more technically detailed than most others in this series.
The published Attendance List was nice enough to already attach country names to every attendee, so making an interactive map to attendees was a simple manner of cleaning the data (here it is as csv), aggregating it and plugging it into CartoDB.
Despite a lack of South American and African attendees, this is still a pretty encouraging map for DH2015, especially compared to earlier years. The geographic diversity of attendees is actually mirrored in the conference submissions (analyzed here), which to my mind means the ADHO decision to hold the conference somewhere other than North America or Europe succeeded in its goal of diversifying the organization. From what I hear, they hope to continue this trend by moving to a three-year rotation, between North America, Europe, and elsewhere. At least from this analysis, that’s a successful strategy.
If we look at the locations of authors at ADHO conferences from 2004-2013, we see a very different profile than is apparent this year in Sydney. The figure below, made by my collaborator Nickoal Eichmann, shows all author locations from ADHO conferences in this 10-year range.
Notice the difference in geographic profile from this year?
This also hides the sheer prominence of the Americas (really, just North America) at every single ADHO conference since 2004. The figure below shows the percentage of authors from different regions at DH2004-2013, with Europe highlighted in orange during the years the conference was held in Europe.
If you take a second to study this visualization, you’ll notice that with only one major exception in 2012, even when the conference was held in Europe, the majority of authors hailed from the Americas. That’s cray-cray, yo. Compare that to 2015 data from Figure 2; the Americas are still technically sending most of the authors, but the authorship pool is significantly more regionally diverse than the decade of 2004-2013.
Actually, even before the DH conference moved to Australia, we’ve been getting slightly more geographically diverse. Figure 5, below, shows a slight increase in diversity score from 2004-2013.
In sum, we’re getting better! Also, our diversity of attendance tends to match our diversity of authorship, which means we’re not suffering an underrepresentation problem on top of a lack of diversity. The lack of diversity is obviously still a problem, but it’s improving, and in no small part to the efforts of ADHO to move the annual conference further afield.
Gravy train’s over, folks. We’re getting better with geography, sure, but what about gender? Turns out our gender representation in DH sucks, it’s always sucked, and unless we forcibly intervene, it’s likely to continue to suck.
We’ve probably inherited our gender problem from computer science, which is weird, because such a large percentage of leadership in DH organizations, committees, and centers are women. What’s more, the issue isn’t that women aren’t doing DH, it’s that they’re not being well-represented at our international conference. Instead they’re going to other conferences which are focused on diversity, which as Jacqueline Wernimont points out, is less than ideal.
@scott_bot was talking about this w @amyeetx and others and said that I’m frustrated to have other major events be the “gender” events 1/2
So what’s the data here? Let’s first look historically.
Figure 6 shows percentage of women authors at DH2004-DH2013. The data were collected in collaboration with Nickoal Eichmann. 1
Notice the alarming tendency for DH conference authorship to hover between 30-35% women. Women fair slightly better as first authors—that is to say, if a woman authors an ADHO presentation, they’re more likely to be a first author than a second or third. This matches well with the fact that a lot of the governing body of DH organizations are women, and yet the ratio does not hold in authorship. I can’t really hazard a guess as to why that is.
Gender in 2015
Which brings us to 2015 in Sydney. I was encouraged to see the organizing committee publish an attendance list, and immediately set out to find the gender distribution of attendees. 2Hurray! I tweeted. About 46% of attendees to DH2015 were women. That’s almost 50/50!
Armed with the same hope I’ve felt all week (what with two fantastic recent Supreme Court decisions, a Papal decree on global warming, and the dropping of confederate flags all over the country), I set out to count gender among authors at DH2015.
Preliminary results show 34.6% 3 of authors at DH2015 are women. Status quo quo quo quo.
So how do we reconcile the fact that only 35% of authors at DH2015 are women, yet 46% of attendees are? I’m interpreting this to mean that we don’t have a diversity problem, but a representation problem; for some reason, though women comprise nearly half of active participants at DH conferences, they only comprise a third of what’s actually presented at them.
This representation issue is further reflected by the topical analysis of DH2015, which shows that only 10% of presentations are tagged as cultural studies, and only 1% as gender studies. Previous years show a similar low number for both topics. (It’s worth noting that cultural studies tend to have a slightly lower-than-average acceptance rate, while gender studies has a slightly higher-than-average acceptance rate. Food for thought.)
Given this, how do we proceed? At an individual level, obviously, people are already trying to figure out paths forward, but what about at the ADHO level? Their efforts, and efforts of constituent members, have been successful at improving regional diversity at our flagship annual event. What sort of intervention can we create to similarly improve our gender representation problems? Hopefully comments below, or Twitter conversation, might help us collaboratively build a path forward, or offer suggestions to ADHO for future events. 4
We labeled authors as male, female, or unknown/other. We did not encode changes of author gender over time, even though we know of at least a few authors in the dataset for whom this would apply. We hope to remedy this issue in the near future by asking authors themselves to help us with identification, and we ourselves at least tried to be slightly more sensitive by labeling author gender by hand, rather than by using an algorithm to guess based on the author’s first name.
This series of choices was problematic, but we felt it was worth it as a first pass as a vehicle to point out bias and lack of representation in DH, and we hope you all will help us improve our very rudimentary dataset soon. ↩
This is an even more problematic analysis than that of conference authorship. I used Lincoln Mullen’s fabulous gender guessing library in R, which guesses gender based on first names and statistics from US Social Security data, but obviously given the regional diversity of the conference, a lot of its guesses are likely off. As with the above data, we hope to improve this set as time goes on. ↩
Very preliminary, but probably not far off; again using Lincoln Mullen’s R library. ↩
Obviously I’m far from the first to come to this conclusion, and many ADHO committee members are already working on this problem (see GO::DH), but the more often we point out problems and try to come up with solutions, the better. ↩
This post’s about the topical coverage of DH2015 in Australia. If you’re curious about how the landscape compares to previous years, see this post. You’ll see a lot of text, literature, and visualizations this year, as well as archives and digitisation projects. You won’t see a lot of presentations in other languages, or presentations focused on non-text sources. Gender studies is pretty much nonexistent. If you want to get accepted, submit pieces about visualization, text/data, literature, or archives. If you want to get rejected, submit pieces about pedagogy, games, knowledge representation, anthropology, or cultural studies.
I’m sorry. This post is going to contain a lot of giant pictures, because I’m in the mountains of Australia and I’d much rather see beautiful vistas than create interactive visualizations in d3. Deal with it, dweebs. You’re just going to have to do a lot of scrolling down to see the next batch of text.
This year’s conference presents a mostly-unsurprising continuations of the status quo (see 2014’s and 2013’s topical landscapes). Figure 1, below, shows the top author-chosen topic words of DH2015, as a proportion of the total presentations at the conference. For example, an impressive quarter, 24%, of presentations at DH2015 are about “text analysis”. The authors were able to choose multiple topics for each presentation, which is why the percentages add up to way more than 100%.
Scroll down for the rest of the post.
Text analysis, visualization, literary studies, data mining, and archives take top billing. History’s a bit lower, but at least there’s more history than the abysmal showing at DH2013. Only a tenth of DH2015 presentations are about DH itself, which is maybe impressive given how much we talk about ourselves? (cf. this post)
As usual, gender studies representation is quite low (1%), as are foreign language presentations and presentations not centered around text. I won’t do a lot of interpretation this post, because it’d mostly be repeat of earlier years. At any rate, acceptance rate is a bit more interesting than coverage this time around. Figure 2 shows acceptance rates of each topic, ordered by volume. Figure 3 shows the same, sorted by acceptance rate.
The topics that appear most frequently at the conference are on the far left, and the red line shows the percent of submitted articles that will be presented at DH2015. The horizontal black line is the overall acceptance rate to the conference, 72%, just to show which topics are above or below average.
Notice that all the most well-represented topics at DH2015 have a higher-than-average acceptance rate, possibly suggesting a bit of path-dependence on the part of peer reviewers or editors. Otherwise, it could mean that, since a majority peer reviewers were also authors in the conference, and since (as I’ve shown) the majority of authors have a leaning toward text, lit, and visualization, it’s also what they’re likely to rate highly in peer review.
The first dips we see under the average acceptance rate is “Interdisciplinary Studies” and “Historical Studies” (☹), but the dips aren’t all that low, and we ought not to read too much into it without comparing it to earlier conferences. More significant are the low rates for “Cultural Studies”, and even more than that are the two categories on Teaching, Pedagogy, and Curriculum. Both categories’ acceptance rates are about 20% under the average, and although they’re obviously correlated with one another, the acceptance rates are similar to 2014 and 2013. In short, DH peer reviewers or editors are more unlikely to accept submissions on pedagogy than on most other topics, even though they sometimes represent a decent chunk of submissions.
Other low points worth pointing out are “Anthropology” (huh, no ideas there), “Games and Meaningful Play” (that one came as a surprise), and “Other” (can’t help you here). Beyond that, the submission counts are too low to read any meaningful interpretations into the data. The Game Studies dip is curious, and isn’t reflected in earlier conferences, so it could just be noise for 2015. The low acceptance rates in Anthropology are consistent 2013-2015, and it’d be worth looking more into that.
Topical Co-Occurrence, 2013-2015
Figure 4, below, shows how topics appear together on submissions to DH2013, DH2014, and DH2015. Technically this has nothing to do with acceptances, and little to do with this year specifically, but the visualization should provide a little context to the above analysis. Topics connect to one another if they appear on a submission together, and the line connecting them gets thicker the more connections two topics share.
Although the “Interdisciplinary Collaboration” topic has a low acceptance rate, it understandably ties the network together; other topics that play a similar role are “Visualization”, “Programming”, “Content Analysis”, “Archives”, and “Digitisation”. All unsurprising for a conference where people come together around method and material. In fact, this reinforces our “DH identity” along those lines, at least insofar as it is represented by the annual ADHO conference.
There’s a lot to unpack in this visualization, and I may go into more detail in the next post. For now, I’ve got a date with the Blue Mountains west of Sydney.
[Update!] Melissa Terras pointed out I probably made a mistake on 2015 long paper -> short paper numbers. I checked, and she was right. I’ve updated the figures accordingly.
Part 1 is about sheer numbers of acceptances to DH2015 and comparisons with previous years. DH is still growing, but the conference locale likely prohibited a larger conference this year than last. Acceptance rates are higher this year than previous years. Long papers still reign supreme. Papers with more authors are more likely to be accepted.
It’s that time of the year again, when all the good little boys, girls, and other genders of DH gather around the scottbot irregular in pointless meta-analysis quiet self-reflection. As most of you know, the 2015 Digital Humanities conference occurs next week in Sydney, Australia. They’ve just released the final program, full of pretty exciting work, which means I can compare it to my analysis of submissions to DH2015 (1, 2, & 3) to see how DH is changing, how work gets accepted or rejected, etc. This is part of my series on analyzing DH conferences.
Part 1 will focus on basic counts, just looking at percentages of acceptance and rejection by the type of presentation, and comparing it with previous years. Later posts will cover topical, gender, geography, and kangaroos. NOTE: When I say “acceptances”, I really mean “presentations that appear on the final program.” More presentations were likely accepted and withdrawn due to the expense of traveling to Australia, so take these numbers with appropriate levels of skepticism.1
Around 270 papers, posters, and workshops are featured in this year’s conference program, down from last year’s ≈350 but up from DH2013’s ≈240. Although this is the first conference since 2010 with fewer presentations than the previous year’s, I suspect this is due largely to geographic and monetary barriers, and we’ll see a massive uptick next year in Poland and the following in (probably) North America. Whether or not the trend will continue to increase in 2018’s Antarctic locale, or 2019’s special Lunar venue, has yet to be seen. 2
As you can see from the chart above, even given this year’s dip, both DH2015 and the annual DHSI event in Victoria reveals DH is still on the rise. It’s also worth noting that last year’s DHSI was likely the first where more people attended it than the international ADHO conference.
A full 72% of submissions to DH2015 will be presented in Sydney next week. That’s significantly more inclusive than previous years: 59% of submitted manuscripts made it to DH2014 in Lausanne, and 64% to DH2013.
At first blush, the loss of exclusivity may seem a bad sign of a conference desperate for attendees, but to my mind the exact opposite is true: this is a great step forward. Conference peer review & acceptance decisions aren’t particularly objective, so using acceptance as a proxy for quality or relevance is a bit of a misdirection. And if we can’t aim for consistent quality or relevance in the peer review process, we ought to aim at least for inclusivity, or higher acceptance rates, and let the participants themselves decide what they want to attend.
Acceptance rates broken down by form (panel, poster, short paper, long paper) aren’t surprising, but are worth noting.
73% of submitted long papers were accepted, but only 45% of them were accepted as long papers. The other 28% were accepted as posters or short papers.
61% of submitted short papers were accepted, but only 51% as short papers; the other 10% became posters.
85% of posters were accepted, all of them as posters.
85% of panels were accepted, but one of them was accepted as a long paper.
A few papers/panels were converted into workshops.
Weirdly, short papers tend to have a lower acceptance rate than long papers over the last three years. I think that’s because if a long paper is rejected, it’s usually further along in the process enough that it’s more likely to be secondarily accepted-as-a-poster, but even that doesn’t account for the entire differential in the acceptance rate. Anyone have any thoughts on this?
Looking over time, we see an increasingly large slice of the DH conference pie is taken up by long papers. My guess is this is just a natural growth as authors learn the difference between long and short papers, a distinction which was only introduced relatively recently.
This is simply wrong with the updated data (tip of the hat to Melissa Terras for pointing it out); the ratio of long papers to short papers is still in flux. My “guess” from earlier was just that, a post-hoc explanation attached to an incorrect analysis. Matthew Lincoln has a great description about why we should be wary of these just-so stories. Go read it.
The breakdown of acceptance rates for each conference isn’t very informative, due in part to the fact I only have the last three years. In another few years this will probably become interesting, but for those who just can’t get enough o’ them sweet sweet numbers, here they are, special for you:
DH is still pretty single-author-heavy. It’s getting better; over the last 10 years we’ve seen an upward trend in number of authors per paper (more details in a future blog post), but the last three years have remained pretty stagnant. This year, 35% of presentations & posters will be by a single author, 25% by two authors, 13% by 3 authors, and so on down the line. The numbers are unremarkably consistent with 2013 and 2014.
We do however see an interesting trend in acceptance rates by number of authors. The more authors on your presentation, the more likely your presentation is to be accepted. This is true of 2013, 2014, and 2015. Single-authored works are 54% likely to be accepted, while works authored by two authors are 67% likely to be accepted. If your submission has more than 7 authors, you’re incredibly unlikely to get rejected.
Obviously this is pure description and correlation; I’m not saying multi-authored works are higher quality or anything else. Sometimes, works with more authors simply have more recognizable names, and thus are more likely to be accepted. That said, it is interesting that large projects seem to be favored in the peer review process for DH conferences.
Stay-tuned for parts 2, π, 16, and 4, which will cover such wonderful subjects as topicality, gender, and other things that seem neat.
The appropriate level of skepticism here is 19.27 ↩
This is the third post in a three-part series analyzing submissions to the 2015 Digital Humanities conference in Australia. In parts 1 & 2, I covered submission volumes, topical coverage, and comparisons to conferences in previous years. This post will briefly address the geography of submissions, further exploring my criticism that this global-themed conference doesn’t feel so global after all. My geographic analysis shows the conference to be more international than I originally suspected.
I’d like to explore whether submissions to DH2015 are more broad in scope than those to previous conferences as well, but given time constraints, I’ll leave that exploration to a later post in this series, which has covered submissions and acceptances at DH conferences since 2013.
For this analysis, I looked at the universities of the submitting (usually lead) author on every submission, and used a geocoder to extract country, region, and continent data for each given university. This means that every submission is attached to one and only one location, even if other authors are affiliated with other places. Not perfect, but good enough for union work. After the geocoding, I corrected the results by hand 1, and present those results here.
It is immediately apparent that the DH2015 authors represent a more diverse geographical distribution than those in previous years. DH2013 in Nebraska was the only conference of the three where over half of submissions were concentrated in one continental region, the Americas. The Switzerland conference in 2014 had a slightly more even regional distribution, but still had very few contributions (11%) from Asia or Oceania. Contrast these heavily skewed numbers against DH2015 in Australia, with a third of the contributions coming from Asia or Oceania.
The trend continues broken down by UN micro-continental regions. The trends are not unexpected, but they are encouraging. When the conference was in Switzerland, Northern and Western Europe were much more well-represented, as was (surprisingly?) Eastern Asia. This may present the case that Eastern Asia’s involvement in DH is on the rise even not taking into account conference locations. Submissions for 2015 in Sydney are well-represented by Australia, New Zealand, Eastern Asia, and even Eastern Europe and Southern Asia.
One trend is pretty clear: the dominance of North America. Even at its lowest point in 2015, authors from North America comprise over a third of submissions. This becomes even more stark in the animation below, on which every submitting author’s country is represented.
The coverage from the United States over the course of the last three years barely changes, and from Canada shrinks only slightly when the conference moves off of North America. The UK also pretty much retains its coverage 2013-2015, hovering around 10% of submissions. Everywhere else the trends are pretty clear: a slow move eastward as the conference moves east. It’ll be interesting to see how things change in Poland in 2016, and wherever it winds up going in 2017.
In sum, it turns out “Global Digital Humanities 2015” is, at least geographically, much more global than the conferences of the previous two years. While the most popular topics are pretty similar to those in earlier years, I haven’t yet done an analysis of the diversity of the less popular topics, and it may be that they actually prove more diverse than those in earlier years. I’ll save that analysis for when the acceptances come in, though.
It’s a small enough dataset. There’s 648 unique institutional affiliations listed on submissions from 2013-2015, which resolved to 49 unique countries in 14 regions on 4 continents. ↩