A Review of Quantitative Analyses of the Digital Humanities
As an historian who combines quantitative and qualitative methods to study the scholarly activities of dead scientists, it’s no shock that I sometimes turn the macroscope back on my own scholarly community. Other people do too, and there’s enough of them to warrant a list/review, so here it is.
Please contribute missing pieces in the comments, and I’ll add them.
Conferences & Events
There are a bunch of conferences, large and small, regional and global, digital humanities and humanities computing. Data for these are usually culled from conference programs. In some cases, submission data are also available, allowing for acceptance statistics.
The largest international conference is hosted by the Alliance of Digital Humanities Organizations (ADHO). The only other DH-related event that I know of that rivals ADHO’s in size is the Digital Humanities Summer Institute (DHSI) in Victoria.
Before ADHO, the equivalent conference was ACH/ALLC. In the LLC article “Disciplined: Using Educational Studies to Analyse ‘Humanities Computing’“, Melissa Terras asks about the disciplinarity of Humanities Computing from an educational perspective. Part of the piece is a concordance analysis of ACH/ALLC conference abstracts from 1996-2005 (excluding 2003), another section looks at ALLC/ACH/Humanist-L membership statistics, and a large portion of it provided an in-depth analysis of people, topics, and geography at ACH/ALLC 2005.
In 2012, Frédéric Clavert blogged a geographic map of reviewers for DH2012, using it to explain a discrepancy between his experience of DH and a more common view of the community.
More recently, I have worked on quantitative analyses of ADHO conferences, both submissions and acceptances, in a series of blog posts. They cover topic, people, geography, and peer review bias. I’ve covered ADHO 2013 (1, 2), 2014 (1, 2), 2015 (1, 2, 3, 4, 5, 6, 7), and 2016 (1, 2, …).
With Nickoal Eichmann, I started exploring earlier conferences in a similar fashion. We presented “What’s under the Big Tent?” at DHSI 2015, looking at gender, geography, and topic at ADHO’s DH2004-DH2014, a closer look of which is hopefully forthcoming in Digital Studies / Le champ numérique.
With Nickoal Eichmann and Jeana Jorgensen, I explore ADHO’s DH2000-DH2015, looking specifically at diversity-related issues at the conference. “Representation at Digital Humanities Conferences (2000-2015)” will hopefully appear in an upcoming book, Feminist Debates in Digital Humanities, in 2017.
For DH2018, Pino-Diaz & Fiormonte presented a geographic and topical analysis of presentations at DH2017 (Montreal) in La geopolítica de las humanidades digitales: un caso de estudio de DH2017 Montreal.
José Calvo Tello analyzed HDH2015 / EADH Day 2015 in Spain, looking at gender, academic affiliation, and geography. He also analyzes DHd2016 in Germany, looking primarily at institutional and country affiliation. Patrick Sahle followed up with an analysis of DHd2018.
Max Kemman does a similar set of analyses for submissions to DHBenelux 2016, and a comparative analysis for submissions to DHBenelux 2014-2016. Recently, he added an analysis of submissions to DHBenelux 2017 and DHBenelux 2018.
Eetu Mäkelä and Mikko Tolonen analyzed DHN(ordic) 2018 by region, topic, and time, and concluded with suggestions for future iterations of the conference.
In “Exploring Regional Development of Digital Humanities Research: A Case Study for Taiwan“, Kuang-hua Chen and Bi-Shin Hsueh analyze DADH 2009-2012 in Taiwan, focusing on topics, citations, authorship, geography, and time. Their article is especially notable for how they classify subject area in references.
The disciplinary breadth of studies taking data from journals is probably the largest, since bibliometric analysis is fairly popular, and journals are the coin of that realm. The journals I’m most familiar with are Digital Humanities Quarterly (DHQ), Digital Scholarship in the Humanities (DSH) (previously known as Literary and Linguistic Computing (LLC)), and Digital Studies / Le champ numérique. This betrays my Western perspective, so I hope those out there knowing studies of other journals will add them in comments.
The largest-duration study I know of is by Julianne Nyhan and Oliver Duke-Williams, published in LLC, “Joint and multi-authored publication patterns in the Digital Humanities“. They analyze Computers and the Humanities (1966-2004) and Literary and Linguistic Computing (1986-2011), and find that a small group of people tend to co-author together, while the majority of authors do not connect with other co-authorship clusters. They also find that, despite common rhetoric of collaboration, single-authorships predominate and co-authorships aren’t really on the rise from 1966-2011. Some of their data are available here.
A student of Duke-Williams, Jin Gao (along with co-authors), has recently presented a number of quantitative analyses of DH journals, which tend to focus on changing aspects of the community over time. Her DH2017 presentation “The Intellectual Structure of Digital Humanities: An Author Co-Citation Analysis” focused on citations, and her DH2018 presentation “Visualising The Digital Humanities Community: A Comparison Study Between Citation Network And Social Network” added Twitter data to the mix.
Edward Vanhoutte provides a thematic breakdown of articles in DSH (previously LLC) in 2015 in a slide/tweet.
Muh-Chyun Tang, Yun-Jen Cheng, Kuang-hua Chen, and Jieh Hsiang presented “A Longitudinal Analysis of Knowledge Integration in Digital Humanities Using Co-citation Analysis” at DH2015. In the study, they analyze 2,509 articles relating to DH from over six journals, looking at structural properties of the resulting citation network. Their findings include a fracturing of the DH community and a table of the most important cited sources in the dataset. If the preliminary DH2016 conference schedule is any indication, the same group will be following up on this study at DH2016. They published a follow-up piece in Scientometrics, A longitudinal study of intellectual cohesion in digital humanities using bibliometric analyses.
There is another study out of Taiwan, I believe it is a dissertation looking at references from 5 international DH journals, but since I cannot read the language I cannot report much more about it. Help would be appreciated.
In “Analysing an Academic Field through the Lenses of Internet Science: Digital Humanities as a Virtual Community“, Almila Akdag Salah, Andrea Scharnhorst, and Sally Wyatt explore nearly 700 DH-related journal articles from various sources, situating the community within a larger disciplinary context (mostly structural). The same authors, swapping Sally Wyatt with Loet Leydesdorff, presented “Mapping the Flow of ‘Digital Humanities’” at DH2010 with similar research. Akdag-Salah and Leydesdorff also published “Maps on the basis of the Arts & Humanities Citation Index: The journals Leonardo and Art Journal versus “digital humanities” as a topic“, which among other analyses presents a structural citation map drawn from around 100 digital humanities journal articles in 2008.
In “Mapping Cultures in the Big Tent: Multidisciplinary Networks in the Digital Humanities Quarterly“, a team of information visualization students (Dulce Maria de la Cruz, Jake Kaupp, Max Kemman, Kristin Lewis, and Teh-Hen Yu) performed a citation and co-authorship analysis of 200 Digital Humanities Quarterly (DHQ) articles, plus the 4,000 references they all cite. The study spans 2007-2014, shows some structural properties of the networks, as well as key research articles and authors.
Gregory Palermo, a PhD student at Northeastern, also performed a citation analysis of Digital Humanities Quarterly (2007-2014), which helpfully includes his code in R.
Money makes the world go ’round. I have the suspicion that the A.W. Mellon Foundation (which funds my position at CMU) probably funds as much DH in the U.S. as the NEH, but only the NEH data are easily accessible.
John D. Martin III and Carolyn Runyon have an interesting piece exploring the representation of genders and races/ethnicities within grant projects, “Digital humanities, digital hegemony: exploring funding practices and unequal access in the digital humanities“. They look at $225 million in NEH funding for 656 DH-related projects from 2007-2016, analyzing what genders and races/ethnicities are the subjects of those grants.
Blogs & Platforms
DH happens in blogs, yeah? Also platforms like HASTAC and Hypotheses.org. Here’s some analysis:
Matt Burton’s 2015 dissertation, “Blogs as Infrastructure for Scholarly Communication”, analyzes 400 DH blogs comprising 100,000 posts, 1995-2013, using trace ethnography and topic modeling. Among other things, the piece explores the themes of DH blogging, and the topical diversity of individual blogs.
In 2015, Cornelius Puschmann and Marco Bastos published “How Digital Are the Digital Humanities? An Analysis of Two Scholarly Blogging Platforms“, which thematically analyzes Hypotheses.org and HASTAC as scholarly platforms from 2006-2013. Their study utilizes word co-occurrence and topic modeling on 14,000 posts, and primarily looks for markers of disciplinarity. Their 14,000-post data are available at the link above.
By social media use in digital humanities, mostly people mean Twitter. If you want to know how DHers use Twitter, see Quan-Haase, Martin, & McCay-Peet (2015) for a survey perspective. Many in DH call social media the back-channel, where conferences and similar are the front-channel.
In “Disciplinary differences in Twitter scholarly communication“, Kim Holberg & Mike Thelwall analyze many tweets, 90,000 of them being tagged as DH-related. The study was primarily focused on how different communities use Twitter differently.
Ross, Terras, Warwick, and Welsh looked at 5,000 tweets from three 2009 DH conferences in “Enabled backchannel: conference Twitter use by digital humanists“, based on hashtags. While they include some thematic and author-based analysis, the piece largely analyzes Twitter user practices: whether they RT, are taking part in a conversation, are sharing links, etc.
Two particularly prolific researchers in this area are Martin Grandjean and Yannick Rochat, who have collaborated on several analysis of DH on Twitter. Some analyses one or both have contributed to include a Twitter visualization of the DH Summer School in Switzerland, 2013, of the 5e colloque international de l’Institut Historique Allemand in Paris, 2013, of Francophone THATCamp St. Malo, 2013, of ADHO’s DH2014 in Switzerland, and others. Grandjean has also worked on the larger DH Twitter network, not associated with a particular event (2014 – 800 users, 2015 – 1400 users). He eventually published an article on this study, “A social network analysis of Twitter: Mapping the digital humanities community“, analyzing 2,500 DHers on Twitter.
A number of surveys have gone around asking for DH responses. They ask after what resources DHers use, how they interact online, what departments they find themselves in, etc. I’ll be honest, I haven’t really been following what comes out of them, so I’d particularly appreciate help tracking down these results.
One cool and expansive project has a home at mapahd.org, which tracks DH community happenings among speakers of Spanish and Portuguese. Élika Ortega and Silvia E. Gutiérrez ran the survey in 2013, which received around 80 respondents, and presented their analysis of geography, disciplinary home, methodological interests, institutional affiliation, etc. A writeup of their work is available at “Ciencias Sociales y Humanidades Digitales: Técnicas, herramientas y experiencias de e-Research e investigación en colaboración“.
Another survey & study, performed by Bowman, Demarest, me (ahem Weingart), Simpson, Lariviere, Thelwall, & Sugimoto, looks at the communicative practices of digital humanists. “Mapping DH through heterogeneous communicative practices” (and slides) presents analysis of a survey on where practicioners produce and consume research, as well as how they participate in conferences, mailing lists, etc.
A report by Marin Dacos in 2016, “La stratégie du sauna finlandais: Les frontières des Digital Humanities“, reported the results of a 2012 survey of 851 digital humanities, focusing particularly on the geography and language of respondents. Perhaps most interesting is the DHDP, an index Dacos coined to compare the number of surveyed DH practitioners in a particular region to the number of reviewers / gatekeepers from that region for DH2012, thus showing which regions held the most power in setting the agenda for the conference.
Centers & Research Groups
There aren’t many analyses of DH centers or research communities, but there are a few large-scale visualizations to mention. One is project GrinUGR’s Atlas of Social Sciences & Digital Humanities, mapping people, centers, and projects in Spanish, English, and other languages. Another is ADHO’s centerNET, a map of about 200 DH centers around the world.
DHers love collecting syllabuses and arguing about the correct pluralization of the word syllabus, but I don’t believe there are many quantitative analyses of those syllabi or curricula.
Chris Alen Sula, Sarah Hackney, and Phillip Cunningham performed one such analysis recently, exploring the amount and makeup of global DH educational programs, 1990-2016. Their data and visualizations are available on Tableau.
Other work includes Melissa Terras’ “Disciplined: Using Educational Studies to Analyse ‘Humanities Computing’“, described above, and Lisa Spiro’s analysis of 134 DH syllabim (1, 2, 3), 2006-2011. Spiro covers topics such as disciplinary home, type of degree program, common words or readings, and so forth.
In 2015, Daniel Carter and Tanya Clement collected information on 22 DH certificate/degree programs in the US and Canada, and the 253 courses listed in those programs. They then hand-categorized the courses and Carter created a visualization connecting these 253 DH courses across 22 programs between 7 categories.
I know there are some out there – help me find them?
The two big DH mailing lists are Humanist-L and TEI-L. I think I’ve seen a handful of analyses of these listservs, but the only one I remember is David McClure’s great visualization from 2014, which analyzes 12 million words in Humanist-L messages from 1987-2014 (the data for which are available here). The project mostly presents common connecting words for the list, showing for example how “printers” and “wordperfect” were eventually replaced by “electronic” and “preservation” as topics of discussion.
An earlier analysis of Humanist-L by Geoffrey Rockwell & Stefan Sinclair was presented at DH2012, titled “The Swallow Flies Swiftly Through: An Analysis of Humanist“. A longer version was published as a chapter in their book Hermeneutica, and is available online.
Like DH more generally, some studies defy easy classification.
In 2009, Xiaoguang Wang & Mitsuyuki Inaba explored both conferences and journals in “Analyzing Structures and Evolution of Digital Humanities Based on Correspondence Analysis and Co-word Analysis“. They analyzed the text of DHQ, LLC, and DH2005-2008 to show the changing disciplinary landscape of the field, as well as the shift from humanities computing to digital humanities.
Probably the most famous DH quantification is Melissa Terras’ 2012 infographic “Quantifying Digital Humanities“, which includes lots of data gathered at and before that time. It’s a snapshot of the size of the community, showing the number of centers, subscriptions to the journal LLC, views of common DH resources, money allocations, and so forth.
Building and problematizing Terras’ study, Amy Earhart explores the global context of digital humanities through a review of both quantitative and qualitative analyses of the geography of digital humanities in her article on Digital Humanities Within a Global Context: Creating Borderlands of Localized Expression.