Acceptances to Digital Humanities 2015 (part 2)

Had enough yet? Too bad! Full-ahead into my analysis of DH2015, part of my 6,021-part series on DH conference submissions and acceptances. If you want more context, read the Acceptances to DH2015 part 1.

tl;dr

This post’s about the topical coverage of DH2015 in Australia. If you’re curious about how the landscape compares to previous years, see this post. You’ll see a lot of text, literature, and visualizations this year, as well as archives and digitisation projects. You won’t see a lot of presentations in other languages, or presentations focused on non-text sources. Gender studies is pretty much nonexistent. If you want to get accepted, submit pieces about visualization, text/data, literature, or archives. If you want to get rejected, submit pieces about pedagogy, games, knowledge representation, anthropology, or cultural studies.

Topical analysis

I’m sorry. This post is going to contain a lot of giant pictures, because I’m in the mountains of Australia and I’d much rather see beautiful vistas than create interactive visualizations in d3. Deal with it, dweebs. You’re just going to have to do a lot of scrolling down to see the next batch of text.

This year’s conference presents a mostly-unsurprising continuations of the status quo (see 2014’s and 2013’s topical landscapes). Figure 1, below, shows the top author-chosen topic words of DH2015, as a proportion of the total presentations at the conference. For example, an impressive quarter, 24%, of presentations at DH2015 are about “text analysis”. The authors were able to choose multiple topics for each presentation, which is why the percentages add up to way more than 100%.

Scroll down for the rest of the post.

Figure 1. Topical coverage of DH2015. Percent represents the % of presentations which authors have tagged with a certain topical keyword. Authors could tag multiple keywords per presentation.
Figure 1. Topical coverage of DH2015. Percent represents the % of presentations which authors have tagged with a certain topical keyword. Authors could tag multiple keywords per presentation.

Text analysis, visualization, literary studies, data mining, and archives take top billing. History’s a bit lower, but at least there’s more history than the abysmal showing at DH2013. Only a tenth of DH2015 presentations are about DH itself, which is maybe impressive given how much we talk about ourselves? (cf. this post)

As usual, gender studies representation is quite low (1%), as are foreign language presentations and presentations not centered around text. I won’t do a lot of interpretation this post, because it’d mostly be repeat of earlier years. At any rate, acceptance rate is a bit more interesting than coverage this time around. Figure 2 shows acceptance rates of each topic, ordered by volume. Figure 3 shows the same, sorted by acceptance rate.

The topics that appear most frequently at the conference are on the far left, and the red line shows the percent of submitted articles that will be presented at DH2015. The horizontal black line is the overall acceptance rate to the conference, 72%, just to show which topics are above or below average.

Figure 2. Acceptance rates of topics to DH2015, sorted by volume.
Figure 2. Acceptance rates of topics to DH2015, sorted by volume. Click to enlarge.
Figure 2. Acceptance rates of topics to DH2015, sorted by acceptance rate. Click to enlarge.
Figure 3. Acceptance rates of topics to DH2015, sorted by acceptance rate. Click to enlarge.

Notice that all the most well-represented topics at DH2015 have a higher-than-average acceptance rate, possibly suggesting a bit of path-dependence on the part of peer reviewers or editors. Otherwise, it could mean that, since a majority peer reviewers were also authors in the conference, and since (as I’ve shown) the majority of authors have a leaning toward text, lit, and visualization, it’s also what they’re likely to rate highly in peer review.

The first dips we see under the average acceptance rate is “Interdisciplinary Studies” and “Historical Studies” (☹), but the dips aren’t all that low, and we ought not to read too much into it without comparing it to earlier conferences. More significant are the low rates for “Cultural Studies”, and even more than that are the two categories on Teaching, Pedagogy, and Curriculum. Both categories’ acceptance rates are about 20% under the average, and although they’re obviously correlated with one another, the acceptance rates are similar to 2014 and 2013. In short, DH peer reviewers or editors are more unlikely to accept submissions on pedagogy than on most other topics, even though they sometimes represent a decent chunk of submissions.

Other low points worth pointing out are “Anthropology” (huh, no ideas there), “Games and Meaningful Play” (that one came as a surprise), and “Other” (can’t help you here). Beyond that, the submission counts are too low to read any meaningful interpretations into the data. The Game Studies dip is curious, and isn’t reflected in earlier conferences, so it could just be noise for 2015. The low acceptance rates in Anthropology are consistent 2013-2015, and it’d be worth looking more into that.

Topical Co-Occurrence, 2013-2015

Figure 4, below, shows how topics appear together on submissions to DH2013, DH2014, and DH2015. Technically this has nothing to do with acceptances, and little to do with this year specifically, but the visualization should provide a little context to the above analysis. Topics connect to one another if they appear on a submission together, and the line connecting them gets thicker the more connections two topics share.

Figure 4. Topical co-occurrence, 2013-2015. Click to enlarge.
Figure 4. Topical co-occurrence, 2013-2015. Click to enlarge.

Although the “Interdisciplinary Collaboration” topic has a low acceptance rate, it understandably ties the network together; other topics that play a similar role are “Visualization”, “Programming”, “Content Analysis”, “Archives”, and “Digitisation”. All unsurprising for a conference where people come together around method and material. In fact, this reinforces our “DH identity” along those lines, at least insofar as it is represented by the annual ADHO conference.

There’s a lot to unpack in this visualization, and I may go into more detail in the next post. For now, I’ve got a date with the Blue Mountains west of Sydney.

Acceptances to Digital Humanities 2015 (part 1)

[Update!] Melissa Terras pointed out I probably made a mistake on 2015 long paper -> short paper numbers. I checked, and she was right. I’ve updated the figures accordingly.

tl;dr

Part 1 is about sheer numbers of acceptances to DH2015 and comparisons with previous years. DH is still growing, but the conference locale likely prohibited a larger conference this year than last. Acceptance rates are higher this year than previous years. Long papers still reign supreme. Papers with more authors are more likely to be accepted.

Introduction

It’s that time of the year again, when all the good little boys, girls, and other genders of DH gather around the scottbot irregular in pointless meta-analysis quiet self-reflection. As most of you know, the 2015 Digital Humanities conference occurs next week in Sydney, Australia. They’ve just released the final program, full of pretty exciting work, which means I can compare it to my analysis of submissions to DH2015 (1, 2, & 3) to see how DH is changing, how work gets accepted or rejected, etc. This is part of my series on analyzing DH conferences.

Part 1 will focus on basic counts, just looking at percentages of acceptance and rejection by the type of presentation, and comparing it with previous years. Later posts will cover topical, gender, geography, and kangaroos. NOTE: When I say “acceptances”, I really mean “presentations that appear on the final program.” More presentations were likely accepted and withdrawn due to the expense of traveling to Australia, so take these numbers with appropriate levels of skepticism. 1

Volume

Around 270 papers, posters, and workshops are featured in this year’s conference program, down from last year’s ≈350 but up from DH2013’s ≈240. Although this is the first conference since 2010 with fewer presentations than the previous year’s, I suspect this is due largely to geographic and monetary barriers, and we’ll see a massive uptick next year in Poland and the following in (probably) North America. Whether or not the trend will continue to increase in 2018’s Antarctic locale, or 2019’s special Lunar venue, has yet to be seen. 2

Annual presentations at DH conferences, compared to growth of DHSI in Victoria.
Annual presentations at DH conferences, compared to growth of DHSI in Victoria.

As you can see from the chart above, even given this year’s dip, both DH2015 and the annual DHSI event in Victoria reveals DH is still on the rise. It’s also worth noting that last year’s DHSI was likely the first where more people attended it than the international ADHO conference.

Acceptance Rates

A full 72% of submissions to DH2015 will be presented in Sydney next week. That’s significantly more inclusive than previous years: 59% of submitted manuscripts made it to DH2014 in Lausanne, and 64% to DH2013.

At first blush, the loss of exclusivity may seem a bad sign of a conference desperate for attendees, but to my mind the exact opposite is true: this is a great step forward. Conference peer review & acceptance decisions aren’t particularly objective, so using acceptance as a proxy for quality or relevance is a bit of a misdirection. And if we can’t aim for consistent quality or relevance in the peer review process, we ought to aim at least for inclusivity, or higher acceptance rates, and let the participants themselves decide what they want to attend.

Form

Acceptance rates broken down by form (panel, poster, short paper, long paper) aren’t surprising, but are worth noting.

  • 73% of submitted long papers were accepted, but only 45% of them were accepted as long papers. The other 28% were accepted as posters or short papers.
  • 61% of submitted short papers were accepted, but only 51% as short papers; the other 10% became posters.
  • 85% of posters were accepted, all of them as posters.
  • 85% of panels were accepted, but one of them was accepted as a long paper.
  • A few papers/panels were converted into workshops.
How submitted articles eventually were rejected or accepted. (e.g. 45% of submitted long papers were accepted as long papers, 14% as short papers, 15% as posters, and 27% were rejected.)

Weirdly, short papers tend to have a lower acceptance rate than long papers over the last three years. I think that’s because if a long paper is rejected, it’s usually further along in the process enough that it’s more likely to be secondarily accepted-as-a-poster, but even that doesn’t account for the entire differential in the acceptance rate. Anyone have any thoughts on this?

Looking over time, we see an increasingly large slice of the DH conference pie is taken up by long papers. My guess is this is just a natural growth as authors learn the difference between long and short papers, a distinction which was only introduced relatively recently.

This is simply wrong with the updated data (tip of the hat to Melissa Terras for pointing it out); the ratio of long papers to short papers is still in flux. My “guess” from earlier was just that, a post-hoc explanation attached to an incorrect analysis. Matthew Lincoln has a great description about why we should be wary of these just-so stories. Go read it.

A breakdown of presentation forms at the last three DH conferences.

The breakdown of acceptance rates for each conference isn’t very informative, due in part to the fact I only have the last three years. In another few years this will probably become interesting, but for those who just can’t get enough o’ them sweet sweet numbers, here they are, special for you:

Breakdown of conference acceptances 2013-2015. The right-most column shows the percent of, for example, long papers that were not only accepted, but accepted AS long papers. Yellow rows are total acceptance rates per year.

Authorship

DH is still pretty single-author-heavy. It’s getting better; over the last 10 years we’ve seen an upward trend in number of authors per paper (more details in a future blog post), but the last three years have remained pretty stagnant. This year, 35% of presentations & posters will be by a single author, 25% by two authors, 13% by 3 authors, and so on down the line. The numbers are unremarkably consistent with 2013 and 2014.

Percent of accepted presentations with a certain number of co-authors in a given year. (e.g. 35% of presentations in 2015 were single-authored.)
Percent of accepted presentations with a certain number of co-authors in a given year. (e.g. 35% of presentations in 2015 were single-authored.)

We do however see an interesting trend in acceptance rates by number of authors. The more authors on your presentation, the more likely your presentation is to be accepted. This is true of 2013, 2014, and 2015. Single-authored works are 54% likely to be accepted, while works authored by two authors are 67% likely to be accepted. If your submission has more than 7 authors, you’re incredibly unlikely to get rejected.

Acceptance rates by number of authors, 2013-2015. The more authors, the more likely a submission will be accepted.
Acceptance rates by number of authors, 2013-2015. The more authors, the more likely a submission will be accepted.

Obviously this is pure description and correlation; I’m not saying multi-authored works are higher quality or anything else. Sometimes, works with more authors simply have more recognizable names, and thus are more likely to be accepted. That said, it is interesting that large projects seem to be favored in the peer review process for DH conferences.

Stay-tuned for parts 2, π, 16, and 4, which will cover such wonderful subjects as topicality, gender, and other things that seem neat.

Notes:

  1. The appropriate level of skepticism here is 19.27
  2. I hear Elon Musk is keynoting in 2019.

Submissions to Digital Humanities 2015 (pt. 3)

This is the third post in a three-part series analyzing submissions to the 2015 Digital Humanities conference in Australia. In parts 1 & 2, I covered submission volumes, topical coverage, and comparisons to conferences in previous years. This post will briefly address the geography of submissions, further exploring my criticism that this global-themed conference doesn’t feel so global after all. My geographic analysis shows the conference to be more international than I originally suspected.

I’d like to explore whether submissions to DH2015 are more broad in scope than those to previous conferences as well, but given time constraints, I’ll leave that exploration to a later post in this series, which has covered submissions and acceptances at DH conferences since 2013.

For this analysis, I looked at the universities of the submitting (usually lead) author on every submission, and used a geocoder to extract country, region, and continent data for each given university. This means that every submission is attached to one and only one location, even if other authors are affiliated with other places. Not perfect, but good enough for union work. After the geocoding, I corrected the results by hand 1, and present those results here.

It is immediately apparent that the DH2015 authors represent a more diverse geographical distribution than those in previous years. DH2013 in Nebraska was the only conference of the three where over half of submissions were concentrated in one continental region, the Americas. The Switzerland conference in 2014 had a slightly more even regional distribution, but still had very few contributions (11%) from Asia or Oceania. Contrast these heavily skewed numbers against DH2015 in Australia, with a third of the contributions coming from Asia or Oceania.

DH submissions broken down by UN macro-continental regions.

The trend continues broken down by UN micro-continental regions. The trends are not unexpected, but they are encouraging. When the conference was in Switzerland, Northern and Western Europe were much more well-represented, as was (surprisingly?) Eastern Asia. This may present the case that Eastern Asia’s involvement in DH is on the rise even not taking into account conference locations. Submissions for 2015 in Sydney are well-represented by Australia, New Zealand, Eastern Asia, and even Eastern Europe and Southern Asia.

DH conferences broken down by % covered from region in a given year.
DH conferences broken down by % covered from region in a given year.

One trend is pretty clear: the dominance of North America. Even at its lowest point in 2015, authors from North America comprise over a third of submissions. This becomes even more stark in the animation below, on which every submitting author’s country is represented.

DH2013-2015 with dots sized by the percent coverage that year.
DH2013-2015 with dots sized by the percent coverage that year.

The coverage from the United States over the course of the last three years barely changes, and from Canada shrinks only slightly when the conference moves off of North America. The UK also pretty much retains its coverage 2013-2015, hovering around 10% of submissions. Everywhere else the trends are pretty clear: a slow move eastward as the conference moves east. It’ll be interesting to see how things change in Poland in 2016, and wherever it winds up going in 2017.

In sum, it turns out “Global Digital Humanities 2015” is, at least geographically, much more global than the conferences of the previous two years. While the most popular topics are pretty similar to those in earlier years, I haven’t yet done an analysis of the diversity of the less popular topics, and it may be that they actually prove more diverse than those in earlier years. I’ll save that analysis for when the acceptances come in, though.

Notes:

  1. It’s a small enough dataset. There’s 648 unique institutional affiliations listed on submissions from 2013-2015, which resolved to 49 unique countries in 14 regions on 4 continents.

Submissions to Digital Humanities 2015 (pt. 2)

Do you like the digital humanities? Me too! You better like it, because this is the 700th or so in a series of posts about our annual conference, and I can’t imagine why else you’d be reading it.

My last post went into some summary statistics of submissions to DH2015, concluding in the end that this upcoming conference, the first outside the Northern Hemisphere, with the theme “Global Digital Humanities”, is surprisingly similar to the DH we’ve seen before. This post will compare this year to submissions to the previous two conferences, in Switzerland and the Nebraska. Part 3 will go into some more detail of geography and globalizing trends.

I can only compare the sheer volume of submissions this year to 2013 and 2014, which is as far back as I’ve got hard data. As many pieces were submitted for DH2015 as were submitted for DH2013 in Nebraska – around 360. Submissions to DH2014 shot up to 589, and it’s not yet clear whether the subsequent dip is an accident of location (Australia being quite far away from most regular conference attendees), or whether this signifies the leveling out of what’s been fairly impressive growth in the DH world.

DH by volume, 1999-2014.  This chart shows how many DHSI workshops occurred per year (right axis), alongside how many pieces were actually presented at the DH conference annually (left axis). This year is not included because we don't yet know which submissions will be accepted.
DH by volume, 1999-2014. This chart shows how many DHSI workshops occurred per year (right axis), alongside how many pieces were actually presented at the DH conference annually (left axis). This year is not included because we don’t yet know which submissions will be accepted.

This graph shows a pretty significant recent upward trend in DH by volume; if acceptance rates to DH2015 are comparable to recent years (60-65%), then DH2015 will represent a pretty significant drop in presentation volume. My gut intuition is this is because of the location, and not a downward trend in DH, but only time will tell.

Replying to my most recent post, Jordan T. T-H commented on his surprise at how many single-authored works were submitted to the conference. I suggested this was of our humanistic disciplinary roots, and that further analysis would likely reveal a trend of increasing co-authorship. My prediction was wrong: at least over the last three years, co-authorship numbers have been stagnant.

This chart shows the that ~40% of submissions to DH conferences over the past three years have been single-authored.
This chart shows the that ~40% of submissions to DH conferences over the past three years have been single-authored.

Roughly 40% of submissions to DH conferences over the past three years have been single-authored; the trend has not significantly changed any further down the line, either. Nickoal Eichmann and I are looking into data from the past few decades, but it’s not ready yet at the time of this blog post. This result honestly surprised me; just from watching and attending conferences, I had the impression we’ve become more multi-authored over the past few years.

Topically, we are noticing some shifts. As a few people noted on Twitter, topics are not perfect proxies for what’s actually going on in a paper; every author makes different choices on how they they tag their submissions. Still, it’s the best we’ve got, and I’d argue it’s good enough to run this sort of analysis on, especially as we start getting longitudinal data. This is an empirical question, and if we wanted to test my assumption, we’d gather a bunch of DHers in a room and see to what extent they all agree on submission topics. It’s an interesting question, but beyond the scope of this casual blog post.

Below is the list of submission topics in order of how much topical coverage has changed since 2013. For example, this year 21% of submissions were tagged as involving Text Analysis. By contrast, only 15% were tagged as Text Analysis in 2013, resulting in a growth of 6% over the last two years. Similarly, this year Internet and World Wide Web studies comprised 7% of submissions, whereas that number was 12% in 2013, showing coverage shrunk by 5%. My more detailed evaluation of the results are below the figure.

dh-topicalchange-2015

We see, as I previously suggested, that Text Analysis (unsurprisingly) has gained a lot of ground. Given the location, it should be unsurprising as well that Asian Studies has grown in coverage, too. Some more surprising results are the re-uptake of Digitisation, which have been pretty low recently, and the growth of GLAM (Galleries, Libraries, Archives, Museums), which I suspect if we could look even further back, we’d spot a consistent upward trend. I’d guess it’s due to the proliferation of DH Alt-Ac careers within the GLAM world.

Not all of the trends are consistent: Historical Studies rose significantly between 2013 and 2014, but dropped a bit in submissions this year to 15%. Still, it’s growing, and I’m happy about that. Literary Studies, on the other hand, has covered a fifth of all submissions in 2013, 2014, and 2015, remaining quite steady. And I don’t see it dropping any time soon.

Visualizations are clearly on the rise, year after year, which I’m going to count as a win. Even if we’re not branching outside of text as much as we ought, the fact that visualizations are increasingly important means DHers are willing to move beyond text as a medium for transmission, if not yet as a medium of analysis. The use of Networks is also growing pretty well.

As Jacqueline Wernimont just pointed out, representation of Gender Studies is incredibly low. And, as the above chart shows, it’s even lower this year than it was in both previous years. Perhaps this isn’t so surprising, given the gender ratio of authors at DH conferences recently.

Gender ratio of authors at DH conferences 2010-2013. Women consistently represent a bit under a third of all authors.
Gender ratio of authors at DH conferences 2010-2013. Women consistently represent a bit under a third of all authors.

Some categories involving Maps and GIS are increasing, while others are decreasing, suggesting small fluctuations in labeling practices, but probably no significant upward or downward trend in their methodological use. Unfortunately, most non-text categories dropped over the past three years: Music, Film & Cinema Studies, Creative/Performing Arts, and Audio/Video/Multimedia all dropped. Image Studies grew, but only slightly, and its too soon to say if this represents a trend.

We see the biggest drops in XML, Encoding, Scholarly Editing, and Interface & UX Design. This won’t come as a surprise to anyone, but it does show how much the past generation’s giant (putting together, cleaning, and presenting scholarly collections) is making way for the new behemoth (analytics). Internet / World Wide Web is the other big coverage loss, but I’m not comfortable giving any causal explanation for that one.

This analysis offers the same conclusion as the earlier one: with the exception of the drop in submissions, nothing is incredibly surprising. Even the drop is pretty well-expected, given how far the conference is from the usual attendees. The fact that the status is pretty quo is worthy of note, because many were hoping that a global DH would seem more diverse, or appreciably different, in some way. In Part 3, I’ll start picking apart geographic and deeper topical data, and maybe there we’ll start to see the difference.

Submissions to Digital Humanities 2015 (pt. 1)

It’s that time of the year again! The 2015 Digital Humanities conference will take place next summer in Australia, and as per usual, I’m going to summarize what is being submitted to the conference and, eventually, how those submissions become accepted. Each year reviewers get the chance to “bid” on conference submissions, and this lets us get a peak inside the general trends in DH research. This post (pt. 1) will focus solely on this year’s submissions, and next post will compare them to previous years and locations.

It’s important to keep in mind that trends in the conference over the last three years may be temporal, geographic, or accidental. The 2013 conference took place in Nebraska, 2014 in Switzerland, 2015 in Australia, and 2016 is set to happen in Poland; it’s to be expected that regional differences will significantly inform who is submitting pieces and what topics will be discussed.

This year, 358 pieces were submitted to the conference (about as many as were submitted to Nebraska in 2013, but more on that in the follow-up post). As with previous years, authors could submit four varieties of works: long papers, short papers, posters, and panels / multi-paper sessions. Long papers comprised 54% of submissions, panels 4%, posters 15%, and short papers 30%.

In total, there were 859 named authors on submissions – this number counts authors more than once if they appear on multiple submissions. Of those, 719 authors are unique. 1 Over half the submissions are multi-authored (58%), with 2.4 authors per submission on average, a median of 2 authors per submission, and a max of 10 authors on one submission. While the majority of submissions included multiple authors, the sheer number of single-authored papers still betrays the humanities roots of DH. The histogram is below.

A histogram of authors-per-submission.
A histogram of authors-per-submission.

As with previous years, authors may submit articles in any of a number of languages. The theme of this year’s conference is “Global Digital Humanities”, but if you expected a multi-lingual conference, you might be disappointed. Of the 358 submissions, 353 are in English. The rest are in French (2), Italian (2), and German (1).

Submitting authors could select from a controlled vocabulary to tag their submissions with topics. There were 95 topics to choose from, and their distribution is not especially surprising. Two submissions each were tagged with 25 topics, suggesting they are impressively far reaching, but for the most part submissions stuck to 5-10 topics. The breakdown of submissions by topic is below, where the percentage represents the percentage of submissions which are tagged by a specific topic. My interpretation is below that.

Percentage of submissions tagged with a specific topic.
Percentage of submissions tagged with a specific topic.

A full 21% of submissions include some form of Text Analysis, and a similar number claim Text or Data Mining as a topic. Other popular methodological topics are Visualizations, Network Analysis, Corpus Analysis, and Natural Language Processing. The DH-o-sphere is still pretty text-heavy; Audio, Video, and Multimedia are pretty low on the list, GIS even lower, and Image Analysis (surprisingly) even lower still. Bibliographic methods, Linguistics, and other approaches more traditionally associated with the humanities appear pretty far down the list. Other tech-y methods, like Stylistics and Agent-Based Modeling, are near the bottom. If I had to guess, the former is on its way down, and the latter on its way up.

Unsurprisingly, regarding disciplinary affiliations, Literary Studies is at the top of the food chain (I’ll talk more about how this compares to previous years in the next post), with Archives and Repositories not far behind. History is near the top tier, but not quite there, which is pretty standard. I don’t recall the exact link, but Ben Schmidt argued pretty convincingly that this may be because there are simply fewer new people in History than in Literary Studies. Digitization seems to be gaining some ground its lost in the previous years. The information science side (UX Design, Knowledge Representation, Information Retrieval, etc.) seems reasonably strong. Cultural Studies is pretty well-represented, and Media Studies, English Studies, Art History, Anthropology, and Classics are among the other DH-inflected communities out there.

Thankfully we’re not completely an echo chamber yet; only about a tenth of the submissions are about DH itself – not great, not terrible. We still seem to do a lot of talking about ourselves, and I’d like to see that number decrease over the next few years. Pedagogy-related submissions are also still a bit lower than I’d like, hovering around 10%. Submissions on the “World Wide Web” are decreasing, which is to be expected, and TEI isn’t far behind.

All in all, I don’t really see the trend toward “Global Digital Humanities” that the conference is themed to push, but perhaps a more complex content analysis will reveal a more global DH than we’ve sen in the past. The self-written Keyword tags (as opposed to the Topic tags, not a controlled vocabulary) reveal a bit more internationalization, although I’ll leave that analysis for a future post.

It’s worth pointing out there’s a statistical property at play that makes it difficult to see deviations from the norm. Shakespeare appears prominently because many still write about him, but even if Shakespearean research is outnumbered by work on more international playwrights, it’d be difficult to catch, because I have no category for “international playwright” – each one would be siphoned off into its own category. Thus, even if the less well-known long tail topics  significantly outweigh the more popular topics, that fact would be tough to catch.

All in all, it looks like DH2015 will be an interesting continuation of the DH tradition. Perhaps the most surprising aspect of my analysis was that nothing in it surprised me; half-way around the globe, and the trends over there are pretty identical to those in Europe and the Americas. It’ll take some more searching to see if this is a function of the submitting authors being the same as previous years (whether they’re all simply from the Western world), or whether it is actually indicative of a fairly homogeneous global digital humanities.

Stay-tuned for Part 2, where I compare the analysis to previous years’ submissions, and maybe even divine future DH conference trends using tea leaves or goat entrails or predictive modeling (whichever seems the most convincing; jury’s still out).

Notes:

  1. As far as I can tell – I used all the text similarity methods I could think of to unify the nearly-duplicate names.

Acceptances to Digital Humanities 2014 (part 1)

It’s that time again! The annual Digital Humanities conference schedule has been released, and this time it’s in Switzerland. In an effort to console myself from not having the funding to make it this year, I’ve gone ahead and analyzed the nitty-gritty of acceptances and rejections to the conference. For those interested in this sort of analysis, you can find my take on submissions to DH2013, acceptances at DH2013, and submissions to DH2014. If you’re visiting this page from the future, you can find any future DH conference analyses at this tag link.

The overall acceptance rate to DH2014 was 59%, although that includes many papers and panels that were accepted as posters. There were 589 submissions this year (compared to 348 submissions last year), of which 345 were accepted. By submission medium, this is the breakdown:

  • Long papers: 62% acceptance rate (lower than last year)
  • Short papers: 52% acceptance rate (lower than last year)
  • Panels: 57% acceptance rate (higher than last year)
  • Posters: 64% acceptance rate (didn’t collect this data last year)
Acceptances to DH2014 by submission medium.
Figure 1: Acceptances to DH2014 by submission medium.

A surprising number of submitted papers switched from one medium to another when they were accepted. A number of panels became long papers, a bunch of short papers became long papers, and a punch of long papers became short papers. Although a bunch of submissions became posters, no posters wound up “breaking out” to become some other medium. I was most surprised by the short papers which became long (13 in all), which leads me to believe some of them may have been converted for scheduling reasons. This is idle speculation on my part – the organizers may reply otherwise. [Edit: the organizers did reply, and assured us this was not the case. I see no recent to doubt that, so congratulations to those 13 short papers that became long papers!]

Medium switches in DH2014 between submission and acceptance.
Figure 2: Medium switches in DH2014 between submission and acceptance.

It’s worth keeping in mind, in all analyses listed here, that I do not have access to any withdrawals; accepted papers were definitely accepted, but not accepted may have been withdrawn rather than rejected.

Figures 3 and 4 all present the same data, but shed slightly different lights on digital humanities. Each shows the acceptance rate by various topics, but they’re ordered slightly differently. All submitting authors needed to select from a limited list of topics to label their submissions, in order to aid with selecting peer reviewers and categorization.

Figure 3 sorts topics by the total amount that were accepted to DH2014. This is at odds with Figure 2 from my post on DH2014 submissions, which sorts by total number of topics submitted. The figure from my previous post gives a sense of what digital humanists are doing and submitting, whereas Figure 3 from this post gives a sense of what the visitor to DH2014 will encounter.

Figure 3. Topical acceptance to DH2014 sorted by total number of accepted papers tagged with a particular topic.
Figure 3: Topical acceptance to DH2014 sorted by total number of accepted papers tagged with a particular topic. (click to enlarge)

The visitor to DH2014 won’t see a hugely different topical landscape than the visitor to DH2013 (see analysis here). Literary studies, text analysis, and text mining still reign supreme, with archives and repositories not far behind. Visitors will see quite a bit fewer studies dedicated to the internet and the world wide web, and quite a bit more dedicated to historical and corpus-based research. More details can be seen by comparing the individual figures.

Figure 4, instead, sorts the topics by their acceptance rate. The most frequently accepted topics appear at the left, and the least frequently appear at the right. A lighter red line is used to show acceptance rates of the same topics for 2013. This graph shows what peers consider to me more credit-worthy, and how this has changed since 2013.

Figure 4:
Figure 4: Topical acceptance to DH2014 sorted by percentage of acceptance for each topic. (click to enlarge)

It’s worth pointing out that the highest and lowest acceptance rates shouldn’t be taken very seriously; with so few submitted articles, the rates are as likely random as indicative of any particularly interesting trend. Also, for comparisons with 2013, keep in mind the North American and European traditions of digital humanities may be driving the differences.

There are a few acceptance ratios worthy of note. English studies and GLAM (Galleries, Libraries, Archives, Museums) both have acceptance rates extremely above average, and also quite a bit higher than their acceptance rates from the previous year. Studies of XML are accepted slightly above the average acceptance rate, and also accepted proportionally more frequently than they were in 2013. Acceptance rates for both literary and historical studies papers are about average, and haven’t changed much since 2013 (even though there were quite a few more historical submissions than the previous year).

Along with an increase in GLAM acceptance rates, there was a big increase in rates for studies involving archives and repositories. It may be they are coming back in style, or it may be indicative of a big difference between European and North American styles. There was a pretty big drop in acceptance rates for ontology and semantic web research, as well as in pedagogy research across the board. Pedagogy had a weak foothold in DH2013, and has an even weaker foothold in 2014, with both fewer submitted articles, and a lower rate of acceptance on those submitted articles.

In the next blog post, I plan on drilling a bit into author-supplied keywords, the role of gender on acceptance rates, and the geography of submissions. As always, I’m happy to share data, but in this case I will only share sufficiently aggregated/anonymized data, because submitting authors who did not get accepted have an expectation of privacy that I intend to keep.

Submissions to Digital Humanities 2014

Submissions for the 2014 Digital Humanities conference just closed. It’ll be in Switzerland this time around, which unfortunately means I won’t be able make it, but I’ll be eagerly following along from afar. Like last year, reviewers are allowed to preview the submitted abstracts. Also like last year, I’m going to be a reviewer, which means I’ll have the opportunity to revisit the submissions to DH2013 to see how the submissions differed this time around. No doubt when the reviews are in and the accepted articles are revealed, I’ll also revisit my analysis of DH conference acceptances.

To start with, the conference organizers received a record number of submissions this year: 589. Last year’s Nebraska conference only received 348 submissions. The general scope of the submissions haven’t changed much; authors were still supposed to tag their submissions using a controlled vocabulary of 95 topics, and were also allowed to submit keywords of their own making. Like last year, authors could submit long papers, short papers, panels, or posters, but unlike last year, multilingual submissions were encouraged (English, French, German, Italian, or Spanish). [edit: Bethany Nowviskie, patient awesome person that she is, has noticed yet another mistake I’ve made in this series of posts. Apparently last year they also welcomed multilingual submissions, and it is standard practice.]

Digital Humanities is known for its collaborative nature, and not much has changed in that respect between 2013 and 2014 (Figure 1). Submissions had, on average, between two and three authors, with 60% of submissions in both years having at least two authors. This year, a few fewer papers have single authors, and a few more have two authors, but the difference is too small to be attributable to anything but noise.

Figure 1. Number of authors per paper.
Figure 1. Number of authors per paper.

The distribution of topics being written about has changed mildly, though rarely in extreme ways. Any changes visible should also be taken with a grain of salt, because a trend over a single year is hardly statistically robust to small changes, say, in the location of the event.

The grey bars in Figure 2 show what percentage of DH2014 submissions are tagged with a certain topic, and the red dotted outlines show what the percentages were in 2013. The upward trends to note this year are text analysis, historical studies, cultural studies, semantic analysis, and corpora and corpus activities. Text analysis was tagged to 15% of submissions in 2013 and is now tagged to 20% of submissions, or one out of every five. Corpus analysis similarly bumped from 9% to 13%. Clearly this is an important pillar of modern DH.

Figure 2. Topics from DH2014 ordered by the percent of submissions which fall in that category. The dotted lines represent the percentage from DH2013.
Figure 2. Topics from DH2014 ordered by the percent of submissions which fall in that category. The red dotted outlines represent the percentage from DH2013.

I’ve pointed out before that History is secondary compared to Literary Studies in DH (although Ted Underwood has convincingly argued, using Ben Schmidt’s data, that the numbers may merely be due to fewer people studying history). This year, however, historical studies nearly doubled in presence, from 10% to 17%. I haven’t yet collected enough years of DH conference data to see if this is a trend in the discipline at large, or more of a difference between European and North American DH. Semantic analysis jumped from 1% to 7% of the submissions, cultural studies went from 10% to 14%, and literary studies stayed roughly equivalent. Visualization, one of the hottest topics of DH2013, has become even hotter in 2014 (14% to 16%).

The most visible drops in coverage came in pedagogy, scholarly editions, user interfaces, and research involving social media and the web. At DH2013, submissions on pedagogy had a surprisingly low acceptance rate, which combined the drop in pedagogy submissions this year (11% to 8% in “Digital Humanities – Pedagogy and Curriculum” and 7% to 4% in “Teaching and Pedagogy”) might suggest a general decline in interest in the DH world in pedagogy. “Scholarly Editing” went from 11% to 7% of the submissions, and “Interface and User Experience Design” from 13% to 8%, which is yet more evidence for the lack of research going into the creation of scholarly editions compared to several years ago. The most surprising drops for me were those in “Internet / World Wide Web” (12% to 8%) and “Social Media” (8.5% to 5%), which I would have guessed would be growing rather than shrinking.

The last thing I’ll cover in this post is the author-chosen keywords. While authors needed to tag their submissions from a list of 95 controlled vocabulary words, they were also encouraged to tag their entries with keywords they could choose themselves. In all they chose nearly 1,700 keywords to describe their 589 submissions. In last year’s analysis of these keywords, I showed that visualization seemed to be the glue that held the DH world together; whether discussing TEI, history, network analysis, or archiving, all the disparate communities seemed to share visualization as a primary method. The 2014 keyword map (Figure 3) reveals the same trend: visualization is squarely in the middle. In this graph, two keywords are linked if they appear together on the same submission, thus creating a network of keywords as they co-occur with one another. Words appear bigger when they span communities.

Figure 3. Co-occurrence of DH2014 author-submitted keywords.
Figure 3. Co-occurrence of DH2014 author-submitted keywords.

Despite the multilingual conference, the large component of the graph is still English. We can see some fairly predictable patterns: TEI is coupled quite closely with XML; collaboration is another keyword that binds the community together, as is (obviously) “Digital Humanities.” Linguistic and literature are tightly coupled, much moreso than, say, linguistic and history. It appears the distant reading of poetry is becoming popular, which I’d guess is a relatively new phenomena, although I haven’t gone back and checked.

This work has been supported by an ACH microgrant to analyze DH conferences and the trends of DH through them, so keep an eye out for more of these posts forthcoming that look through the last 15 years. Though I usually share all my data, I’ll be keeping these to myself, as the submitters to the conference did so under an expectation of privacy if their proposals were not accepted.

[edit: there was some interest on twitter last night for a raw frequency of keywords. Because keywords are author-chosen and I’m trying to maintain some privacy on the data, I’m only going to list those keywords used at least twice. Here you go (Figure 4)!]

Figure 4. Keywords used in DH2014 submissions ordered by frequency.
Figure 4. Keywords used in DH2014 submissions ordered by frequency.

From Trees to Webs: Uprooting Knowledge through Visualization

[update: here are some of the pretty pictures I will be showing off in The Hague]

The blog’s been quiet lately; my attention has been occupied by various journal submissions and a new book in the works, but I figured my readers would be interested in one of those forthcoming publications. This is an article [preprint] I’m presenting at the Universal Decimal Classification Seminar in The Hague this October, on the history of how we’ve illustrated the interconnections of knowledge and scholarly domains. It’s basically two stories: one of how we shifted from understanding the world hierarchically to understanding it as a flat web of interconnected parts, and the other of how the thing itself and knowledge of that thing became separated.

Porphyrian Tree: tree of Aristotle's categories from the 6th century. [via]
Porphyrian Tree: tree of Aristotle’s categories originally dating from the 6th century. [via some random website about trees]
A few caveats worth noting: first, because I didn’t want to deal with the copyright issues, there are no actual illustrations in the paper. For the presentation, I’m going to compile a powerpoint with all the necessary attributions and post it alongside this paper so you can all see the relevant pretty pictures. For your viewing pleasure, though, I’ve included some of the illustrations in this blog post.

An interpretation of the classification of knowledge from Hobbes' Leviathan. [via e-ducation]
An interpretation of the classification of knowledge from Hobbes’ Leviathan. [via e-ducation]
Second, because the this is a presentation directed at information scientists, the paper is organized linearly and with a sense of inevitability; or, as my fellow historians would say, it’s very whiggish. I regret not having the space to explore the nuances of the historical narrative, but it would distract from the point and context of this presentation. I plan on writing a more thorough article to submit to a history journal at a later date, hopefully fitting more squarely in the historiographic rhetorical tradition.

H.G. Wells' idea of how students should be taught. [via H.G. Wells, 1938. World Brain. Doubleday, Doran & Co., Inc]
H.G. Wells’ idea of how students should be taught. [via H.G. Wells, 1938. World Brain. Doubleday, Doran & Co., Inc]
In the meantime, if you’re interested in reading the pre-print draft, here it is! All comments are welcome, as like I said, I’d like to make this into a fuller scholarly article beyond the published conference proceedings. I was excited to put this up now, but I’ll probably have a new version with full citation information within the week, if you’re looking to enter this into Zotero/Mendeley/etc. Also, hey! I think this is the first post on the Irregular that has absolutely nothing to do with data analysis.

Recent map of science by Kevin Boyack, Dick Klavans, W. Bradford Paley, and Katy Börner. [via SEED magazine]
Recent map of science by Kevin Boyack, Dick Klavans, W. Bradford Paley, and Katy Börner. [via SEED magazine]

The Historian’s Macroscope

Whelp, it appears the cat’s out of the bag. Shawn Graham, Ian Milligan, and I have signed our ICP contract and will shortly begin the process of writing The Historian’s Macroscope, a book introducing the process and rationale of digital history to a broad audience. The book will be a further experiment in live-writing: as we have drafts of the text, they will go online immediately for comments and feedback. The publishers have graciously agreed to allow us to keep the live-written portion online after the book goes on sale, and though what remains online will not be the final copy-edited and typeset version, we (both authors and publishers) feel this is a good compromise to prevent the cannibalization of book sales while still keeping much of the content open and available for those who cannot afford the book or are looking for a taste before they purchase it. Thankfully, this plan also fits well with my various pledges to help make a more open scholarly world.

Microscope / Telescope / Macroscope [via The Macroscope by Joël de Rosnay]
Microscope / Telescope / Macroscope [via The Macroscope by Joël de Rosnay]
We’re announcing the project several months earlier than we’d initially intended. In light of the American Historical Association’s recent statement endorsing the six year embargo of dissertations on the unsupported claim that it will help career development, we wanted to share our own story to offset the AHA’s narrative. Shawn, Ian, and I have already worked together on a successful open access chapter in The Programming Historian, and have all worked separately releasing public material on our respective blogs. It was largely because of our open material that we were approached to write this book, and indeed much of the material we’ve already posted online will be integrated into the final publication. It would be an understatement to say our publisher’s liaison Alice jumped at this opportunity to experiment with a semi-open publication.

The disadvantage to announcing so early is that we don’t have any content to tease you with. Stay-tuned, though. By September, we hope to have some preliminary content up, and we’d love to read your thoughts and comments; especially from those not already aligned with the DH world.

An experiment in communal editing: Finding the history & philosophy of science.

After my last post about co-citation analysis, the author of one of the papers I was responding to, K. Brad Wray, generously commented and suggested I write up and publish the results and send them off to Erkenntnis, which is the same journal he published his results. That sounded like a great idea, so I am.

Because so many good ideas have come from comments on this blog, I’d like to try opening my first draft to communal commenting. For those who aren’t familiar with google docs (anyone? Bueller?), you can comment by selecting test and either hitting ctrl-alt-m, or going to the insert-> menu and clicking ‘Comment’.

The paper is about the relationship between history of science and philosophy of science, and draws both from the blog post and from this page with additional visualizations. There is also an appendix (pdf, sorry) with details of data collection and some more interesting results for the HPS buffs. If you like history of science, philosophy of science, or citation analysis, I’d love to see your comments! If you have any general comments that don’t refer to a specific part of the text, just post them in the blog comments below.

This is a bit longer form than the usual blog, so who knows if it will inspire much interaction, but it’s worth a shot. Anyone who is signed in so I can see their name will get credit in the acknowledgements.

Finding the History and Philosophy of Science (earlier draft)  ← draft 1, thanks for your comments.

Finding the History and Philosophy of Science (current draft) ← comment here!