Argument Clinic

Zoe LeBlanc asked how basic statistics lead to a meaningful historical argument. A good discussion followed, worth reading, but since I couldn’t fit my response into tweets, I hoped to add a bit to the thread here on the irregular. I’m addressing only one tiny corner of her question, in a way that is peculiar to my own still-forming approach to computational history; I hope it will be of some use to those starting out.

In brief, I argue that one good approach to computational history cycles between data summaries and focused hypothesis exploration, driven by historiographic knowledge, in service to finding and supporting historically interesting agendas. There’s a lot of good computational history that doesn’t do this, and a lot of bad computational history that does, but this may be a helpful rubric to follow.

In the spirit of Monty Python, the below video has absolutely nothing to do with the discussion at hand.

Zoe’s question gets at the heart of one of the two most prominent failures of computational history in 2017 1: the inability to go beyond descriptive statistics into historical argument. 2 I’ve written before on one of the many reasons for this inability, but that’s not the subject of this post. This post covers some good practices in getting from statistics to arguments.

Describing the Past

Historians, for the most part, aren’t experimentalists. 3 Our goals vary, but they often include telling stories about the past that haven’t been told, by employing newly-discovered evidence, connecting events that seemed unrelated, or revisiting an old narrative with a fresh perspective.

Facts alone usually don’t cut it. We don’t care what Jane ate for breakfast without a so what. Maybe her breakfast choices say something interesting about her socioeconomic status, or about food culture, or about how her eating habits informed the way she lived. Alongside a fact, we want why or how it came to be, what it means, or its role in some larger or future trend. A sufficiently big and surprising fact may be worthy of note on its own (“Jane ate orphans for breakfast” or “The government did indeed collude with a foreign power”), but such surprising revelations are rare, not the only purpose for historians, and still beg for context.

Computational history has gotten away with a lot of context-free presentations of fact. 4 That’s great! It’s a sign there’s a lot we didn’t know that contemporary methods & data make easily visible. 5 Here’s an example of one of mine, showing that, despite evidence to the contrary, there is a thriving community at the intersection of history and philosophy of science:

My citation analysis showing a bridge between history & philosophy of science.

But, though we’re not running out of low-hanging fruit, the novelty of mere description is wearing thin. Knowing that a community exists between history & philosophy of science is not particularly interesting; knowing why it exists, what it changes, or whether it is less tenuous than any other disciplinary borderland are more interesting and historiographically recognizable questions.

Context is Key

So how to get from description to historical argument? Though there’s no right path, and the route depends on the type of claim, this post may offer some guidance. Before we get too far, though, a note:

Description has little meaning without context and comparison. The data may show that more people are eating apples for breakfast, but there’s a lot to unpack there before it can be meaningful, let alone relevant.

Line chart of # of people who eat apples over time.

It may be, for example, that the general population is growing just as quickly as the number of people who eat apples. If that’s the case, does it matter that apple-eaters themselves don’t seem to be making up any larger percent of the population?

Line chart of # of people who eat apples over time (left axis) compared to general population (right axis).

The answer for a historian is: of course it matters. If we were talking about casualties of war, or amount of cities in a country, rather than apples, a twofold increase in absolute value (rather than percentage of population) makes a huge difference. It’s more lives affected; it’s more infrastructure and resources for a growing nation.

But the nature of that difference changes when we know our subject of study matches population dynamics. If we’re looking at voting patterns across cities, and we notice population density correlates with party affiliation, we can use that as a launching point for so what. Perhaps sparser cities rely on fewer social services to run smoothly, leading the population to vote more conservative; perhaps past events pushed conservative families towards the outskirts; perhaps.

Without having a ground against which to contextualize our results, a base map like general population, the fact of which cities voted in which direction gives us little historical meat to chew on.

On the other hand, some surprising facts, when contextualized, leave us less surprised. A two-fold increase in apple eating across a decade is pretty surprising, until you realize it happened alongside a similar increase in population. The fact is suddenly less worthy of report by itself, though it may have implications for, say, the growth of the apple industry.

But Zoe asked about statistics, not counting, in finding meaning. I don’t want to divert this post into teaching stats, and nor do I want to assume statistical knowledge, so I’ll opt for an incredibly simple metric: ratio.

The illustration above shows an increase in both population and apple-eating, and eyeball estimates show them growing apace. If we divide the total population by the number of people eating apples, however, our story is complicated.

Line chart of # of people who eat apples over time (left axis) compared to general population (right axis). A thick blue line in the middle (left axis) shows the ratio between the two.

Though both population and apple-eating increase, in 1806 the population begins rising much more rapidly than the number of apple-eaters. 6 It is in this statistically-derived difference that the historian may find something worth exploring and explaining further.

There are a many ways to compare and contextualize data, of which this is one. They aren’t worth enumerating, but the importance of contextualization is relevant to what comes next.

Question- and Data-Driven History

Computational historians like to talk about question-driven analysis. Computational history is best, we say, when it is led by a specific question or angle. The alternative is dumping a bunch of data into a statistics engine, describing it, and finding something weird, and saying “oh, this looks interesting.”

When push comes to shove, most would agree the above dichotomy is false. Historical questions don’t pop out of thin air, but from a continuously shifting relationship with the past. We read primary and secondary sources, do some data entry, do some analysis, do some more reading, and through it all build up a knowledge-base and a set of expectations about the past. We also by this point have a set of claims we don’t quite agree with, or some secondary sources with stories that feel wrong or incomplete.

This is where the computational history practice begins: with a firm grasp of the history and historiography of a period, and a set of assumptions, questions, and mild disagreements.

From here, if you’re reading this blog post, you’re likely in one of two camps:

  1. You have a big dataset and don’t know what to do with it, or
  2. You have a historiographic agenda (a point to prove, a question to answer, etc.) that you don’t know how to make computationally tractable.

We’ll begin with #1.

1. I have data. Now what?

Congratulations, you have data!

Congratulations!

This is probably the thornier of the two positions, and the one more prone to results of mere description. You want to know how to turn your data into interesting history, but you may end up doing little more than enumerating the blades of grass on a field. To avoid that, you must begin down a process sometimes called scalable reading, or a special case of the hermeneutic circle.

You start, of course, with mere description. How many records are there? What are the values in each? Are there changes over time or place? Who is most central? Before you start quantifying the data, write down the answers you expect to these questions, with a bit of a causal explanation for each.

Now, barrage your dataset with visualizations and statistical tests to find out exactly what makes it up. See how the results align with the hypotheses you noted down. If you created the data yourself, one archival visit at a time, you won’t find a lot that surprises you. That’s alright. Be sure to take time to consider what’s missing from the dataset, due to archival lacunae, bias, etc.

If any results surprise you, dig into the data to try to understand why. If none do, think about claims from secondary sources–do any contradict the data? Align with it?

This is also a good point to bring in contextualization. If you’re looking at the number of people doing something over time, try to compare your dataset to population dynamics. If you’re looking at word usage, find a way to compare your data to base frequencies of that word in similar collections. If you’re looking at social networks, compare them to random networks to see if their average path length or degree distribution are surprising compared to networks of similar size. Every unexpected result is an opportunity for exploration.

Internal comparisons may also yield interesting points to pursue further, especially if you think your data are biased. Given a limited dataset of actors, their genders, their roles, and play titles, for example, you may not be able to make broad claims about which plays are more popular, but you could see how different roles are distributed across genders within the group.

Internal comparisons could also be temporal. Given a dataset of occupations over time with a particular city, if you compare those numbers to population changes over time, you could find the moments where population and occupation dynamics part ways, and focus on those instances. Why, suddenly, are there more grocers?

The above boils down into two possible points of further research: deviations from expectation, or deviations from internal consistency.

Deviations from expectation–your own or that of some notable secondary source–can be particularly question-provoking. “Why didn’t this meet expectations” quickly becomes “what is wrong or incomplete about this common historical narrative?” From here, it’s useful to dig down into the points of data that exemplify such deviations, and see if you can figure out why and how they break from expectations.

Deviations from internal consistency–that is, when comparisons within the data wind up showing different trends–lead to positive rather than negative questions. Instead of “why is this theory wrong?”, you may ask, “why are these groups different?” or “why does this trend cease to keep pace with population during these decades?” Here you are asking specific questions that require new or shifted theories, whereas with deviations from expectations, you begin by seeing where existing narratives fail.

It’s worth reiterating that, in both scenarios, questions are drawn from deviations from some underlying theory.

In deviations from expectation, the underlying theory is what you bring to your data; you assume the data ought to look one way, but it doesn’t. You are coming with an internal, if not explicit, quantitative model of how the data ought to look.

In deviations from internal consistency, that data’s descriptive statistics provide the underlying theory against which there may be deviations. Apple-eaters deviating in number from population growth is only interesting if, at most points, apple-eaters grow  evenly alongside population. That is, you assume general statistics should be the same between groups or over time, and if they are not, it is worthy of explanation.

This an oversimplification, but a useful one. Undoubtedly, combinations of the two will arise: maybe you expect the differences between men and women in roles they play will be large, but it turns out they are small. This provides a deviation of both kinds, but no less legitimate for it. In this case, your recourse may be looking for other theatrical datasets to see if the gender dynamics play out the same across them, or if your data are somehow special and worthy of explanation outside the context of larger gender dynamics.

Which brings us, inexorably, to the cyclic process of computational history. Scalable reading. The hermeneutic circle. Whatever.

Point is, you’re at the point where some deviation or alignment seems worth explanation or exploration. You could stop here. You could present this trend, give a convincing causal just-so story of why it exists, and leave it at that. You will probably get published, since you’ve already gone farther than mere description, the trap of so much computational history.

But you shouldn’t stop here. You should take this opportunity to strengthen your story. Perhaps this is the point where you put your “traditional” historian’s cap back on, and go dust-diving for archival evidence to support your claims. I wouldn’t think less of you for it, but if you stop there, you’d only be reaping half the advantages of computational history.

In the example above, looking for other theatrical datasets to contextualize gender results in your own, hinted at the second half of the computational history research cycle: creating computationally tractable questions. Recall this section described the first half: making sense of data. Although I presented the two as separate, they productively feed on one another.

Once you’ve gone through your data to find how it aligns with your or others’ preconceived notions of the past, or how by its own internal deviations it presents interesting dilemmas, you have found yourself in the second half of the cycle. You have questions or theories you want to ask of data, but you do not yet have the data or the statistics to explore them.

This seems counter-intuitive. Why not just use the data or statistics already gathered, sometimes painstakingly over several years? Because if you use the same data & stats to both generate and answer questions, your evidence is circular. Specifically, you risk making a scientistic claim of what could easily be a spurious trend. It may simply be that, by random chance, the breakfast record-keeper lost a bunch of records from 1806-1810, thus causing the decline seen in the population ratio.

To convincingly make arguments from a historical data description, you must back it up using triangulation–approaching the problem from many angles. That triangulation may be computational, archival, archaeological, or however else you’re used to historying, but we’ll focus here on computational.

2. Computationally Tractable Questions

So you’ve got a historiographic agenda, and now you want to make it computationally tractable. Good luck! This is the hard part.

Good luck!

“Sparse areas relied less on social services.” “The infrastructure of science became less dependent on specific individuals over the course of the 17th century.” “T-Rex was a remarkable climber.” “Who benefited most from the power vacuum left by the assassination?” These hypotheses and questions do not, on their own, lend themselves to quantitative analysis.

Chief among the common difficulties of turning a historiographic agenda into a computationally tractable hypothesis is a lack of familiarity of computational methods. If you don’t know what a computer is good at, you can’t form an experiment to use one.

I said that history isn’t experimental, but I lied. Archival research can be an experiment if you go in with a hypothesis and a pre-conceived approach or set of criteria that would confirm it. Computational history, at this stage, is also experimental. It often works a little like this (but it may not): 7

  1. Set your agenda. Start with a hypothesis, historiographic framework, or question. For example, “The infrastructure of science became less dependent on specific individuals over the course of the 17th century.” (that question’s mine, don’t steal it.)
  2. Find testable hypotheses. Break it into many smaller statements that can be confirmed, denied, or quantitatively assessed. “If science depends less on specific individuals over the 17th century, the distribution of names mentioned in scholarly correspondence will flatten out. That is, in 1600 a few people will be mentioned frequently, whereas most will be mentioned infrequently; in 1700, the frequency of name mentions will be more evenly distributed across correspondence.” Or “If science depends less on specific individuals over the 17th century, when an important person died, it affected the scholarly network less in 1700 than in 1600.” (Notice in these two examples how finding evidence for the littler statements will corroborate the bigger hypothesis, and vice-versa.)
  3. Match hypotheses to approaches. Come up with methodological proxies, datasets, and/or statistical tests that could corroborate the littler statements. Be careful, thorough, and specific. For example, “In a network of 17th-century letter writers, if the removal of a central figure in 1600 decreases the average path length of the network less than the the removal of a central figure in 1700, central figures likely played less important structural roles. This will be most convincing if the effects of node removal smoothly decreases across the century.” (This is the step in which you need to come to the table with knowledge of different computational methods and what they do.)
  4. Specify proxies. List specific analytic approaches needed for the promising tests, and the data required to do them. For example, you need a list of senders and recipients of scholarly letters, roughly evenly distributed across time between 1600 and 1700, and densely-packed enough to perform network analysis. There could be a few different analytic approaches, including removing highly-central nodes and re-calculating average path length; employing measurements of attack tolerance; etc. Probably worth testing them all and seeing if each yields conforms to the pre-existing theory.
  5. Find data. Find pre-existing datasets that will fit your proxies, or estimate how long it will take to gather enough data yourself to reasonably approach your hypotheses. Opt for data that will work for as many approaches as possible. You may find some data that will suggest new hypotheses, and you’ll iterate back and forth between steps #3-#5 a few times.
  6. Collect data. Run experiments. Uh, yeah, just do those things. Easy as baking apple pie from scratch.
  7. Match experimental results to hypotheses. Here’s the fun part, you get to see how many of your predictions matched your results. Hopefully a bunch, but even if they didn’t, it’s an excuse to figure out why, and start the process anew. You can also start exploring the additional datasets to help you develop new questions. The astute may have noticed, this step brings us back to the first half of computational historiography: exploring data and seeing what you can find. 8

From here, it may be worthwhile to cycle back to the data exploration stage, then back here to computationally tractable hypothesis exploration, and so on ad infinitum.

By now, making meaning out of data probably feels impossible. I’m sorry. The process is much more fluid and intertwined than is easily unpacked in a blog post. The back-and-forth can take hours, days, months, or years.

But the important thing is, after you’ve gone back-and-forth a few times, you should have a combination of quantitative, archival, theoretical, and secondary support for a solidly historical argument.

Contexts of Discovery and Justification

Early 20th-century philosophy of science cared a lot about the distinction between the contexts of discovery and justification. Violently shortened, the context of discovery is how you reached your conclusion, and the context of justification is how you argue your point, regardless of the process that got you there.

I bring this up as a reminder that the two can be distinct. By the 1990s, quantitative historians who wanted to remain legible to their non-quantitative colleagues often saved the data analysis for an appendix, and even there the focus was on the actual experiments, not the long process of coming up with tests, re-testing, collecting more data, and so on.

The result of this cyclical computational historiography need not be (and rarely is, and perhaps can never be) a description of the process that led you to the evidence supporting your argument. While it’s a good idea to be clear about where your methods led you astray, the most legible result to historians will necessarily involve a narrative reconfiguration.

Causality and Truth

Small final notes on two big topics.

First, Causality. This approach won’t get you there. It’s hard to disentangle causality from correlation, but more importantly in this context, it’s hard to choose between competing causal explanations. The above process can lead you to plausible and corroborated hypotheses, but it cannot prove anything.

Consider this: “My hypothesis about apples predicts these 10 testable claims.” You test each claim, and each test agrees with your predictions. It’s a success, but a soft one; you’ve shown your hypothesis to be plausible given the evidence, but not inevitable. A dozen other equally sensible hypotheses could have produced the same 10 testable claims. You did not prove those hypotheses wrong, you just chose one model that happened to work. 9

Even if no alternate hypothesis presents itself, and all of your tests agree with your hypothesis, you still do not have causal proof. It may be that the proxies you chose to test your claims are bad ones, or incomplete, or your method has unseen holes. Causality is tricky, and in the humanities, proof especially so.

Which leads us to the next point: Truth. Even if somehow you devise the perfect process to find proof of a causal hypothesis, the causal description does not constitute capital-T Truth. There are many truths, coming from many perspectives, about the past, and they don’t need to agree with each other. Historians care not just about what happened, but how and why, and those hows and whys are driven by people. Messy, inconsistent people who believe many conflicting things within the span of a moment. When it comes to questions of society, even the most scientistic of scholars must come to terms with uncertainty and conflict, which after all are more causally central to the story of history than most clever narratives we might tell.

Notes:

  1. Also called digital history, and related to quantitative history and cliometrics in ways we don’t often like to admit.
  2. The other most prominent failure in computational history is our tendency to group things into finite discrete categories; in this case, a two-part list of failures.
  3. With some notable exceptions. Some historians simulate the past, others perform experiments on rates of material decay, or on the chemical composition of inks. It’s a big world out there.
  4. When I say fact, assume I add all the relevant post-modernist caveats of the contingency of objectivity etc. etc. Really I mean “matters of history that the volume of available evidence make difficult to dispute.”
  5. Ted Underwood and I have both talked about the exciting promise of incredibly low-hanging fruit in new approaches.
  6. OK in retrospect I should have used a more historically relevant example – I wasn’t expecting to push this example so far.
  7. If this seems overly scientistic, worry not! Experimental science is often defined by its recourse to rote procedure, which means pretty much any procedural explanation of research will resemble experimental science. There are many ways one can go about scalable reading / triangulation of computational historiography, not just the procedural steps #1-#7 above, but this is one of the easier approaches to explain. Soft falsification and hypothesis testing are plausible angles into computational history, but not necessary ones.
  8. A brief addendum to steps #6-#7: although I’d argue Null-Hypothesis Significance Testing or population-based statistical inferences may not be relevant to historiography, especially when its based in triangulation, they may be useful in certain cases. Without delving too deeply into the weeds, they can help you figure out the extent to which the effect you see may just be noise, not indicative of any particular trend. Statistical effect sizes also may be of use, helping you see whether the magnitude of your finding is big enough to have any appreciable role in the historical narrative.
  9. Shawn Graham and I wrote about this in relation to archaeology and simulation here, on the subject of underdetermination and abduction

Lessons From Digital History’s Antecedents

The below is the transcript from my October 29 keynote presented to the Creativity and The City 1600-2000 conference in Amsterdam, titled “Punched-Card Humanities”. I survey historical approaches to quantitative history, how they relate to the nomothetic/idiographic divide, and discuss some lessons we can learn from past successes and failures. For ≈200 relevant references, see this Zotero folder.


Title Slide
Title Slide

I’m here to talk about Digital History, and what we can learn from its quantitative antecedents. If yesterday’s keynote was framing our mutual interest in the creative city, I hope mine will help frame our discussions around the bottom half of the poster; the eHumanities perspective.

Specifically, I’ve been delighted to see at this conference, we have a rich interplay between familiar historiographic and cultural approaches, and digital or eHumanities methods, all being brought to bear on the creative city. I want to take a moment to talk about where these two approaches meet.

Yesterday’s wonderful keynote brought up the complicated goal of using new digital methods to explore the creative city, without reducing the city to reductive indices. Are we living up to that goal? I hope a historical take on this question might help us move in this direction, that by learning from those historiographic moments when formal methods failed, we can do better this time.

Creativity Conference Theme
Creativity Conference Theme

Digital History is different, we’re told. “New”. Many of us know historians who used computers in the 1960s, for things like demography or cliometrics, but what we do today is a different beast.

Commenting on these early punched-card historians, in 1999, Ed Ayers wrote, quote, “the first computer revolution largely failed.” The failure, Ayers, claimed, was in part due to their statistical machinery not being up to the task of representing the nuances of human experience.

We see this rhetoric of newness or novelty crop up all the time. It cropped up a lot in pioneering digital history essays by Roy Rosenzweig and Dan Cohen in the 90s and 2000s, and we even see a touch of it, though tempered, in this conference’s theme.

In yesterday’s final discussion on uncertainty, Dorit Raines reminded us the difference between quantitative history in the 70s and today’s Digital History is that today’s approaches broaden our sources, whereas early approaches narrowed them.

Slide (r)evolution
Slide (r)evolution

To say “we’re at a unique historical moment” is something common to pretty much everyone, everywhere, forever. And it’s always a little bit true, right?

It’s true that every historical moment is unique. Unprecedented. Digital History, with its unique combination of public humanities, media-rich interests, sophisticated machinery, and quantitative approaches, is pretty novel.

But as the saying goes, history never repeats itself, but it rhymes. Each thread making up Digital History has a long past, and a lot of the arguments for or against it have been made many times before. Novelty is a convenient illusion that helps us get funding.

Not coincidentally, it’s this tension I’ll highlight today: between revolution and evolution, between breaks and continuities, and between the historians who care more about what makes a moment unique, and those who care more about what connects humanity together.

To be clear, I’m operating on two levels here: the narrative and the metanarrative. The narrative is that the history of digital history is one of continuities and fractures; the metanarrative is that this very tension between uniqueness and self-similarity is what swings the pendulum between quantitative and qualitative historians.

Now, my claim that debates over continuity and discontinuity are a primary driver of the quantitative/qualitative divide comes a bit out of left field — I know — so let me back up a few hundred years and explain.

Chronology
Chronology

Francis Bacon wrote that knowledge would be better understood if it were collected into orderly tables. His plea extended, of course, to historical knowledge, and inspired renewed interest in a genre already over a thousand years old: tabular chronology.

These chronologies were world histories, aligning the pasts of several regions which each reconned the passage of time differently.

Isaac Newton inherited this tradition, and dabbled throughout his life in establishing a more accurate universal chronology, aligning Biblical history with Greek legends and Egyptian pharoahs.

Newton brought to history the same mind he brought to everything else: one of stars and calculations. Like his peers, Newton relied on historical accounts of astronomical observations to align simultaneous events across thousands of miles. Kepler and Scaliger, among others, also partook in this “scientific history”.

Where Newton departed from his contemporaries, however, was in his use of statistics for sorting out history. In the late 1500s, the average or arithmetic mean was popularized by astronomers as a way of smoothing out noisy measurements. Newton co-opted this method to help him estimate the length of royal reigns, and thus the ages of various dynasties and kingdoms.

On average, Newton figured, a king’s reign lasted 18-20 years. If the history books record 5 kings, that means the dynasty lasted between 90 and 100 years.

Newton was among the first to apply averages to fill in chronologies, though not the first to apply them to human activities. By the late 1600s, demographic statistics of contemporary life — of births, burials and the like — were becoming common. They were ways of revealing divinely ordered regularities.

Incidentally, this is an early example of our illustrious tradition of uncritically appropriating methods from the natural sciences. See? We’ve all done it, even Newton!  

Joking aside, this is an important point: statistical averages represented divine regularities. Human statistics began as a means to uncover universal truths, and they continue to be employed in that manner. More on that later, though.

Musgrave Quote

Newton’s method didn’t quite pass muster, and skepticism grew rapidly on the whole prospect of mathematical history.

Criticizing Newton in 1782, for example, Samuel Musgrave argued, in part, that there are no discernible universal laws of history operating in parallel to the universal laws of nature. Nature can be mathematized; people cannot.

Not everyone agreed. Francesco Algarotti passionately argued that Newton’s calculation of average reigns, the application of math to history, was one of his greatest achievements. Even Voltaire tried Newton’s method, aligning a Chinese chronology with Western dates using average length of reigns.

Nomothetic / Idiographic
Nomothetic / Idiographic

Which brings us to the earlier continuity/discontinuity point: quantitative history stirs debate in part because it draws together two activities Immanuel Kant sets in opposition: the tendency to generalize, and the tendency to specify.

The tendency to generalize, later dubbed Nomothetic, often describes the sciences: extrapolating general laws from individual observations. Examples include the laws of gravity, the theory of evolution by natural selection, and so forth.

The tendency to specify, later dubbed Idiographic, describes, mostly, the humanities: understanding specific, contingent events in their own context and with awareness of subjective experiences. This could manifest as a microhistory of one parish in the French Revolution, a critical reading of Frankenstein focused on gender dynamics, and so forth.  

These two approaches aren’t mutually exclusive, and they frequently come in contact around scholarship of the past. Paleontologists, for example, apply general laws of biology and geology to tell the specific story of prehistoric life on Earth. Astronomers, similarly, combine natural laws and specific observations to trace to origins of our universe.

Historians have, with cyclically recurring intensity, engaged in similar efforts. One recent nomothetic example is that of cliodynamics: the practitioners use data and simulations to discern generalities such as why nations fail or what causes war. Recent idiographic historians associate more with the cultural and theoretical turns in historiography, often focusing on microhistories or the subjective experiences of historical actors.

Both tend to meet around quantitative history, but the conversation began well before the urge to quantify. They often fruitfully align and improve one another when working in concert; for example when the historian cites a common historical pattern in order to highlight and contextualize an event which deviates from it.

But more often, nomothetic and idiographic historians find themselves at odds. Newton extrapolated “laws” for the length of kings, and was criticized for thinking mathematics had any place in the domain of the uniquely human. Newton’s contemporaries used human statistics to argue for divine regularities, and this was eventually criticized as encroaching on human agency, free will, and the uniqueness of subjective experience.

Bacon Taxonomy
Bacon Taxonomy

I’ll highlight some moments in this debate, focusing on English-speaking historians, and will conclude with what we today might learn from foibles of the quantitative historians who came before.

Let me reiterate, though, that quantitative is not nomothetic history, but they invite each other, so I shouldn’t be ahistorical by dividing them.

Take Henry Buckle, who in 1857 tried to bridge the two-culture divide posed by C.P. Snow a century later. He wanted to use statistics to find general laws of human progress, and apply those generalizations to the histories of specific nations.

Buckle was well-aware of historiography’s place between nomothetic and idiographic cultures, writing: “it is the business of the historian to mediate between these two parties, and reconcile their hostile pretensions by showing the point at which their respective studies ought to coalesce.”

In direct response, James Froud wrote that there can be no science of history. The whole idea of Science and History being related was nonsensical, like talking about the colour of sound. They simply do not connect.

This was a small exchange in a much larger Victorian debate pitting narrative history against a growing interest in scientific history. The latter rose on the coattails of growing popular interest in science, much like our debates today align with broader discussions around data science, computation, and the visible economic successes of startup culture.

This is, by the way, contemporaneous with something yesterday’s keynote highlighted: the 19th century drive to establish ‘urban laws’.

By now, we begin seeing historians leveraging public trust in scientific methods as a means for political control and pushing agendas. This happens in concert with the rise of punched cards and, eventually, computational history. Perhaps the best example of this historical moment comes from the American Census in the late 19th century.

19C Map
19C Map

Briefly, a group of 19th century American historians, journalists, and census chiefs used statistics, historical atlases, and the machinery of the census bureau to publicly argue for the disintegration of the U.S. Western Frontier in the late 19th century.

These moves were, in part, made to consolidate power in the American West and wrestle control from the native populations who still lived there. They accomplished this, in part, by publishing popular atlases showing that the western frontier was so fractured that it was difficult to maintain and defend. 1

The argument, it turns out, was pretty compelling.

Hollerith Cards
Hollerith Cards

Part of what drove the statistical power and scientific legitimacy of these arguments was the new method, in 1890, of entering census data on punched cards and processing them in tabulating machines. The mechanism itself was wildly successful, and the inventor’s company wound up merging with a few others to become IBM. As was true of punched-card humanities projects through the time of Father Roberto Busa, this work was largely driven by women.

It’s worth pausing to remember that the history of punch card computing is also a history of the consolidation of government power. Seeing like a computer was, for decades, seeing like a state. And how we see influences what we see, what we care about, how we think.  

Recall the Ed Ayers quote I mentioned at the beginning of his talk. He said the statistical machinery of early quantitative historians could not represent the nuance of historical experience. That doesn’t just mean the math they used; it means the actual machinery involved.

See, one of the truly groundbreaking punch card technologies at the turn of the century was the card sorter. Each card could represent a person, or household, or whatever else, which is sort of legible one-at-a-time, but unmanageable in giant stacks.

Now, this is still well before “computers”, but machines were being developed which could sort these cards into one of twelve pockets based on which holes were punched. So, for example, if you had cards punched for people’s age, you could sort the stacks into 10 different pockets to break them up by age groups: 0-9, 10-19, 20-29, and so forth.

This turned out to be amazing for eyeball estimates. If your 20-29 pocket was twice as full as your 10-19 pocket after all the cards were sorted, you had a pretty good idea of the age distribution.

Over the next 50 years, this convenience would shape the social sciences. Consider demographics or marketing. Both developed in the shadow of punch cards, and both relied heavily on what’s called “segmentation”, the breaking of society into discrete categories based on easily punched attributes. Age ranges, racial background, etc. These would be used to, among other things, determine who was interested in what products.

They’d eventually use statistics on these segments to inform marketing strategies.

But, if you look at the statistical tests that already existed at the time, these segmentations weren’t always the best way to break up the data. For example, age flows smoothly between 0 and 100; you could easily contrive a statistical test to show that, as a person ages, she’s more likely to buy one product over another, over a set of smooth functions.

That’s not how it worked though. Age was, and often still is, chunked up into ten or so distinct ranges, and those segments were each analyzed individually, as though they were as distinct from one another as dogs and cats. That is, 0-9 is as related to 10-19 as it is to 80-89.

What we see here is the deep influence of technological affordances on scholarly practice, and it’s an issue we still face today, though in different form.

As historians began using punch cards and social statistics, they inherited, or appropriated, a structure developed for bureaucratic government processing, and were rightly soon criticized for its dehumanizing qualities.

Pearson Stats

Unsurprisingly, given this backdrop, historians in the first few decades of the 20th century often shied away from or rejected quantification.

The next wave of quantitative historians, who reached their height in the 1930s, approached the problem with more subtlety than the previous generations in the 1890s and 1860s.

Charles Beard’s famous Economic Interpretation of the Constitution of the United States used economic and demographic stats to argue that the US Constitution was economically motivated. Beard, however, did grasp the fundamental idiographic critique of quantitative history, claiming that history was, quote:

“beyond the reach of mathematics — which cannot assign meaningful values to the imponderables, immeasurables, and contingencies of history.”

The other frequent critique of quantitative history, still heard, is that it uncritically appropriates methods from stats and the sciences.

This also wasn’t entirely true. The slide behind me shows famed statistician Karl Pearson’s attempt to replicate the math of Isaac Newton that we saw earlier using more sophisticated techniques.

By the 1940s, Americans with graduate training in statistics like Ernest Rubin were actively engaging historians in their own journals, discussing how to carefully apply statistics to historical research.

On the other side of the channel, the French Annales historians were advocating longue durée history; a move away from biographies to prosopographies, from events to structures. In its own way, this was another historiography teetering on the edge between the nomothetic and idiographic, an approach that sought to uncover the rhymes of history.

Interest in quantitative approaches surged again in the late 1950s, led by a new wave of Annales historians like Fernand Braudel and American quantitative manifestos like those by Benson, Conrad, and Meyer.

William Aydolette went so far as to point out that all historians implicitly quantify, when they use words like “many”, “average”, “representative”, or “growing” – and the question wasn’t can there be quantitative history, but when should formal quantitative methods be utilized?

By 1968, George Murphy, seeing the swell of interest, asked a very familiar question: why now? He asked why the 1960s were different from the 1860s or 1930s, why were they, in that historical moment, able to finally do it right? His answer was that it wasn’t just the new technologies, the huge datasets, the innovative methods: it was the zeitgeist. The 1960s was the right era for computational history, because it was the era of computation.

By the early 70s, there was a historian using a computer in every major history department. Quantitative history had finally grown into itself.

Popper Historicism
Popper Historicism

Of course, in retrospect, Murphy was wrong. Once the pendulum swung too far towards scientific history, theoretical objections began pushing it the other way.

In Poverty of Historicism, Popper rejected scientific history, but mostly as a means to reject historicism outright. Popper’s arguments represent an attack from outside the historiographic tradition, but one that eventually had significant purchase even among historians, as an indication of the failure of nomothetic approaches to culture. It is, to an extent, a return to Musgrave’s critique of Isaac Newton.

At the same time, we see growing criticism from historians themselves. Arthur Schlesinger famously wrote that “important questions are important precisely because they are not susceptible to quantitative answers.”

There was a converging consensus among English-speaking historians, as in the early 20th century, that quantification erased the essence of the humanities, that it smoothed over the very inequalities and historical contingencies we needed to highlight.

Barzun's Clio
Barzun’s Clio

Jacques Barzun summed it up well, if scathingly, saying history ought to free us from the bonds of the machine, not feed us into it.

The skeptics prevailed, and the pendulum swung the other way. The post-structural, cultural, and literary-critical turns in historiography pivoted away from quantification and computation. The final nail was probably Fogel and Engerman’s 1974 Time on the Cross, which reduced the Atlantic  slave-trade to economic figures, and didn’t exactly treat the subject with nuance and care.

The cliometricians, demographers, and quantitative historians didn’t disappear after the cultural turn, but their numbers shrunk, and they tended to find themselves in social science departments, or fled here to Europe, where social and economic historians were faring better.

Which brings us, 40 years on, to the middle of a new wave of quantitative or “formal method” history. Ed Ayers, like George Murphy before him, wrote, essentially, this time it’s different.

And he’s right, to a point. Many here today draw their roots not to the cliometricians, but to the very cultural historians who rejected quantification in the first place. Ours is a digital history steeped in the the values of the cultural turn, that respects social justice and seeks to use our approaches to shine a light on the underrepresented and the historically contingent.

But that doesn’t stop a new wave of critiques that, if not repeating old arguments, certainly rhymes. Take Johanna Drucker’s recent call to rebrand data as capta, because when we treat observations objectively as if it were the same as the phenomena observed, we collapse the critical distance between the world and our interpretation of it. And interpretation, Drucker contends, is the foundation on which humanistic knowledge is based.

Which is all to say, every swing of the pendulum between idiographic and nomothetic history was situated in its own historical moment. It’s not a clock’s pendulum, but Foucault’s pendulum, with each swing’s apex ending up slightly off from the last. The issues of chronology and astronomy are different from those of eugenics and manifest destiny, which are themselves different from the capitalist and dehumanizing tendencies of 1950s mainframes.

But they all rhyme. Quantitative history has failed many times, for many reasons, but there are a few threads that bind them which we can learn from — or, at least, a few recurring mistakes we can recognize in ourselves and try to avoid going forward.

We won’t, I suspect, stop the pendulum’s inevitable about-face, but at least we can continue our work with caution, respect, and care.

Which is to be Master?
Which is to be Master?

The lesson I’d like to highlight may be summed up in one question, asked by Humpty Dumpty to Alice: which is to be master?

Over several hundred years of quantitative history, the advice of proponents and critics alike tends to align with this question. Indeed in 1956, R.G. Collingwood wrote specifically “statistical research is for the historian a good servant but a bad master,” referring to the fact that statistical historical patterns mean nothing without historical context.

Schlesinger, the guy who I mentioned earlier who said historical questions are interesting precisely because they can’t be quantified, later acknowledged that while quantitative methods can be useful, they’ll lead historians astray. Instead of tackling good questions, he said, historians will tackle easily quantifiable ones — and Schlesinger was uncomfortable by the tail wagging the dog.

Which is to be master - questions
Which is to be master – questions

I’ve found many ways in which historians have accidentally given over agency to their methods and machines over the years, but these five, I think, are the most relevant to our current moment.

Unfortunately since we running out of time, you’ll just have to trust me that these are historically recurring.

Number 1 is the uncareful appropriation of statistical methods for historical uses. It controls us precisely because it offers us a black box whose output we don’t truly understand.

A common example I see these days is in network visualizations. People visualize nodes and edges using what are called force-directed layouts in Gephi, but they don’t exactly understand what those layouts mean. As these layouts were designed, physical proximity of nodes are not meant to represent relatedness, yet I’ve seen historians interpret two neighboring nodes as being related because of their visual adjacency.

This is bad. It’s false. But because we don’t quite understand what’s happening, we get lured by the black box into nonsensical interpretations.

The second way methods drive us is in our reliance on methodological imports. That is, we take the time to open the black box, but we only use methods that we learn from statisticians or scientists. Even when we fully understand the methods we import, if we’re bound to other people’s analytic machinery, we’re bound to their questions and biases.

Take the example I mentioned earlier, with demographic segmentation, punch card sorters, and its influence on social scientific statistics. The very mechanical affordances of early computers influence the sort of questions people asked for decades: how do discrete groups of people react to the world in different ways, and how do they compare with one another?

The next thing to watch out for is naive scientism. Even if you know the assumptions of your methods, and you develop your own techniques for the problem at hand, you still can fall into the positivist trap that Johanna Drucker warns us about — collapsing the distance between what we observe and some underlying “truth”.

This is especially difficult when we’re dealing with “big data”. Once you’re working with so much material you couldn’t hope to read it all, it’s easy to be lured into forgetting the distance between operationalizations and what you actually intend to measure.

For instance, if I’m finding friendships in Early Modern Europe by looking for particular words being written in correspondences, I will completely miss the existence of friends who were neighbors, and thus had no reason to write letters for us to eventually read.

A fourth way we can be mislead by quantitative methods is the ease with which they lend an air of false precision or false certainty.

This is the problem Matthew Lincoln and the other panelists brought up yesterday, where missing or uncertain data, once quantified, falsely appears precise enough to make comparisons.

I see this mistake crop up in early and recent quantitative histories alike; we measure, say, the changing rate of transnational shipments over time, and notice a positive trend. The problem is the positive difference is quite small, easily attributable to error, but because numbers are always precise, it still feels like we’re being more precise than doing a qualitative assessment. Even when it’s unwarranted.

The last thing to watch out for, and maybe the most worrisome, is the blinders quantitative analysis places on historians who don’t engage in other historiographic methods. This has been the downfall of many waves of quantitative history in the past; the inability to care about or even see that which can’t be counted.

This was, in part, was what led Time on the Cross to become the excuse to drive historians from cliometrics. The indicators of slavery that were measurable were sufficient to show it to have some semblance of economic success for black populations; but it was precisely those aspects of slavery they could not measure that were the most historically important.

So how do we regain mastery in light of these obstacles?

Which is to be master - answers
Which is to be master – answers

1. Uncareful Appropriation – Collaboration

Regarding the uncareful appropriation of methods, we can easily sidestep the issue of accidentally misusing a method by collaborating with someone who knows how the method works. This may require a translator; statisticians can as easily misunderstand historical problems as historians can misunderstand statistics.

Historians and statisticians can fruitfully collaborate, though, if they have someone in the middle trained to some extent in both — even if they’re not themselves experts. For what it’s worth, Dutch institutions seem to be ahead of the game in this respect, which is something that should be fostered.

2. Reliance on Imports – Statistical Training

Getting away from reliance on disciplinary imports may take some more work, because we ourselves must learn the approaches well enough to augment them, or create our own. Right now in DH this is often handled by summer institutes and workshop series, but I’d argue those are not sufficient here. We need to make room in our curricula for actual methods courses, or even degrees focused on methodology, in the same fashion as social scientists, if we want to start a robust practice of developing appropriate tools for our own research.

3. Naive Scientism – Humanities History

The spectre of naive scientism, I think, is one we need to be careful of, but we are also already well-equipped to deal with it. If we want to combat the uncareful use of proxies in digital history, we need only to teach the history of the humanities; why the cultural turn happened, what’s gone wrong with positivistic approaches to history in the past, etc.

Incidentally, I think this is something digital historians already guard well against, but it’s still worth keeping in mind and making sure we teach it. Particularly, digital historians need to remain aware of parallel approaches from the past, rather than tracing their background only to the textual work of people like Roberto Busa in Italy.

4. False Precision & Certainty – Simulation & Triangulation

False precision and false certainty have some shallow fixes, and some deep ones. In the short term, we need to be better about understanding things like confidence intervals and error bars, and use methods like what Matthew Lincoln highlighted yesterday.

In the long term, though, digital history would do well to adopt triangulation strategies to help mitigate against these issues. That means trying to reach the same conclusion using multiple different methods in parallel, and seeing if they all agree. If they do, you can be more certain your results are something you can trust, and not just an accident of the method you happened to use.

5. Quantitative Blinders – Rejecting Digital History

Avoiding quantitative blinders – that is, the tendency to only care about what’s easily countable – is an easy fix, but I’m afraid to say it, because it might put me out of a job. We can’t call what we do digital history, or quantitative history, or cliometrics, or whatever else. We are, simply, historians.

Some of us use more quantitative methods, and some don’t, but if we’re not ultimately contributing to the same body of work, both sides will do themselves a disservice by not bringing every approach to bear in the wide range of interests historians ought to pursue.

Qualitative and idiographic historians will be stuck unable to deal with the deluge of material that can paint us a broader picture of history, and quantitative or nomothetic historians will lose sight of the very human irregularities that make history worth studying in the first place. We must work together.

If we don’t come together, we’re destined to remain punched-card humanists – that is, we will always be constrained and led by our methods, not by history.

Creativity Theme Again
Creativity Theme Again

Of course, this divide is a false one. There are no purely quantitative or purely qualitative studies; close-reading historians will continue to say things like “representative” or “increasing”, and digital historians won’t start publishing graphs with no interpretation.

Still, silos exist, and some of us have trouble leaving the comfort of our digital humanities conferences or our “traditional” history conferences.

That’s why this conference, I think, is so refreshing. It offers a great mix of both worlds, and I’m privileged and thankful to have been able to attend. While there are a lot of lessons we can still learn from those before us, from my vantage point, I think we’re on the right track, and I look forward to seeing more of those fruitful combinations over the course of today.

Thank you.

Notes:

  1. This account is influenced from some talks by Ben Schmidt. Any mistakes are from my own faulty memory, and not from his careful arguments.

“Digital History” Can Never Be New

If you claim computational approaches to history (“digital history”) lets historians ask new types of questions, or that they offer new historical approaches to answering or exploring old questions, you are wrong. You’re not actually wrong, but you are institutionally wrong, which is maybe worse.

This is a problem, because rhetoric from practitioners (including me) is that we can bring some “new” to the table, and when we don’t, we’re called out for not doing so. The exchange might (but probably won’t) go like this:

Digital Historian: And this graph explains how velociraptors were of utmost importance to Victorian sensibilities.

Historian in Audience: But how is this telling us anything we haven’t already heard before? Didn’t John Hammond already make the same claim?

DH: That’s true, he did. One thing the graph shows, though, is that velicoraptors in general tend to play much more unimportant roles across hundreds of years, which lends support to the Victorian thesis.

HiA: Yes, but the generalized argument doesn’t account for cultural differences across those times, so doesn’t meaningfully contribute to this (or any other) historical conversation.


New Questions

History (like any discipline) is made of people, and those people have Ideas about what does or doesn’t count as history (well, historiography, but that’s a long word so let’s ignore it). If you ask a new type of question or use a new approach, that new thing probably doesn’t fit historians’ Ideas about proper history.

Take culturomics. They make claims like this:

The age of peak celebrity has been consistent over time: about 75 years after birth. But the other parameters have been changing. Fame comes sooner and rises faster. Between the early 19th century and the mid-20th century, the age of initial celebrity declined from 43 to 29 years, and the doubling time fell from 8.1 to 3.3 years.

Historians saw those claims and asked “so what”? It’s not interesting or relevant according to the things historians usually consider interesting or relevant, and it’s problematic in ways historians find things problematic. For example, it ignores cultural differences, does not speak to actual human experiences, and has nothing of use to say about a particular historical moment.

It’s true. Culturomics-style questions do not fit well within a humanities paradigm (incommensurable, anyone?). By the standard measuring stick of what makes a good history project, culturomics does not measure up. A new type of question requires a new measuring stick; in this case, I think a good one for culturomics-style approaches is the extent to which they bridge individual experiences with large-scale social phenomena, or how well they are able to reconcile statistical social regularities with free or contingent choice.

The point, though, is a culturomics presentation would fit few of the boxes expected at a history conference, and so would be considered a failure. Rightly so, too—it’s a bad history presentation. But what culturomics is successfully doing is asking new types of questions, whether or not historians find them legitimate or interesting. Is it good culturomics?

To put too fine a point on it, since history is often a question-driven discipline, new types of questions that are too different from previous types are no longer legitimately within the discipline of history, even if they are intrinsically about human history and do not fit in any other discipline.

What’s more, new types of questions may appear simplistic by historian’s standards, because they fail at fulfilling even the most basic criteria usually measuring historical worth. It’s worth keeping in mind that, to most of the rest of the world, our historical work often fails at meeting their criteria for worth.

New Approaches

New approaches to old questions share a similar fate, but for different reasons. That is, if they are novel, they are not interesting, and if they are interesting, they are not novel.

Traditional historical questions are, let’s face it, not particularly new. Tautologically. Some old questions in my field are: what role did now-silent voices play in constructing knowledge-making instruments in 17th century astronomy? How did scholarship become institutionalized in the 18th century? Why was Isaac Newton so annoying?

My own research is an attempt to provide a broader view of those topics (at least, the first two) using computational means. Since my topical interest has a rich tradition among historians, it’s unlikely any of my historically-focused claims (for example, that scholarly institutions were built to replace the really complicated and precarious role people played in coordinating social networks) will be without precedent.

After decades, or even centuries, of historical work in this area, there will always be examples of historians already having made my claims. My contribution is the bolstering of a particular viewpoint, the expansion of its applicability, the reframing of a discussion. Ultimately, maybe, I convince the world that certain social network conditions play an important role in allowing scholarly activity to be much more successful at its intended goals. My contribution is not, however, a claim that is wholly without precedent.

But this is a problem, since DH rhetoric, even by practitioners, can understandably lead people to expect such novelty. Historians in particular are very good at fitting old patterns to new evidence. It’s what we’re trained to do.

Any historical claim (to an acceptable question within the historical paradigm) can easily be countered with “but we already knew that”. Either the question’s been around long enough that every plausible claim has been covered, or the new evidence or theory is similar enough to something pre-existing that it can be taken as precedent.

The most masterful recent discussion of this topic was Matthew Lincoln’s Confabulation in the humanities, where he shows how easy it is to make up evidence and get historians to agree that they already knew it was true.

To put too fine a point on it, new approaches to old historical questions are destined to produce results which conform to old approaches; or if they don’t, it’s easy enough to stretch the old & new theories together until they fit. New approaches to old questions will fail at producing completely surprising results; this is a bad standard for historical projects. If a novel methodology were to create truly unrecognizable results, it is unlikely those results would be recognized as “good history” within the current paradigm. That is, historians would struggle to care.

What Is This Beast?

What is this beast we call digital history? Boundary-drawing is a tried-and-true tradition in the humanities, digital or otherwise. It’s theoretically kind of stupid but practically incredibly important, since funding decisions, tenure cases, and similar career-altering forces are at play. If digital history is a type of history, it’s fundable as such, tenurable as such; if it isn’t, it ain’t. What’s more, if what culturomics researchers are doing are also history, their already-well-funded machine can start taking slices of the sad NEH pie.

Artist's rendition of sad NEH pie. [via]
Artist’s rendition of sad NEH pie. [via]
So “what counts?” is unfortunately important to answer.

This discussion around what is “legitimate history research” is really important, but I’d like to table it for now, because it’s so often conflated with the discussion of what is “legitimate research” sans history. The former question easily overshadows the latter, since academics are mostly just schlubs trying to make a living.

For the last century or so, history and philosophy of science have been smooshed together in departments and conferences. It’s caused a lot of concern. Does history of science need philosophy of science? Does philosophy of science need history of science? What does it mean to combine the two? Is what comes out of the middle even useful?

Weirdly, the question sometimes comes down to “does history and philosophy of science even exist?”. It’s weird because people identify with that combined title, so I published a citation analysis in Erkenntnis a few years back that basically showed that, indeed, there is an area between the two communities, and indeed those people describe themselves as doing HPS, whatever that means to them.

Look! Right in the middle there, it's history and philosophy of science.
Look! Right in the middle there, it’s history and philosophy of science.

I bring this up because digital history, as many of us practice it, leaves us floating somewhere between public engagement, social science, and history. Culturomics occupies a similar interstitial space, though inching closer to social physics and complex systems.

From this vantage point, we have a couple of options. We can say digital history is just history from a slightly different angle, and try to be evaluated by standard historical measuring sticks—which would make our work easily criticized as not particularly novel. Or we can say digital history is something new, occupying that in-between space—which could render the work unrecognizable to our usual communities.

The either/or proposition is, of course, ludicrous. The best work being done now skirts the line, offering something just novel enough to be surprising, but not so out of traditional historical bounds as to be grouped with culturomics. But I think we need to more deliberate and organized in this practice, lest we want to be like History and Philosophy of Science, still dealing with basic questions of legitimacy fifty years down the line.

In the short term, this probably means trying not just to avoid the rhetoric of newness, but to actively curtail it. In the long term, it may mean allying with like-minded historians, social scientists, statistical physicists, and complexity scientists to build a new framework of legitimacy that recognizes the forms of knowledge we produce which don’t always align with historiographic standards. As Cassidy Sugimoto and I recently wrote, this often comes with journals, societies, and disciplinary realignment.

The least we can do is steer away from a novelty rhetoric, since what is novel often isn’t history, and what is history often isn’t novel.


“Branding” – An Addendum

After writing this post, I read Amardeep Singh’s call to, among other things, avoid branding:

Here’s a way of thinking that might get us past this muddle (and I think I agree with the authors that the hype around DH is a mistake): let’s stop branding our scholarship. We don’t need Next Big Things and we don’t need Academic Superstars, whether they are DH Superstars or Theory Superstars. What we do need is to find more democratic and inclusive ways of thinking about the value of scholarship and scholarly communities.

This is relevant here, and good, but tough to reconcile with the earlier post. In an ideal world, without disciplinary brandings, we can all try to be welcoming of works on their own merits, without relying our preconceived disciplinary criteria. In the present condition, though, it’s tough to see such an environment forming. In that context, maybe a unified digital history “brand” is the best way to stay afloat. This would build barriers against whatever new thing comes along next, though, so it’s a tough question.

Connecting the Dots

This is the incredibly belated transcript of my HASTAC 2015 keynote. Many thanks to the organizers for inviting me, and to my fellow participants for all the wonderful discussions. The video and slides are also online. You can find citations to some of the historical illustrations and many of my intellectual inspirations here. What I said and what I wrote probably don’t align perfectly.

When you’re done reading this, you should read Roopika Risam’s closing keynote, which connects surprisingly well with this, though we did not plan it.


If you take a second to expand and disentangle “HASTAC”, you get a name of an organization that doubles as a fairly strong claim about the world: that Humanities, Arts, Science, and Technology are separate things, that they probably aren’t currently in alliance with one another, and that they ought to form an alliance.

This intention is reinforced in the theme of this year’s conference: “The Art and Science of Digital Humanities.” Here again we get the four pillars: humanities, arts, science, and technology. In fact, bear with me as I read from the CFP:

We welcome sessions that address, exemplify, and interrogate the interdisciplinary nature of DH work. HASTAC 2015 challenges participants to consider how the interplay of science, technology, social sciences, humanities, and arts are producing new forms of knowledge, disrupting older forms, challenging or reifying power relationships, among other possibilities.

Here again is that implicit message: disciplines are isolated, and their interplay can somehow influence power structures. As with a lot of digital humanities and cultural studies, there’s also a hint of activism: that building intentional bridges is a beneficial activity, and we’re organizing the community of HASTAC around this goal.

hastac-outline

This is what I’ll be commenting on today. First, what does disciplinary isolation mean? I put this historically, and argue that we must frame disciplinary isolation in a rhetorical context.

This brings me to my second point about ontology. It turns out the way we talk about isolation is deeply related to the way we think about knowledge, the way we illustrate it, and ultimately the shape of knowledge itself. That’s ontology.

My third point brings us back to HASTAC: that we represent an intentional community, and this intent is to build bridges which positively affect the academy and the world.

I’ll connect these three strands by arguing that we need a map to build bridges, and we need to carefully think about the ontology of knowledge to draw that map. And once we have a map, we can use it to design a better territory.

In short, this plenary is a call-to-action. It’s my vocal support for an intentionally improved academy, my exploration of its historical and rhetorical underpinnings, and my suggestions for affecting positive change in the future.

PhDKnowledge.002[1]
Matt Might’s Illustrated Guide to the Ph.D.
Let’s begin at the beginning. With isolation.

Stop me if you’ve heard this one before:

Within this circle is the sum of all human knowledge. It’s nice, it’s enclosed, it’s bounded. It’s a comforting thought, that everything we’ve ever learned or created sits comfortably inside these boundaries.

This blue dot is you, when you’re born. It’s a beautiful baby picture. You’ve got the whole world ahead of you, an entire universe to learn, just waiting. You’re at the center because you have yet to reach your proverbial hand out in any direction and begin to learn.

Matt Might's Illustrated Guide to the Ph.D.
Matt Might’s Illustrated Guide to the Ph.D.

But time passes and you grow. You go to highschool, you take your liberal arts and sciences, and you slowly expand your circle into the great known. Rounding out your knowledge, as it were.

Then college happens! Oh, those heady days of youth. We all remember it, when the shape of our knowledge started leaning tumorously to one side. The ill-effects of specialization and declaring a major, I suspect.

As you complete a master’s degree, your specialty pulls your knowledge inexorably towards the edge of the circle of the known. You’re not a jack of all trades anymore. You’re an expert.

http://matt.might.net/articles/phd-school-in-pictures/
Matt Might’s Illustrated Guide to the Ph.D.

Then your PhD advisor yells at you to focus and get even smaller. So you complete your qualifying exams and reach the edge of what’s known. What lies beyond the circle? Let’s zoom in and see!

Matt Might's Illustrated Guide to the Ph.D.
Matt Might’s Illustrated Guide to the Ph.D.

You’ve reached the edge. The end of the line. The sum of all human knowledge stops here. If you want to go further, you’ll need to come up with something new. So you start writing your dissertation.

That’s your PhD. Right there, at the end of the little arrow.

You did it. Congratulations!

You now know more about less than anybody else in the world. You made a dent in the circle, you pushed human knowledge out just a tiny bit further, and all it cost you was your mental health, thirty years of your life, and the promise of a certain future. …Yay?

PhDKnowledge.012[1]
Matt Might’s Illustrated Guide to the Ph.D.
So here’s the new world that you helped build, the new circle of knowledge. With everyone in this room, I bet we’ve managed to make a lot of dents. Maybe we’ve even managed to increase the circle’s radius a bit!

Now, what I just walked us all through is Matt Might’s illustrated guide to the Ph.D. It made its rounds on the internet a few years back, it was pretty popular.

And, though I’m being snarky about it, it’s a pretty uplifting narrative. It provides that same dual feeling of insignificance and importance that you get when you stare at the Hubble Ultra Deep Field. You know the picture, right?

Hubble Ultra Deep Field
Hubble Ultra Deep Field

There are 10,000 galaxies on display here, each with a hundred billion stars. To think that we, humans, from our tiny vantage point on Earth, could see so far and so much because of the clever way we shape glass lenses? That’s really cool.

And saying that every pinprick of light we see is someone else’s PhD? Well, that’s a pretty fantastic metaphor. Makes getting the PhD seem worth it, right?

Dante and the Early Astronomers; M. A. Orr (Mrs. John Evershed), 1913
Dante and the Early Astronomers; M. A. Orr (Mrs. John Evershed), 1913

It kinda reminds me of the cosmological theories of some of our philosophical ancestors.

The cosmos (Greek for “Order”), consisted of concentric, perfectly layered spheres, with us at the very center.

The cosmos was bordered by celestial fire, the light from heaven, and stars were simply pin-pricks in a dark curtain which let the heavenly light shine through.

Flammarion
Flammarion

So, if we beat Matt Might’s PhD metaphor to death, each of our dissertations are poking holes in the cosmic curtain, letting the light of heaven shine through. And that’s a beautiful thought, right? Enough pinpricks, and we’ll all be bathed in light.

Expanding universe.
Expanding universe.

But I promised we’d talk about isolation, and even if we have to destroy this metaphor to get there, we’ll get there.

The universe is expanding. That circle of knowledge we’re pushing the boundaries of? It’s getting bigger too. And as it gets larger, things that were once close get further and further apart. You and I and Alpha Centauri were all neighbors for the big bang, but things have changed since then, and the star that was once our neighbor is now 5 light years away.

Atlas of Science, Katy Borner (2010).
Atlas of Science, Katy Borner (2010).

In short, if we’re to take Matt Might’s PhD model as accurate, then the result of specialization is inexorable isolation. Let’s play this out.

Let’s say two thousand years ago, a white dude from Greece invented science. He wore a beard.

[Note for readers: the following narrative is intentionally awful. Read on and you’ll see why.]

Untitled-3

He and his bearded friends created pretty much every discipline we’re familiar with at Western universities: biology, cosmology, linguistics, philosophy, administration, NCAA football, you name it.

Over time, as Ancient Greek beards finished their dissertations, the boundaries of science expanded in every direction. But the sum of human knowledge was still pretty small back then, so one beard could write many dissertations, and didn’t have to specialize in only one direction. Polymaths still roamed the earth.

Untitled-3

Fast forward a thousand years or so. Human knowledge had expanded in the interim, and the first European universities branched into faculties: theology, law, medicine, arts.

Another few hundred years, and we’ve reached the first age of information overload. It’s barely possible to be a master of all things, and though we remember scholars and artists known for their amazing breadth, this breadth is becoming increasingly difficult to manage.

We begin to see the first published library catalogs, since the multitude of books required increasingly clever and systematic cataloging schemes. If you were to walk through Oxford in 1620, you’d see a set of newly-constructed doors with signs above them denoting their disciplinary uses: music, metaphysics, history, moral philosophy, and so on.

The encyclopedia of Diderot & D'alembert
The encyclopedia of Diderot & D’alembert

Time goes on a bit further, the circle of knowledge expands, and specialization eventually leads to fracturing.

We’ve reached the age of these massive hierarchical disciplinary schemes, with learning branching in every direction. Our little circle has become unmanageable.

A few more centuries pass. Some German universities perfect the art of specialization, and they pass it along to everyone else, including the American university system.

Within another 50 years, CP Snow famously invoked the “Two Cultures” of humanities and sciences.

And suddenly here we are

Untitled-3

On the edge of our circle, pushing outward, with every new dissertation expanding our radius, and increasing the distance to our neighbors.

Basically, the inevitable growth of knowledge results in an equally inevitable isolation. This is the culmination of super-specialization: a world where the gulf between disciplines is impossible to traverse, filled with language barriers, value differences, and intellectual incommensurabilities. You name it.

hastac-outline

By this point, 99% of the room is probably horrified. Maybe it’s by the prospect of an increasingly isolated academy. More likely the horror’s at my racist, sexist, whiggish, Eurocentric account of the history of science, or at my absurdly reductivist and genealogical account of the growth of knowledge.

This was intentional, and I hope you’ll forgive me, because I did it to prove a point: the power of visual rhetoric in shaping our thoughts. We use the word “imagine” to describe every act of internal creation, whether or not it conforms to the root word of “image”. In classical and medieval philosophy, thought itself was a visual process, and complex concepts were often illustrated visually in order to help students understand and remember. Ars memoriae, it was called.

And in ars memoriae, concepts were not only given visual form, they were given order. This order wasn’t merely a clever memorization technique, it was a reflection on underlying truths about the relationship between concepts. In a sense, visual representations helped bridge human thought with divine structure.

This is our entrance into ontology. We’ve essentially been talking about interdisciplinarity for two thousand years, and always alongside a visual rhetoric about the shape, or ontology, of knowledge. Over the next 10 minutes, I’ll trace the interwoven histories of ontology, illustrations, and rhetoric of interdisciplinarity. This will help contextualize our current moment, and the intention behind meeting at a conference like this one. It should, I hope, also inform how we design our community going forward.

Let’s take a look some alternatives to the Matt Might PhD model.

Diagrams of Knowledge
Diagrams of Knowledge

Countless cultural and religious traditions associate knowledge with trees; indeed, in the Bible, the fruit of one tree is knowledge itself.

During the Roman Empire and the Middle Ages, the sturdy metaphor of trees provided a sense of lineage and order to the world that matched perfectly with the neatly structured cosmos of the time. Common figures of speech we use today like “the root of the problem” or “branches of knowledge” betray the strength with which we connected these structures to one another. Visual representations of knowledge, obviously, were also tree-like.

See, it’s impossible to differentiate the visual from the essential here. The visualization wasn’t a metaphor, it was an instantiation of essence. There are three important concepts that link knowledge to trees, which at that time were inseparable.

One: putting knowledge on a tree implied a certain genealogy of ideas. What we discovered and explored first eventually branched into more precise subdisciplines, and the history of those branches are represented on the tree. This is much like any family tree you or I would put together with our parents and grandparents and so forth. The tree literally shows the historical evolution of concepts.

Two: putting knowledge on a tree implied a specific hierarchy that would by the Enlightenment become entwined with how we understood the universe. Philosophy separates into the theoretical and the practical; basic math into geometry and arithmetic. This branching hierarchy gave an importance to the root of the tree, be that root physics or God or philosophy or man, and that importance decreased as you reached the further limbs. It also implied an order of necessity: the branches of math could not exist without the branch of philosophy it stemmed from. This is why today people still think things like physics is the most important discipline.

Three: As these trees were represented, there was no difference between the concept of a branch of knowledge, the branch of knowledge itself, and the object of study of that branch of knowledge. The relationship of physics to chemistry isn’t just genealogical or foundational; it’s actually transcendent. The conceptual separation of genealogy, ontology, and transcendence would not come until much later.

It took some time for the use of the branching tree as a metaphor for knowledge to take hold, competing against other visual and metaphorical representations, but once it did, it ruled victorious for centuries. The trees spread and grew until they collapsed under their own weight by the late nineteenth century, leaving a vacuum to be filled by faceted classification systems and sprawling network visualizations. The loss of a single root as the source of knowledge signaled an epistemic shift in how knowledge is understood, the implications of which are still unfolding in present-day discussions of interdisciplinarity.

By visualizing knowledge itself as a tree, our ancestors reinforced both an epistemology and a phenomenology of knowledge, ensuring that we would think of concepts as part of hierarchies and genealogies for hundreds of years. As we slowly moved away from strictly tree-based representations of knowledge in the last century, we have also moved away from the sense that knowledge forms a strict hierarchy. Instead, we now believe it to be a diffuse system of occasionally interconnected parts.

Of course, the divisions of concepts and bodies of study have no natural kind. There are many axes against which we may compare biology to literature, but even the notion of an axis of comparison implies a commonality against which the two are related which may not actually exist. Still, we’ve found the division of knowledge into subjects, disciplines, and fields a useful practice since before Aristotle. The metaphors we use for these divisions influence our understanding of knowledge itself: structured or diffuse; overlapping or separate; rooted or free; fractals or divisions; these metaphors inform how we think about thinking, and they lend themselves to visual representations which construct and reinforce our notions of the order of knowledge.

Arbor Scientiae, late thirteenth century, Ramon Llull. [via]
Arbor Scientiae, late thirteenth century, Ramon Llull.
Given all this, it should come as no surprise that medieval knowledge was shaped like a tree – God sat at the root, and the great branching of knowledge provided a transcendental order of things. Physics, ethics, and biology branched further and further until tiny subdisciplines sat at every leaf. One important aspect of these illustrations was unity – they were whole and complete, and even more, they were all connected. This mirrors pretty closely that circle from Matt Might.

Christophe de Savigny’s Tableaux: Accomplis de tous les arts liberaux, 1587
Christophe de Savigny’s Tableaux: Accomplis de tous les arts liberaux, 1587

Speaking of that circle I had up earlier, many of these branching diagrams had a similar feature. Notice the circle encompassing this illustration, especially the one on the left here: it’s a chain. The chain locks the illustration down: it says, there are no more branches to grow.

This and similar illustrations were also notable for their placement. This was an index to a book, an early encyclopedia of sorts – you use the branches to help you navigate through descriptions of the branches of knowledge. How else should you organize a book of knowledge than by its natural structure?

Bacon's Advancement of Learning
Bacon’s Advancement of Learning

We start seeing some visual, rhetorical, and ontological changes by the time of Francis Bacon, who wrote “the distributions and partitions of knowledge are […] like branches of a tree that meet in a stem, which hath a dimension and quantity of entireness and continuance, before it come to discontinue and break itself into arms and boughs.”

The highly influential book broke the trends in three ways:

  1. it broke the “one root” model of knowledge.
  2. It shifted the system from closed to open, capable of growth and change
  3. it detached natural knowledge from divine wisdom.

Bacon’s uprooting of knowledge, dividing it into history, poesy, and philosophy, each with its own root, was an intentional rhetorical strategy. He used it to argue that natural philosophy should be explored at the expense of poesy and history. Philosophy, what we now call science, was now a different kind of knowledge, worthier than the other two.

And doesn’t that feel a lot like today?

Bacon’s system also existed without an encompassing chain, embodying the idea that learning could be advanced; that the whole of knowledge could not be represented as an already-grown tree. There was no complete order of knowledge, because knowledge changes.

And, by being an imperfect, incomplete entity, without union, knowledge was notably separated from divine wisdom.

Kircher's Philosophical tree representing all branches of knowledge, from Ars Magna Sciendi (1669), p. 251.
Kircher’s Philosophical tree representing all branches of knowledge, from Ars Magna Sciendi (1669), p. 251.

Of course, divinity and transcendence wasn’t wholly exorcised from these ontological illustrations: Athanasius Kircher put God on the highest branch, feeding the tree’s growth. (Remember, from my earlier circle metaphor, the importance of the poking holes in the fabric of the cosmos to let the light of heaven shine through?). Descartes as well continued to describe knowledge as a tree, whose roots were reliant on divine existence.

Chambers' Cyclopædia
Chambers’ Cyclopædia

But even without the single trunk, without God, without unity, the metaphors were still ontologically essential, even into the 18th century. This early encyclopedia by Ephraim Chambers uses the tree as an index, and Chambers writes:

“the Origin and Derivation of the several Parts, and the relation in which [the disciplines] stand to their common Stock and to each other; will assist in restoring ‘em to their proper Places

Their proper places. This order is still truth with a capital T.

The encyclopedia of Diderot & D'alembert
The encyclopedia of Diderot & D’alembert

It wasn’t until the mid-18th century, with Diderot and d’Alembert’s encyclopedia, that serious thinkers started actively disputing the idea that these trees were somehow indicative of the essence of knowledge. Even they couldn’t escape using trees, however, introducing their enyclopedia by saying “We have chosen a division which has appeared to us most nearly satisfactory for the encyclopedia arrangement of our knowledge and, at the same time, for its genealogical arrangement.

Even if the tree wasn’t the essence of knowledge, it still represented possible truth about the genealogy of ideas. It took until a half century later, with the Encyclopedia Britannica, for the editors to do away with tree illustrations entirely and write that the world was “perpetually blended in almost every branch of human knowledge”. (Notice they still use the word branch.) By now, a philosophical trend that began with Bacon was taking form through the impossibility of organizing giant libraries and encyclopedia: that there was no unity of knowledge, no implicit order, and no viable hierarchy.

Banyan tree [via]
It took another century to find a visual metaphor to replace the branching tree. Herbert Spencer wrote that the branches of knowledge “now and again re-unite […], they severally send off and receive connecting growths; and the intercommunion is ever becoming more frequent, more intricate, more widely ramified.” Classification theorist S.R. Ranganathan compared knowledge to the Banyan tree from his home country of India, which has roots which both grow from the bottom up and the top down.

Otlet 1937
Otlet 1937

The 20th century saw a wealth of new shapes of knowledge. Paul Otlet conceived a sort of universal network, connected through individual’s thought processes. H.G. Wells shaped knowledge very similar to Matt Might’s illustrated PhD from earlier: starting with a child’s experience of learning and branching out. These were both interesting developments, as they rhetorically placed the ontology of knowledge in the realm of the psychological or the social: driven by people rather than some underlying objective reality about conceptual relationships.

Porter’s 1939 Map of Physics [via]
Around this time there was a flourishing of visual metaphors, to fill the vacuum left by the loss of the sturdy tree.There was, uncoincidentally, a flourishing of uses for these illustrations. Some, like this map, was educational and historical, teaching students how the history of physics split and recombined like water flowing through rivers and tributaries. Others, like the illustration to the right, showed how the conceptual relationships between knowledge domains differed from and overlapped with library classification schemes and literature finding aids.

Small & Garfield, 1985
Small & Garfield, 1985

By the 80s, we start seeing a slew of the illustrations we’re all familiar with: those sexy sexy network spaghetti-and-meatball graphs. We often use them to illustrate citation chains, and the relationship between academic disciplines. These graphs, so popular in the 21st century, go hand-in-hand with the ontological baggage we’re used to: that knowledge is complex, unrooted, interconnected, and co-constructed. This fits well with the current return to a concept we’d mostly left in the 19th century: that knowledge is a single, growing unit, that it’s consilient, that everyone is connected. It’s a return to the Republic of Letters from the C.P. Snow’s split of the Two Cultures.

It also notably departs from genealogical, transcendental, and even conceptual discussions of knowledge. These networks, broadly construed, are social representations, and while those relationships may often align with conceptual ones, concepts are not what drive the connections.

Fürbringer's Illustration of Bird Evolution, 1888
Fürbringer’s Illustration of Bird Evolution, 1888

Interestingly, there is precedent in these sorts of illustrations in the history of evolutionary biology. In the late 19th-century, illustrators and scientists began asking what it would look like if you took a slice from the evolutionary tree – or, what does the tree of life look like when you’re looking at it from the top-down?

What you get is a visual structure very similar to the network diagrams we’re now used to. And often, if you probe those making the modern visualizations, they will weave a story about the history of these networks that is reminiscent of branching evolutionary trees.

There’s another set of epistemological baggage that comes along with these spaghetti-and-meatball-graphs. Ben Fry, a well-known researcher in information visualization, wrote:

“There is a tendency when using [networks] to become smitten with one’s own data. Even though a graph of a few hundred nodes quickly becomes unreadable, it is often satisfying for the creator because the resulting figure is elegant and complex and may be subjectively beautiful, and the notion that the creator’s data is ‘complex’ fits just fine with the creator’s own interpretation of it. Graphs have a tendency of making a data set look sophisticated and important, without having solved the problem of enlightening the viewer.”

Actually, were any of you here at last night’s Pink Floyd light show in the planetarium? They’re a lot like that. [Yes, readers, HASTAC put on a Pink Floyd light show.]

And this is where we are now.

hastac-outline

Which brings us back to the outline, and HASTAC. Cathy Davidson has often described HASTAC as a social network, which is (at least on the web) always an intentionally-designed medium. Its design grants certain affordances to users: is it easier to communicate individually or in groups? What types of communities, events, or content is prioritized? These are design decisions that affect how the HASTAC community functions and interacts.

And the design decisions going into HASTAC are informed by its intent, so what is that intent? In their groundbreaking 2004 manifesto in the Chronicle, Cathy Davidson and David Goldberg wrote:

“We believe that a new configuration in the humanities must be championed to ensure their centrality to all intellectual enterprises in the university and, more generally, to understanding the human condition and thereby improving it; and that those intellectual changes must be supported by new institutional structures and values.”

This was a HASTAC rallying cry: how can the humanities constructively inform the world? Notice especially how they called for “New Institutional Structures.”

Remember earlier, how I talked about the problem if isolation? While my story about it was problematic, it doesn’t make disciplinary superspecialization any less real a problem. For all its talk of interdisciplinarity, academia is averse to synthesis on many fronts, superspecialization being just one of them. A dissertation based on synthesis, for example, is much less likely to get through a committee than a thorough single intellectual contribution to one specific field.

The academy is also weirdly averse to writing for public audiences. Popular books won’t get you tenure. But every discipline is a popular audience to most other disciplines: you wouldn’t talk to a chemist about history the same way you’d talk to a historian. Synthetic and semi-public work is exactly the sort of work that will help with HASTAC’s goal of a truly integrated and informed academy for social good, but the cards are stacked against it. Cathy and David hit the nail on the head when they target institutional structures as a critical point for improvement.

This is where design comes in.

Richmond, 1954
Richmond, 1954

Recall again the theme this year: The Art and Science of Digital Humanities. I propose we take the next few days to think about how we can use art and science to make HASTAC even better at living up its intent. That is, knowing what we do about collaboration, about visual rhetoric, about the academy, how can we design an intentional community to meet its goals? Perusing the program, it looks like most of us will already be discussing exactly this, but it’s useful to put a frame around it.

When we talk about structure and the social web, there’s many great examples we may learn from. One such example is that of Tara McPherson and her colleagues, in designing the web publishing platform Scalar. As opposed to WordPress, its cousin in functionality, Scalar was designed with feminist and humanist principles in mind, allowing for more expressive, non-hierarchical “pathways” through content.

When talking of institutional, social, and web-based structures, we can also take lessons history. In Early Modern Europe, the great network of information exchange known as the Republic of Letters was a shining example of the influence of media structures on innovation. Scholars would often communicate through “hubs”, which were personified in people nicknamed things like “the mailbox of Europe”. And they helped distribute new research incredibly efficiently through their vast web of social ties. These hubs were essential to what’s been called the scientific revolution, and without their structural role, it’s unlikely you’d see references to a scientific revolution in the 17th century Europe.

Similarly, at that time, the Atlantic slave trade was wreaking untold havoc on the world. For all the ills it caused, we at least can take some lessons from it in the intentional design of a scholarly network. There existed a rich exchange of medical knowledge between Africans and indigenous Americans that bypassed Europe entirely, taking an entirely different sort of route through early modern social networks.

If we take the present day, we see certain affordances of social networks similarly used to subvert or reconfigure power structures, as with the many revolutions in North Africa and the Middle East, or the current activist events taking place around police brutality and racism in the US. Similar tactics that piggy-back on network properties are used by governments to spread propaganda, ad agencies to spread viral videos, and so forth.

The question, then, is how we can intentionally design a community, using principles we learn from historical action, as well as modern network science, in order to subvert institutional structures in the manner raised by Cathy and David?

Certainly we also ought to take into account the research going into collaboration, teamwork, and group science. We’ve learned, for example, that teams with diverse backgrounds often come up with more creative solutions to tricky problems. We’ve learned that many small, agile groups often outperform large groups with the same amount of people, and that informal discussion outside the work-space contributes in interesting ways to productivity. Many great lessons can be found in Michael Nielsen’s book, Reinventing Discovery.

We can use these historical and lab-based examples to inform the design of social networks. HASTAC already work towards this goal through its scholars program, but there are more steps that may be taken, such as strategically seeking out scholars from underrepresented parts of the network.

So this covers covers the science, but what about the art?

Well, I spent the entire middle half of this talk discussing how visual rhetoric is linked to ontological metaphors of knowledge. The tree metaphor of knowledge, for example, was so strongly held that it fooled Descartes into breaking his claims of mind-body dualism.

So here is where the artists in the room can also fruitfully contribute to the same goal: by literally designing a better infrastructure. Visually. Illustrations can be remarkably powerful drivers of reconceptualization, and we have the opportunity here to affect changes in the academy more broadly.

One of the great gifts of the social web, at least when it’s designed well, is its ability to let nodes on the farthest limbs of the network to still wield remarkable influence over the whole structure. This is why viral videos, kickstarter projects, and cats playing pianos can become popular without “industry backing”. And the decisions we make in creating illustrations, in fostering online interactions, in designing social interfaces, can profoundly affect the way those interactions reinforce, subvert, or sidestep power structures.

So this is my call to the room: let’s revisit the discussion about designing the community we want to live in.

 

Thanks very much.

Culturomics 2: The Search for More Money

“God willing, we’ll all meet again in Spaceballs 2: The Search for More Money.” -Mel Brooks, Spaceballs, 1987

A long time ago in a galaxy far, far away (2012 CE, Indiana), I wrote a few blog posts explaining that, when writing history, it might be good to talk to historians (1,2,3). They were popular posts for the Irregular, and inspired by Mel Brooks’ recent interest in making Spaceballs 2,  I figured it was time for a sequel of my own. You know, for all the money this blog pulls in. 1

SpaceballsTheFlamethrower[1]

Two teams recently published very similar articles, attempting cultural comparison via a study of historical figures in different-language editions of Wikipedia. The first, by Gloor et al., is for a conference next week in Japan, and frames itself as cultural anthropology through the study of leadership networks. The second, by Eom et al. and just published in PLoS ONE, explores cross-cultural influence through historical figures who span different language editions of Wikipedia.

Before reading the reviews, keep in mind I’m not commenting on method or scientific contribution—just historical soundness. This often doesn’t align with the original authors’ intents, which is fine. My argument isn’t that these pieces fail at their goals (science is, after all, iterative), but that they would be markedly improved by adhering to the same standards of historical rigor as they adhere to in their home disciplines, which they could accomplish easily by collaborating with a historian.

The road goes both ways. If historians don’t want physicists and statisticians bulldozing through history, we ought to be open to collaborating with those who don’t have a firm grasp on modern historiography, but who nevertheless have passion, interest, and complementary skills. If the point is understanding people better, by whatever means relevant, we need to do it together.

Cultural Anthropology

“Cultural Anthropology Through the Lens of Wikipedia – A Comparison of Historical Leadership Networks in the English, Chinese, Japanese and German Wikipedia” by Gloor et al. analyzes “the historical networks of the World’s leaders since the beginning of written history, comparing them in the four different Wikipedias.”

Their method is simple (simple isn’t bad!): take each “people page” in Wikipedia, and create a network of people based on who else is linked within that page. For example, if Wikipedia’s article on Mozart links to Beethoven, a connection is drawn between them. Connections are only drawn between people whose lives overlap; for example, the Mozart (1756-1791) Wikipedia page also links to Chopin (1810-1849), but because they did not live concurrently, no connection is drawn.

Figure 1 from http://arxiv.org/ftp/arxiv/papers/1502/1502.05256.pdf
Figure 1 from Gloor et al

A separate network is created for four different language editions of Wikipedia (English, Chinese, Japanese, German), because biographies in each edition are rarely exact translations, and often different people will be prominent within the same biography across all four languages. PageRank was calculated for all the people in the resulting networks, to get a sense of who the most central figures are according to the Wikipedia link structure.

“Who are the most important people of all times?” the authors ask, to which their data provides them an answer. 2 In China and Japan, they show, only warriors and politicians make the cut, whereas religious leaders, artists, and scientists made more of a mark on Germany and the English-speaking world. Historians and biographers wind up central too, given how often their names appear on the pages of famous contemporaries on whom they wrote.

Diversity is also a marked difference: 80% of the “top 50” people for the English Wikipedia were themselves non-English, whereas only 4% of the top people from the Chinese Wikipedia are not Chinese. The authors conclude that “probing the historical perspective of many different language-specific Wikipedias gives an X-ray view deep into the historical foundations of cultural understanding of different countries.”

Figure 3
Figure 3 from Gloor et al

Small quibbles aside (e.g. their data include the year 0 BC, which doesn’t exist), the big issue here is the ease with which they claim these are the “most important” actors in history, and that these datasets provides an “X-ray” into the language cultures that produced them. This betrays the same naïve assumptions that plague much of culturomics research: that you can uncritically analyze convenient datasets as a proxy for analyzing larger cultural trends.

You can in fact analyze convenient datasets as a proxy for larger cultural trends, you just need some cultural awareness and a critical perspective.

In this case, several layers of assumptions are open for questioning, including:

  • Is the PageRank algorithm a good proxy for historical importance? (The answer turns out to be yes in some situations, but probably not this one.)
  • Is the link structure in Wikipedia a good proxy for historical dependency? (No, although it’s probably a decent proxy for current cultural popularity of historical figures, which would have been a better framing for this article. Better yet, these data can be used to explore the many well-known and unknown biases that pervade Wikipedia.)
  • Can differences across language editions of Wikipedia be explained by any factors besides cultural differences? (Yes. For example, editors of the German-language Wikipedia may be less likely to write a German biography if one already exists in English, given that ≈64% of Germany speaks English.)

These and other questions, unexplored in the article, make it difficult to take at face value that this study can reveal important historical actors or compare cultural norms of importance. Which is a shame, because simple datasets and approaches like this one can produce culturally and scientifically valid results that wind up being incredibly important. And the scholars working on the project are top-notch, it’s just that they don’t have all the necessary domain expertise to explore their data and questions.

Cultural Interactions

The great thing about PLoS is the quality control on its publications: there isn’t much. As long as primary research is presented, the methods are sound, the data are open, and the experiment is well-documented, you’re in.

It’s a great model: all reasonable work by reasonable people is published, and history decides whether an article is worthy of merit. Contrast this against the current model, where (let’s face it) everything gets published eventually anyway, it’s just a question of how many journal submissions and rounds of peer review you’re willing to sit through. Research sits for years waiting to be published, subject to the whims of random reviewers and editors who may hold long grudges, when it could be out there the minute it’s done, open to critique and improvement, and available to anyone to draw inspiration or to learn from someone’s mistakes.

“Interactions of Cultures and Top People of Wikipedia from Ranking of 24 Language Editions” by Eom et al. is a perfect example of this model. Do I consider it a paragon of cultural research? Obviously not, if I’m reviewing it here. Am I happy the authors published it, respectful of their attempt, and willing to use it to push forward our mutual goal of soundly-researched cultural understanding? Absolutely.

Eom et al.’s piece, similar to that of Gloor et al. above, uses links between Wikipedia people pages to rank historical figures and to make cultural comparisons. The article explores 24 different language editions of Wikipedia, and goes one step further, using the data to explore intercultural influence. Importantly, given that this is a journal-length article and not a paper from a conference proceeding like Gloor et al.’s, extra space and thought was clearly put into the cultural biases of Wikipedia across languages. That said, neither of the articles reviewed here include any authors who identify themselves as historians or cultural experts.

This study collected data a bit differently from the last. Instead of a network connecting only those people whose lives overlapped, this network connected all pages within a single-language edition of Wikipedia, based only on links between articles. 3 They then ranked pages using a number of metrics, including but not limited to PageRank, and only then automatically extracted people to find who was the most prominent in each dataset.

In short, every Wikipedia article is linked in a network and ranked, after which all articles are culled except those about people. The authors explain: “On the basis of this data set we analyze spatial, temporal, and gender skewness in Wikipedia by analyzing birth place, birth date, and gender of the top ranked historical figures in Wikipedia.” By birth place, they mean the country currently occupying the location where a historical figure was born, such that Aristophanes, born in Byzantium 2,300 years ago, is considered Turkish for the purpose of this dataset. The authors note this can lead to cultural misattributions ≈3.5% of the time (e.g. Kant is categorized as Russian, having been born in a city now in Russian territory). They do not, however, call attention to the mutability of culture over time.

Table 2 from Eom et al.
Table 2 from Eom et al.

It is unsurprising, though comforting, to note that the fairly different approach to measuring prominence yields many of the same top-10 results as Gloor’s piece: Shakespeare, Napoleon, Bush, Jesus, etc.

Analysis of the dataset resulted in several worthy conclusions:

  • Many of the “top” figures across all language editions hail from Western Europe or the U.S.
  • Language editions bias local heroes (half of top figures in Wikipedia English are from the U.S. and U.K.; half of those in Wikipedia Hindi are from India) and regional heroes (Among Wikipedia Korean, many top figures are Chinese).
  • Top figures are distributed throughout time in a pattern you’d expect given global population growth, excepting periods representing foundations of modern cultures (religions, politics, and so forth).
  • The farther you go back in time, the less likely a top figure from a certain edition of Wikipedia is to have been born in that language’s region. That is, modern prominent figures in Wikipedia English are from the U.S. or the U.K., but the earlier you go, the less likely top figures are born in English-speaking regions. (I’d question this a bit, given cultural movement and mutability, but it’s still a result worth noting).
  • Women are consistently underrepresented in every measure and edition. More recent top people are more likely to be women than those from earlier years.
Figure 4 from Eom et al.
Figure 4 from Eom et al.

The article goes on to describe methods and results for tracking cultural influence, but this blog post is already tediously long, so I’ll leave that section out of this review.

There are many methodological limitations to their approach, but the authors are quick to notice and point them out. They mention that Linnaeus ranks so highly because “he laid the foundations for the modern biological naming scheme so that plenty of articles about animals, insects and plants point to the Wikipedia article about him.” This research was clearly approached with a critical eye toward methodology.

Eom et al. do not fare as well historically as methodologically; opportunities to frame claims more carefully, or to ask different sorts of questions, are overlooked. I mentioned earlier that the research assumes historical cultural consistency, but cultural currents intersect languages and geography at odd angles.

The fact that Wikipedia English draws significantly from other locations the earlier you look should come as no surprise. But, it’s unlikely English Wikipedians are simply looking to more historically diverse subjects; rather, the locus of some cultural current (Christianity, mathematics, political philosophy) has likely moved from one geographic region to another. This should be easy to test with their dataset by looking at geographic clustering and spread in any given year. It’d be nice to see them move in that direction next.

I do appreciate that they tried to validate their method by comparing their “top people” to lists other historians have put together. Unfortunately, the only non-Wikipedia-based comparison they make is to a book written by an astrophysicist and white separatist with no historical training: “To assess the alignment of our ranking with previous work by historians, we compare it with [Michael H.] Hart’s list of the top 100 people who, according to him, most influenced human history.”

Top People

Both articles claim that an algorithm analyzing Wikipedia networks can compare cultures and discover the most important historical actors, though neither define what they mean by “important.” The claim rests on the notion that Wikipedia’s grand scale and scope smooths out enough authorial bias that analyses of Wikipedia can inductively lead to discoveries about Culture and History.

And critically approached, that notion is more plausible than historians might admit. These two reviewed articles, however, don’t bring that critique to the table. 4 In truth, the dataset and analysis lets us look through a remarkably clear mirror into the cultures that created Wikipedia, the heroes they make, and the roots to which they feel most connected.

Usefully for historians, there is likely much overlap between history and the picture Wikipedia paints of it, but the nature of that overlap needs to be understood before we can use Wikipedia to aid our understanding of the past. Without that understanding, boldly inductive claims about History and Culture risk reinforcing the same systemic biases which we’ve slowly been trying to fix. I’m absolutely certain the authors don’t believe that only 5% of history’s most important figures were women, but the framing of the articles do nothing to dispel readers of this notion.

Eom et al. themselves admit “[i]t is very difficult to describe history in an objective way,” which I imagine is a sentiment we can all get behind. They may find an easier path forward in the company of some historians.

Notes:

  1. net income: -$120/year.
  2. If you’re curious, the 10 most important people in the English-speaking world, in order, are George W. Bush, ol’ Willy Shakespeare, Sidney Lee, Jesus, Charles II, Aristotle, Napoleon, Muhammad, Charlemagne, and Plutarch.
  3. Download their data here.
  4. Actually the Eom et al. article does raise useful critiques, but mentioning them without addressing them doesn’t really help matters.

[Review] The Book of Trees, Manuel Lima

The first line on the first page of The Book of Trees is “This is the book I wish had been available when I was researching my previous book, Visual Complexity: Mapping Patterns of Information.” It’s funny, because this is also the book I wish had been available when I was researching my own project, Knowledge Uprooted. It took Alberto Cairo, reading over a draft of my article, to point out that Manuel Lima was working on a book-length version of a very similar project. If the book had come out a year ago, my own research might look very different. Lima’s book is beautifully designed, well-researched, and a delightful resource for anyone interested in visualizations of knowledge.

Tree of Consanguinity, ca. 1450-1510. Page 52.
Tree of Consanguinity, ca. 1450-1510. Page 52.

Lima’s book is a history of hierarchical visualizations, most frequently as trees, and often representing branches of knowledge. He roots his narrative in trees themselves, describing how their symbolism has touched religions and cultures for millennia. The narrative weaves through Ancient Greece and Medieval Europe, makes a few stops outside of the West and winds its way to the present day. Subsequent chapters are divided into types of tree visualizations: figurative, vertical, horizontal, multidirectional, radial, hyperbolic, rectangular, voronoi, circular, sunbursts, and icicles. Each chapter presents a chronological set of beautiful examples embodying that type.

Biblical Genealogy, ca. 1060. Page 112.
Biblical Genealogy, ca. 1060. Page 112.

Of course, any project with such a wide scope is bound to gloss over or inaccurately portray some of its historical content. I’d quibble, for example, with Lima’s suggestion that the use of these visual diagrams could be understood in the context of ars memorativa, a method for improving memory and understanding in the Middle Ages. Instead, I’d argue that the tradition stemmed from a more innate Aristotelian connection between thinking and seeing. Lima also argues that the scala naturae, depictions of entities on a natural order rising to God, is an obvious reflection on contemporary feudal stratification. The story is a bit more complex than that, with feudal stratification itself being concomitant to the medieval worldview of a natural order. In discussing Ramon Llull, Lima oddly writes “the notion of a unified trunk of science has remained to this day,” a claim which Lima himself shows isn’t exactly true in his earlier book, Visual Complexity. But this isn’t a book written by or for historians, and that’s okay—it’s accurate enough to get a good sense of the progression of trees.

The Blog Tree, 2012. Page 77.
The Blog Tree, 2012. Page 77.

Where the book shines is in its clear, well-cited, contextualized illustrations, which comprise the majority of its contents. Over a hundred illustrations pack the book, each with at least a paragraph of description and, in many cases, translation. This is a book for people passionate about visualizations, and interested in their history. There is not yet a book-length treatment for historians interested in this subject, though Murdoch’s Album of Science (1984) comes close. For those who want to delve even deeper into this history, I’ve compiled a 100+ reference bibliography that is freely available here.

Understanding Special Relativity through History and Triangles (pt. 1)

We interrupt this usually-DH blog because I got in a discussion about Special Relativity with a friend, and promised it was easily understood using only the math we use for triangles. But I’m a historian, so I can’t leave a good description alone without some background.

If you just want to learn how relativity works, skip ahead to the next post, Relativity Made Simple [Note! I haven’t written it yet, this is a two-part post. Stay-tuned for the next section]; if you hate science and don’t want to know how the universe functions, but love history, read only this post. If you have a month of time to kill, just skip this post entirely and read through my 122-item relativity bibliography on Zotero. Everyone else, disregard this paragraph.

An Oddly Selective History of Relativity

This is not a history of how Einstein came up with his Theory of Special Relativity as laid out in Zur Elektrodynamik bewegter Körper in 1905. It’s filled with big words like aberration and electrodynamics, and equations with occult symbols. We don’t need to know that stuff. This is a history of how others understood relativity. Eventually, you’re going to understand relativity, but first I’m going to tell you how other people, much smarter than you, did not.

There’s an infamous (potentially mythical) story about how difficult it is to understand relativity: Arthur Eddington, a prominent astronomer, was asked whether it was true that only three people in the world understood relativity. After pausing for a moment, Eddington replies “I’m trying to think who the third person is!” This was about General Relativity, but it was also a joke: good scientists know relativity isn’t incredibly difficult to grasp, and even early on, lots of people could claim to understand it.

Good historians, however, know that’s not the whole story. It turns out a lot of people who thought they understood Einstein’s conceptions of relativity actually did not, including those who agreed with him. This, in part, is that story.

Relativity Before Einstein

Einstein’s special theory of relativity relied on two assumptions: (1) you can’t ever tell whether you’re standing still or moving at a constant velocity (or, in physics-speak, the laws of physics in any inertial reference frame are indistinguishable from one another), and (2) light always looks like it’s moving at the same speed (in physics-speak, the speed of light is always constant no matter the velocity of the emitting body nor that of the observer’s inertial reference frame). Let’s trace these concepts back.

Our story begins in the 14th century. William of Occam, famous for his razor, claimed motion was merely the location of a body and its successive positions over time; motion itself was in the mind. Because position was simply defined in terms of the bodies that surround it, this meant motion was relative. Occam’s student, Buridan, pushed that claim forward, saying “If anyone is moved in a ship and imagines that he is at rest, then, should he see another ship which is truly at rest, it will appear to him that the other ship is moved.”

Galileo's relativity [via]. The site where this comes from is a little crazy, but the figure is still useful, so here it is.
Galileo’s relativity [via]. The site where this comes from is a little crazy, but the figure is still useful, so here it is.
The story movies forward at irregular speed (much like the speed of this blog, and the pacing of this post). Within a century, scholars introduced the concepts of an infinite universe without any center, nor any other ‘absolute’ location. Copernicus cleverly latched onto this relativistic thinking by showing that the math works just as well, if not better, when the Earth orbits the Sun, rather than vice versa. Galileo claimed there was no way, on the basis of mechanical experiments, to tell whether you were standing still or moving at a uniform speed.

For his part, Descartes disagreed, but did say that the only way one could discuss movement was relative to other objects. Christian Huygens takes Descartes a step forward, showing that there are no ‘privileged’ motions or speeds (that is, there is no intrinsic meaning of a universal ‘at rest’ – only ‘at rest’ relative to other bodies). Isaac Newton knew that it was impossible to measure something’s absolute velocity (rather than velocity relative to an observer), but still, like Descartes, supported the idea that there was an absolute space and absolute velocity – we just couldn’t measure it.

Lets skip ahead some centuries. The year is 1893; the U.S. Supreme Court declared the tomato was a vegetable, Gandhi campaigned against segregation in South Africa, and the U.S. railroad industry bubble had just popped, forcing the government to bail out AIG for $85 billion. Or something. Also, by this point, most scientists thought light traveled in waves. Given that in order for something to travel in a wave, something has to be waving, scientists posited there was this luminiferous ether that pervaded the universe, allowing light to travel between stars and candles and those fish with the crazy headlights. It makes perfect sense. In order for sound waves to travel, they need air to travel through; in order for light waves to travel, they need the ether.

Ernst Mach, A philosopher read by many contemporaries (including Einstein), said that Newton and Descartes were wrong: absolute space and absolute motion are meaningless. It’s all relative, and only relative motion has any meaning. It is both physically impossible to measure the an objects “real” velocity, and also philosophically nonsensical. The ether, however, was useful. According to Mach and others, we could still measure something kind of like absolute position and velocity by measuring things in relationship to that all-pervasive ether. Presumably, the ether was just sitting still, doing whatever ether does, so we could use its stillness as a reference point and measure how fast things were going relative to it.

Well, in theory. Earth is hurtling through space, orbiting the sun at about 70,000 miles per hour, right? And it’s spinning too, at about a thousand miles an hour. But the ether is staying still. And light, supposedly, always travels at the same speed through the ether no matter what. So in theory, light should look like it’s moving a bit faster if we’re moving toward its source, relative to the ether, and a bit slower, if we’re moving away from it, relative to the ether. It’s just like if you’re in a train hurdling toward a baseball pitcher at 100 mph, and the pitcher throws a ball at you, also at 100 mph, in a futile attempt to stop the train. To you, the baseball will look like it’s going twice as fast, because you’re moving toward it.

The earth moving in the ether. [via]
The earth moving through the ether. [via]
It turns out measuring the speed of light in relation to the ether was really difficult. A bunch of very clever people made a bunch of very clever instruments which really should have measured the speed of earth moving through the ether, based on small observed differences of the speed of light going in different directions, but the experiments always showed light moving at the same speed. Scientists figured this must mean the earth was actually exerting a pull on the ether in its vicinity, dragging it along with it as the earth hurtled through space, explaining why light seemed to be constant in both directions when measured on earth. They devised even cleverer experiments that would account for such an ether drag, but even those seemed to come up blank. Their instruments, it was decided, simply were not yet fine-tuned enough to measure such small variations in the speed of light.

Not so fast! shouted Lorentz, except he shouted it in Dutch. Lorentz used the new electromagnetic theory to suggest that the null results of the ether experiments were actually a result, not of the earth dragging the ether along behind it, but of physical objects compressing when they moved against the ether. The experiments weren’t showing any difference in the speed of light they sought because the measuring instruments themselves contracted to just the right length to perfectly offset the difference in the velocity of light, when measuring “into” the ether. The ether was literally squeezing the electrons in the meter stick together so it became a little shorter; short enough to inaccurately measure light’s speed. The set of equations used to describe this effect became known as Lorentz Transformations. One property of these transformations was that the physical contractions would, obviously, appear the same from any observer. No matter how fast you were going relative to your measuring device, if it were moving into the ether, you would see it contracting slightly to accommodate the measurement difference.

Not so fast! shouted Poincaré, except he shouted it in French. This property of transformations to always appear the same, relative to the ether, was actually a problem. Remember that 500 years of physics that said there is no way to mechanically determine your absolute speed or absolute location in space? Yeah, so did Poincaré. He said the only way you could measure velocity or location was matter-to-matter, not matter-to-ether, so the Lorentz transformations didn’t fly.

It’s worth taking a brief aside to talk about the underpinnings of the theories of both Lorentz and Poincaré. Their theories were based on experimental evidence, which is to say, they based their reasoning on contraction on apparent experimental evidence of said contraction, and they based their theories of relativity off of experimental evidence of motion being relative.

Einstein and Relativity

When Einstein hit the scene in 1905, he approached relativity a bit differently. Instead of trying to fit the apparent contraction of objects from the ether drift experiment to a particular theory, Einstein began with the assumption that light always appeared to move at the same rate, regardless of the relative velocity of the observer. The other assumption he began with was that there was no privileged frame of reference; no absolute space or velocity, only the movement of matter relative to other matter. I’ll work out the math later, but, unsurprisingly, it turned out that working out these assumptions led to exactly the same transformation equations as Lorentz came up with experimentally.

The math was the same. The difference was in the interpretation of the math. Einstein’s theory required no ether, but what’s more, it did not require any physical explanations at all. Because Einstein’s theory of special relativity rested on two postulates about measurement, the theory’s entire implications rested in its ability to affect how we measure or observe the universe. Thus, the interpretation of objects “contracting,” under Einstein’s theory, was that they were not contracting at all. Instead, objects merely appear as though they contract relative to the movement of the observer. Another result of these transformation equations is that, from the perspective of the observer, time appears to move slower or faster depending on the relative speed of what is being observed. Lorentz’s theory predicted the same time dilation effects, but he just chalked it up to a weird result of the math that didn’t actually manifest itself. In Einstein’s theory, however, weird temporal stretching effects were Actually What Was Going On.

To reiterate: the math of Lorentz, Einstein, and Poincaré were (at least at this early stage) essentially equivalent. The result was that no experimental result could favor one theory over another. The observational predictions between each theory were exactly the same.

Relativity’s Supporters in America

I’m focusing on America here because it’s rarely focused on in the historiography, and it’s about time someone did. If I were being scholarly and citing my sources, this might actually be a novel contribution to historiography. Oh well, BLOG! All my primary sources are in that Zotero library I linked to earlier.

In 1910, Daniel Comstock wrote a popular account of the relativity of Lorentz and Einstein, to some extent conflating the two. He suggested that if Einstein’s postulates could be experimentally verified, his special theory of relativity would be true. “If either of these postulates be proved false in the future, then the structure erected can not be true in is present form. The question is, therefore, an experimental one.” Comstock’s statement betrays a misunderstanding of Einstein’s theory, though, because, at the time of that writing, there was no experimental difference between the two theories.

Gilbert Lewis and Richard Tolman presented a paper at the 1908 American Physical Society in New York, where they describe themselves as fully behind Einstein over Lorentz. Oddly, they consider Einstein’s theory to be correct, as opposed to Lorentz’s, because his postulates were “established on a pretty firm basis of experimental fact.” Which, to reiterate, couldn’t possibly have been a difference between Lorentz and Einstein. Even more oddly still, they presented the theory not as one of physics or of measurement, but of psychology (a bit like 14th century Oresme). The two went on to separately write a few articles which supposedly experimentally confirmed the postulates of special relativity.

In fact, the few Americans who did seem to engage with the actual differences between Lorentz and Einstein did so primarily in critique. Louis More, a well-respected physicist from Cincinnati, labeled the difference as metaphysical and primarily useless. This American critique was fairly standard.

At the 1909 America Physical Society meeting in Boston, one physicist (Harold Wilson) claimed his experiments showed the difference between Einstein and Lorentz. One of the few American truly theoretical physicists, W.S. Franklin, was in attendance, and the lectures he saw inspired him to write a popular account of relativity in 1911; in it, he found no theoretical difference between Lorentz and Einstein. He tended to side theoretically with Einstein, but assumed Lorentz’s theory implied the same space and time dilation effects, which they did not.

Even this series of misunderstandings should be taken as shining examples in the context of an American approach to theoretical physics that was largely antagonistic, at times decrying theoretical differences entirely. At a symposium on Ether Theories at the 1911 APS, the presidential address by William Magie was largely about the uselessness of relativity because, according to him, physics should be a functional activity based in utility and experimentation. Joining Magie’s “side” in the debate were Michelson, Morley, and Arthur Gordon Webster, the co-founder of the America Physical Society. Of those at the meeting supporting relativity, Lewis was still convinced Einstein differed experimentally from Lorentz, and Franklin and Comstock each felt there was no substantive difference between the two. In 1912, Indiana University’s R.D. Carmichael stated Einstein’s postulates were “a direct generalization from experiment.” In short, the American’s were really focused on experiment.

Of Einstein’s theory, Louis More wrote in 1912:

Professor Einstein’s theory of Relativity [… is] proclaimed somewhat noisily to be the greatest revolution in scientific method since the time of Newton. That [it is] revolutionary there can be no doubt, in so far as [it] substitutes mathematical symbols as the basis of science and denies that any concrete experience underlies these symbols, thus replacing an objective by a subjective universe. The question remains whether this is a step forward or backward […] if there is here any revolution in thought, it is in reality a return to the scholastic methods of the Middle Ages.

More goes on to say how the “Anglo-Saxons” demand practical results, not the unfathomable theories of “the German mind.” Really, that quote about sums it up. By this point, the only Americans who even talked about relativity were the ones who trained in Germany.

I’ll end here, where most histories of the reception of relativity begin: the first Solvay Conference. It’s where this beautiful picture was taken.

First Solvay Conference. [via]
First Solvay Conference. [via]
To sum up: in the seven year’s following Einstein’s publication, the only Americans who agreed with Einstein were ones who didn’t quite understand him. You, however, will understand it much better, if you only read the next post [coming this week!].

Do historians need scientists?

[edit: I’m realizing I didn’t make it clear in this post that I’m aware many historians consider themselves scientists, and that there’s plenty of scientific historical archaeology and anthropology. That’s exactly what I’m advocating there be more of, and more varied.]

Short Answer: Yes.

Less Snarky Answer: Historians need to be flexible to fresh methods, fresh perspectives, and fresh blood. Maybe not that last one, I guess, as it might invite vampires.Okay, I suppose this answer wasn’t actually less snarky.

Long Answer

The long answer is that historians don’t necessarily need scientists, but that we do need fresh scientific methods. Perhaps as an accident of our association with the ill-defined “humanities”, or as a result of our being placed in an entirely different culture (see: C.P. Snow), most historians seem fairly content with methods rooted in thinking about text and other archival evidence. This isn’t true of all historians, of course – there are economic historians who use statistics, historians of science who recreate old scientific experiments, classical historians who augment their research with archaeological findings, archival historians who use advanced ink analysis,  and so forth. But it wouldn’t be stretching the truth to say that, for the most part, historiography is the practice of thinking cleverly about words to make more words.

I’ll argue here that our reliance on traditional methods (or maybe more accurately, our odd habit of rarely discussing method) is crippling historiography, and is making it increasingly likely that the most interesting and innovative historical work will come from non-historians. Sometimes these studies are ill-informed, especially when the authors decide not to collaborate with historians who know the subject, but to claim that a few ignorant claims about history negate the impact of these new insights is an exercise in pedantry.

In defending the humanities, we like to say that scientists and technologists with liberal arts backgrounds are more well-rounded, better citizens of the world, more able to contextualize their work. Non-humanists benefit from a liberal arts education in pretty much all the ways that are impossible to quantify (and thus, extremely difficult to defend against budget cuts). We argue this in the interest of rounding a person’s knowledge, to make them aware of their past, of their place in a society with staggering power imbalances and systemic biases.

Humanities departments should take a page from their own books. Sure, a few general ed requirements force some basic science and math… but I got an undergraduate history degree in a nice university, and I’m well aware how little STEM I actually needed to get through it. Our departments are just as guilty of narrowness as those of our STEM colleagues, and often because of it, we rely on applied mathematicians, statistical physicists, chemists, or computer scientists to do our innovative work for (or sometimes, thankfully, with) us.

Of course, there’s still lots of innovative work to be done from a textual perspective. I’m not downplaying that. Not everyone needs to use crazy physics/chemistry/computer science/etc. methods. But there’s a lot of low hanging fruit at the intersection of historiography and the natural sciences, and we’re not doing a great job of plucking it.

The story below is illustrative.

Gutenberg

Last night, Blaise Agüera y Arcas presented his research on Gutenberg to a packed house at our rare books library. He’s responsible for a lot of the cool things that have come out of Microsoft in the last few years, and just got a job at Google, where presumably he will continue to make cool things. Blaise has degrees in physics and applied mathematics. And, a decade ago, Blaise and historian/librarian Paul Needham sent ripples through the History of the Book community by showing that Gutenberg’s press did not work at all the way people expected.

It was generally assumed that Gutenberg employed a method called punchcutting in order to create a standard font. A letter carved into a metal rod (a “punch”) would be driven into a softer metal (a “matrix”) in order to create a mold. The mold would be filled with liquid metal which hardened to form a small block of a single letter (a “type”), which would then be loaded onto the press next to other letters, inked, and then impressed onto a page. Because the mold was metal, many duplicate “types” could be made of the same letter, thus allowing many uses of the same letter to appear identical on a single pressed page.

Punch matrix system. [via]
Punch matrix system. [via]
Type to be pressed. [via]
Type to be pressed. [via]
This process is what allowed all the duplicate letters to appear identical in Gutenberg’s published books. Except, of course, careful historians of early print noticed that letters weren’t, in fact, identical. In the 1980s, Paul Needham and a colleague attempted to produce an inventory of all the different versions of letters Gutenberg used, but they stopped after frequently finding 10 or more obviously distinct versions of the same letter.

Needham's inventory of Gutenberg type. [via]
Needham’s inventory of Gutenberg type. [via]
This was perplexing, but the subject was bracketed away for a while, until Blaise Agüera y Arcas came to Princeton and decided to work with Needham on the problem. Using extremely high-resolution imagining techniques, Blaise noted that there were in fact hundreds of versions of every letter. Not only that, there were actually variations and regularities in the smaller elements that made up letters. For example, an “n” was formed by two adjacent vertical lines, but occasionally the two vertical lines seem to have flipped places entirely. The extremely basic letter “i” itself had many variations, but within those variations, many odd self-similarities.

Variations in the letter "i" in Gutenberg's type. [via]
Variations in the letter “i” in Gutenberg’s type. [via]
Historians had, until this analysis, assumed most letter variations were due to wear of the type blocks. This analysis blew that hypothesis out of the water. These “i”s were clearly not all made in the same mold; but then, how had they been made? To answer this, they looked even closer at the individual letters.

 

Close up of Gutenberg letters, with light shining through page. [via]
Close up of Gutenberg letters, with light shining through page. [via]
It’s difficult to see at first glance, but they found something a bit surprising. The letters appeared to be formed of overlapping smaller parts: a vertical line, a diagonal box, and so forth. The below figure shows a good example of this. The glyphs on the bottom have have a stem dipping below the bottom horizontal line, while the glyphs at the top do not.

Abbreviation of 'per'. [via]
Abbreviation of ‘per’. [via]
The conclusion Needham and Agüera y Arcas drew, eventually, was that the punchcutting method must not have been used for Gutenberg’s early material. Instead, a set of carved “strokes” were pushed into hard sand or soft clay, configured such that the strokes would align to form various letters, not unlike the formation of cuneiform. This mold would then be used to cast letters, creating the blocks we recognize from movable type. The catch is that this soft clay could only cast letters a few times before it became unusable and would need to be recreated. As Gutenberg needed multiple instances of individual letters per page, many of those letters would be cast from slightly different soft molds.

Low-Hanging Fruit

At the end of his talk, Blaise made an offhand comment: how is it that historians/bibliographers/librarians have been looking at these Gutenbergs for so long, discussing the triumph of their identical characters, and not noticed that the characters are anything but uniform? Or, of those who had noticed it, why hadn’t they raised any red flags?

The insights they produced weren’t staggering feats of technology. He used a nice camera, a light shining through the pages of an old manuscript, and a few simple image recognition and clustering algorithms. The clustering part could even have been done by hand, and actually had been, by Paul Needham. And yes, it’s true, everything is obvious in hindsight, but there were a lot of eyes on these bibles, and odds are if some of them had been historians who were trained in these techniques, this insight could have come sooner. Every year students do final projects and theses and dissertations, but what percent of those use techniques from outside historiography?

In short, there’s a lot of very basic assumptions we make about the past that could probably be updated significantly if we had the right skillset, or knew how to collaborate with those who did. I think people like William Newman, who performs Newton’s alchemical experiments, is on the right track. As is Shawn Graham, who reanimates the trade networks of ancient Rome using agent-based simulations, or Devon Elliott, who creates computational and physical models of objects from the history of stage magic. Elliott’s models have shown that certain magic tricks couldn’t possibly have worked as they were described to.

The challenge is how to encourage this willingness to reach outside traditional historiographic methods to learn about the past. Changing curricula to be more flexible is one way, but that is a slow and institutionally difficult process. Perhaps faculty could assign group projects to students taking their gen-ed history courses, encouraging disciplinary mixes and non-traditional methods. It’s an open question, and not an easy one, but it’s one we need to tackle.

Bridging Token and Type

There’s an oft-spoken and somewhat strawman tale of how the digital humanities is bridging C.P. Snow’s “Two Culture” divide, between the sciences and the humanities. This story is sometimes true (it’s fun putting together Ocean’s Eleven-esque teams comprising every discipline needed to get the job done) and sometimes false (plenty of people on either side still view the other with skepticism), but as a historian of science, I don’t find the divide all that interesting. As Snow’s title suggests, this divide is first and foremost cultural. There’s another overlapping divide, a bit more epistemological, methodological, and ontological, which I’ll explore here. It’s the nomothetic(type)/idiographic(token) divide, and I’ll argue here that not only are its barriers falling, but also that the distinction itself is becoming less relevant.

Nomothetic (Greek for “establishing general laws”-ish) and Idiographic (Greek for “pertaining to the individual thing”-ish) approaches to knowledge have often split the sciences and the humanities. I’ll offload the hard work onto Wikipedia:

Nomothetic is based on what Kant described as a tendency to generalize, and is typical for the natural sciences. It describes the effort to derive laws that explain objective phenomena in general.

Idiographic is based on what Kant described as a tendency to specify, and is typical for the humanities. It describes the effort to understand the meaning of contingent, unique, and often subjective phenomena.

These words are long and annoying to keep retyping, and so in the longstanding humanistic tradition of using new words for words which already exist, henceforth I shall refer to nomothetic as type and idiographic as token. 1 I use these because a lot of my digital humanities readers will be familiar with their use in text mining. If you counted the number of unique words in a text, you’d be be counting the number of types. If you counted the number of total words in a text, you’d be counting the number of tokens, because each token (word) is an individual instance of a type. You can think of a type as the platonic ideal of the word (notice the word typical?), floating out there in the ether, and every time it’s actually used, it’s one specific token of that general type.

The Token/Type Distinction
The Token/Type Distinction

Usually the natural and social sciences look for general principles or causal laws, of which the phenomena they observe are specific instances. A social scientist might note that every time a student buys a $500 textbook, they actively seek a publisher to punch, but when they purchase $20 textbooks, no such punching occurs. This leads to the discovery of a new law linking student violence with textbook prices. It’s worth noting that these laws can and often are nuanced and carefully crafted, with an awareness that they are neither wholly deterministic nor ironclad.

[via]
[via]
The humanities (or at least history, which I’m more familiar with) are more interested in what happened than in what tends to happen. Without a doubt there are general theories involved, just as in the social sciences there are specific instances, but the intent is most-often to flesh out details and create a particular internally consistent narrative. They look for tokens where the social scientists look for types. Another way to look at it is that the humanist wants to know what makes a thing unique, and the social scientist wants to know what makes a thing comparable.

It’s been noted these are fundamentally different goals. Indeed, how can you in the same research articulate the subjective contingency of an event while simultaneously using it to formulate some general law, applicable in all such cases? Rather than answer that question, it’s worth taking time to survey some recent research.

A recent digital humanities panel at MLA elicited responses by Ted Underwood and Haun Saussy, of which this post is in part itself a response. One of the papers at the panel, by Long and So, explored the extent to which haiku-esque poetry preceded what is commonly considered the beginning of haiku in America by about 20 years. They do this by teaching the computer the form of the haiku, and having it algorithmically explore earlier poetry looking for similarities. Saussy comments on this work:

[…] macroanalysis leads us to reconceive one of our founding distinctions, that between the individual work and the generality to which it belongs, the nation, context, period or movement. We differentiate ourselves from our social-science colleagues in that we are primarily interested in individual cases, not general trends. But given enough data, the individual appears as a correlation among multiple generalities.

One of the significant difficulties faced by digital humanists, and a driving force behind critics like Johanna Drucker, is the fundamental opposition between the traditional humanistic value of stressing subjectivity, uniqueness, and contingency, and the formal computational necessity of filling a database with hard decisions. A database, after all, requires you to make a series of binary choices in well-defined categories: is it or isn’t it an example of haiku? Is the author a man or a woman? Is there an author or isn’t there an author?

Underwood addresses this difficulty in his response:

Though we aspire to subtlety, in practice it’s hard to move from individual instances to groups without constructing something like the sovereign in the frontispiece for Hobbes’ Leviathan – a homogenous collection of instances composing a giant body with clear edges.

But he goes on to suggest that the initial constraint of the digital media may not be as difficult to overcome as it appears. Computers may even offer us a way to move beyond the categories we humanists use, like genre or period.

Aren’t computers all about “binary logic”? If I tell my computer that this poem both is and is not a haiku, won’t it probably start to sputter and emit smoke?

Well, maybe not. And actually I think this is a point that should be obvious but just happens to fall in a cultural blind spot right now. The whole point of quantification is to get beyond binary categories — to grapple with questions of degree that aren’t well-represented as yes-or-no questions. Classification algorithms, for instance, are actually very good at shades of gray; they can express predictions as degrees of probability and assign the same text different degrees of membership in as many overlapping categories as you like.

Here we begin to see how the questions asked of digital humanists (on the one side; computational social scientists are tackling these same problems) are forcing us to reconsider the divide between the general and the specific, as well as the meanings of categories and typologies we have traditionally taken for granted. However, this does not yet cut across the token/type divide: this has gotten us to the macro scale, but it does not address general principles or laws that might govern specific instances. Historical laws are a murky subject, prone to inducing fits of anti-deterministic rage. Complex Systems Science and the lessons we learn from Agent-Based Modeling, I think, offer us a way past that dilemma, but more on that later.

For now, let’s talk about influence. Or diffusion. Or intertextuality. 2 Matthew Jockers has been exploring these concepts, most recently in his book Macroanalysis. The undercurrent of his research (I think I’ve heard him call it his “dangerous idea”) is a thread of almost-determinism. It is the simple idea that an author’s environment influences her writing in profound and easy to measure ways. On its surface it seems fairly innocuous, but it’s tied into a decades-long argument about the role of choice, subjectivity, creativity, contingency, and determinism. One word that people have used to get around the debate is affordances, and it’s as good a word as any to invoke here. What Jockers has found is a set of environmental conditions which afford certain writing styles and subject matters to an author. It’s not that authors are predetermined to write certain things at certain times, but that a series of factors combine to make the conditions ripe for certain writing styles, genres, etc., and not for others. The history of science analog would be the idea that, had Einstein never existed, relativity and quantum physics would still have come about; perhaps not as quickly, and perhaps not from the same person or in the same form, but they were ideas whose time had come. The environment was primed for their eventual existence. 3

An example of shape affording certain actions by constraining possibilities and influencing people. [via]
An example of shape affording certain actions by constraining possibilities and influencing people. [via]
It is here we see the digital humanities battling with the token/type distinction, and finding that distinction less relevant to its self-identification. It is no longer a question of whether one can impose or generalize laws on specific instances, because the axes of interest have changed. More and more, especially under the influence of new macroanalytic methodologies, we find that the specific and the general contextualize and augment each other.

The computational social sciences are converging on a similar shift. Jon Kleinberg likes to compare some old work by Stanley Milgram 4, where he had people draw maps of cities from memory, with digital city reconstruction projects which attempt to bridge the subjective and objective experiences of cities. The result in both cases is an attempt at something new: not quite objective, not quite subjective, and not quite intersubjective. It is a representation of collective individual experiences which in its whole has meaning, but also can be used to contextualize the specific. That these types of observations can often lead to shockingly accurate predictive “laws” isn’t really the point; they’re accidental results of an attempt to understand unique and contingent experiences at a grand scale. 5

Manhattan. Dots represent where people have taken pictures; blue dots are by locals, red by tourists, and yellow unsure. [via Eric Fischer]
Manhattan. Dots represent where people have taken pictures; blue dots are by locals, red by tourists, and yellow are uncertain. [via Eric Fischer]
It is no surprise that the token/type divide is woven into the subjective/objective divide. However, as Daston and Galison have pointed out, objectivity is not an ahistorical category. 6 It has a history, is only positively defined in relation to subjectivity, and neither were particularly useful concepts before the 19th century.

I would argue, as well, that the nomothetic and idiographic divide is one which is outliving its historical usefulness. Work from both the digital humanities and the computational social sciences is converging to a point where the objective and the subjective can peaceably coexist, where contingent experiences can be placed alongside general predictive principles without any cognitive dissonance, under a framework that allows both deterministic and creative elements. It is not that purely nomothetic or purely idiographic research will no longer exist, but that they no longer represent a binary category which can usefully differentiate research agendas. We still have Snow’s primary cultural distinctions, of course, and a bevy of disciplinary differences, but it will be interesting to see where this shift in axes takes us.

Notes:

  1. I am not the first to do this. Aviezer Tucker (2012) has a great chapter in The Oxford Handbook of Philosophy of Social Science, “Sciences of Historical Tokens and Theoretical Types: History and the Social Sciences” which introduces and historicizes the vocabulary nicely.
  2. Underwood’s post raises these points, as well.
  3. This has sometimes been referred to as environmental possibilism.
  4. Milgram, Stanley. 1976. “Pyschological Maps of Paris.” In Environmental Psychology: People and Their Physical Settings, edited by Proshansky, Ittelson, and Rivlin, 104–124. New York.

    ———. 1982. “Cities as Social Representations.” In Social Representations, edited by R. Farr and S. Moscovici, 289–309.

  5. If you’re interested in more thoughts on this subject specifically, I wrote a bit about it in relation to single-authorship in the humanities here
  6. Daston, Lorraine, and Peter Galison. 2007. Objectivity. New York, NY: Zone Books.

Historians, Doctors, and their Absence

[Note: sorry for the lack of polish on the post compared to others. This was hastily written before a day of international travel. Take it with however many grains of salt seem appropriate under the circumstances.]

[Author’s note two: Whoops! Never included the link to the article. Here it is.]

Every once in a while, 1 a group of exceedingly clever mathematicians and physicists decide to do something exceedingly clever on something that has nothing to do with math or physics. This particular research project has to do with the 14th Century Black Death, resulting in such claims as the small-world network effect is a completely modern phenomenon, and “most social exchange among humans before the modern era took place via face-to-face interaction.”

The article itself is really cool. And really clever! I didn’t think of it, and I’m angry at myself for not thinking of it. They look at the empirical evidence of the spread of disease in the late middle ages, and note that the pattern of disease spread looked shockingly different than patterns of disease spread today. Epidemiologists have long known that today’s patterns of disease propagation are dependent on social networks, and so it’s not a huge leap to say that if earlier diseases spread differently, their networks must have been different too.

Don’t get me wrong, that’s really fantastic. I wish more people (read: me) would make observations like this. It’s the sort of observation that allows historians to infer facts about the past with reasonable certainty given tiny amounts of evidence. The problem is, the team had neither any doctors, nor any historians of the late middle ages, and it turned an otherwise great paper into a set of questionable conclusions.

Small world networks have a formal mathematical definition, which (essentially) states that no matter how big the population of the world gets, everyone is within a few degrees of separation from you. Everyone’s an acquaintance of an acquaintance of an acquaintance of an acquaintance. This non-intuitive fact is what drives the insane speeds of modern diseases; today, an epidemic can spread from Australia to every state in the U.S. in a matter of days. Due to this, disease spread maps are weirdly patchy, based more around how people travel than geographic features.

Patchy h5n1 outbreak map.
Patchy h5n1 outbreak map.

The map of the spread of black death in the 14th century looked very different. Instead of these patches, the disease appeared to spread in very deliberate waves, at a rate of about 2km/day.

Spread of the plague, via the original article.
Spread of the plague, via the original article.

How to reconcile these two maps? The solution, according to the network scientists, was to create a model of people interacting and spreading diseases across various distances and types of networks. Using the models, they show that in order to generate these wave patterns of disease spread, the physical contact network cannot be small world. From this, because they make the (uncited) claimed that physical contact networks had to be a subset of social contact networks (entirely ignoring, say, correspondence), the 14th century did not have small world social networks.

There’s a lot to unpack here. First, their model does not take into account the fact that people, y’know, die after they get the plague. Their model assumes infected have enough time and impetus to travel to get the disease as far as they could after becoming contagious. In the discussion, the authors do realize this is a stretch, but suggest that because, people could if they so choose travel 40km/day, and the black death only spread 2km/day, this is not sufficient to explain the waves.

I am no plague historian, nor a doctor, but a brief trip on the google suggests that black death symptoms could manifest in hours, and a swift death comes only days after. It is, I think, unlikely that people would or could be traveling great distances after symptoms began to show.

More important to note, however, are the assumptions the authors make about social ties in the middle ages. They assume a social tie must be a physical one; they assume social ties are connected with mobility; and they assume social ties are constantly maintained. This is a bit before my period of research, but only a hundred years later (still before the period the authors claim could have sustained small world networks), but any early modern historian could tell you that communication was asynchronous and travel was ordered and infrequent.

Surprisingly, I actually believe the authors’ conclusions: that by the strict mathematical definition of small world networks, the “pre-modern” world might not have that feature. I do think distance and asynchronous communication prevented an entirely global 6-degree effect. That said, the assumptions they make about what a social tie is are entirely modern, which means their conclusion is essentially inevitable: historical figures did not maintain modern-style social connections, and thus metrics based on those types of connections should not apply. Taken in the social context of the Europe in the late middle ages, however, I think the authors would find that the salient features of small world networks (short average path length and high clustering) exist in that world as well.

A second problem, and the reason I agree with the authors that there was not a global small world in the late 14th century, is because “global” is not an appropriate axis on which to measure “pre-modern” social networks. Today, we can reasonably say we all belong to a global population; at that point in time, before trade routes from Europe to the New World and because of other geographical and technological barriers, the world should instead have been seen as a set of smaller, overlapping populations. My guess is that, for more reasonable definitions of populations for the time period, small world properties would continue to hold in this time period.

Notes:

  1. Every day? Every two days?