The Route of a Text Message

[Note: Find the more polished, professionally illustrated version of this piece at Motherboard|Vice!]

This is the third post in my full-stack dev (f-s d) series on the secret life of data. This installment is about a single text message: how it was typed, stored, sent, received, and displayed. I sprinkle in some history and context to break up the alphabet soup of protocols, but though the piece gets technical, it should all be easily understood.

The first two installments of this series are Cetus, about the propagation of errors in a 17th century spreadsheet, and Down the Rabbit Hole, about the insane lengths one occasionally needs to go through to track down the source of a dataset.

A Love Story

My leg involuntarily twitches with vibration—was it my phone, or just a phantom feeling?—and a quick inspection reveals a blinking blue notification. “I love you”, my wife texted me. I walk downstairs to wish her goodnight, because I know the difference between the message and the message, you know?

It’s a bit like encryption, or maybe steganography: anyone can see the text, but only I can decode the hidden data.

My translation, if we’re being honest, is just one extra link in a remarkably long chain of data events, all to send a message (“come downstairs and say goodnight”) in under five seconds across about 40 feet.

The message presumably began somewhere in my wife’s brain and somehow ended up in her thumbs, but that’s a signal for a different story. Ours begins as her thumb taps a translucent screen, one letter at a time, and ends as light strikes my retinas.

Through the Looking Glass

With each tap, a small electrical current passes from the screen to her hand. Because electricity flows easily through human bodies, sensors on the phone register a change in voltage wherever her thumb presses against the screen. But the world is messy, and the phone senses random fluctuations in voltage across the rest of the screen, too, so an algorithm determines the biggest, thumbiest-looking voltage fluctuations and assumes that’s where she intended to press.


Figure 0. Capacitive touch.

So she starts tap-tap-tapping on the keyboard, one letter at a time.

I-spacebar-l-o-v-e-spacebar-y-o-u.

She’s not a keyboard swiper (I am, but somehow she still types faster than me). The phone reliably records the (x,y) coordinates of each thumbprint and aligns it with the coordinates of each key on the screen. It’s harder than you think; sometimes her thumb slips, yet somehow the phone realizes she’s not trying to swipe, that it was just a messy press.

Deep in the metal guts of the device, an algorithm tests whether each thumb-shaped voltage disruption moves more that a certain number of pixels, called touch slop. If the movement is sufficiently small, the phone registers a keypress rather than a swipe.

Fig 1. Android’s code for detecting ‘touch slop’. Notice the developers had my wife’s gender in mind.

She finishes her message, a measly 10 characters of her allotted 160.

The allotment of 160 characters is a carefully chosen number, if you believe the legend: In 1984, German telephone engineer Friedhelm Hillebrand sat at his typewriter and wrote as many random sentences as came to his mind. His team then looked at postcards and telex messages, and noticed most fell below 160 characters. “Eureka!”, they presumably yelled in German, before setting the character limit of text messages in stone for the next three-plus decades.

Character Limits & Legends

Legends rarely tell the whole story, and the legend of SMS is no exception. Hillebrand and his team hoped to relay messages over a secondary channel that phones were already using to exchange basic information with their local stations.

Signalling System no. 7 (SS7) are a set of protocols used by cell phones to stay in constant contact with their local tower; they need this continuous connection to know when to ring, to get basic location tracking, to check for voicemail, and communicate other non-internet reliant messages. Since the protocol’s creation in 1980, it had a hard limit of 279 bytes of information. If Hillebrand wanted text messages to piggyback on the SS7 protocol, he had to deal with this pretty severe limit.

Normally, 279 bytes equals 279 characters. A byte is eight bits (each bit is a 0 or 1), and in common encodings, a single letter is equivalent to eight 0s and 1s in a row.

‘A’ is

0100 0001

‘B’ is

0100 0010

‘C’ is

0100 0011

and so on.

Unfortunately, getting messages across the SS7 protocol isn’t a simple matter of sending 2,232 (that’s 279 bytes at 8 bits each) 0s or 1s through radio signals from my phone to yours. Part of that 279-byte signal needs to contain your phone number, and part of it needs to contain my phone number. Part of it needs to let the cell tower know “hey, this is a message, not a call, don’t ring the phone!”.

By the time Hillebrand and his team finished cramming all the necessary contextual bits into the 279-byte signal, they were left with only enough space for 140 characters at 1 byte (8 bits) a piece, or 1,120 bits.

But what if they could encode a character in only 7 bits? At 7 bits per character, they could squeeze 160 (1,140 / 7 = 160) characters into each SMS, but those extra twenty characters demanded a sacrifice: fewer possible letters.

An 8-bit encoding allows 256 possible characters: lowercase ‘a’ takes up one possible space, uppercase ‘A’ another space, a period takes up a third space, an ‘@’ symbol takes up a fourth space, a line break takes up a fifth space, and so on up to 256. To squeeze an alphabet down to 7 bits, you need to remove some possible characters: the 1/2 symbol (½), the degree symbol (°), the pi symbol (π), and so on. But assuming people will never use those symbols in text messages (a poor assumption, to be sure), this allowed Hillebrand and his colleagues to stuff 160 characters into a 140-byte space, which in turn fit neatly into a 279-byte SS7 signal: exactly the number of characters they claim to have discovered was the perfect length of a message. (A bit like the miracle of Hanukkah, if you ask me.)

Fig 2. The GSM-7 character set.

So there my wife is, typing “I love you” into a text message, all the while the phone converts those letters into this 7-bit encoding scheme, called GSM-7.

“I” (notice it’s at the intersection of 4x and x9 above) =

49 

Spacebar (notice it’s at the intersection of 2x and x0 above) =

20 

“l” =

6C

“o” =

6F

and so on down the line.

In all, her slim message becomes:

49 20 6C 6F 76 65 20 79 6F 75 

(10 bytes combined). Each two-character code, called a hex code, is one 8-bit chunk, and together it spells “I love you”.

But this is actually not how the message is stored on her phone. It has to convert the 8-bit text to 7-bit hex codes, which it does by essentially borrowing the remaining bit at the end of every byte. The math is a bit more complicated than is worth getting into here, but the resulting message appears as

49 10 FB 6D 2F 83 F2 EF 3A 

(9 bytes in all) in her phone.

When my wife finally finishes her message (it takes only a few seconds), she presses ‘send’ and a host of tiny angels retrieve the encoded message, flutter their invisible wings the 40 feet up to the office, and place it gently into my phone. The process isn’t entirely frictionless, which is why my phone vibrates lightly upon delivery.

The so-called “telecommunication engineers” will tell you a different story, and for the sake of completeness I’ll relay it to you, but I wouldn’t trust them if I were you.

SIM-to-Send

The engineers would say that, when the phone senses voltage fluctuations over the ‘send’ button, it sends the encoded message to the SIM card (that tiny card your cell provider puts in your phone so it knows what your phone number is), and in the process it wraps it in all sorts of useful contextual data. By the time it reaches my wife’s SIM, it goes from a 140-byte message (just the text) to a 176-byte message (text + context).

The extra 36 bytes are used to encode all sorts of information, seen below.

Fig 3. Here, bytes are called octets (8 bits). Counting all possible bytes yields 174 (10+1+1+12+1+1+7+1+140). The other two bytes are reserved for some SIM card bookkeeping.

The first ten bytes are reserved for the telephone number (service center address, or SCA) of the SMS service center (SMSC), tasked with receiving, storing, forwarding, and delivering text messages. It’s essentially a switchboard: my wife’s phone sends out a signal to the local cell tower and gives it the number of the SMSC, which forwards her text message from the tower to the SMSC. The SMSC, which in our case is operated by AT&T, routes the text to the mobile station nearest to my phone. Because I’m sitting three rooms away from my wife, the text just bounces back to the same mobile station, and then to my phone.

Fig 4. SMS cellular network

The next byte (PDU-type) encodes some basic housekeeping on how the phone should interpret the message, including whether it was sent successfully, whether the carrier requests a status report, and (importantly) whether this is a single text or part of a string of connected messages.

The byte after the PDU-Type is the message reference (MR). It’s a number between 1 and 255, and is essentially used as a short-term ID number to let the phone and the carrier know which text message it’s dealing with. In my wife’s case the number is set to 0, because her phone has its own message ID system independent of this particular file.

The next twelve bytes or so are reserved for the recipient’s phone number, called the destination address (DA). With the exception of the 7-bit letter character encoding I mentioned earlier, that helps us stuff 160 letters into a 140-character space, the phone number encoding is the stupidest, most confusing bits you’ll encounter in this SMS. It’s called reverse nibble notation, and it reverses every other digit in a large number. (Get it? Part of a byte is a nibble, hahahahaha, nobody’s laughing, engineers.)

My number, which is usually 1-352-537-8376, is logged in my wife’s phone as:

3125358773f6

The 1-3 is represented by

31

The 52 is represented by

25

The 53 is represented by

35

The 7-8 is represented by

87

The 37 is represented by

73

And the 6 is represented by…

f6

Where the fuck did the ‘f’ come from? It means it’s the end of the phone number, but for some awful reason (again, reverse nibble notation) it’s one character before the final digit.

It’s like pig latin for numbers.

tIs'l ki eip galit nof runbmre.s

But I’m not bitter.

[Edit: Sean Gies points out that reverse nibble notation is an inevitable artifact of representing 4-bit little-endian numbers in 8-bit chunks. That doesn’t invalidate the above description, but it does add some context for those who know what it means, and makes the decision seem more sensible.]

The Protocol Identifier (PID) byte is honestly, at this point, mostly wasted space. It takes about 40 possible values, and it tells the service provider how to route the message. A value of

22 

means my wife is sending “I love you” to a fax machine; a value of

24 

means she’s sending it to a voice line, somehow. Since she’s sending it as an SMS to my phone, which receives texts, the PID is set to

0

(Like every other text sent in the modern world.)

Fig 5. Possible PID Values

The next byte is the Data Coding Scheme (DCS, see this doc for details), which tells the carrier and the receiving phone which character encoding scheme was used. My wife used GSM-7, the 7-bit alphabet I mentioned above that allows her to stuff 160 letters into a 140-character space, but you can easily imagine someone wanting to text in Chinese, or someone texting a complex math equation (ok, maybe you can’t easily imagine that, but a guy can dream, right?).

In my wife’s text, the DCS byte was set to

0

meaning she used a 7-bit alphabet, but she may have changed that value to use an 8- or 16-bit alphabet, which would allow her many more possible letters, but a much smaller space to fit them. Incidentally, this is why when you text emoji to your friend, you have fewer characters to work with.

There’s also a little flag in the DCS byte that tells the phone whether to self-destruct the message after sending it, Mission Impossible style, so that’s neat.

The validity period (VP) space can take up to seven bytes, and sends us into another aspect of how text messages actually work. Take another look at Figure 4, above. It’s okay, I’ll wait.

When my wife finally hits ‘send’, the text gets sent to the SMS Service Center (SMSC), which then routes the message to me. I’m upstairs and my phone is on, so I receive the text in a handful of seconds, but what if my phone were off? Surely my phone can’t accept a message when it’s not receiving any power, so the SMSC has to do something with the text.

If the SMSC can’t find my phone, my wife’s message will just bounce around in its system until the moment my phone reconnects, at which point it sends the text out immediately. I like to think of the SMSC continuously checking every online phone to see if its mine like a puppy waiting for its human by the door: is that smell my human? No. Is that smell my human? No. Is this smell my human? YESYESJUMPNOW.

The validity period (VP) bytes tell the carrier how long the puppy will wait before it gets bored and finds a new home. It’s either a timestamp or a duration, and it basically says “if you don’t see the recipient phone pop online in the next however-many days, just don’t bother sending it.” The default validity period for a text is 10,080 minutes, which means if it takes me more than seven days to turn my phone back on, I’ll never receive her text.

Because there’s often a lot of empty space in an SMS, a few bits here or there are dedicated to letting the phone and carrier know exactly which bytes are unused. If my wife’s SIM card expects a 176-byte SMS, but because she wrote an exceptionally short message it only receives a 45-byte SMS, it may get confused and assume something broke along the way. The user data length (UDL) byte solves this problem: it relays exactly how many bytes the text in the text message actually take up.

In the case of “I love you”, the UDL claims the subsequent message is 9 bytes. You’d expect it to be 10 bytes, one for each of the 10 characters in

I-spacebar-l-o-v-e-spacebar-y-o-u

but because each character is 7 bits rather than 8 bits (a full byte), we’re able to shave an extra byte off in the translation. That’s because 7 bits * 10 characters = 70 bits, divided by 8 (the number of bits in a byte) = 8.75 bytes, rounded up to 9 bytes.

Which brings us to the end of every SMS: the message itself, or the UD (User Data). The message can take up to 140 bytes, though as I just mentioned, “I love you” will pack into a measly 9. Amazing how much is packed into those 9 bytes—not just the message (my wife’s presumed love for me, which is already difficult enough to compress into 0s and 1s), but also the message (I need to come downstairs and wish her goodnight). Those bytes are:

49 10 FB 6D 2F 83 F2 EF 3A.

In all, then, this is the text message stored on my wife’s SIM card:

SCA[1-10]-PDU[1]-MR[1]-DA[1-12]-DCS[1]-VP[0, 1, or 7]-UDL[1]-UD[0-140]

00 - 11 - 00 - 07 31 25 35 87 73 F6 - ?? 00 ?? - ?? - 09 - 49 10 FB 6D 2F 83 F2 EF 3A

(Note: to get the full message, I need to do some more digging. Alas, you only see most of the message here, hence the ??s.)

Waves in the Æther

Somehow [he says in David Attenborough’s voice], the SMS must now begin its arduous journey from the SIM card to the nearest base station.  To do that, my wife’s phone must convert a string of 176 bytes to the 279 bytes readable by the SS7 protocol, convert those digital bytes to an analog radio signal, and then send those signals out into the æther at a frequency of somewhere between 800 and 2000 megahertz. That means each wave is between 6 and 14 inches from one peak to the next.

Fig 6. Wavelength

In order to efficiently send and receive signals, antennas should be no smaller than half the size of the radio waves they’re dealing with. If cell waves are 6 to 14 inches, their antennas need to be 3-7 inches. Now stop and think about the average height of a mobile phone, and why they never seem to get much smaller.

Through some digital gymnastics that would take entirely too long to explain, suddenly my wife’s phone shoots a 279-byte information packet containing “I love you” at the speed of light in every direction, eventually fizzling into nothing after about 30 miles.

Well before getting that far, her signal strikes the AT&T HSPA Base Station ID199694204 LAC21767. This base transceiver station (BTS) is about 5 blocks from my favorite bakery in Hazelwood, La Gourmandine, and though I was able to find its general location using an android app called OpenSignal, the antenna is camouflaged beyond my ability to find it.

The really fascinating bit here is that it reaches the base transceiver station at all, given everything else going on. Not only is my wife texting me “I love you” in the 1000ish mhz band of the electromagnetic spectrum; tens of thousands of other people are likely talking on the phone or texting within the 30 mile radius around my house, beyond which cell signals disintegrate. On top of that, a slew of radio and TV signals are jostling for attention in our immediate airspace, alongside visible light bouncing this way and that, to name a few of the many electromagnetic waves that seem like they ought to be getting in the way.

As Richard Feynman eloquently put it in 1983, it’s a bit like the cell tower is a little blind bug resting gently atop the water on one end of a pool, and based only on the frequency and direction of waves that cause it to bounce up and down, it’s able to reconstruct who’s swimming and where.

Feynman discussing waves.

In part due to the complexity of competing signals, each base transceiver station generally can’t handle more than 200 active users (using voice or data) at a time. So “I love you” pings my local base transceiver station, about a half a mile away, and then shouts itself into the void in every direction until it fades into the noise of everyone else.

Switching

I’m pretty lucky, all things considered. Were my wife and I on different cell providers, or were we in different cities, the route of her message to me would be a good deal more circuitous.

My wife’s message is massaged into the 279-byte SS7 channel, and sent along to the local base transceiver station (BTS) near the bakery. From there, it gets routed to the base station controller (BSC), which is the brain of not just our antenna, but several other local antennas besides. The BSC flings the text to AT&T Pittsburgh’s mobile switching center (MSC), which relies on the text message’s SCA (remember the service center address embedded within every SMS? That’s where this comes in) to get it to the appropriate short message service center (SMSC).

This alphabet soup is easier to understand with the diagram from figure 7; I just described steps 1 and 3. If my wife were on a different carrier, we’d continue through steps 4-7, because that’s where the mobile carriers all talk to each other. The SMS has to go from the SMSC to a global switchboard and then potentially bounce around the world before finding its way to my phone.

Fig 7. SMS routed through a GSM network.

But she’s on AT&T and I’m on AT&T, and our phones are connected to the same tower, so after step 3 the 279-byte packet of love just does an about-face and returns through the same mobile service center, through the same base station, and now to my phone instead of hers. A trip of a few dozen miles in the blink of an eye.

Sent-to-SIM

Buzzzzz. My pocket vibrates. A notification lets me know an SMS has arrived through my nano-SIM card, a circuit board about the size of my pinky nail. Like Bilbo Baggins or any good adventurer, it changed a bit in its trip there and back again.

Fig 8. Received message, as opposed to sent message (figure 3).

Figure 8 shows the structure of the message “I love you” now stored on my phone. Comparing figures 3 and 8, we see a few differences. The SCA (phone number of the short message service center), the PDU (some mechanical housekeeping), the PID (phone-to-phone rather than, say, phone-to-fax), the DCS (character encoding scheme), the UDL (length of message), and the UD (the message itself) are all mostly the same, but the VP (the text’s expiration date), the MR (the text’s ID number), and the DA (my phone number) are missing.

Instead, on my phone, there are two new pieces of information: the OA (originating address, or my wife’s phone number), and the SCTS (service center time stamp, or when my wife sent the message).

My wife’s phone number is stored in the same annoying reverse nibble notation (like dyslexia but for computers) that my phone number was stored in on her phone, and the timestamp is stored in the same format as the expiration date was stored in on on her phone.

These two information inversions make perfect contextual sense. Her phone needed to reach me by a certain time at a certain address, and I now need to know who sent the message and when. Without the home address, so to speak, I wouldn’t know whether the “I love you” came from my wife or my mother, and the difference would change my interpretation of the message fairly significantly.

Through a Glass Brightly

In much the same way that any computer translates a stream of bytes into a series of (x,y) coordinates with specific color assignments, my phone’s screen gets the signal to render

49 10 FB 6D 2F 83 F2 EF 3A

on the screen in front of me as “I love you” in backlit black-and-white. It’s an interesting process, but as it’s not particularly unique to smartphones, you’ll have to look it up elsewhere. Let’s instead focus on how those instructions become points of light.

The friendly marketers at Samsung call my screen a Super AMOLED (Active Matrix Organic Light-Emitting Diode) display, which is somehow both redundant and not particularly informative, so we’ll ignore unpacking the acronym as yet another distraction, and dive right into the technology.

There are about 330,000 tiny sources of light, or pixels, crammed inside each of my phone screen’s 13 square inches. For that many pixels, each needs to be about 45µm (micrometers) wide: thinner than a human hair. There’s 4 million of ‘em in all packed into the palm of my hand.

But you already know how screens work. You know that every point of light, like the Christian God or Musketeers (minus d’Artagnan), is always a three-for-one sort of deal. Red, green, and blue combine to form white light in a single pixel. Fiddle with the luminosity of each channel, and you get every color in the rainbow. And since 4 x 3 = 12, that’s 12 million tiny sources of light sitting innocently dormant behind my black mirror, waiting for me to press the power button to read my wife’s text.

Fig 9. The subpixel array of a Samsung OLED display.

Each pixel, as the acronym suggests, is an organic light-emitting diode. That’s fancy talk for an electricity sandwich:

Fig 10. An electricity sandwich.

The layers aren’t too important, beyond the fact that it’s a cathode plate (negatively charged), below a layer of organic molecules (remember back to highschool: it’s just some atoms strung together with carbon), below an anode plate (positively charged).

When the phone wants the screen on, it sends electrons from the cathode plate to the anode plate. The sandwiched molecules intercept the energy, and in response they start emitting visible light, photons, up through the transparent anode, up through the screen, and into my waiting eyes.

Since each pixel is three points of light (red, green, and blue), there’s actually three of these sandwiches per pixel. They’re all essentially the same, except the organic molecule is switched out: poly(p-phenylene) for blue light, polythiophene for red light, and poly(p-phenylene vinylene) for green light. Because each is slightly different, they shine different colors when electrified.

(Fun side fact: blue subpixels burn out much faster, due to a process called “exciton-polaron annihilation”, which sounds really exciting, doesn’t it?)

All 4 million pixels are laid out on an indexed matrix. An index works in a computer much the same way it works in a book: when my phone wants a specific pixel to light a certain color, it looks that pixel up in the index, and then sends a signal to the address it finds. Let there be light, and there was light.

(Fun side fact: now you know what “Active Matrix Light-Emitting Diode” means, and you didn’t even try.)

My phone’s operating system interprets my wife’s text message, figures out the shape of each letter, and maps those shapes to the indexed matrix. It sends just the right electric pulses through the Super AMOLED screen to render those three little words that have launched ships and vanquished curses.

The great strangeness here is that my eyes never see “I love you” in bright OLED lights; it appear on the screen black-on-white. The phone creates the illusion of text through negative space, washing the screen white by setting every red, green, & blue to maximum brightness, then turning off the bits where letters should be. Its complexity is offensively mundane.

Fig 11. Negative space.

In displaying everything but my wife’s text message, and letting me read it in the gaps, my phone succinctly betrays the lie at the heart of the information age: that communication is simple. Speed and ease hide a mountain of mediation.

And that mediation isn’t just technical. My wife’s text wouldn’t have reached me had I not paid the phone bill on time, had there not been a small army of workers handling financial systems behind the scenes. Technicians keep the phone towers in working order, which they reach via a network of roads partially subsidized by federal taxes collected from hundreds of millions of Americans across 50 states. Because so many transactions still occur via mail, if the U.S. postal system collapsed tomorrow, my phone service would falter. Exploited factory workers in South America and Asia assembled parts in both our phones, and exhausted programmers renting expensive Silicon Valley closets are as-you-read-this pushing out code ensuring our phones communicate without interruption.

All of this underneath a 10-character text. A text which, let’s be honest, means much more than it says. My brain subconsciously peels back years of interactions with my wife to decode the message appearing on my phone, but between her and me there’s still a thicket of sociotechnical mediation, a stew of people and history and parts, that can never be untangled.

The Aftermath

So here I am, in the office late one Sunday night. “I love you,” my wife texted from the bedroom downstairs, before the message traversed 40 or so feet to my phone in a handful of seconds. I realize what it means: it’s time to wish her goodnight, and perhaps wrap up this essay. I tap away the last few words, now slightly more cognizant of the complex layering of miles, signals, years of history, and human sweat it took to keep my wife from having to shout upstairs that it’s about damn time I get some rest.

Thanks to Christopher Warren, Vika Zafrin, and Nechama Weingart for comments on earlier drafts.

The Center for Midnight

How many people do you need? Is an artistic movement only a movement as a collective? Can one person alone carry the melody?

Over the course of 12 hours between October 29th and October 31st, a pop-up writing collective of artists, scholars, and algorithms uncovered a fragmentary history of the Center for Midnight, an imagined artistic movement of the late twentieth century.

We named ourselves the Midnight Society, though our membership was as difficult to enumerate as our goals. About thirty participants wandered in and out of the STUDIO for Creative Inquiry at Carnegie Mellon University over those three evenings, contributing words or technical expertise or editorial opinions or halloween candy, some for moments and others for hours. Members arrived from as far as the Atlantic and the Pacific, though the heart of our collective rested in Pittsburgh, the mind in Robin Sloan, and the words in a neural network taught to read by biographers and comedians.

RNN example

The Center for Midnight began, as so many things do, with a blank page and a blinking cursor.

When we first invited bestselling author and technologist Robin Sloan to Pittsburgh, we knew we wanted him for an extended artist’s residency, but we didn’t have an end goal in sight. His first book, Mr. Penumbra’s 24-Hour Bookstore, has been called a love letter to digital humanities (by me, among others), so you can see why a digital humanities center like ours would be interested in bringing him to town.

Inspiration came from his most recent experiments on human/computer collaborative writing. Sloan is developing a sort of cyborg text editor, an algorithmic cure for writer’s block, a machine that reads what you’ve written so far and offers a few words that might come next. It does so by reaching into its model of language, a recurrent neural network trained on whatever collection of text seems appropriate, and trying to find sensible endings to the sentence you began.

Together with the Frank-Ratchye STUDIO for Creative Inquiry and the Department of English, Carnegie Mellon University’s dSHARP Center for Innovative Digital Initiatives decided to invite Sloan to lead a three-day experiment of generative fiction. We would assemble a multi-talented team of artists and scholars from around Pittsburgh and elsewhere, connect them with Robin Sloan’s generative text editor, and attempt to assemble a readable short story in the space of 12 hours. The results exceeded our high expectations.

Center for Midnight

Before the workshop, we established a few ground rules. Participants would be capped at a dozen (we failed at keeping to that rule, to our benefit), would need to commit to being available every day (also failed, and also worked out fine), and would need to come with diverse skills and backgrounds (thankfully, finally, a success). More than four hours of writing a day would be a slog, and most people had daytime commitments, so we settled on 4-8pm, Monday through Wednesday, with copious food provided.

Inspired by David Markson’s The Last Novel, Robin decided to assemble the short story in 1-3 sentence snippets, which would allow people to contribute as much or as little as they were able. The story would be about a yet-unnamed artistic movement, so Robin pre-trained his recurrent neural network on the biographies of artists.

Day 1

When everyone arrived on the afternoon of October 29th, the house was surprisingly packed; well over the dozen people I’d hand-selected, with more trickling in as the night went on. I guess word had gotten out. Our temporary base, the STUDIO for Creative Inquiry, acts as a home to rotating students, artists, and other ne’er-do-wells; its residents filled out our rogues’ gallery.

We spent a good while introducing ourselves, which proved important. By the end of the workshop, though our numbers had thinned, we wound up leaning on each person’s interests and skills for the tasks required to finish the story.

Robin introduced the premise, that each of us would use the algorithmic collaboration tool to assemble snippets of text about some fictional artist or artistic theme, and dump our results into a collective google doc. We produced about 80 snippets in all, ranging from a handful of words to over 300, each appended with a brief process note from the author:

  • “I have no idea if Maabundas is a real word or if it was generated nonsense. Either way, it sounds cool.”
  • “I cackled.”
  • “Following up on my magical steer from a previous text chunk. This required a bit of guidance. I liked that it made me think of how I wanted certain bits to sound by generating text that I could respond to.”

Over the course of the night, we brainstormed other documents on which to train the neural network, and we settled on a bunch of biographies from the Harlem Renaissance, a corpus of stand-up comedy scripts, and the collected biographies of art collectors.

At the end of the evening, each of us picked a favorite line or phrase to share with the group, including:

  • “The institutionalized monks of Yann Hirsch”
  • “The Center for Midnight”
  • “The golden age of lithography”
  • “He wandered down to the beach, watching as the anti-capitalist, plant-themed novelist and short story writer wrestled with filmmaker Benjamin John O’Toole in a drunken bout of delirium.”

Stuffed with Mediterranean food and halloween candy, we went our separate ways, while Robin continued to work.

Day 2

A 6×6 wall of seemingly blank post-its awaited us in the STUDIO. Each had on its sticky side a unique short instruction from Robin, defining a period, a subject, and a method: inception / artistic work / generate text; conflict / artist / mine text; development / relationship / generate text.

post-its

We also arrived to a one-page description, assembled by Robin from yesterday’s favorite lines:

THE CENTER FOR MIDNIGHT (1967-1978)
Methods: (primarily but not exclusively) lithography and embroidery
Obsessed with: the sea, aging, and time

Inception → Development → Conflict → Dissolution

Dramatis personae
Strongly consider mentioning one or more of these:
-Okyanica-La Trail
-Minerva Black
-The filmmaker Benjamin John O'Toole
-Territoria Migraine ← yep that's a name
-The institutionalized monks of Yann Hirsch

Today’s assignment was to uncover the Center for Midnight’s story, which began in 1967 (the average year from yesterday’s google doc) and ended in 1978 (the median year). We each took a sticky note in turn, read our instructions, and got to work writing about the artists, artworks, and relationships that circled the Center at every stage of its short life. When finished, we deposited the text in a new google doc, exchanged stickies, and started all over again.

The Midnight Society, as I started thinking of our team, wrote 4,300 words that day. We riffed off each other, taking narrative threads we saw being dropped in the google doc and weaving them through our own snippets of semi-generated prose.

While we wrote, we listened to a ghostly soundtrack of music generated by Robin, assembled from a neural network trained on an artist he wished had produced more music.

Today’s algorithmic collaboration felt a bit different, now that we’d expanded the corpus on which the model was trained to include art collectors, artists from the Harlem Renaissance, and stand-up comedians. It was a bizarre time.

The night wrapped up, again, with the eating of food and picking of favorites:

  • “She embroidered the ideas of Laura de Gioste on a seaside tree.”
  • “Many works found considerable readers in the airport, specifically the painting called Neue Big Chrome.”
  • “Minerva Black’s irreverent embroidery depicted classical Greek figures alongside high-tech imagery: Athena and her computer.”
  • “When he died, she is reported to have said, ‘He became a response to himself.’”

Day 3

The evening of Halloween, and only the most dedicated remained. About ten of us arrived to a soundtrack Robin had generated just that morning. The music was not unlike the calls of a dying caribou, and about as distressing, which if nothing else fit the holiday.

The Midnight Society

A new google doc awaited us, assembled from the words we’d contributed yesterday, though significantly reduced. Robin put order to our words, replaced a few proper nouns to solidify the narrative thread, and gave us some time to read what we’d written (or be impressed by this master author’s ability to give meaning to madness).

To polish the draft off, we marked the passages that confused or displeased us, and then each spent a while fixing the problem sections: making the narrative flow, removing tangents, and tightening the prose. On the final readthrough, we vetoed changes that needed vetoing, revived a few beloved but cut lines, and generally marveled at the readability of the final piece.

Somehow, amidst the chaos of machine prose and a barely coordinated, rotating group of amateurs, we assembled a story with a narrative arc, delicious prose, and a coherent (if strange) plot.

Aftermath

We can answer the question that drove our experimental workshop: can a dozen artists, technologists, and scholars collaborate with each other and with machines to produce a readable, interesting story in under 12 hours?

Yes, if guided by a professional cyborg author like Robin Sloan.

While I can’t speak for the others, I found this to be the most refreshing writing crucible I’ve yet experienced.

I rarely get the opportunity to write fiction, but when I do, it’s a one-way street. I can send words to an empty screen, but the screen never sends words back. Over these three nights, a combination of algorithms and compatriots sat behind my blank page, and we lobbed words back and forth as though the blinking cursor were a tennis net.

Robin Sloan’s algorithmic writing companion works an awful lot like gmail’s new predictive sentence completion, just turned upside-down. It expands rather constrains a text’s possible futures. Whether this bodes a new era of writing, I cannot say. The experience rhymes with Oulipo, but reads more accessibly. If ease and mass distribution are the tailwind of 21st century change, perhaps the next decade will see the rise of a new sort of writing.

In the meantime, I’m scheming my next experiment.

Encouraging Misfits

tl;dr Academics’ individual policing of disciplinary boundaries at the expense of intellectual merit does a disservice to our global research community, which is already structured to reinforce disciplinarity at every stage. We should work harder to encourage research misfits to offset this structural pull.


The academic game is stacked to reinforce old community practices. PhDs aren’t only about specialization, but about teaching you to think, act, write, and cite like the discipline you’ll soon join. Tenure is about proving to your peers you are like them. Publishing and winning grants are as much about goodness of fit as about quality of work.

This isn’t bad. One of science’s most important features is that it’s often cumulative or at least agglomerative, that scientists don’t start from scratch with every new project, but build on each other’s work to construct an edifice that often resembles progress. The scientific pipeline uses PhDs, tenure, journals, and grants as built-in funnels, ensuring everyone is squeezed snugly inside the pipes at every stage of their career. It’s a clever institutional trick to keep science cumulative.

But the funnels work too well. Or at least, there’s no equally entrenched clever institutional mechanism for building new pipes, for allowing the development of new academic communities that break the mold. Publishing in established journals that enforce their community boundaries is necessary for your career; most of the world’s scholarly grant programs are earmarked for and evaluated by specific academic communities. It’s easy to be disciplinary, and hard to be a misfit.

To be sure, this is a known problem. Patches abound. Universities set aside funds for “interdisciplinary research” or “underfunded areas”; postdoc positions, centers, and antidsciplinary journals exist to encourage exactly the sort of weird research I’m claiming has no little place in today’s university. These solutions are insufficient.

University or even external grant programs fostering “interdisciplinarity” for its own sake become mostly useless because of the laws of Goodhart & Campbell. They’re usually designed to bring disciplines together rather than to sidestep disciplinarity altogether, which while admirable, is a system that’s pretty easy to game, and often leads to awkward alliances of convenience.

Dramatic rendition of types of -disciplinarity from Lotrecchiano in 2010, shown here never actually getting outside disciplines.

Universities do a bit better in encouraging certain types of centers that, rather than being “interdisciplinary”, are focused on a specific goal, method, or topic that doesn’t align easily with the local department structure. A new pipe, to extend my earlier bad metaphor. The problems arise here because centers often lack the institutional benefits available to departments: they rely on soft money, don’t get kickback from grant overheads, don’t get money from cross-listed courses, and don’t get tenure lines. Antidisciplinary postdoc positions suffer a similar fate, allowing misfits to thrive for a year or so before having to go back on the job market to rinse & repeat.

In short, the overwhelming inertial force of academic institutions pulls towards disciplinarity despite frequent but half-assed or poorly-supported attempts to remedy the situation. Even when new disciplinary configurations break free of institutional inertia, presenting themselves as means to knowledge every bit as legitimate as traditional departments (chemistry, history, sociology, etc.), it can take decades for them to even be given the chance to fail.

It is perhaps unsurprising that the community which taught us about autopoiesis proved incapable of sustaining itself, though half a century on its influences are glaringly apparent and far-reaching across today’s research universities. I wonder if we reconfigured the organization of colleges and departments from scratch today, whether there would be more departments of environmental studies and fewer departments of [redacted] 1.

I bring this all up to raise awareness of the difficulty facing good work with no discernible home, and to advocate for some individual action which, though it won’t change the system overnight, will hopefully make the world a bit easier for those who deserve it.

It is this: relax the reflexive disciplinary boundary drawing, and foster programs or communities which celebrate misfits. I wrote a bit about this last year in the context of history and culturomics; historians clamored to show that culturomics was bad history, but culturomics never attempted to be good history—it attempted to be good culturomics. Though I’d argue it often failed at that as well, it should have been evaluated by its own criteria, not the criteria of some related but different discipline.

Some potential ways to move forward:

  • If you are reviewing for a journal or grant and the piece is great, but doesn’t quite fit, and you can’t think of a better home for it, push against the editor to let it in anyway.
  • If you’re a journal editor or grant program officer, be more flexible with submissions which don’t fit your mold but don’t have easy homes elsewhere.
  • If you control funds for research grants, earmark half your money for good work that lacks a home. Not “good work that lacks a home but still looks like the humanities”, or “good work that looks like economics but happens to involve a computer scientist and a biologist”, but truly homeless work. I realize this won’t happen, but if I’m advocating, I might as well advocate big!
  • If you are training graduate students, hiring faculty, or evaluating tenure cases, relax the boundary-drawing urge to say “her work is fascinating, but it’s not exactly our department.”
  • If you have administrative and financial power at a university, commit to supporting nondisciplinary centers and agendas with the creation of tenure lines, the allocation of course & indirect funds, and some of the security offered to departments.

Ultimately, we need clever systems to foster nondisciplinary thinking which are as robust as those systems that foster cumulative research. This problem is above my paygrade. In the meantime, though, we can at least avoid the urge to equate disciplinary fitness with intellectual quality.

Notes:

  1. You didn’t seriously expect me to name names, did you?

Argument Clinic

Zoe LeBlanc asked how basic statistics lead to a meaningful historical argument. A good discussion followed, worth reading, but since I couldn’t fit my response into tweets, I hoped to add a bit to the thread here on the irregular. I’m addressing only one tiny corner of her question, in a way that is peculiar to my own still-forming approach to computational history; I hope it will be of some use to those starting out.

In brief, I argue that one good approach to computational history cycles between data summaries and focused hypothesis exploration, driven by historiographic knowledge, in service to finding and supporting historically interesting agendas. There’s a lot of good computational history that doesn’t do this, and a lot of bad computational history that does, but this may be a helpful rubric to follow.

In the spirit of Monty Python, the below video has absolutely nothing to do with the discussion at hand.

Zoe’s question gets at the heart of one of the two most prominent failures of computational history in 2017 1: the inability to go beyond descriptive statistics into historical argument. 2 I’ve written before on one of the many reasons for this inability, but that’s not the subject of this post. This post covers some good practices in getting from statistics to arguments.

Describing the Past

Historians, for the most part, aren’t experimentalists. 3 Our goals vary, but they often include telling stories about the past that haven’t been told, by employing newly-discovered evidence, connecting events that seemed unrelated, or revisiting an old narrative with a fresh perspective.

Facts alone usually don’t cut it. We don’t care what Jane ate for breakfast without a so what. Maybe her breakfast choices say something interesting about her socioeconomic status, or about food culture, or about how her eating habits informed the way she lived. Alongside a fact, we want why or how it came to be, what it means, or its role in some larger or future trend. A sufficiently big and surprising fact may be worthy of note on its own (“Jane ate orphans for breakfast” or “The government did indeed collude with a foreign power”), but such surprising revelations are rare, not the only purpose for historians, and still beg for context.

Computational history has gotten away with a lot of context-free presentations of fact. 4 That’s great! It’s a sign there’s a lot we didn’t know that contemporary methods & data make easily visible. 5 Here’s an example of one of mine, showing that, despite evidence to the contrary, there is a thriving community at the intersection of history and philosophy of science:

My citation analysis showing a bridge between history & philosophy of science.

But, though we’re not running out of low-hanging fruit, the novelty of mere description is wearing thin. Knowing that a community exists between history & philosophy of science is not particularly interesting; knowing why it exists, what it changes, or whether it is less tenuous than any other disciplinary borderland are more interesting and historiographically recognizable questions.

Context is Key

So how to get from description to historical argument? Though there’s no right path, and the route depends on the type of claim, this post may offer some guidance. Before we get too far, though, a note:

Description has little meaning without context and comparison. The data may show that more people are eating apples for breakfast, but there’s a lot to unpack there before it can be meaningful, let alone relevant.

Line chart of # of people who eat apples over time.

It may be, for example, that the general population is growing just as quickly as the number of people who eat apples. If that’s the case, does it matter that apple-eaters themselves don’t seem to be making up any larger percent of the population?

Line chart of # of people who eat apples over time (left axis) compared to general population (right axis).

The answer for a historian is: of course it matters. If we were talking about casualties of war, or amount of cities in a country, rather than apples, a twofold increase in absolute value (rather than percentage of population) makes a huge difference. It’s more lives affected; it’s more infrastructure and resources for a growing nation.

But the nature of that difference changes when we know our subject of study matches population dynamics. If we’re looking at voting patterns across cities, and we notice population density correlates with party affiliation, we can use that as a launching point for so what. Perhaps sparser cities rely on fewer social services to run smoothly, leading the population to vote more conservative; perhaps past events pushed conservative families towards the outskirts; perhaps.

Without having a ground against which to contextualize our results, a base map like general population, the fact of which cities voted in which direction gives us little historical meat to chew on.

On the other hand, some surprising facts, when contextualized, leave us less surprised. A two-fold increase in apple eating across a decade is pretty surprising, until you realize it happened alongside a similar increase in population. The fact is suddenly less worthy of report by itself, though it may have implications for, say, the growth of the apple industry.

But Zoe asked about statistics, not counting, in finding meaning. I don’t want to divert this post into teaching stats, and nor do I want to assume statistical knowledge, so I’ll opt for an incredibly simple metric: ratio.

The illustration above shows an increase in both population and apple-eating, and eyeball estimates show them growing apace. If we divide the total population by the number of people eating apples, however, our story is complicated.

Line chart of # of people who eat apples over time (left axis) compared to general population (right axis). A thick blue line in the middle (left axis) shows the ratio between the two.

Though both population and apple-eating increase, in 1806 the population begins rising much more rapidly than the number of apple-eaters. 6 It is in this statistically-derived difference that the historian may find something worth exploring and explaining further.

There are a many ways to compare and contextualize data, of which this is one. They aren’t worth enumerating, but the importance of contextualization is relevant to what comes next.

Question- and Data-Driven History

Computational historians like to talk about question-driven analysis. Computational history is best, we say, when it is led by a specific question or angle. The alternative is dumping a bunch of data into a statistics engine, describing it, and finding something weird, and saying “oh, this looks interesting.”

When push comes to shove, most would agree the above dichotomy is false. Historical questions don’t pop out of thin air, but from a continuously shifting relationship with the past. We read primary and secondary sources, do some data entry, do some analysis, do some more reading, and through it all build up a knowledge-base and a set of expectations about the past. We also by this point have a set of claims we don’t quite agree with, or some secondary sources with stories that feel wrong or incomplete.

This is where the computational history practice begins: with a firm grasp of the history and historiography of a period, and a set of assumptions, questions, and mild disagreements.

From here, if you’re reading this blog post, you’re likely in one of two camps:

  1. You have a big dataset and don’t know what to do with it, or
  2. You have a historiographic agenda (a point to prove, a question to answer, etc.) that you don’t know how to make computationally tractable.

We’ll begin with #1.

1. I have data. Now what?

Congratulations, you have data!

Congratulations!

This is probably the thornier of the two positions, and the one more prone to results of mere description. You want to know how to turn your data into interesting history, but you may end up doing little more than enumerating the blades of grass on a field. To avoid that, you must begin down a process sometimes called scalable reading, or a special case of the hermeneutic circle.

You start, of course, with mere description. How many records are there? What are the values in each? Are there changes over time or place? Who is most central? Before you start quantifying the data, write down the answers you expect to these questions, with a bit of a causal explanation for each.

Now, barrage your dataset with visualizations and statistical tests to find out exactly what makes it up. See how the results align with the hypotheses you noted down. If you created the data yourself, one archival visit at a time, you won’t find a lot that surprises you. That’s alright. Be sure to take time to consider what’s missing from the dataset, due to archival lacunae, bias, etc.

If any results surprise you, dig into the data to try to understand why. If none do, think about claims from secondary sources–do any contradict the data? Align with it?

This is also a good point to bring in contextualization. If you’re looking at the number of people doing something over time, try to compare your dataset to population dynamics. If you’re looking at word usage, find a way to compare your data to base frequencies of that word in similar collections. If you’re looking at social networks, compare them to random networks to see if their average path length or degree distribution are surprising compared to networks of similar size. Every unexpected result is an opportunity for exploration.

Internal comparisons may also yield interesting points to pursue further, especially if you think your data are biased. Given a limited dataset of actors, their genders, their roles, and play titles, for example, you may not be able to make broad claims about which plays are more popular, but you could see how different roles are distributed across genders within the group.

Internal comparisons could also be temporal. Given a dataset of occupations over time with a particular city, if you compare those numbers to population changes over time, you could find the moments where population and occupation dynamics part ways, and focus on those instances. Why, suddenly, are there more grocers?

The above boils down into two possible points of further research: deviations from expectation, or deviations from internal consistency.

Deviations from expectation–your own or that of some notable secondary source–can be particularly question-provoking. “Why didn’t this meet expectations” quickly becomes “what is wrong or incomplete about this common historical narrative?” From here, it’s useful to dig down into the points of data that exemplify such deviations, and see if you can figure out why and how they break from expectations.

Deviations from internal consistency–that is, when comparisons within the data wind up showing different trends–lead to positive rather than negative questions. Instead of “why is this theory wrong?”, you may ask, “why are these groups different?” or “why does this trend cease to keep pace with population during these decades?” Here you are asking specific questions that require new or shifted theories, whereas with deviations from expectations, you begin by seeing where existing narratives fail.

It’s worth reiterating that, in both scenarios, questions are drawn from deviations from some underlying theory.

In deviations from expectation, the underlying theory is what you bring to your data; you assume the data ought to look one way, but it doesn’t. You are coming with an internal, if not explicit, quantitative model of how the data ought to look.

In deviations from internal consistency, that data’s descriptive statistics provide the underlying theory against which there may be deviations. Apple-eaters deviating in number from population growth is only interesting if, at most points, apple-eaters grow  evenly alongside population. That is, you assume general statistics should be the same between groups or over time, and if they are not, it is worthy of explanation.

This an oversimplification, but a useful one. Undoubtedly, combinations of the two will arise: maybe you expect the differences between men and women in roles they play will be large, but it turns out they are small. This provides a deviation of both kinds, but no less legitimate for it. In this case, your recourse may be looking for other theatrical datasets to see if the gender dynamics play out the same across them, or if your data are somehow special and worthy of explanation outside the context of larger gender dynamics.

Which brings us, inexorably, to the cyclic process of computational history. Scalable reading. The hermeneutic circle. Whatever.

Point is, you’re at the point where some deviation or alignment seems worth explanation or exploration. You could stop here. You could present this trend, give a convincing causal just-so story of why it exists, and leave it at that. You will probably get published, since you’ve already gone farther than mere description, the trap of so much computational history.

But you shouldn’t stop here. You should take this opportunity to strengthen your story. Perhaps this is the point where you put your “traditional” historian’s cap back on, and go dust-diving for archival evidence to support your claims. I wouldn’t think less of you for it, but if you stop there, you’d only be reaping half the advantages of computational history.

In the example above, looking for other theatrical datasets to contextualize gender results in your own, hinted at the second half of the computational history research cycle: creating computationally tractable questions. Recall this section described the first half: making sense of data. Although I presented the two as separate, they productively feed on one another.

Once you’ve gone through your data to find how it aligns with your or others’ preconceived notions of the past, or how by its own internal deviations it presents interesting dilemmas, you have found yourself in the second half of the cycle. You have questions or theories you want to ask of data, but you do not yet have the data or the statistics to explore them.

This seems counter-intuitive. Why not just use the data or statistics already gathered, sometimes painstakingly over several years? Because if you use the same data & stats to both generate and answer questions, your evidence is circular. Specifically, you risk making a scientistic claim of what could easily be a spurious trend. It may simply be that, by random chance, the breakfast record-keeper lost a bunch of records from 1806-1810, thus causing the decline seen in the population ratio.

To convincingly make arguments from a historical data description, you must back it up using triangulation–approaching the problem from many angles. That triangulation may be computational, archival, archaeological, or however else you’re used to historying, but we’ll focus here on computational.

2. Computationally Tractable Questions

So you’ve got a historiographic agenda, and now you want to make it computationally tractable. Good luck! This is the hard part.

Good luck!

“Sparse areas relied less on social services.” “The infrastructure of science became less dependent on specific individuals over the course of the 17th century.” “T-Rex was a remarkable climber.” “Who benefited most from the power vacuum left by the assassination?” These hypotheses and questions do not, on their own, lend themselves to quantitative analysis.

Chief among the common difficulties of turning a historiographic agenda into a computationally tractable hypothesis is a lack of familiarity of computational methods. If you don’t know what a computer is good at, you can’t form an experiment to use one.

I said that history isn’t experimental, but I lied. Archival research can be an experiment if you go in with a hypothesis and a pre-conceived approach or set of criteria that would confirm it. Computational history, at this stage, is also experimental. It often works a little like this (but it may not): 7

  1. Set your agenda. Start with a hypothesis, historiographic framework, or question. For example, “The infrastructure of science became less dependent on specific individuals over the course of the 17th century.” (that question’s mine, don’t steal it.)
  2. Find testable hypotheses. Break it into many smaller statements that can be confirmed, denied, or quantitatively assessed. “If science depends less on specific individuals over the 17th century, the distribution of names mentioned in scholarly correspondence will flatten out. That is, in 1600 a few people will be mentioned frequently, whereas most will be mentioned infrequently; in 1700, the frequency of name mentions will be more evenly distributed across correspondence.” Or “If science depends less on specific individuals over the 17th century, when an important person died, it affected the scholarly network less in 1700 than in 1600.” (Notice in these two examples how finding evidence for the littler statements will corroborate the bigger hypothesis, and vice-versa.)
  3. Match hypotheses to approaches. Come up with methodological proxies, datasets, and/or statistical tests that could corroborate the littler statements. Be careful, thorough, and specific. For example, “In a network of 17th-century letter writers, if the removal of a central figure in 1600 decreases the average path length of the network less than the the removal of a central figure in 1700, central figures likely played less important structural roles. This will be most convincing if the effects of node removal smoothly decreases across the century.” (This is the step in which you need to come to the table with knowledge of different computational methods and what they do.)
  4. Specify proxies. List specific analytic approaches needed for the promising tests, and the data required to do them. For example, you need a list of senders and recipients of scholarly letters, roughly evenly distributed across time between 1600 and 1700, and densely-packed enough to perform network analysis. There could be a few different analytic approaches, including removing highly-central nodes and re-calculating average path length; employing measurements of attack tolerance; etc. Probably worth testing them all and seeing if each result conforms to the pre-existing theory.
  5. Find data. Find pre-existing datasets that will fit your proxies, or estimate how long it will take to gather enough data yourself to reasonably approach your hypotheses. Opt for data that will work for as many approaches as possible. You may find some data that will suggest new hypotheses, and you’ll iterate back and forth between steps #3-#5 a few times.
  6. Collect data. Run experiments. Uh, yeah, just do those things. Easy as baking apple pie from scratch.
  7. Match experimental results to hypotheses. Here’s the fun part, you get to see how many of your predictions matched your results. Hopefully a bunch, but even if they didn’t, it’s an excuse to figure out why, and start the process anew. You can also start exploring the additional datasets to help you develop new questions. The astute may have noticed, this step brings us back to the first half of computational historiography: exploring data and seeing what you can find. 8

From here, it may be worthwhile to cycle back to the data exploration stage, then back here to computationally tractable hypothesis exploration, and so on ad infinitum.

By now, making meaning out of data probably feels impossible. I’m sorry. The process is much more fluid and intertwined than is easily unpacked in a blog post. The back-and-forth can take hours, days, months, or years.

But the important thing is, after you’ve gone back-and-forth a few times, you should have a combination of quantitative, archival, theoretical, and secondary support for a solidly historical argument.

Contexts of Discovery and Justification

Early 20th-century philosophy of science cared a lot about the distinction between the contexts of discovery and justification. Violently shortened, the context of discovery is how you reached your conclusion, and the context of justification is how you argue your point, regardless of the process that got you there.

I bring this up as a reminder that the two can be distinct. By the 1990s, quantitative historians who wanted to remain legible to their non-quantitative colleagues often saved the data analysis for an appendix, and even there the focus was on the actual experiments, not the long process of coming up with tests, re-testing, collecting more data, and so on.

The result of this cyclical computational historiography need not be (and rarely is, and perhaps can never be) a description of the process that led you to the evidence supporting your argument. While it’s a good idea to be clear about where your methods led you astray, the most legible result to historians will necessarily involve a narrative reconfiguration.

Causality and Truth

Small final notes on two big topics.

First, Causality. This approach won’t get you there. It’s hard to disentangle causality from correlation, but more importantly in this context, it’s hard to choose between competing causal explanations. The above process can lead you to plausible and corroborated hypotheses, but it cannot prove anything.

Consider this: “My hypothesis about apples predicts these 10 testable claims.” You test each claim, and each test agrees with your predictions. It’s a success, but a soft one; you’ve shown your hypothesis to be plausible given the evidence, but not inevitable. A dozen other equally sensible hypotheses could have produced the same 10 testable claims. You did not prove those hypotheses wrong, you just chose one model that happened to work. 9

Even if no alternate hypothesis presents itself, and all of your tests agree with your hypothesis, you still do not have causal proof. It may be that the proxies you chose to test your claims are bad ones, or incomplete, or your method has unseen holes. Causality is tricky, and in the humanities, proof especially so.

Which leads us to the next point: Truth. Even if somehow you devise the perfect process to find proof of a causal hypothesis, the causal description does not constitute capital-T Truth. There are many truths, coming from many perspectives, about the past, and they don’t need to agree with each other. Historians care not just about what happened, but how and why, and those hows and whys are driven by people. Messy, inconsistent people who believe many conflicting things within the span of a moment. When it comes to questions of society, even the most scientistic of scholars must come to terms with uncertainty and conflict, which after all are more causally central to the story of history than most clever narratives we might tell.

Notes:

  1. Also called digital history, and related to quantitative history and cliometrics in ways we don’t often like to admit.
  2. The other most prominent failure in computational history is our tendency to group things into finite discrete categories; in this case, a two-part list of failures.
  3. With some notable exceptions. Some historians simulate the past, others perform experiments on rates of material decay, or on the chemical composition of inks. It’s a big world out there.
  4. When I say fact, assume I add all the relevant post-modernist caveats of the contingency of objectivity etc. etc. Really I mean “matters of history that the volume of available evidence make difficult to dispute.”
  5. Ted Underwood and I have both talked about the exciting promise of incredibly low-hanging fruit in new approaches.
  6. OK in retrospect I should have used a more historically relevant example – I wasn’t expecting to push this example so far.
  7. If this seems overly scientistic, worry not! Experimental science is often defined by its recourse to rote procedure, which means pretty much any procedural explanation of research will resemble experimental science. There are many ways one can go about scalable reading / triangulation of computational historiography, not just the procedural steps #1-#7 above, but this is one of the easier approaches to explain. Soft falsification and hypothesis testing are plausible angles into computational history, but not necessary ones.
  8. A brief addendum to steps #6-#7: although I’d argue Null-Hypothesis Significance Testing or population-based statistical inferences may not be relevant to historiography, especially when its based in triangulation, they may be useful in certain cases. Without delving too deeply into the weeds, they can help you figure out the extent to which the effect you see may just be noise, not indicative of any particular trend. Statistical effect sizes also may be of use, helping you see whether the magnitude of your finding is big enough to have any appreciable role in the historical narrative.
  9. Shawn Graham and I wrote about this in relation to archaeology and simulation here, on the subject of underdetermination and abduction

Teaching Yourself to Code in DH

tl;dr Book-length introductions to programming or analytic methods (math / statistics / etc.) aimed at or useful for humanists with limited coding experience.


I’m collecting programming & methodological textbooks for humanists as part of a reflective study on DH, but figured it’d also be useful for those interested in teaching themselves to code, or teachers who need a textbook for their class. Though I haven’t read them all yet, I’ve organized them into very imperfect categories and provided (hopefully) some useful comments. Short coding exercises, books that assume some pre-existing knowledge of coding, and theoretical introductions are not listed here.

Thanks to @Literature_Geek, @ProgHist, @heatherfro, @electricarchaeo, @digitaldante, @kintopp, @dmimno, & @collinj for their contributions to the growing list. In the interest of maintaining scope, not all of their suggestions appear below.

Historical Analysis

  • The Programming Historian, 1st edition (2007). William J. Turkel and Alan MacEachern.
    • An open access introduction to programming in Python. Mostly web scraping and basic text analysis. Probably best to look to newer resources, due to the date. Although it’s aimed at historians, the methods are broadly useful to all text-based DH.
  • The Programming Historian, 2nd edition (ongoing). Afanador-Llach, Maria José, Antonio Rojas Castro, Adam Crymble, Víctor Gayol, Fred Gibbs, Caleb McDaniel, Ian Milligan, Amanda Visconti, and Jeri Wieringa, eds.
    • Constantly updating lessons, ostensibly aimed at historians, but useful to all of DH. Includes introductions to web development, text analysis, GIS, network analysis, etc. in multiple programming languages. Not a monograph, and no real order.
  • Computational Historical Thinking with Applications in R (ongoing). Lincoln Mullen.
    • A series of lessons in in R, still under development with quite a few chapters missing. Probably the only programming book aimed at historians that actually focuses on historical questions and approaches.
  • The Rubyist Historian (2004). Jason Heppler.
    • A short introduction to programming in Ruby. Again, ostensibly aimed at historians, but really just focused on the fundamentals of coding, and useful in that context.
  • Natural Language Processing for Historical Texts (2012). Michael Piotrowski.
    • About natural language processing, but not an introduction to coding. Instead, an introduction to the methodological approaches of natural language processing specific to historical texts (OCR, spelling normalization, choosing a corpus, part of speech tagging, etc.). Teaches a variety of tools and techniques.
  • The Historian’s Macroscope (2015). Graham, Milligan, & Weingart.
    • Okay I’m cheating a bit here! This isn’t teaching you to program, but Shawn, Ian, and I spent a while writing this intro to digital methods for historians, so I figured I’d sneak a link in.

Literary & Linguistic Analysis

  • Text Analysis with R for Students of Literature (2014). Matthew Jockers.
    • Step-by-step introduction to learning R, specifically focused on literary text analysis, both for close and distant reading, with primers on the statistical approaches being used. Includes approaches to, e.g., word frequency distribution, lexical variety, classification, and topic modeling.
  • The Art of Literary Text Analysis (ongoing). Stéfan Sinclair & Geoffrey Rockwell.
    • A growing, interactive textbook similar in scope to Jockers’ book (close & distant reading in literary analysis), but in Python rather than R. Heavily focused on the code itself, and includes such methods as topic modeling and sentiment analysis.
  • Statistics for Corpus Linguistics (1998). Michael Oakes.
    • Don’t know anything about this one, sorry!

General Digital Humanities

Many of the above books are focused on literary or historical analysis only in name, but are really useful for everyone in DH. The below are similar in scope, but don’t aim themselves at one particular group.

  • Humanities Data in R (2015). Lauren Tilton & Taylor Arnold.
    • General introduction to programming through R, and broadly focused on many approaches, including basic statistics, networks, maps, texts, and images. Teaches concepts and programmatic implementations.
  • Digital Research Methods with Mathematica (2015). William J. Turkel.
    • A Mathematica notebook (thus, not accessible unless you have an appropriate reader) teaching text, image, and geo-based analysis. Mathematica itself is an expensive piece of software without an institutional license, so this resource may be inaccessible to many learners. [NOTE: Arno Bosse wrote positive feedback on this textbook in a comment below.]
  • Exploratory Programming for the Arts and Humanities (2016). Nick Montfort.
    • An introduction to the fundamentals of programming specifically for arts and humanities, languages Python and Processing, that goes through statistics, text, sound, animation, images, and so forth. Much more expansive than many other options listed here, but not as focused on needs of text analysis (which is probably a good thing).
  • An Introduction to Text Analysis: A Coursebook (2016). Brandon Walsh & Sarah Horowitz.
    • A brief textbook with exercises and explanatory notes specific to text analysis for the study of literature and history. Not an introduction to programming, but covers some of the mathematical and methodological concepts used in these sorts of studies.
  • Python Programming for Humanists (ongoing). Folgert Karsdorp and Maarten van Gompel.
    • Interactive (Jupyter) notebooks teaching Python for statistical text analysis. Quite thorough, teaching methodological reasoning and examples, including quizzes and other lesson helpers, going from basic tokenization up through unsupervised learning, object-oriented programming, etc.
  • Technical Foundations of Informatics (2017). Michael Freeman and Joel Ross.
    • Teaches the start-to-finish skills needed to write code to work with data, from command line to markdown to github to R and ggplot2. Not aimed at humanists, but aimed at those with no prior technical experience.
  • Hacking the Humanities (2018). Paul Vierthaler.
    • A series of videos and github code snippets/examples for Vierthaler’s DH class at Leiden University (see the syllabus). General introduction to coding principles, followed by specific needs of digital humanists, like text analysis and web scraping.

Statistical Methods & Machine Learning

  • Statistics for the Humanities (2014). John Canning.
    • Not an introduction to coding of any sort, but a solid intro to statistics geared at the sort of stats needed by humanists (archaeologists, literary theorists, philosophers, historians, etc.). Reading this should give you a solid foundation of statistical methods (sampling, confidence intervals, bias, etc.)
  • Data Mining: Practical Machine Learning Tools and Techniques, 4th edition (2016). Witten, Frank, Hall, & Pal.
    • A practical intro to machine learning in Weka, Java-based software for data mining and modeling. Not aimed at humanists, but legible to the dedicated amateur. It really gets into the weeds of how machine learning works.
  • Text Mining with R (2017). Julia Silge and David Robinson.
    • Introduction to text mining aimed at data scientists in the statistical programming language R. Some knowledge of R is expected; the authors suggest using R for Data Science (2016) by Grolemund & Wickham to get up to speed. This is for those interested in current data science coding best-practices, though it does not get as in-depth as some other texts focused on literary text analysis. Good as a solid base to learn from.
  • The Curious Journalist’s Guide to Data (2016). Jonathan Stray.
    • Not an intro to programming or math, but rather a good guide to quantitatively thinking through evidence and argument. Aimed at journalists, but of potential use to more empirically-minded humanists.
  • Six Septembers: Mathematics for the Humanist (2017). Patrick Juola & Stephen Ramsay.
    • Fantastic introduction to simple and advanced mathematics written by and for humanists. Approachable, prose-heavy, and grounded in humanities examples. Covers topics like algebra, calculus, statistics, differential equations. Definitely a foundations text, not an applications one.

Data Visualization, Web Development, & Related

  • The JavaScripting English Major (2017). Moacir P. de Sá Pereira.
    • “A 15-session course that teaches humanities undergraduates how to write their own web-based project using JavaScript.” Short textbook-style course with exercises that focuses particularly on web-mapping with Leaflet.
  • Data Visualization for Social Science: A practical introduction with R and ggplot2 (2017). Kieran Healy
    • A “hands-on introduction to the principles and practice of looking at and presenting data using R and ggplot” that introduce readers “to both the ideas and the methods of data visualization in a comprehensible and reproducible way”. Incredibly thorough, painstakingly annotated, and though not aimed directly at humanists, is close enough in scope to be more valuable than a general introduction to data science.
  • Interactive Information Visualization (2017). Michael Freeman.
    • Introduction to the skills, tools, and setup required to create interactive web visualizations, briefly covering everything from HTML to D3.js. Not aimed at the humanities, but aimed at those with no prior experience with code.
  • D3.js in Action, 2nd edition (2017). Elijah Meeks.
    • Introduction to programmatic, online data visualization in javascript and the library D3.js. Not aimed at the humanities, but written by a digital humanist; easy to read and follow. The great thing about D3 is it’s a library for visualizing something in whatever fashion you might imagine, so this is a good book for those who want to design their own visualizations rather than using off-the-shelf tools.
  • Drupal for Humanists (2016). Quinn Dombrowski.
    • Full-length introduction to Drupal, a web platform that allows you to build “environments for gathering, annotating, arranging, and presenting their research and supporting materials” on the web. Useful for those interested in getting started with the creation of web-based projects but who don’t want to dive head-first into from-scratch web development.
  • (Xe)LaTeX appliqué aux sciences humaines (2012). Maïeul Rouquette, Brendan Chabannes et Enimie Rouquette.
    • French introduction to LaTeX for humanists. LaTeX is the primary means scientists use to prepare documents (instead of MS Word or similar software), which allows for more sustainable, robust, and easily typeset scholarly publications. If humanists wish to publish in natural (or some social) science journals, this is an important skill.

Submissions to DH2017 (pt. 1)

Like many times before, I’m analyzing the international digital humanities conference, this time the 2017 conference in Montréal. The data I collect is available to any conference peer reviewer, though I do a bunch of scraping, cleaning, scrubbing, shampooing, anonymizing, etc. before posting these results.

This first post covers the basic landscape of submissions to next year’s conference: how many submissions there are, what they’re about, and so forth.

The analysis is opinionated and sprinkled with my own preliminary interpretations. If you disagree with something or want to see more, comment below, and I’ll try to address it in the inevitable follow-up. If you want the data, too bad—since it’s only available to reviewers, there’s an expectation of privacy. If you are sad for political or other reasons and live near me, I will bring you chocolate; if you are sad and do not live near me, you should move to Pittsburgh. We have chocolate.

Submission Numbers & Types

I’ll be honest, I was surprised by this year’s submission numbers. This will be the first ADHO conference held in North America since it was held in Nebraska in 2013, and I expected an influx of submissions from people who haven’t been able to travel off the continent for interim events. I expected the biggest submission pool yet.

Submissions per year by type.
Submissions per year by type.

What we see, instead, are fewer submissions than Kraków last year: 608 in all. The low number of submissions to Sydney was expected, given it was the first  conference held outside Europe or North America, but this year’s numbers suggests the DH Hype Machine might be cooling somewhat, after five years of rapid growth.

Annual presentations at DH conferences, compared to growth of DHSI in Victoria.
Annual presentations at DH conferences, compared to growth of DHSI in Victoria, 1999-2015.

We need some more years and some more DH-Hype-Machine Indicators to be sure, but I reckon things are slowing down.

The conference offers five submission tracks: Long Paper, Short Paper, Poster, Panel, and (new this year) Virtual Short Paper. The distribution is pretty consistent with previous years, with the only deviation being in Sydney in 2015. Apparently Australians don’t like short papers or posters?

I’ll be interested to see how the “Virtual Short Paper” works out. Since authors need to decide on this format before submitting, it doesn’t allow the flexibility of seeing if funding will become available over the course of the year. Still, it’s a step in the right direction, and I hope it succeeds.

Co-Authorship

More of the same! If nothing else, we get points for consistency.

Percent of Co-Authorships
Percent of Co-Authorships

Same as it ever was, nearly half of all submissions are by a single author. I don’t know if that’s because humanists need to justify their presentations to hiring and tenure committees who only respect single authorship, or if we’re just used to working alone. A full 80% of submissions have three or fewer authors, suggesting large teams are still not the norm, or that we’re not crediting all of the labor that goes into DH projects with co-authorships. [Post-publication note: See Adam Crymble’s comment, below, for important context]

Language, Topic, & Discipline

Authors choose from several possible submission languages. This year, 557 submissions were received in English, 40 in French, 7 in Spanish, 3 in Italian, and 1 in German. That’s the easy part.

The Powers That Be decided to make my life harder by changing up the categories authors can choose from for 2017. Thanks, Diane, ADHO, or whoever decided this.

In previous years, authors chose any number of keywords from a controlled vocabulary of about 100 possible topics that applied to their submission. Among other purposes, it helped match authors with reviewers. The potential topic list was relatively static for many years, allowing me to analyze the change in interest in topics over time.

This year, they added, removed, and consolidated a bunch of topics, as well as divided the controlled vocabulary into “Topics” (like metadata, morphology, and machine translation) and “Disciplines” (like disability studies, archaeology, and law). This is ultimately good for the conference, but makes it difficult for me to compare this against earlier years, so I’m holding off on that until another post.

But I’m not bitter.

This year’s options are at the bottom of this post in the appendix. Words in red were added or modified this year, and the last list are topics that used to exist, but no longer do.

So let’s take a look at this year’s breakdown by discipline.

Disciplinary breakdown of submissions
Disciplinary breakdown of submissions

Huh. “Computer science”—a topic which last year did not exist—represents nearly a third of submissions. I’m not sure how much this topic actually means anything. My guess is the majority of people using it are simply signifying the “digital” part of their “Digital Humanities” project, since the topic “Programming”—which existed in previous years but not this year—used to only connect to ~6% of submissions.

“Literary studies” represents 30% of all submissions, more than any previous year (usually around 20%), whereas “historical studies” has stayed stable with previous years, at around 20% of submissions. These two groups, however, can be pretty variable year-to-year, and I’m beginning to suspect that their use by authors is not consistent enough to take as meaningful. More on that in a later post.

That said, DH is clearly driven by lit, history, and library/information science. L/IS is a new and welcome category this year; I’ve always suspected that DHers are as much from L/IS as the humanities, and this lends evidence in that direction. Importantly, it also makes apparent a dearth in our disciplinary genealogies: when we trace the history of DH, we talk about the history of humanities computing, the history of the humanities, the history of computing, but rarely the history of L/IS.

I’ll have a more detailed breakdown later, but there were some surprises in my first impressions. “Film and Media Studies” is way up compared to previous years, as are other non-textual disciplines, which refreshingly shows (I hope) the rise of non-textual sources in DH. Finally. Gender studies and other identity- or intersectional-oriented submissions also seem to be on the rise (this may be an indication of US academic interests; we’ll need another few years to be sure).

If we now look at Topic choices (rather than Discipline choices, above), we see similar trends.

Topical distribution of submissions
Topical distribution of submissions

Again, these are just first impressions, there’ll be more soon. Text is still the bread and butter of DH, but we see more non-textual methods being used than ever. Some of the old favorites of DH, like authorship attribution, are staying pretty steady against previous years, whereas others, like XML and encoding, seem to be decreasing in interest year after year.

One last note on Topics and Disciplines. There’s a list of discontinued topics at the bottom of the appendix. Most of them have simply been consolidated into other categories, however one set is conspicuously absent: meta-discussions of DH. There are no longer categories for DH’s history, theory, how it’s taught, or its institutional support. These were pretty popular categories in previous years, and I’m not certain why they no longer exist. Perusing the submissions, there are certainly several that fall into these categories.

What’s Next

For Part 2 of this analysis, look forward to more thoughts on the topical breakdown of conference submissions; preliminary geographic and gender analysis of authors; and comparisons with previous years. After that, who knows? I take requests in the comments, but anyone who requests “Free Bird” is banned for life.

Appendix: Controlled Vocabulary

Words in red were added or modified this year, and the last list are topics that used to exist, but no longer do.

Topics

  • 3D Printing
  • agent modeling and simulation
  • archives, repositories, sustainability and preservation
  • audio, video, multimedia
  • authorship attribution / authority
  • bibliographic methods / textual studies
  • concording and indexing
  • content analysis
  • copyright, licensing, and Open Access
  • corpora and corpus activities
  • crowdsourcing
  • cultural and/or institutional infrastructure
  • data mining / text mining
  • data modeling and architecture including hypothesis-driven modeling
  • databases & dbms
  • digitisation – theory and practice
  • digitisation, resource creation, and discovery
  • diversity
  • encoding – theory and practice
  • games and meaningful play
  • geospatial analysis, interfaces & technology, spatio-temporal modeling/analysis & visualization
  • GLAM: galleries, libraries, archives, museums
  • hypertext
  • image processing
  • information architecture
  • information retrieval
  • interdisciplinary collaboration
  • interface & user experience design/publishing & delivery systems/user studies/user needs
  • internet / world wide web
  • knowledge representation
  • lexicography
  • linking and annotation
  • machine translation
  • metadata
  • mobile applications and mobile design
  • morphology
  • multilingual / multicultural approaches
  • natural language processing
  • networks, relationships, graphs
  • ontologies
  • project design, organization, management
  • query languages
  • scholarly editing
  • semantic analysis
  • semantic web
  • social media
  • software design and development
  • speech processing
  • standards and interoperability
  • stylistics and stylometry
  • teaching, pedagogy and curriculum
  • text analysis
  • text generation
  • universal/inclusive design
  • virtual and augmented reality
  • visualisation
  • xml

Disciplines

  • anthropology
  • archaeology
  • art history
  • asian studies
  • classical studies
  • computer science
  • creative and performing arts, including writing
  • cultural studies
  • design
  • disability studies
  • english studies
  • film and media studies
  • folklore and oral history
  • french studies
  • gender studies
  • geography
  • german studies
  • historical studies
  • italian studies
  • law
  • library & information science
  • linguistics
  • literary studies
  • medieval studies
  • music
  • near eastern studies
  • philology
  • philosophy
  • renaissance studies
  • rhetorical studies
  • sociology
  • spanish and spanish american studies
  • theology
  • translation studies

No Longer Exist

  • Digital Humanities – Facilities
  • Digital Humanities – Institutional Support
  • Digital Humanities – Multilinguality
  • Digital Humanities – Nature And Significance
  • Digital Humanities – Pedagogy And Curriculum
  • Genre-specific Studies: Prose, Poetry, Drama
  • History Of Humanities Computing/digital Humanities
  • Maps And Mapping
  • Media Studies
  • Other
  • Programming
  • Prosodic Studies
  • Publishing And Delivery Systems
  • Spatio-temporal Modeling, Analysis And Visualisation
  • User Studies / User Needs

Lessons From Digital History’s Antecedents

The below is the transcript from my October 29 keynote presented to the Creativity and The City 1600-2000 conference in Amsterdam, titled “Punched-Card Humanities”. I survey historical approaches to quantitative history, how they relate to the nomothetic/idiographic divide, and discuss some lessons we can learn from past successes and failures. For ≈200 relevant references, see this Zotero folder.


Title Slide
Title Slide

I’m here to talk about Digital History, and what we can learn from its quantitative antecedents. If yesterday’s keynote was framing our mutual interest in the creative city, I hope mine will help frame our discussions around the bottom half of the poster; the eHumanities perspective.

Specifically, I’ve been delighted to see at this conference, we have a rich interplay between familiar historiographic and cultural approaches, and digital or eHumanities methods, all being brought to bear on the creative city. I want to take a moment to talk about where these two approaches meet.

Yesterday’s wonderful keynote brought up the complicated goal of using new digital methods to explore the creative city, without reducing the city to reductive indices. Are we living up to that goal? I hope a historical take on this question might help us move in this direction, that by learning from those historiographic moments when formal methods failed, we can do better this time.

Creativity Conference Theme
Creativity Conference Theme

Digital History is different, we’re told. “New”. Many of us know historians who used computers in the 1960s, for things like demography or cliometrics, but what we do today is a different beast.

Commenting on these early punched-card historians, in 1999, Ed Ayers wrote, quote, “the first computer revolution largely failed.” The failure, Ayers, claimed, was in part due to their statistical machinery not being up to the task of representing the nuances of human experience.

We see this rhetoric of newness or novelty crop up all the time. It cropped up a lot in pioneering digital history essays by Roy Rosenzweig and Dan Cohen in the 90s and 2000s, and we even see a touch of it, though tempered, in this conference’s theme.

In yesterday’s final discussion on uncertainty, Dorit Raines reminded us the difference between quantitative history in the 70s and today’s Digital History is that today’s approaches broaden our sources, whereas early approaches narrowed them.

Slide (r)evolution
Slide (r)evolution

To say “we’re at a unique historical moment” is something common to pretty much everyone, everywhere, forever. And it’s always a little bit true, right?

It’s true that every historical moment is unique. Unprecedented. Digital History, with its unique combination of public humanities, media-rich interests, sophisticated machinery, and quantitative approaches, is pretty novel.

But as the saying goes, history never repeats itself, but it rhymes. Each thread making up Digital History has a long past, and a lot of the arguments for or against it have been made many times before. Novelty is a convenient illusion that helps us get funding.

Not coincidentally, it’s this tension I’ll highlight today: between revolution and evolution, between breaks and continuities, and between the historians who care more about what makes a moment unique, and those who care more about what connects humanity together.

To be clear, I’m operating on two levels here: the narrative and the metanarrative. The narrative is that the history of digital history is one of continuities and fractures; the metanarrative is that this very tension between uniqueness and self-similarity is what swings the pendulum between quantitative and qualitative historians.

Now, my claim that debates over continuity and discontinuity are a primary driver of the quantitative/qualitative divide comes a bit out of left field — I know — so let me back up a few hundred years and explain.

Chronology
Chronology

Francis Bacon wrote that knowledge would be better understood if it were collected into orderly tables. His plea extended, of course, to historical knowledge, and inspired renewed interest in a genre already over a thousand years old: tabular chronology.

These chronologies were world histories, aligning the pasts of several regions which each reconned the passage of time differently.

Isaac Newton inherited this tradition, and dabbled throughout his life in establishing a more accurate universal chronology, aligning Biblical history with Greek legends and Egyptian pharoahs.

Newton brought to history the same mind he brought to everything else: one of stars and calculations. Like his peers, Newton relied on historical accounts of astronomical observations to align simultaneous events across thousands of miles. Kepler and Scaliger, among others, also partook in this “scientific history”.

Where Newton departed from his contemporaries, however, was in his use of statistics for sorting out history. In the late 1500s, the average or arithmetic mean was popularized by astronomers as a way of smoothing out noisy measurements. Newton co-opted this method to help him estimate the length of royal reigns, and thus the ages of various dynasties and kingdoms.

On average, Newton figured, a king’s reign lasted 18-20 years. If the history books record 5 kings, that means the dynasty lasted between 90 and 100 years.

Newton was among the first to apply averages to fill in chronologies, though not the first to apply them to human activities. By the late 1600s, demographic statistics of contemporary life — of births, burials and the like — were becoming common. They were ways of revealing divinely ordered regularities.

Incidentally, this is an early example of our illustrious tradition of uncritically appropriating methods from the natural sciences. See? We’ve all done it, even Newton!  

Joking aside, this is an important point: statistical averages represented divine regularities. Human statistics began as a means to uncover universal truths, and they continue to be employed in that manner. More on that later, though.

Musgrave Quote

Newton’s method didn’t quite pass muster, and skepticism grew rapidly on the whole prospect of mathematical history.

Criticizing Newton in 1782, for example, Samuel Musgrave argued, in part, that there are no discernible universal laws of history operating in parallel to the universal laws of nature. Nature can be mathematized; people cannot.

Not everyone agreed. Francesco Algarotti passionately argued that Newton’s calculation of average reigns, the application of math to history, was one of his greatest achievements. Even Voltaire tried Newton’s method, aligning a Chinese chronology with Western dates using average length of reigns.

Nomothetic / Idiographic
Nomothetic / Idiographic

Which brings us to the earlier continuity/discontinuity point: quantitative history stirs debate in part because it draws together two activities Immanuel Kant sets in opposition: the tendency to generalize, and the tendency to specify.

The tendency to generalize, later dubbed Nomothetic, often describes the sciences: extrapolating general laws from individual observations. Examples include the laws of gravity, the theory of evolution by natural selection, and so forth.

The tendency to specify, later dubbed Idiographic, describes, mostly, the humanities: understanding specific, contingent events in their own context and with awareness of subjective experiences. This could manifest as a microhistory of one parish in the French Revolution, a critical reading of Frankenstein focused on gender dynamics, and so forth.  

These two approaches aren’t mutually exclusive, and they frequently come in contact around scholarship of the past. Paleontologists, for example, apply general laws of biology and geology to tell the specific story of prehistoric life on Earth. Astronomers, similarly, combine natural laws and specific observations to trace to origins of our universe.

Historians have, with cyclically recurring intensity, engaged in similar efforts. One recent nomothetic example is that of cliodynamics: the practitioners use data and simulations to discern generalities such as why nations fail or what causes war. Recent idiographic historians associate more with the cultural and theoretical turns in historiography, often focusing on microhistories or the subjective experiences of historical actors.

Both tend to meet around quantitative history, but the conversation began well before the urge to quantify. They often fruitfully align and improve one another when working in concert; for example when the historian cites a common historical pattern in order to highlight and contextualize an event which deviates from it.

But more often, nomothetic and idiographic historians find themselves at odds. Newton extrapolated “laws” for the length of kings, and was criticized for thinking mathematics had any place in the domain of the uniquely human. Newton’s contemporaries used human statistics to argue for divine regularities, and this was eventually criticized as encroaching on human agency, free will, and the uniqueness of subjective experience.

Bacon Taxonomy
Bacon Taxonomy

I’ll highlight some moments in this debate, focusing on English-speaking historians, and will conclude with what we today might learn from foibles of the quantitative historians who came before.

Let me reiterate, though, that quantitative is not nomothetic history, but they invite each other, so I shouldn’t be ahistorical by dividing them.

Take Henry Buckle, who in 1857 tried to bridge the two-culture divide posed by C.P. Snow a century later. He wanted to use statistics to find general laws of human progress, and apply those generalizations to the histories of specific nations.

Buckle was well-aware of historiography’s place between nomothetic and idiographic cultures, writing: “it is the business of the historian to mediate between these two parties, and reconcile their hostile pretensions by showing the point at which their respective studies ought to coalesce.”

In direct response, James Froud wrote that there can be no science of history. The whole idea of Science and History being related was nonsensical, like talking about the colour of sound. They simply do not connect.

This was a small exchange in a much larger Victorian debate pitting narrative history against a growing interest in scientific history. The latter rose on the coattails of growing popular interest in science, much like our debates today align with broader discussions around data science, computation, and the visible economic successes of startup culture.

This is, by the way, contemporaneous with something yesterday’s keynote highlighted: the 19th century drive to establish ‘urban laws’.

By now, we begin seeing historians leveraging public trust in scientific methods as a means for political control and pushing agendas. This happens in concert with the rise of punched cards and, eventually, computational history. Perhaps the best example of this historical moment comes from the American Census in the late 19th century.

19C Map
19C Map

Briefly, a group of 19th century American historians, journalists, and census chiefs used statistics, historical atlases, and the machinery of the census bureau to publicly argue for the disintegration of the U.S. Western Frontier in the late 19th century.

These moves were, in part, made to consolidate power in the American West and wrestle control from the native populations who still lived there. They accomplished this, in part, by publishing popular atlases showing that the western frontier was so fractured that it was difficult to maintain and defend. 1

The argument, it turns out, was pretty compelling.

Hollerith Cards
Hollerith Cards

Part of what drove the statistical power and scientific legitimacy of these arguments was the new method, in 1890, of entering census data on punched cards and processing them in tabulating machines. The mechanism itself was wildly successful, and the inventor’s company wound up merging with a few others to become IBM. As was true of punched-card humanities projects through the time of Father Roberto Busa, this work was largely driven by women.

It’s worth pausing to remember that the history of punch card computing is also a history of the consolidation of government power. Seeing like a computer was, for decades, seeing like a state. And how we see influences what we see, what we care about, how we think.  

Recall the Ed Ayers quote I mentioned at the beginning of his talk. He said the statistical machinery of early quantitative historians could not represent the nuance of historical experience. That doesn’t just mean the math they used; it means the actual machinery involved.

See, one of the truly groundbreaking punch card technologies at the turn of the century was the card sorter. Each card could represent a person, or household, or whatever else, which is sort of legible one-at-a-time, but unmanageable in giant stacks.

Now, this is still well before “computers”, but machines were being developed which could sort these cards into one of twelve pockets based on which holes were punched. So, for example, if you had cards punched for people’s age, you could sort the stacks into 10 different pockets to break them up by age groups: 0-9, 10-19, 20-29, and so forth.

This turned out to be amazing for eyeball estimates. If your 20-29 pocket was twice as full as your 10-19 pocket after all the cards were sorted, you had a pretty good idea of the age distribution.

Over the next 50 years, this convenience would shape the social sciences. Consider demographics or marketing. Both developed in the shadow of punch cards, and both relied heavily on what’s called “segmentation”, the breaking of society into discrete categories based on easily punched attributes. Age ranges, racial background, etc. These would be used to, among other things, determine who was interested in what products.

They’d eventually use statistics on these segments to inform marketing strategies.

But, if you look at the statistical tests that already existed at the time, these segmentations weren’t always the best way to break up the data. For example, age flows smoothly between 0 and 100; you could easily contrive a statistical test to show that, as a person ages, she’s more likely to buy one product over another, over a set of smooth functions.

That’s not how it worked though. Age was, and often still is, chunked up into ten or so distinct ranges, and those segments were each analyzed individually, as though they were as distinct from one another as dogs and cats. That is, 0-9 is as related to 10-19 as it is to 80-89.

What we see here is the deep influence of technological affordances on scholarly practice, and it’s an issue we still face today, though in different form.

As historians began using punch cards and social statistics, they inherited, or appropriated, a structure developed for bureaucratic government processing, and were rightly soon criticized for its dehumanizing qualities.

Pearson Stats

Unsurprisingly, given this backdrop, historians in the first few decades of the 20th century often shied away from or rejected quantification.

The next wave of quantitative historians, who reached their height in the 1930s, approached the problem with more subtlety than the previous generations in the 1890s and 1860s.

Charles Beard’s famous Economic Interpretation of the Constitution of the United States used economic and demographic stats to argue that the US Constitution was economically motivated. Beard, however, did grasp the fundamental idiographic critique of quantitative history, claiming that history was, quote:

“beyond the reach of mathematics — which cannot assign meaningful values to the imponderables, immeasurables, and contingencies of history.”

The other frequent critique of quantitative history, still heard, is that it uncritically appropriates methods from stats and the sciences.

This also wasn’t entirely true. The slide behind me shows famed statistician Karl Pearson’s attempt to replicate the math of Isaac Newton that we saw earlier using more sophisticated techniques.

By the 1940s, Americans with graduate training in statistics like Ernest Rubin were actively engaging historians in their own journals, discussing how to carefully apply statistics to historical research.

On the other side of the channel, the French Annales historians were advocating longue durée history; a move away from biographies to prosopographies, from events to structures. In its own way, this was another historiography teetering on the edge between the nomothetic and idiographic, an approach that sought to uncover the rhymes of history.

Interest in quantitative approaches surged again in the late 1950s, led by a new wave of Annales historians like Fernand Braudel and American quantitative manifestos like those by Benson, Conrad, and Meyer.

William Aydolette went so far as to point out that all historians implicitly quantify, when they use words like “many”, “average”, “representative”, or “growing” – and the question wasn’t can there be quantitative history, but when should formal quantitative methods be utilized?

By 1968, George Murphy, seeing the swell of interest, asked a very familiar question: why now? He asked why the 1960s were different from the 1860s or 1930s, why were they, in that historical moment, able to finally do it right? His answer was that it wasn’t just the new technologies, the huge datasets, the innovative methods: it was the zeitgeist. The 1960s was the right era for computational history, because it was the era of computation.

By the early 70s, there was a historian using a computer in every major history department. Quantitative history had finally grown into itself.

Popper Historicism
Popper Historicism

Of course, in retrospect, Murphy was wrong. Once the pendulum swung too far towards scientific history, theoretical objections began pushing it the other way.

In Poverty of Historicism, Popper rejected scientific history, but mostly as a means to reject historicism outright. Popper’s arguments represent an attack from outside the historiographic tradition, but one that eventually had significant purchase even among historians, as an indication of the failure of nomothetic approaches to culture. It is, to an extent, a return to Musgrave’s critique of Isaac Newton.

At the same time, we see growing criticism from historians themselves. Arthur Schlesinger famously wrote that “important questions are important precisely because they are not susceptible to quantitative answers.”

There was a converging consensus among English-speaking historians, as in the early 20th century, that quantification erased the essence of the humanities, that it smoothed over the very inequalities and historical contingencies we needed to highlight.

Barzun's Clio
Barzun’s Clio

Jacques Barzun summed it up well, if scathingly, saying history ought to free us from the bonds of the machine, not feed us into it.

The skeptics prevailed, and the pendulum swung the other way. The post-structural, cultural, and literary-critical turns in historiography pivoted away from quantification and computation. The final nail was probably Fogel and Engerman’s 1974 Time on the Cross, which reduced the Atlantic  slave-trade to economic figures, and didn’t exactly treat the subject with nuance and care.

The cliometricians, demographers, and quantitative historians didn’t disappear after the cultural turn, but their numbers shrunk, and they tended to find themselves in social science departments, or fled here to Europe, where social and economic historians were faring better.

Which brings us, 40 years on, to the middle of a new wave of quantitative or “formal method” history. Ed Ayers, like George Murphy before him, wrote, essentially, this time it’s different.

And he’s right, to a point. Many here today draw their roots not to the cliometricians, but to the very cultural historians who rejected quantification in the first place. Ours is a digital history steeped in the the values of the cultural turn, that respects social justice and seeks to use our approaches to shine a light on the underrepresented and the historically contingent.

But that doesn’t stop a new wave of critiques that, if not repeating old arguments, certainly rhymes. Take Johanna Drucker’s recent call to rebrand data as capta, because when we treat observations objectively as if it were the same as the phenomena observed, we collapse the critical distance between the world and our interpretation of it. And interpretation, Drucker contends, is the foundation on which humanistic knowledge is based.

Which is all to say, every swing of the pendulum between idiographic and nomothetic history was situated in its own historical moment. It’s not a clock’s pendulum, but Foucault’s pendulum, with each swing’s apex ending up slightly off from the last. The issues of chronology and astronomy are different from those of eugenics and manifest destiny, which are themselves different from the capitalist and dehumanizing tendencies of 1950s mainframes.

But they all rhyme. Quantitative history has failed many times, for many reasons, but there are a few threads that bind them which we can learn from — or, at least, a few recurring mistakes we can recognize in ourselves and try to avoid going forward.

We won’t, I suspect, stop the pendulum’s inevitable about-face, but at least we can continue our work with caution, respect, and care.

Which is to be Master?
Which is to be Master?

The lesson I’d like to highlight may be summed up in one question, asked by Humpty Dumpty to Alice: which is to be master?

Over several hundred years of quantitative history, the advice of proponents and critics alike tends to align with this question. Indeed in 1956, R.G. Collingwood wrote specifically “statistical research is for the historian a good servant but a bad master,” referring to the fact that statistical historical patterns mean nothing without historical context.

Schlesinger, the guy who I mentioned earlier who said historical questions are interesting precisely because they can’t be quantified, later acknowledged that while quantitative methods can be useful, they’ll lead historians astray. Instead of tackling good questions, he said, historians will tackle easily quantifiable ones — and Schlesinger was uncomfortable by the tail wagging the dog.

Which is to be master - questions
Which is to be master – questions

I’ve found many ways in which historians have accidentally given over agency to their methods and machines over the years, but these five, I think, are the most relevant to our current moment.

Unfortunately since we running out of time, you’ll just have to trust me that these are historically recurring.

Number 1 is the uncareful appropriation of statistical methods for historical uses. It controls us precisely because it offers us a black box whose output we don’t truly understand.

A common example I see these days is in network visualizations. People visualize nodes and edges using what are called force-directed layouts in Gephi, but they don’t exactly understand what those layouts mean. As these layouts were designed, physical proximity of nodes are not meant to represent relatedness, yet I’ve seen historians interpret two neighboring nodes as being related because of their visual adjacency.

This is bad. It’s false. But because we don’t quite understand what’s happening, we get lured by the black box into nonsensical interpretations.

The second way methods drive us is in our reliance on methodological imports. That is, we take the time to open the black box, but we only use methods that we learn from statisticians or scientists. Even when we fully understand the methods we import, if we’re bound to other people’s analytic machinery, we’re bound to their questions and biases.

Take the example I mentioned earlier, with demographic segmentation, punch card sorters, and its influence on social scientific statistics. The very mechanical affordances of early computers influence the sort of questions people asked for decades: how do discrete groups of people react to the world in different ways, and how do they compare with one another?

The next thing to watch out for is naive scientism. Even if you know the assumptions of your methods, and you develop your own techniques for the problem at hand, you still can fall into the positivist trap that Johanna Drucker warns us about — collapsing the distance between what we observe and some underlying “truth”.

This is especially difficult when we’re dealing with “big data”. Once you’re working with so much material you couldn’t hope to read it all, it’s easy to be lured into forgetting the distance between operationalizations and what you actually intend to measure.

For instance, if I’m finding friendships in Early Modern Europe by looking for particular words being written in correspondences, I will completely miss the existence of friends who were neighbors, and thus had no reason to write letters for us to eventually read.

A fourth way we can be mislead by quantitative methods is the ease with which they lend an air of false precision or false certainty.

This is the problem Matthew Lincoln and the other panelists brought up yesterday, where missing or uncertain data, once quantified, falsely appears precise enough to make comparisons.

I see this mistake crop up in early and recent quantitative histories alike; we measure, say, the changing rate of transnational shipments over time, and notice a positive trend. The problem is the positive difference is quite small, easily attributable to error, but because numbers are always precise, it still feels like we’re being more precise than doing a qualitative assessment. Even when it’s unwarranted.

The last thing to watch out for, and maybe the most worrisome, is the blinders quantitative analysis places on historians who don’t engage in other historiographic methods. This has been the downfall of many waves of quantitative history in the past; the inability to care about or even see that which can’t be counted.

This was, in part, was what led Time on the Cross to become the excuse to drive historians from cliometrics. The indicators of slavery that were measurable were sufficient to show it to have some semblance of economic success for black populations; but it was precisely those aspects of slavery they could not measure that were the most historically important.

So how do we regain mastery in light of these obstacles?

Which is to be master - answers
Which is to be master – answers

1. Uncareful Appropriation – Collaboration

Regarding the uncareful appropriation of methods, we can easily sidestep the issue of accidentally misusing a method by collaborating with someone who knows how the method works. This may require a translator; statisticians can as easily misunderstand historical problems as historians can misunderstand statistics.

Historians and statisticians can fruitfully collaborate, though, if they have someone in the middle trained to some extent in both — even if they’re not themselves experts. For what it’s worth, Dutch institutions seem to be ahead of the game in this respect, which is something that should be fostered.

2. Reliance on Imports – Statistical Training

Getting away from reliance on disciplinary imports may take some more work, because we ourselves must learn the approaches well enough to augment them, or create our own. Right now in DH this is often handled by summer institutes and workshop series, but I’d argue those are not sufficient here. We need to make room in our curricula for actual methods courses, or even degrees focused on methodology, in the same fashion as social scientists, if we want to start a robust practice of developing appropriate tools for our own research.

3. Naive Scientism – Humanities History

The spectre of naive scientism, I think, is one we need to be careful of, but we are also already well-equipped to deal with it. If we want to combat the uncareful use of proxies in digital history, we need only to teach the history of the humanities; why the cultural turn happened, what’s gone wrong with positivistic approaches to history in the past, etc.

Incidentally, I think this is something digital historians already guard well against, but it’s still worth keeping in mind and making sure we teach it. Particularly, digital historians need to remain aware of parallel approaches from the past, rather than tracing their background only to the textual work of people like Roberto Busa in Italy.

4. False Precision & Certainty – Simulation & Triangulation

False precision and false certainty have some shallow fixes, and some deep ones. In the short term, we need to be better about understanding things like confidence intervals and error bars, and use methods like what Matthew Lincoln highlighted yesterday.

In the long term, though, digital history would do well to adopt triangulation strategies to help mitigate against these issues. That means trying to reach the same conclusion using multiple different methods in parallel, and seeing if they all agree. If they do, you can be more certain your results are something you can trust, and not just an accident of the method you happened to use.

5. Quantitative Blinders – Rejecting Digital History

Avoiding quantitative blinders – that is, the tendency to only care about what’s easily countable – is an easy fix, but I’m afraid to say it, because it might put me out of a job. We can’t call what we do digital history, or quantitative history, or cliometrics, or whatever else. We are, simply, historians.

Some of us use more quantitative methods, and some don’t, but if we’re not ultimately contributing to the same body of work, both sides will do themselves a disservice by not bringing every approach to bear in the wide range of interests historians ought to pursue.

Qualitative and idiographic historians will be stuck unable to deal with the deluge of material that can paint us a broader picture of history, and quantitative or nomothetic historians will lose sight of the very human irregularities that make history worth studying in the first place. We must work together.

If we don’t come together, we’re destined to remain punched-card humanists – that is, we will always be constrained and led by our methods, not by history.

Creativity Theme Again
Creativity Theme Again

Of course, this divide is a false one. There are no purely quantitative or purely qualitative studies; close-reading historians will continue to say things like “representative” or “increasing”, and digital historians won’t start publishing graphs with no interpretation.

Still, silos exist, and some of us have trouble leaving the comfort of our digital humanities conferences or our “traditional” history conferences.

That’s why this conference, I think, is so refreshing. It offers a great mix of both worlds, and I’m privileged and thankful to have been able to attend. While there are a lot of lessons we can still learn from those before us, from my vantage point, I think we’re on the right track, and I look forward to seeing more of those fruitful combinations over the course of today.

Thank you.

Notes:

  1. This account is influenced from some talks by Ben Schmidt. Any mistakes are from my own faulty memory, and not from his careful arguments.

[f-s d] Cetus

Quoting Liz Losh, Jacqueline Wernimont tweeted that behind every visualization is a spreadsheet.

But what, I wondered, is behind every spreadsheet?

Space whales.

Okay, maybe space whales aren’t behind every spreadsheet, but they’re behind this one, dated 1662, notable for the gigantic nail it hammered into the coffin of our belief that heaven above is perfect and unchanging. The following post is the first in my new series full-stack dev (f-s d), where I explore the secret life of data. 1

Hevelius. Mercurius in Sole visus (1662).
Hevelius. Mercurius in Sole visus (1662).

The Princess Bride teaches us a good story involves “fencing, fighting, torture, revenge, giants, monsters, chases, escapes, true love, miracles”. In this story, Cetus, three of those play a prominent role: (red) giants, (sea) monsters, and (cosmic) miracles. Also Greek myths, interstellar explosions, beer-brewing astronomers, meticulous archivists, and top-secret digitization facilities. All together, they reveal how technologies, people, and stars aligned to stick this 350-year-old spreadsheet in your browser today.

The Sea

When Aethiopian queen Cassiopeia claimed herself more beautiful than all the sea nymphs, Poseidon was, let’s say, less than pleased. Mildly miffed. He maybe sent a sea monster named Cetus to destroy Aethiopia.

Because obviously the best way to stop a flood is to drown a princess, Queen Cassiopeia chained her daughter to the rocks as a sacrifice to Cetus. Thankfully the hero Perseus just happened to be passing through Aethiopia, returning home after beheading Medusa, that snake-haired woman whose eyes turned living creatures to stone. Perseus (depicted below as the world’s most boring 2-ball juggler) revealed Medusa’s severed head to Cetus, turning the sea monster to stone and saving the princess. And then they got married because traditional gender roles I guess?

Corinthian vase depicting Perseus, Andromeda and Ketos.
Corinthian vase depicting Perseus, Andromeda and Ketos. [via]
Cetaceans, you may recall from grade school, are those giant carnivorous sea-mammals that Captain Ahab warned you about. Cetaceans, from Cetus. You may also remember we have a thing for naming star constellations and dividing the sky up into sections (see the Zodiac), and that we have a long history of comparing the sky to the ocean (see Carl Sagan or Star Trek IV).

It should come as no surprise, then, that we’ve designated a whole section of space as ‘The Sea‘, home of Cetus (the whale), Aquarius (the God) and Eridanus (the water pouring from Aquarius’ vase, source of river floods), Pisces (two fish tied together by a rope, which makes total sense I promise), Delphinus (the dolphin), and Capricornus (the goat-fish. Listen, I didn’t make these up, okay?).

Jamieson's Celestial Atlas, Plate 21 (1822).
Jamieson’s Celestial Atlas, Plate 21 (1822). [via]
Jamieson's Celestial Atlas, Plate 23 (1822).
Jamieson’s Celestial Atlas, Plate 23 (1822). [via]
Ptolemy listed most of these constellations in his Almagest (ca. 150 A.D.), including Cetus, along with descriptions of over a thousand stars. Ptolemy’s model, with Earth at the center and the constellations just past Saturn, set the course of cosmology for over a thousand years.

Ptolemy's Cosmos [by Robert A. Hatch]
Ptolemy’s Cosmos [by Robert A. Hatch]
In this cosmos, reigning in Western Europe for centuries past Copernicus’ death in 1543, the stars were fixed and motionless. There was no vacuum of space; every planet was embedded in a shell made of aether or quintessence (quint-essence, the fifth element), and each shell sat atop the next until reaching the celestial sphere. This last sphere held the stars, each one fixed to it as with a pushpin. Of course, all of it revolved around the earth.

The domain of heavenly spheres was assumed perfect in all sorts of ways. They slid across each other without friction, and the planets and stars were perfect spheres which could not change and were unmarred by inconsistencies. One reason it was so difficult for even “great thinkers” to believe the earth orbited the sun, rather than vice-versa, was because such a system would be at complete odds with how people knew physics to work. It would break gravity, break motion, and break the outer perfection of the cosmos, which was essential (…heh) 2 to our notions of, well, everything.

Which is why, when astronomers with their telescopes and their spreadsheets started systematically observing imperfections in planets and stars, lots of people didn’t believe them—even other astronomers. Over the course of centuries, though, these imperfections became impossible to ignore, and helped launch the earth in rotation ’round the sun.

This is the story of one such imperfection.

A Star is Born (and then dies)

Around 1296 A.D., over the course of half a year, a red dwarf star some 2 quadrillion miles away grew from 300 to 400 times the size of our sun. Over the next half year, the star shrunk back down to its previous size. Light from the star took 300 years to reach earth, eventually striking the retina of German pastor David Fabricius. It was very early Tuesday morning on August 13, 1596, and Pastor Fabricius was looking for Jupiter. 3

At that time of year, Jupiter would have been near the constellation Cetus (remember our sea monster?), but Fabricius noticed a nearby bright star (labeled ‘Mira’ in the below figure) which he did not remember from Ptolemy or Tycho Brahe’s star charts.

Mira Ceti and Jupiter. [via]
Mira Ceti and Jupiter. [via]
Spotting an unrecognized star wasn’t unusual, but one so bright in so common a constellation was certainly worthy of note. He wrote down some observations of the star throughout September and October, after which it seemed to have disappeared as suddenly as it appeared. The disappearance prompted Fabricius to write a letter about it to famed astronomer Tycho Brahe, who had described a similar appearing-then-disappearing star between 1572 and 1574. Brahe jotted Fabricius’ observations down in his journal. This sort of behavior, after all, was a bit shocking for a supposedly fixed and unchanging celestial sphere.

More shocking, however, was what happened 13 years later, on February 15, 1609. Once again searching for Jupiter, pastor Fabricius spotted another new star in the same spot as the last one. Tycho Brahe having recently died, Fabricius wrote a letter to his astronomical successor, Johannes Kepler, describing the miracle. This was unprecedented. No star had ever vanished and returned, and nobody knew what to make of it.

Unfortunately for Fabricius, nobody did make anything of it. His observations were either ignored or, occasionally, dismissed as an error. To add injury to insult, a local goose thief killed Fabricius with a shovel blow, thus ending his place in this star’s story, among other stories.

Mira Ceti

Three decades passed. On the winter solstice, 1638, Johannes Phocylides Holwarda prepared to view a lunar eclipse. He reported with excitement the star’s appearance and, by August 1639, its disappearance. The new star, Holwarda claimed, should be considered of the same class as Brahe, Kepler, and Fabricius’ new stars. As much a surprise to him as Fabricius, Holwarda saw the star again on November 7, 1639. Although he was not aware of it, his new star was the same as the one Fabricius spotted 30 years prior.

Two more decades passed before the new star in the neck of Cetus would be systematically sought and observed, this time by Johannes Hevelius: local politician, astronomer, and brewer of fine beers. By that time many had seen the star, but it was difficult to know whether it was the same celestial body, or even what was going on.

Hevelius brought everything together. He found recorded observations from Holwarda, Fabricius, and others, from today’s Netherlands to Germany to Poland, and realized these disparate observations were of the same star. Befitting its puzzling and seemingly miraculous nature, Hevelius dubbed the star Mira (miraculous) Ceti. The image below, from Hevelius’ Firmamentum Sobiescianum sive Uranographia (1687), depicts Mira Ceti as the bright star in the sea monster’s neck.

Hevelius. Firmamentum Sobiescianum sive Uranographia (1687).
Hevelius. Firmamentum Sobiescianum sive Uranographia (1687).

Going further, from 1659 to 1683, Hevelius observed Mira Ceti in a more consistent fashion than any before. There were eleven recorded observations in the 65 years between Fabricius’ first sighting of the star and Hevelius’ undertaking; in the following three, he had recorded 75 more such observations. Oddly, while Hevelius was a remarkably meticulous observer, he insisted the star was inherently unpredictable, with no regularity in its reappearances or variable brightness.

Beginning shortly after Hevelius, the astronomer Ismaël Boulliau also undertook a thirty year search for Mira Ceti. He even published a prediction, that the star would go through its vanishing cycle every 332 days, which turned out to be incredibly accurate. As today’s astronomers note, Mira Ceti‘s brightness increases and decreases by several orders of magnitude every 331 days, caused by an interplay between radiation pressure and gravity in the star’s gaseous exterior.

Mira Ceti composite taken by NASA's Galaxy Evolution Explorer. [via]
Mira Ceti composite taken by NASA’s Galaxy Evolution Explorer. [via]
While of course Boulliau didn’t arrive at today’s explanation for Mira‘s variability, his solution did require a rethinking of the fixity of stars, and eventually contributed to the notion that maybe the same physical laws that apply on Earth also rule the sun and stars.

Spreadsheet Errors

But we’re not here to talk about Boulliau, or Mira Ceti. We’re here to talk about this spreadsheet:

Hevelius. Mercurius in Sole visus (1662).
Hevelius. Mercurius in Sole visus (1662).

This snippet represents Hevelius’ attempt to systematically collected prior observations of Mira Ceti. Unreasonably meticulous readers of this post may note an inconsistency: I wrote that Johannes Phocylides Holwarda observed Mira Ceti on November 7th, 1639, yet Hevelius here shows Holwarda observing the star on December 7th, 1639, an entire month later. The little notes on the side are basically the observers saying: “wtf this star keeps reappearing???”

This mistake was not a simple printer’s error. It reappeared in Hevelius’ printed books three times: 1662, 1668, and 1685. This is an early example of what Raymond Panko and others call a spreadsheet error, which appear in nearly 90% of 21st century spreadsheets. Hand-entry is difficult, and mistakes are bound to happen. In this case, a game of telephone also played a part: Hevelius may have pulled some observations not directly from the original astronomers, but from the notes of Tycho Brahe and Johannes Kepler, to which he had access.

Unfortunately, with so few observations, and many of the early ones so sloppy, mistakes compound themselves. It’s difficult to predict a variable star’s periodicity when you don’t have the right dates of observation, which may have contributed to Hevelius’ continued insistence that Mira Ceti kept no regular schedule. The other contributing factor, of course, is that Hevelius worked without a telescope and under cloudy skies, and stars are hard to measure under even the best circumstances.

To Be Continued

Here ends the first half of Cetus. The second half will cover how Hevelius’ book was preserved, the labor behind its digitization, and a bit about the technologies involved in creating the image you see.

Early modern astronomy is a particularly good pre-digital subject for full-stack dev (f-s d), since it required vast international correspondence networks and distributed labor in order to succeed. Hevelius could not have created this table, compiled from the observations of several others, without access to cutting-edge astronomical instruments and the contemporary scholarly network.

You may ask why I included that whole section on Greek myths and Ptolemy’s constellations. Would as many early modern astronomers have noticed Mira Ceti had it not sat in the center of a familiar constellation, I wonder?

I promised this series will be about the secret life of data, answering the question of what’s behind a spreadsheet. Cetus is only the first story (well, second, I guess), but the idea is to upturn the iceberg underlying seemingly mundane datasets to reveal the complicated stories of their creation and usage. Stay-tuned for future installments.

Notes:

  1. I’m retroactively adding my blog rant about data underlying an equality visualization to the f-s d series.
  2. this pun is only for historians of science
  3. Most of the historiography in this and the following section are summarized from Robert A. Hatch’s “Discovering Mira Ceti: Celestial Change and Cosmic Continuity

Experience

Last week, I publicly outed myself as a non-tenure-track academic diagnosed on the autism spectrum, 1 hoping that doing so might help other struggling academics find solace knowing they are not alone. I was unprepared for the outpouring of private and public support. Friends, colleagues, and strangers thanked me for helping them feel a little less alone, which in turn helped me feel much less alone. Thank you all, deeply and sincerely.

In a similar spirit, for interested allies and struggling fellows, this post is about how my symptoms manifest in the academic world, and how I manage them. 2

Navigating the social world is tough—a fact that may surprise some of my friends and most of my colleagues. I do alright at conferences and in groups, when conversation is polite and skin-deep, but it requires careful concentration and a lot of smoke and mirrors. Inside, it feels like I’m translating from Turkish to Cantonese without knowing either language. Every time this is said, that is the appropriate reply, though I struggle to understand why. I just possess a translation book, and recite what is expected. Stimulus and response. This skill was only recently acquired.

Looking at the point between people’s eyes makes it appear as though I am making direct eye contact during conversations. Certain observations (“you look tired”) are apparently less well-received than others (“you look excited”), and I’ve mostly learned which are which.

After a long day keeping up this appearance, especially at conferences, I find a nice dark room and stay there. Sharing conference hotel rooms with fellow academics is never an option. Some strategies I figured out myself; others, like the eye contact trick, I built over extended discussions with an old girlfriend after she handed me a severely-highlighted copy of The Partner’s Guide to Asperger Syndrome.

ADHD and Autism Spectrum Disorder are highly co-morbid, and I have been diagnosed with either or both by several independent professionals in the last twenty years. Working is hard, and often takes at least twice as much time for me as it does for the peers with whom I have discussed this. When interested in something, I lose myself entirely in it for hours on end, but a single break in concentration will leave me scrambling. It may take hours or days to return to a task, if I do at all. My best work is done in marathon, and work that takes longer than a few days may never get finished, or may drop in quality precipitously. Keeping the internet disconnected and my phone off during regular periods every day, locked in my windowless office, helps keep distractions at bay. But, I have yet to discover a good strategy to manage long projects. A career in the book-driven humanities may have been a poor choice.

Paying bills on time, keeping schedules, and replying to emails are among the most stressful tasks in my life. When I don’t adequately handle all of these mundane tasks, it sets in motion a cycle of horror that paralyzes my ability to get anything done, until I eventually file for task bankruptcy and inevitably disappoint colleagues, friends, or creditors to whom action is owed. Poor time management and stress-cycles lead me to over-promise and under-deliver. On the bright side, I recently received help in strategies to improve that, and they work. Sometimes.

Friendships, surprisingly, are easy to maintain but difficult to nourish. My friends consider me trustworthy and willing to help (if not necessarily always dependable), but I lose track of friends or family who aren’t geographically close. Deeper emotional relationships are rare or, for swaths of my life, non-existent. I get no fits of anger or depression or elation or excitement. Indeed, my friends and family remark how impossible it is to see if I like a gift they’ve given me.

People occasionally describe my actions as offensive, rude, or short, and I get frustrated trying to understand exactly why what I’m doing fits into those categories. Apparently, early in grad school, I had a bit of a reputation for asking obnoxious questions in lectures. But I don’t like upsetting people, and actively (maybe successfully?) try to curb these traits when they are pointed out.

Thankfully, academic life allows me the freedom to lock myself in a room and focus on a task. Using work as a coping mechanism for social difficulties may be unhealthy, but hey, at least I found a career that rewards my peculiarities.

My life is pretty great. I have good friends, a loving family, and hobbies that challenge me. As long as I maintain the proper controlled environment, my fixations and obsessions are a perfect complement to an academic career, especially in a culture that (unfortunately) rewards workaholism. The same tenacity often compensates for difficulties in navigating romantic relationships, of which I’ve had a few incredibly fulfilling and valuable ones over my life thus-far.

Unfortunately, my experience on the autism spectrum is not shared by all academics. Some have enough difficulty managing the social world that they end up alienating colleagues who are on their tenure committees, to disastrous effect. From private conversations, it seems autistic women suffer more from this than men, as they are expected to perform more service work and to be more social. Supportive administrators can be vital in these situations, and autism-spectrum academics may want to negotiate accommodations for themselves as part of their hiring process.

Despite some frustrations, I have found my atypical way of interacting with the world to be a feature, not a bug. My atypicality presents as what used to be called Asperger Syndrome, and it is easier for me to interact with the world, and easier for the world to interact with me, than many other autistic individuals. That said, whether or not my friends and colleagues notice, I still struggle with many aspects common to those diagnosed on the autism spectrum: social-emotional difficulties, alexithymia, intensity of focus, hypersensitivity, system-oriented thinking, etc.

Relationships or friendships with someone on the spectrum can be tough, even with someone who doesn’t outwardly present common characteristics, like me. An old partner once vented her frustrations that she couldn’t turn to her friends for advice, because: “everyone just said Scott is so normal and I was thinking [no], he’s just very very good at passing [as socially aware].” Like many who grow up non-neurotypical, I learned a complex set of coping strategies to help me fit in and succeed in a neurotypical world. To concentrate on work, I create an office cave to shut out the world. I use a complicated set of journals, calendars, and apps to keep me on task and ensure I pay bills on time. To stay attentive, I sit at the front of a lecture hall—it even works, sometimes. Some ADHD symptoms are managed pharmacologically.

These strategies give me the 80% push I need to be a functioning member of society, to become someone who can sustain relationships, not get kicked out of his house for forgetting rent, and can almost finish a PhD. Almost. It’s not quite enough to prevent me from a dozen incompletes on my transcripts, but I make do. A host of unrealistically patient and caring friends, family, and colleagues helps. (If you’re someone to whom I still owe work, but am too scared to reply to because of how delinquent I am, thanks for understanding! waves and runs away). Caring allies help. A lot.

My life so far has been a series of successes and confusions. Not unlike anybody else’s life, I suppose. I occupy my own corner of weirdness, which is itself unique enough, but everyone has their own corner. I doubt my writing this will help anyone understand themselves any better, but hopefully it will help fellow academics feel a bit safer in their own weirdness. And if this essay helps our neurotypical colleagues be a bit more understanding of our struggles, and better-informed as allies, all the better.

Notes:

  1. The original article, Stigma, was written for the Conditionally Accepted column of Inside Higher Ed. Jeana Jorgensen, Eric Grollman and Sarah Bray provided invaluable feedback, and I wouldn’t have written it without them. They invited me to write this second article for Inside Higher Ed as well, which was my original intent. I wound up posting it on my blog instead because their posting schedule didn’t quite align with my writing schedule. This shouldn’t be counted as a negative reflection on the process of publishing with that fine establishment.
  2. Let me be clear: I know very little about autism, beyond that I have been diagnosed with it. I’m still learning a lot. This post is about me. Knowing other people face similar struggles has been profoundly helpful, regardless of what causes those struggles.

“Digital History” Can Never Be New

If you claim computational approaches to history (“digital history”) lets historians ask new types of questions, or that they offer new historical approaches to answering or exploring old questions, you are wrong. You’re not actually wrong, but you are institutionally wrong, which is maybe worse.

This is a problem, because rhetoric from practitioners (including me) is that we can bring some “new” to the table, and when we don’t, we’re called out for not doing so. The exchange might (but probably won’t) go like this:

Digital Historian: And this graph explains how velociraptors were of utmost importance to Victorian sensibilities.

Historian in Audience: But how is this telling us anything we haven’t already heard before? Didn’t John Hammond already make the same claim?

DH: That’s true, he did. One thing the graph shows, though, is that velicoraptors in general tend to play much more unimportant roles across hundreds of years, which lends support to the Victorian thesis.

HiA: Yes, but the generalized argument doesn’t account for cultural differences across those times, so doesn’t meaningfully contribute to this (or any other) historical conversation.


New Questions

History (like any discipline) is made of people, and those people have Ideas about what does or doesn’t count as history (well, historiography, but that’s a long word so let’s ignore it). If you ask a new type of question or use a new approach, that new thing probably doesn’t fit historians’ Ideas about proper history.

Take culturomics. They make claims like this:

The age of peak celebrity has been consistent over time: about 75 years after birth. But the other parameters have been changing. Fame comes sooner and rises faster. Between the early 19th century and the mid-20th century, the age of initial celebrity declined from 43 to 29 years, and the doubling time fell from 8.1 to 3.3 years.

Historians saw those claims and asked “so what”? It’s not interesting or relevant according to the things historians usually consider interesting or relevant, and it’s problematic in ways historians find things problematic. For example, it ignores cultural differences, does not speak to actual human experiences, and has nothing of use to say about a particular historical moment.

It’s true. Culturomics-style questions do not fit well within a humanities paradigm (incommensurable, anyone?). By the standard measuring stick of what makes a good history project, culturomics does not measure up. A new type of question requires a new measuring stick; in this case, I think a good one for culturomics-style approaches is the extent to which they bridge individual experiences with large-scale social phenomena, or how well they are able to reconcile statistical social regularities with free or contingent choice.

The point, though, is a culturomics presentation would fit few of the boxes expected at a history conference, and so would be considered a failure. Rightly so, too—it’s a bad history presentation. But what culturomics is successfully doing is asking new types of questions, whether or not historians find them legitimate or interesting. Is it good culturomics?

To put too fine a point on it, since history is often a question-driven discipline, new types of questions that are too different from previous types are no longer legitimately within the discipline of history, even if they are intrinsically about human history and do not fit in any other discipline.

What’s more, new types of questions may appear simplistic by historian’s standards, because they fail at fulfilling even the most basic criteria usually measuring historical worth. It’s worth keeping in mind that, to most of the rest of the world, our historical work often fails at meeting their criteria for worth.

New Approaches

New approaches to old questions share a similar fate, but for different reasons. That is, if they are novel, they are not interesting, and if they are interesting, they are not novel.

Traditional historical questions are, let’s face it, not particularly new. Tautologically. Some old questions in my field are: what role did now-silent voices play in constructing knowledge-making instruments in 17th century astronomy? How did scholarship become institutionalized in the 18th century? Why was Isaac Newton so annoying?

My own research is an attempt to provide a broader view of those topics (at least, the first two) using computational means. Since my topical interest has a rich tradition among historians, it’s unlikely any of my historically-focused claims (for example, that scholarly institutions were built to replace the really complicated and precarious role people played in coordinating social networks) will be without precedent.

After decades, or even centuries, of historical work in this area, there will always be examples of historians already having made my claims. My contribution is the bolstering of a particular viewpoint, the expansion of its applicability, the reframing of a discussion. Ultimately, maybe, I convince the world that certain social network conditions play an important role in allowing scholarly activity to be much more successful at its intended goals. My contribution is not, however, a claim that is wholly without precedent.

But this is a problem, since DH rhetoric, even by practitioners, can understandably lead people to expect such novelty. Historians in particular are very good at fitting old patterns to new evidence. It’s what we’re trained to do.

Any historical claim (to an acceptable question within the historical paradigm) can easily be countered with “but we already knew that”. Either the question’s been around long enough that every plausible claim has been covered, or the new evidence or theory is similar enough to something pre-existing that it can be taken as precedent.

The most masterful recent discussion of this topic was Matthew Lincoln’s Confabulation in the humanities, where he shows how easy it is to make up evidence and get historians to agree that they already knew it was true.

To put too fine a point on it, new approaches to old historical questions are destined to produce results which conform to old approaches; or if they don’t, it’s easy enough to stretch the old & new theories together until they fit. New approaches to old questions will fail at producing completely surprising results; this is a bad standard for historical projects. If a novel methodology were to create truly unrecognizable results, it is unlikely those results would be recognized as “good history” within the current paradigm. That is, historians would struggle to care.

What Is This Beast?

What is this beast we call digital history? Boundary-drawing is a tried-and-true tradition in the humanities, digital or otherwise. It’s theoretically kind of stupid but practically incredibly important, since funding decisions, tenure cases, and similar career-altering forces are at play. If digital history is a type of history, it’s fundable as such, tenurable as such; if it isn’t, it ain’t. What’s more, if what culturomics researchers are doing are also history, their already-well-funded machine can start taking slices of the sad NEH pie.

Artist's rendition of sad NEH pie. [via]
Artist’s rendition of sad NEH pie. [via]
So “what counts?” is unfortunately important to answer.

This discussion around what is “legitimate history research” is really important, but I’d like to table it for now, because it’s so often conflated with the discussion of what is “legitimate research” sans history. The former question easily overshadows the latter, since academics are mostly just schlubs trying to make a living.

For the last century or so, history and philosophy of science have been smooshed together in departments and conferences. It’s caused a lot of concern. Does history of science need philosophy of science? Does philosophy of science need history of science? What does it mean to combine the two? Is what comes out of the middle even useful?

Weirdly, the question sometimes comes down to “does history and philosophy of science even exist?”. It’s weird because people identify with that combined title, so I published a citation analysis in Erkenntnis a few years back that basically showed that, indeed, there is an area between the two communities, and indeed those people describe themselves as doing HPS, whatever that means to them.

Look! Right in the middle there, it's history and philosophy of science.
Look! Right in the middle there, it’s history and philosophy of science.

I bring this up because digital history, as many of us practice it, leaves us floating somewhere between public engagement, social science, and history. Culturomics occupies a similar interstitial space, though inching closer to social physics and complex systems.

From this vantage point, we have a couple of options. We can say digital history is just history from a slightly different angle, and try to be evaluated by standard historical measuring sticks—which would make our work easily criticized as not particularly novel. Or we can say digital history is something new, occupying that in-between space—which could render the work unrecognizable to our usual communities.

The either/or proposition is, of course, ludicrous. The best work being done now skirts the line, offering something just novel enough to be surprising, but not so out of traditional historical bounds as to be grouped with culturomics. But I think we need to more deliberate and organized in this practice, lest we want to be like History and Philosophy of Science, still dealing with basic questions of legitimacy fifty years down the line.

In the short term, this probably means trying not just to avoid the rhetoric of newness, but to actively curtail it. In the long term, it may mean allying with like-minded historians, social scientists, statistical physicists, and complexity scientists to build a new framework of legitimacy that recognizes the forms of knowledge we produce which don’t always align with historiographic standards. As Cassidy Sugimoto and I recently wrote, this often comes with journals, societies, and disciplinary realignment.

The least we can do is steer away from a novelty rhetoric, since what is novel often isn’t history, and what is history often isn’t novel.


“Branding” – An Addendum

After writing this post, I read Amardeep Singh’s call to, among other things, avoid branding:

Here’s a way of thinking that might get us past this muddle (and I think I agree with the authors that the hype around DH is a mistake): let’s stop branding our scholarship. We don’t need Next Big Things and we don’t need Academic Superstars, whether they are DH Superstars or Theory Superstars. What we do need is to find more democratic and inclusive ways of thinking about the value of scholarship and scholarly communities.

This is relevant here, and good, but tough to reconcile with the earlier post. In an ideal world, without disciplinary brandings, we can all try to be welcoming of works on their own merits, without relying our preconceived disciplinary criteria. In the present condition, though, it’s tough to see such an environment forming. In that context, maybe a unified digital history “brand” is the best way to stay afloat. This would build barriers against whatever new thing comes along next, though, so it’s a tough question.