The Turing Point

Below is some crazy, uninformed ramblings about the least-complex possible way to trick someone into thinking a computer is a human, for the purpose of history research. I’d love some genuine AI/Machine Intelligence researchers to point me to the actual discussions on the subject. These aren’t original thoughts; they spring from countless sci-fi novels and AI research from the ’70s-’90s. Humanists beware: this is super sci-fi speculative, but maybe an interesting thought experiment.


If someone’s chatting with a computer, but doesn’t realize her conversation partner isn’t human, that computer passes the Turing Test. Unrelatedly, if a robot or piece of art is just close enough to reality to be creepy, but not close enough to be convincingly real, it lies in the Uncanny ValleyI argue there is a useful concept in the simplest possible computer which is still convincingly human, and that computer will be at the Turing Point. 1 

By Smurrayinchester - self-made, based on image by Masahiro Mori and Karl MacDorman at http://www.androidscience.com/theuncannyvalley/proceedings2005/uncannyvalley.html, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=2041097
By Smurrayinchester – self-made, based on image by Masahiro Mori and Karl MacDorman, CC BY-SA 3.0

Forgive my twisting Turing Tests and Uncanny Valleys away from their normal use, for the sake of outlining the Turing Point concept:

  • A human simulacrum is a simulation of a human, or some aspect of a human, in some medium, which is designed to be as-close-as-possible to that which is being modeled, within the scope of that medium.
  • A Turing Test winner is any human simulacrum which humans consistently mistake for the real thing.
  • An occupant of the Uncanny Valley is any human simulacrum which humans consistently doubt as representing a “real” human.
  • Between the Uncanny Valley and Turing Test winners lies the Turing Point, occupied by the least-sophisticated human simulacrum that can still consistently pass as human in a given medium. The Turing Point is a hyperplane in a hypercube, such that there are many points of entry for the simulacrum to “phase-transition” from uncanny to convincing.

Extending the Turing Test

The classic Turing Test scenario is a text-only chatbot which must, in free conversation, be convincing enough for a human to think it is speaking with another human. A piece of software named Eugene Goostman sort-of passed this test in 2014, convincing a third of judges it was a 13-year-old Ukrainian boy.

There are many possible modes in which a computer can act convincingly human. It is easier to make a convincing simulacrum of a 13-year-old non-native English speaker who is confined to text messages than to make a convincing college professor, for example. Thus the former has a lower Turing Point than the latter.

Playing with the constraints of the medium will also affect the Turing Point threshold. The Turing Point for a flesh-covered robot is incredibly difficult to surpass, since so many little details (movement, design, voice quality, etc.) may place it into the Uncanny Valley. A piece of software posing as a Twitter user, however, would have a significantly easier time convincing fellow users it is human.

The Turing Point, then, is flexible to the medium in which the simulacrum intends to deceive, and the sort of human it simulates.

From Type to Token

Convincing the world a simulacrum is any old human is different than convincing the world it is some specific human. This is the token/type distinction; convincingly simulating a specific person (token) is much more difficult than convincingly simulating any old person (type).

Simulations of specific people are all over the place, even if they don’t intend to deceive. Several Twitter-bots exist as simulacra of Donald Trump, reading his tweets and creating new ones in a similar style. Perhaps imitating Poe’s Law, certain people’s styles, or certain types of media (e.g. Twitter), may provide such a low Turing Point that it is genuinely difficult to distinguish humans from machines.

Put differently, the way some Turing Tests may be designed, humans could easily lose.

It’ll be useful to make up and define two terms here. I imagine the concepts already exist, but couldn’t find them, so please comment if they do so I can use less stupid words:

  • type-bot is a machine designed to be represent something at the type-level. For example, a bot that can be mistaken for some random human, but not some specific human.
  • token-bot is a machine designed to represent something at the token-level. For example, a bot that can be mistaken for Donald Trump.

Replaying History

Using traces to recreate historical figures (or at least things they could have done) as token-bots is not uncommon. The most recent high-profile example of this is a project to create a new Rembrandt painting in the original style. Shawn Graham and I wrote an article on using simulations to create new plausible histories, among many other examples old and new.

This all got me thinking, if we reach the Turing Point for some social media personalities (that is, it is difficult to distinguish between their social media presence, and a simulacrum of it), what’s to say we can’t reach it for an entire social media ecosystem? Can we take a snapshot of Twitter and project it several seconds/minutes/hours/days into the future, a bit like a meteorological model?

A few questions and obvious problems:

  • Much of Twitter’s dynamics are dependent upon exogenous forces: memes from other media, real world events, etc. Thus, no projection of Twitter alone would ever look like the real thing. One can, however, potentially use such a simulation to predict how certain types of events might affect the system.
  • This is way overkill, and impossibly computationally complex at this scale. You can simulate the dynamics of Twitter without simulating every individual user, because people on average act pretty systematically. That said, for the humanities-inclined, we may gain more insight from the ground-level of the system (individual agents) than macroscopic properties.
  • This is key. Would a set of plausibly-duplicate Twitter personalities on aggregate create a dynamic system that matches Twitter as an aggregate system? That is, just because the algorithms pass the Turing Test, because humans believe them to be humans, does that necessarily imply the algorithms have enough fidelity to accurately recreate the dynamics of a large scale social network? Or will small unnoticeable differences between the simulacrum and the original accrue atop each other, such that in aggregate they no longer act like a real social network?

The last point is I think a theoretically and methodologically fertile one for people working in DH, AI, and Cognitive Science: whether reducing human-appreciable traits between machines and people is sufficient to simulate aggregate social behavior, or whether human-appreciability (i.e., Turing Test) is a strict enough criteria for making accurate predictions about societies.

These points aside, if we ever do manage to simulate specific people (even in a very limited scope) as token-bots based on the traces they leave, it opens up interesting pedagogical and research opportunities for historians. Scott Enderle tweeted a great metaphor for this:

Imagine, as a student, being able to have a plausible discussion with Marie Curie, or sitting in an Enlightenment-era salon. 2 Or imagine, as a researcher (if individual Turing Point machines do aggregate well), being able to do well-grounded counterfactual history that works at the token level rather than at the type level.

Turing Point Simulations

Bringing this slightly back into the realm of the sane, the interesting thing here is the interplay between appreciability (a person’s ability to appreciate enough difference to notice something wrong with a simulacrum) and fidelity.

We can specifically design simulation conditions with incredibly low-threshold Turing Points, even for token-bots. That is to say, we can create a condition where the interactions are simple enough to make a bot that acts indistinguishably from the specific human it is simulating.

At the most extreme end, this is obviously pointless. If our system is one in which a person can only answer “yes” or “no” to pre-selected preference questions (“Do you like ice-cream?”), making a bot to simulate that person convincingly would be trivial.

Putting that aside (lest we get into questions of the Turing Point of a set of Turing Points), we can potentially design reasonably simplistic test scenarios that would allow for an easy-to-reach Turing Point while still being historiographically or sociologically useful. It’s sort of a minimization problem in topological optimizations. Such a goal would limit the burden of the simulation while maximizing the potential research benefit (but only if, as mentioned before, the difference between true fidelity and the ability to win a token-bot Turing Test is small enough to allow for generalization).

In short, the concept of a Turing Point can help us conceptualize and build token-simulacra that are useful for research or teaching. It helps us ask the question: what’s the least-complex-but-still-useful token-simulacra? It’s also kind-of maybe sort-of like Kolmogorov complexity for human appreciability of other humans: that is, the simplest possible representation of a human that is convincing to other humans.

I’ll end by saying, once again, I realize how insane this sounds, and how far-off. And also how much an interloper I am to this space, having never so much as designed a bot. Still, as Bill Hart-Davidson wrote,

the possibility seems more plausible than ever, even if not soon-to-come. I’m not even sure why I posted this on the Irregular, but it seemed like it’d be relevant enough to some regular readers’ interests to be worth spilling some ink.

Notes:

  1. The name itself is maybe too on-the-nose, being a pun for turning point and thus connected to the rhetoric of singularity, but ¯\_(ツ)_/¯
  2. Yes yes I know, this is SecondLife all over again, but hopefully much more useful.

Who sits in the 41st chair?

tl;dr Rich-get-richer academic prestige in a scarce job market makes meritocracy impossible. Why some things get popular and others don’t. Also agent-based simulations.

Slightly longer tl;dr This post is about why academia isn’t a meritocracy, at no intentional fault of those in power who try to make it one. None of presented ideas are novel on their own, but I do intend this as a novel conceptual contribution in its connection of disparate threads. Especially, I suggest the predictability of research success in a scarce academic economy as a theoretical framework for exploring successes and failures in the history of science.

But mostly I just beat a “musical chairs” metaphor to death.

Positive Feedback

To the victor go the spoils, and to the spoiled go the victories. Think about it: the Yankees; Alexander the Great; Stanford University. Why do the Yankees have twice as many World Series appearances as their nearest competitors, how was Alex’s empire so fucking vast, and why does Stanford get all the cool grants?

The rich get richer. Enough World Series victories, and the Yankees get the reputation and funding to entice the best players. Ol’ Allie-G inherited an amazing army, was taught by Aristotle, and pretty much every place he conquered increased his military’s numbers. Stanford’s known for amazing tech innovation, so they get the funding, which means they can afford even more innovation, which means even more people think they’re worthy of funding, and so on down the line until Stanford and its neighbors (Google, Apple, etc.) destroy the local real estate market and then accidentally blow up the world.

Alexander's Empire [via]
Alexander’s Empire [via]
Okay, maybe I exaggerated that last bit.

Point is, power begets power. Scientists call this a positive feedback loop: when a thing’s size is exactly what makes it grow larger.

You’ve heard it firsthand when a microphoned singer walks too close to her speaker. First the mic picks up what’s already coming out of the speaker. The mic, doings its job, sends what it hears to an amplifier, sending an even louder version to the very same speaker. The speaker replays a louder version of what it just produced, which is once again received by the microphone, until sound feeds back onto itself enough times to produce the ear-shattering squeal fans of live music have come to dread. This is a positive feedback loop.

Feedback loop. [via]
Feedback loop. [via]
Positive feedback loops are everywhere. They’re why the universe counts logarithmically rather than linearly, or why income inequality is so common in free market economies. Left to their own devices, the rich tend to get richer, since it’s easier to make money when you’ve already got some.

Science and academia are equally susceptible to positive feedback loops. Top scientists, the most well-funded research institutes, and world-famous research all got to where they are, in part, because of something called the Matthew Effect.

Matthew Effect

The Matthew Effect isn’t the reality TV show it sounds like.

For unto every one that hath shall be given, and he shall have abundance: but from him that hath not shall be taken even that which he hath. —Matthew 25:29, King James Bible.

It’s the Biblical idea that the rich get richer, and it’s become a popular party trick among sociologists (yes, sociologists go to parties) describing how society works. In academia, the phrase is brought up alongside evidence that shows previous grant-recipients are more likely to receive new grants than their peers, and the more money a researcher has been awarded, the more they’re likely to get going forward.

The Matthew Effect is also employed metaphorically, when it comes to citations. He who gets some citations will accrue more; she who has the most citations will accrue them exponentially faster. There are many correct explanations, but the simplest one will do here: 

If Susan’s article on the danger of velociraptors is cited by 15 other articles, I am more likely to find it and cite her than another article on velociraptors containing the same information, that has never been citedThat’s because when I’m reading research, I look at who’s being cited. The more Susan is cited, the more likely I’ll eventually come across her article and cite it myself, which in turn increases the likelihood that much more that someone else will find her article through my own citations. Continue ad nauseam.

Some of you are thinking this is stupid. Maybe it’s trivially correct, but missing the bigger picture: quality. What if Susan’s velociraptor research is simply better than the competing research, and that’s why it’s getting cited more?

Yes, that’s also an issue. Noticeably awful research simply won’t get much traction. 1 Let’s disqualify it from the citation game. The point is there is lots of great research out there, waiting to be read and built upon, and its quality isn’t the sole predictor of its eventual citation success.

In fact, quality is a mostly-necessary but completely insufficient indicator of research success. Superstar popularity of research depends much more on the citation effects I mentioned above – more citations begets even more. Previous success is the best predictor of future success, mostly independent of the quality of research being shared.

Example of positive feedback loops pushing some articles to citation stardom.
Example of positive feedback loops pushing some articles to citation stardom. [via]
This is all pretty hand-wavy. How do we know success is more important than quality in predicting success? Uh, basically because of Napster.

Popular Music

If VH1 were to produce a retrospective on the first decade of the 21st century, perhaps its two biggest subjects would be illegal music sharing and VH1’s I Love the 19xx… TV series. Napster came and went, followed by LimeWire, eDonkey2000, AudioGalaxy, and other services sued by Metallica. Well-known early internet memes like Hamster Dance and All Your Base Are Belong To Us spread through the web like socially transmitted diseases, and researchers found this the perfect opportunity to explore how popularity worked. Experimentally.

In 2006, a group of Columbia University social scientists designed a clever experiment to test why some songs became popular and others did not, relying on the public interest in online music sharing. They created a music downloading site which gathered 14,341 users, each one to become a participant in their social experiment.

The cleverness arose out of their experimental design, which allowed them to get past the pesky problem of history only ever happening once. It’s usually hard to learn why something became popular, because you don’t know what aspects of its popularity were simply random chance, and what aspects were genuine quality. If you could, say, just rerun the 1960s, changing a few small aspects here or there, would the Beatles still have been as successful? We can’t know, because the 1960s are pretty much stuck having happened as they did, and there’s not much we can do to change it. 2

But this music-sharing site could rerun history—or at least, it could run a few histories simultaneously. When they signed up, each of the site’s 14,341 users were randomly sorted into different groups, and their group number determined how they were presented music. The musical variety was intentionally obscure, so users wouldn’t have heard the bands before.

A user from the first group, upon logging in, would be shown songs in random order, and were given the option to listen to a song, rate it 1-5, and download it. Users from group #2, instead, were shown the songs ranked in order of their popularity among other members of group #2. Group #3 users were shown a similar rank-order of popular songs, but this time determined by the song’s popularity within group #3. So too for groups #4-#9. Every user could listen to, rate, and download music.

Essentially, the researchers put the participants into 9 different self-contained petri dishes, and waited to see which music would become most popular in each. Ranking and download popularity from group #1 was their control group, in that members judged music based on their quality without having access to social influence. Members of groups #2-#9 could be influenced by what music was popular with their peers within the group. The same songs circulated in each petri dish, and each petri dish presented its own version of history.

Music sharing site from Columbia study.
Music sharing site from Columbia study.

No superstar songs emerged out of the control group. Positive feedback loops weren’t built into the system, since popularity couldn’t beget more popularity if nobody saw what their peers were listening to. The other 8 musical petri dishes told a different story, however. Superstars emerged in each, but each group’s population of popular music was very different. A song’s popularity in each group was slightly related to its quality (as judged by ranking in the control group), but mostly it was social-influence-produced chaos. The authors put it this way:

In general, the “best” songs never do very badly, and the “worst” songs never do extremely well, but almost any other result is possible. —Salganik, Dodds, & Watts, 2006

These results became even more pronounced when the researchers increased the visibility of social popularity in the system. The rich got even richer still. A lot of it has to do with timing. In each group, the first few good songs to become popular are the ones that eventually do the best, simply by an accident of circumstance. The first few popular songs appear at the top of the list, for others to see, so they in-turn become even more popular, and so ad infinitum.  The authors go on:

experts fail to predict success not because they are incompetent judges or misinformed about the preferences of others, but because when individual decisions are subject to social influence, markets do not simply aggregate pre-existing individual preferences.

In short, quality is a necessary but insufficient criteria for ultimate success. Social influence, timing, randomness, and other non-qualitative features of music are what turn a good piece of music into an off-the-charts hit.

Wait what about science?

Compare this to what makes a “well-respected” scientist: it ain’t all citations and social popularity, but they play a huge role. And as I described above, simply out of exposure-fueled-propagation, the more citations someone accrues, the more citations they are likely to accrue, until we get a situation like the Yankees (40 world series appearances, versus 20 appearances by the Giants) on our hands. Superstars are born, who are miles beyond the majority of working researchers in terms of grants, awards, citations, etc. Social scientists call this preferential attachment.

Which is fine, I guess. Who cares if scientific popularity is so skewed as long as good research is happening? Even if we take the Columbia social music experiment at face-value, an exact analog for scientific success, we know that the most successful are always good scientists, and the least successful are always bad ones, so what does it matter if variability within the ranks of the successful is so detached from quality?

Except, as anyone studying their #OccupyWallstreet knows, it ain’t that simple in a scarce economy. When the rich get richer, that money’s gotta come from somewhere. Like everything else (cf. the law of conservation of mass), academia is a (mostly) zero-sum game, and to the victors go the spoils. To the losers? Meh.

So let’s talk scarcity.

The 41st Chair

The same guy who who introduced the concept of the Matthew Effect to scientific grants and citations, Robert K. Merton (…of Columbia University), also brought up “the 41st chair” in the same 1968 article.

Merton’s pretty great, so I’ll let him do the talking:

In science as in other institutional realms, a special problem in the workings of the reward system turns up when individuals or organizations take on the job of gauging and suitably rewarding lofty performance on behalf of a large community. Thus, that ultimate accolade in 20th-century science, the Nobel prize, is often assumed to mark off its recipients from all the other scientists of the time. Yet this assumption is at odds with the well-known fact that a good number of scientists who have not received the prize and will not receive it have contributed as much to the advancement of science as some of the recipients, or more.

This can be described as the phenomenon of “the 41st chair.” The derivation of this tag is clear enough. The French Academy, it will be remembered, decided early that only a cohort of 40 could qualify as members and so emerge as immortals. This limitation of numbers made inevitable, of course, the exclusion through the centuries of many talented individuals who have won their own immortality. The familiar list of occupants of this 41st chair includes Descartes, Pascal, Moliere, Bayle, Rousseau, Saint-Simon, Diderot, Stendahl, Flaubert, Zola, and Proust

[…]

But in greater part, the phenomenon of the 41st chair is an artifact of having a fixed number of places available at the summit of recognition. Moreover, when a particular generation is rich in achievements of a high order, it follows from the rule of fixed numbers that some men whose accomplishments rank as high as those actually given the award will be excluded from the honorific ranks. Indeed, their accomplishments sometimes far outrank those which, in a time of less creativity, proved
enough to qualify men for his high order of recognition.

The Nobel prize retains its luster because errors of the first kind—where scientific work of dubious or inferior worth has been mistakenly honored—are uncommonly few. Yet limitations of the second kind cannot be avoided. The small number of awards means that, particularly in times of great scientific advance, there will be many occupants of the 41st chair (and, since the terms governing the award of the prize do not provide for posthumous recognition, permanent occupants of that chair).

Basically, the French Academy allowed only 40 members (chairs) at a time. We can be reasonably certain those members were pretty great, but we can’t be sure that equally great—or greater—women existed who simply never got the opportunity to participate because none of the 40 members died in time.

These good-enough-to-be-members-but-weren’t were said to occupy the French Academy’s 41st chair, an inevitable outcome of a scarce economy (40 chairs) when the potential number benefactors of this economy far outnumber the goods available (40). The population occupying the 41st chair is huge, and growing, since the same number of chairs have existed since 1634, but the population of France has quadrupled in the intervening four centuries.

Returning to our question of “so what if rich-get-richer doesn’t stick the best people at the top, since at least we can assume the people at the top are all pretty good anyway?”, scarcity of chairs is the so-what.

Since faculty jobs are stagnating compared to adjunct work, yet new PhDs are being granted faster than new jobs become available, we are presented with the much-discussed crisis in higher education. Don’t worry, we’re told, academia is a meritocracy. With so few jobs, only the cream of the crop will get them. The best work will still be done, even in these hard times.

Recent Science PhD growth in the U.S. [via]
Recent Science PhD growth in the U.S. [via]
Unfortunately, as the Columbia social music study (among many other studies) showed, true meritocracies are impossible in complex social systems. Anyone who plays the academic game knows this already, and many are quick to point it out when they see people in much better jobs doing incredibly stupid things. What those who point out the falsity of meritocracy often get wrong, however, is intention: the idea that there is no meritocracy because those in power talk the meritocracy talk, but don’t then walk the walk. I’ll talk a bit later about how, even if everyone is above board in trying to push the best people forward, occupants of the 41st chair will still often wind up being more deserving than those sitting in chairs 1-40. But more on that later.

For now, let’s start building a metaphor that we’ll eventually over-extend well beyond its usefulness. Remember that kids’ game Musical Chairs, where everyone’s dancing around a bunch of chairs while the music is playing, but as soon as the music stops everyone’s got to find a chair and sit down? The catch, of course, is that there are fewer chairs than people, so someone always loses when the music stops.

The academic meritocracy works a bit like this. It is meritocratic, to a point: you can’t even play the game without proving some worth. The price of admission is a Ph.D. (which, granted, is more an endurance test than an intelligence test, but academic success ain’t all smarts, y’know?), a research area at least a few people find interesting and believe you’d be able to do good work in it, etc. It’s a pretty low meritocratic bar, since it described 50,000 people who graduated in the U.S. in 2008 alone, but it’s a bar nonetheless. And it’s your competition in Academic Musical Chairs.

Academic Musical Chairs

Time to invent a game! It’s called Academic Musical Chairs, the game where everything’s made up and the points don’t matter. It’s like Regular Musical Chairs, but more complicated (see Fig. 1). Also the game is fixed.

Figure 1: Academic Musical Chairs
Figure 1: Academic Musical Chairs

See those 40 chairs in the middle green zone? People sitting in them are the winners. Once they’re seated they have what we call in the game “tenure”, and they don’t get up until they die or write something controversial on twitter. Everyone bustling around them, the active players, are vying for seats while they wait for someone to die; they occupy the yellow zone we call “the 41st chair”. Those beyond that, in the red zone, can’t yet (or may never) afford the price of game admission; they don’t have a Ph.D., they already said something controversial on Twitter, etc. The unwashed masses, you know?

As the music plays, everyone in the 41st chair is walking around in a circle waiting for someone to die and the music to stop. When that happens, everyone rushes to the empty seat. A few invariably reach it simultaneously, until one out-muscles the others and sits down. The sitting winner gets tenure. The music starts again, and the line continues to orbit the circle.

If a player spends too long orbiting in the 41st chair, he is forced to resign. If a player runs out of money while orbiting, she is forced to resign. Other factors may force a player to resign, but they will never appear in the rulebook and will always be a surprise.

Now, some players are more talented than others, whether naturally or through intense training. The game calls this “academic merit”, but it translates here to increased speed and strength, which helps some players reach the empty chair when the music stops, even if they’re a bit further away. The strength certainly helps when competing with others who reach the chair at the same time.

A careful look at Figure 1 will reveal one other way players might increase their chances of success when the music stops. The 41st chair has certain internal shells, or rings, which act a bit like that fake model of an atom everyone learned in high-school chemistry. Players, of course, are the electrons.

Electron shells. [via]
Electron shells. [via]
You may remember that the further out the shell, the more electrons can occupy it(-ish): the first shell holds 2 electrons, the second holds 8; third holds 18; fourth holds 32; and so on. The same holds true for Academic Musical Chairs: the coveted interior ring only fits a handful of players; the second ring fits an order of magnitude more; the third ring an order of magnitude more than that, and so on.

Getting closer to the center isn’t easy, and it has very little to do with your “academic rigor”! Also, of course, the closer you are to the center, the easier it is to reach either the chair, or the next level (remember positive feedback loops?). Contrariwise, the further you are from the center, the less chance you have of ever reaching the core.

Many factors affect whether a player can proceed to the next ring while the music plays, and some factors actively count against a player. Old age and being a woman, for example, take away 1 point. Getting published or cited adds points, as does already being friends with someone sitting in a chair (the details of how many points each adds can be found in your rulebook). Obviously the closer you are to the center, the easier you can make friends with people in the green core, which will contribute to your score even further. Once your score is high enough, you proceed to the next-closest shell.

Hooray, someone died! Let’s watch what happens.

The music stops. The people in the innermost ring who have the luckiest timing (thus are closest to the empty chair) scramble for it, and a few even reach it. Some very well-timed players from the 2nd & 3rd shells also reach it, because their “academic merit” has lent them speed and strength to reach past their position. A struggle ensues. Miraculously, a pregnant black woman sits down (this almost never happens), though not without some bodily harm, and the music begins again.

Oh, and new shells keep getting tacked on as more players can afford the cost of admission to the yellow zone, though the green core remains the same size.

Bizarrely, this is far from the first game of this nature. A Spanish boardgame from 1587 called the Courtly Philosophy had players move figures around a board, inching closer to living a luxurious life in the shadow of a rich patron. Random chance ruled their progression—a role of the dice—and occasionally they’d reach a tile that said things like: “Your patron dies, go back 5 squares”.

The courtier's philosophy. [via]
The courtier’s philosophy. [via]
But I digress. Let’s temporarily table the scarcity/41st-chair discussion and get back to the Matthew Effect.

The View From Inside

A friend recently came to me, excited but nervous about how well they were being treated by their department at the expense of their fellow students. “Is this what the Matthew Effect feels like?” they asked. Their question is the reason I’m writing this post, because I spent the next 24 hours scratching my head over “what does the Matthew Effect feel like?”.

I don’t know if anyone’s looked at the psychological effects of the Matthew Effect (if you do, please comment?), but my guess is it encompasses two feelings: 1) impostor syndrome, and 2) hard work finally paying off.

Since almost anyone who reaps the benefits of the Matthew Effect in academia will be an intelligent, hard-working academic, a windfall of accruing success should feel like finally reaping the benefits one deserves. You probably realize that luck played a part, and that many of your harder-working, smarter friends have been equally unlucky, but there’s no doubt in your mind that, at least, your hard work is finally paying off and the academic community is beginning to recognize that fact. No matter how unfair it is that your great colleagues aren’t seeing the same success.

But here’s the thing. You know how in physics, gravity and acceleration feel equivalent? How, if you’re in a windowless box, you wouldn’t be able to tell the difference between being stationary on Earth, or being pulled by a spaceship at 9.8 m/s2 through deep space? Success from merit or from Matthew Effect probably acts similarly, such that it’s impossible to tell one from the other from the inside.

Gravity vs. Acceleration. [via]
Gravity vs. Acceleration. [via]
Incidentally, that’s why the last advice you ever want to take is someone telling you how to succeed from their own experience.

Success

Since we’ve seen explosive success requires but doesn’t rely on skill, quality, or intent, the most successful people are not necessarily in the best position to understand the reason for their own rise. Their strategies may have paid off, but so did timing, social network effects, and positive feedback loops. The question you should be asking is, why didn’t other people with the same strategies also succeed?

Keep this especially in mind if you’re a student, and your tenured-professor advised you to seek an academic career. They may believe that giving you their strategies for success will help you succeed, when really they’re just giving you one of 50,000 admission tickets to Academic Musical Chairs.

Building a Meritocracy

I’m teetering well-past the edge of speculation here, but I assume the communities of entrenched academics encouraging undergraduates into a research career are the same communities assuming a meritocracy is at play, and are doing everything they can in hiring and tenure review to ensure a meritocratic playing field.

But even if gender bias did not exist, even if everyone responsible for decision-making genuinely wanted a meritocracy, even if the game weren’t rigged at many levels, the economy of scarcity (41st chair) combined with the Matthew Effect would ensure a true meritocracy would be impossible. There are only so many jobs, and hiring committees need to choose some selection criteria; those selection criteria will be subject to scarcity and rich-get-richer effects.

I won’t prove that point here, because original research is beyond the scope of this blog post, but I have a good idea of how to do it. In fact, after I finish writing this, I probably will go do just that. Instead, let me present very similar research, and explain how that method can be used to answer this question.

We want an answer to the question of whether positive feedback loops and a scarce economy are sufficient to prevent the possibility of a meritocracy. In 1971, Tom Schelling asked an unrelated question which he answered using a very relevant method: can racial segregation manifest in a community whose every actor is intent on not living a segregated life? Spoiler alert: yes.

He answered this question using by simulating an artificial world—similar in spirit to the Columbia social music experiment, except for using real participants, he experimented on very simple rule-abiding game creatures of his own invention. A bit like having a computer play checkers against itself.

The experiment is simple enough: a bunch of creatures occupy a checker board, and like checker pieces, they’re red or black. Every turn, one creature has the opportunity to move randomly to another empty space on the board, and their decision to move is based on their comfort with their neighbors. Red pieces want red neighbors, and black pieces want black neighbors, and they keep moving randomly ’till they’re all comfortable. Unsurprisingly, segregated creature communities appear in short order.

What if we our checker-creatures were more relaxed in their comforts? They’d be comfortable as long as they were in the majority; say, at least 50% of their neighbors were the same color. Again, let the computer play itself for a while, and within a few cycles the checker board is once again almost completely segregated.

Schelling segregation. [via]
Schelling segregation. [via]
What if the checker pieces are excited about the prospect of a diverse neighborhood? We relax the criteria even more, so red checkers only move if fewer than a third of their neighbors are red (that is, they’re totally comfortable with 66% of their neighbors being black)? If we run the experiment again, we see, again, the checker board breaks up into segregated communities.

Schelling’s claim wasn’t about how the world worked, but about what the simplest conditions were that could still explain racism. In his fictional checkers-world, every piece could be generously interested in living in a diverse neighborhood, and yet the system still eventually resulted in segregation. This offered a powerful support for the theory that racism could operate subtly, even if every actor were well-intended.

Vi Hart and Nicky Case created an interactive visualization/game that teaches Schelling’s segregation model perfectly. Go play it. Then come back. I’ll wait.


Such an experiment can be devised for our 41st-chair/positive-feedback system as well. We can even build a simulation whose rules match the Academic Musical Chairs I described above. All we need to do is show that a system in which both effects operate (a fact empirically proven time and again in academia) produces fundamental challenges for meritocracy. Such a model would be show that simple meritocratic intent is insufficient to produce a meritocracy. Hulk smashing the myth of the meritocracy seems fun; I think I’ll get started soon.

The Social Network

Our world ain’t that simple. For one, as seen in Academic Musical Chairs, your place in the social network influences your chances of success. A heavy-hitting advisor, an old-boys cohort, etc., all improve your starting position when you begin the game.

To put it more operationally, let’s go back to the Columbia social music experiment. Part of a song’s success was due to quality, but the stuff that made stars was much more contingent on chance timing followed by positive feedback loops. Two of the authors from the 2006 study wrote another in 2007, echoing this claim that good timing was more important than individual influence:

models of information cascades, as well as human subjects experiments that have been designed to test the models (Anderson and Holt 1997; Kubler and Weizsacker 2004), are explicitly constructed such that there is nothing special about those individuals, either in terms of their personal characteristics or in their ability to influence others. Thus, whatever influence these individuals exert on the collective outcome is an accidental consequence of their randomly assigned position in the queue.

These articles are part of a large literature in predicting popularity, viral hits, success, and so forth. There’s The Pulse of News in Social Media: Forecasting Popularity by Bandari, Asur, & Huberman, which showed that a top predictor of newspaper shares was the source rather than the content of an article, and that a major chunk of articles that do get shared never really make it to viral status. There’s Can Cascades be Predicted? by Cheng, Adamic, Dow, Kleinberg, and Leskovec (all-star cast if ever I saw one), which shows the remarkable reliance on timing & first impressions in predicting success, and also the reliance on social connectivity. That is, success travels faster through those who are well-connected (shocking, right?), and structural properties of the social network are important. This study by Susarla et al. also shows the importance of location in the social network in helping push those positive feedback loops, effecting the magnitude of success in YouTube Video shares.

Twitter information cascade. [via]
Twitter information cascade. [via]
Now, I know, social media success does not an academic career predict. The point here, instead, is to show that in each of these cases, before sharing occurs and not taking into account social media effects (that is, relying solely on the merit of the thing itself), success is predictable, but stardom is not.

Concluding, Finally

Relating it to Academic Musical Chairs, it’s not too difficult to say whether someone will end up in the 41st chair, but it’s impossible to tell whether they’ll end up in seats 1-40 until you keep an eye on how positive feedback loops are affecting their career.

In the academic world, there’s a fertile prediction market for Nobel Laureates. Social networks and Matthew Effect citation bursts are decent enough predictors, but what anyone who predicts any kind of success will tell you is that it’s much easier to predict the pool of recipients than it is to predict the winners.

Take Economics. How many working economists are there? Tens of thousands, at least. But there’s this Econometric Society which began naming Fellows in 1933, naming 877 Fellows by 2011. And guess what, 60 of 69 Nobel Laureates in Economics before 2011 were Fellows of the society. The other 817 members are or were occupants of the 41st chair.

The point is (again, sorry), academic meritocracy is a myth. Merit is a price of admission to the game, but not a predictor of success in a scarce economy of jobs and resources. Once you pass the basic merit threshold and enter the 41st chair, forces having little to do with intellectual curiosity and rigor guide eventual success (ahem). Small positive biases like gender, well-connected advisors, early citations, lucky timing, etc. feed back into increasingly larger positive biases down the line. And since there are only so many faculty jobs out there, these feedback effects create a naturally imbalanced playing field. Sometimes Einsteins do make it into the middle ring, and sometimes they stay patent clerks. Or adjuncts, I guess. Those who do make it past the 41st chair are poorly-suited to tell you why, because by and large they employed the same strategies as everybody else.

Figure 1: Academic Musical Chairs
Yep, Academic Musical Chairs

And if these six thousand words weren’t enough to convince you, I leave you with this article and this tweet. Have a nice day!

Addendum for Historians

You thought I was done?

As a historian of science, this situation has some interesting repercussions for my research. Perhaps most importantly, it and related concepts from Complex Systems research offer a middle ground framework between environmental/contextual determinism (the world shapes us in fundamentally predictable ways) and individual historical agency (we possess the power to shape the world around us, making the world fundamentally unpredictable).

More concretely, it is historically fruitful to ask not simply what non-“scientific” strategies were employed by famous scientists to get ahead (see Biagioli’s Galileo, Courtier), but also what did or did not set those strategies apart from the masses of people we no longer remember. Galileo, Courtier provides a great example of what we historians can do on a larger scale: it traces Galileo’s machinations to wind up in the good graces of a wealthy patron, and how such a system affected his own research. Using recently-available data on early modern social and scholarly networks, as well as the beginnings of data on people’s activities, interests, practices, and productions, it should be possible to zoom out from Biagioli’s viewpoint and get a fairly sophisticated picture of trajectories and practices of people who weren’t Galileo.

This is all very preliminary, just publicly blogging whims, but I’d be fascinated by what a wide-angle (dare I say, macroscopic?) analysis of the 41st chair in could tell us about how social and “scientific” practices shaped one another in the 16th and 17th centuries. I believe this would bear previously-impossible fruit, since a lone historian grasping ten thousand tertiary actors at once is a fool’s errand, but is a walk in the park for my laptop.

As this really is whim-blogging, I’d love to hear your thoughts.

Notes:

  1. Unless it’s really awful, but let’s avoid that discussion here.
  2. short of a TARDIS.

Culturomics 2: The Search for More Money

“God willing, we’ll all meet again in Spaceballs 2: The Search for More Money.” -Mel Brooks, Spaceballs, 1987

A long time ago in a galaxy far, far away (2012 CE, Indiana), I wrote a few blog posts explaining that, when writing history, it might be good to talk to historians (1,2,3). They were popular posts for the Irregular, and inspired by Mel Brooks’ recent interest in making Spaceballs 2,  I figured it was time for a sequel of my own. You know, for all the money this blog pulls in. 1

SpaceballsTheFlamethrower[1]

Two teams recently published very similar articles, attempting cultural comparison via a study of historical figures in different-language editions of Wikipedia. The first, by Gloor et al., is for a conference next week in Japan, and frames itself as cultural anthropology through the study of leadership networks. The second, by Eom et al. and just published in PLoS ONE, explores cross-cultural influence through historical figures who span different language editions of Wikipedia.

Before reading the reviews, keep in mind I’m not commenting on method or scientific contribution—just historical soundness. This often doesn’t align with the original authors’ intents, which is fine. My argument isn’t that these pieces fail at their goals (science is, after all, iterative), but that they would be markedly improved by adhering to the same standards of historical rigor as they adhere to in their home disciplines, which they could accomplish easily by collaborating with a historian.

The road goes both ways. If historians don’t want physicists and statisticians bulldozing through history, we ought to be open to collaborating with those who don’t have a firm grasp on modern historiography, but who nevertheless have passion, interest, and complementary skills. If the point is understanding people better, by whatever means relevant, we need to do it together.

Cultural Anthropology

“Cultural Anthropology Through the Lens of Wikipedia – A Comparison of Historical Leadership Networks in the English, Chinese, Japanese and German Wikipedia” by Gloor et al. analyzes “the historical networks of the World’s leaders since the beginning of written history, comparing them in the four different Wikipedias.”

Their method is simple (simple isn’t bad!): take each “people page” in Wikipedia, and create a network of people based on who else is linked within that page. For example, if Wikipedia’s article on Mozart links to Beethoven, a connection is drawn between them. Connections are only drawn between people whose lives overlap; for example, the Mozart (1756-1791) Wikipedia page also links to Chopin (1810-1849), but because they did not live concurrently, no connection is drawn.

Figure 1 from http://arxiv.org/ftp/arxiv/papers/1502/1502.05256.pdf
Figure 1 from Gloor et al

A separate network is created for four different language editions of Wikipedia (English, Chinese, Japanese, German), because biographies in each edition are rarely exact translations, and often different people will be prominent within the same biography across all four languages. PageRank was calculated for all the people in the resulting networks, to get a sense of who the most central figures are according to the Wikipedia link structure.

“Who are the most important people of all times?” the authors ask, to which their data provides them an answer. 2 In China and Japan, they show, only warriors and politicians make the cut, whereas religious leaders, artists, and scientists made more of a mark on Germany and the English-speaking world. Historians and biographers wind up central too, given how often their names appear on the pages of famous contemporaries on whom they wrote.

Diversity is also a marked difference: 80% of the “top 50” people for the English Wikipedia were themselves non-English, whereas only 4% of the top people from the Chinese Wikipedia are not Chinese. The authors conclude that “probing the historical perspective of many different language-specific Wikipedias gives an X-ray view deep into the historical foundations of cultural understanding of different countries.”

Figure 3
Figure 3 from Gloor et al

Small quibbles aside (e.g. their data include the year 0 BC, which doesn’t exist), the big issue here is the ease with which they claim these are the “most important” actors in history, and that these datasets provides an “X-ray” into the language cultures that produced them. This betrays the same naïve assumptions that plague much of culturomics research: that you can uncritically analyze convenient datasets as a proxy for analyzing larger cultural trends.

You can in fact analyze convenient datasets as a proxy for larger cultural trends, you just need some cultural awareness and a critical perspective.

In this case, several layers of assumptions are open for questioning, including:

  • Is the PageRank algorithm a good proxy for historical importance? (The answer turns out to be yes in some situations, but probably not this one.)
  • Is the link structure in Wikipedia a good proxy for historical dependency? (No, although it’s probably a decent proxy for current cultural popularity of historical figures, which would have been a better framing for this article. Better yet, these data can be used to explore the many well-known and unknown biases that pervade Wikipedia.)
  • Can differences across language editions of Wikipedia be explained by any factors besides cultural differences? (Yes. For example, editors of the German-language Wikipedia may be less likely to write a German biography if one already exists in English, given that ≈64% of Germany speaks English.)

These and other questions, unexplored in the article, make it difficult to take at face value that this study can reveal important historical actors or compare cultural norms of importance. Which is a shame, because simple datasets and approaches like this one can produce culturally and scientifically valid results that wind up being incredibly important. And the scholars working on the project are top-notch, it’s just that they don’t have all the necessary domain expertise to explore their data and questions.

Cultural Interactions

The great thing about PLoS is the quality control on its publications: there isn’t much. As long as primary research is presented, the methods are sound, the data are open, and the experiment is well-documented, you’re in.

It’s a great model: all reasonable work by reasonable people is published, and history decides whether an article is worthy of merit. Contrast this against the current model, where (let’s face it) everything gets published eventually anyway, it’s just a question of how many journal submissions and rounds of peer review you’re willing to sit through. Research sits for years waiting to be published, subject to the whims of random reviewers and editors who may hold long grudges, when it could be out there the minute it’s done, open to critique and improvement, and available to anyone to draw inspiration or to learn from someone’s mistakes.

“Interactions of Cultures and Top People of Wikipedia from Ranking of 24 Language Editions” by Eom et al. is a perfect example of this model. Do I consider it a paragon of cultural research? Obviously not, if I’m reviewing it here. Am I happy the authors published it, respectful of their attempt, and willing to use it to push forward our mutual goal of soundly-researched cultural understanding? Absolutely.

Eom et al.’s piece, similar to that of Gloor et al. above, uses links between Wikipedia people pages to rank historical figures and to make cultural comparisons. The article explores 24 different language editions of Wikipedia, and goes one step further, using the data to explore intercultural influence. Importantly, given that this is a journal-length article and not a paper from a conference proceeding like Gloor et al.’s, extra space and thought was clearly put into the cultural biases of Wikipedia across languages. That said, neither of the articles reviewed here include any authors who identify themselves as historians or cultural experts.

This study collected data a bit differently from the last. Instead of a network connecting only those people whose lives overlapped, this network connected all pages within a single-language edition of Wikipedia, based only on links between articles. 3 They then ranked pages using a number of metrics, including but not limited to PageRank, and only then automatically extracted people to find who was the most prominent in each dataset.

In short, every Wikipedia article is linked in a network and ranked, after which all articles are culled except those about people. The authors explain: “On the basis of this data set we analyze spatial, temporal, and gender skewness in Wikipedia by analyzing birth place, birth date, and gender of the top ranked historical figures in Wikipedia.” By birth place, they mean the country currently occupying the location where a historical figure was born, such that Aristophanes, born in Byzantium 2,300 years ago, is considered Turkish for the purpose of this dataset. The authors note this can lead to cultural misattributions ≈3.5% of the time (e.g. Kant is categorized as Russian, having been born in a city now in Russian territory). They do not, however, call attention to the mutability of culture over time.

Table 2 from Eom et al.
Table 2 from Eom et al.

It is unsurprising, though comforting, to note that the fairly different approach to measuring prominence yields many of the same top-10 results as Gloor’s piece: Shakespeare, Napoleon, Bush, Jesus, etc.

Analysis of the dataset resulted in several worthy conclusions:

  • Many of the “top” figures across all language editions hail from Western Europe or the U.S.
  • Language editions bias local heroes (half of top figures in Wikipedia English are from the U.S. and U.K.; half of those in Wikipedia Hindi are from India) and regional heroes (Among Wikipedia Korean, many top figures are Chinese).
  • Top figures are distributed throughout time in a pattern you’d expect given global population growth, excepting periods representing foundations of modern cultures (religions, politics, and so forth).
  • The farther you go back in time, the less likely a top figure from a certain edition of Wikipedia is to have been born in that language’s region. That is, modern prominent figures in Wikipedia English are from the U.S. or the U.K., but the earlier you go, the less likely top figures are born in English-speaking regions. (I’d question this a bit, given cultural movement and mutability, but it’s still a result worth noting).
  • Women are consistently underrepresented in every measure and edition. More recent top people are more likely to be women than those from earlier years.
Figure 4 from Eom et al.
Figure 4 from Eom et al.

The article goes on to describe methods and results for tracking cultural influence, but this blog post is already tediously long, so I’ll leave that section out of this review.

There are many methodological limitations to their approach, but the authors are quick to notice and point them out. They mention that Linnaeus ranks so highly because “he laid the foundations for the modern biological naming scheme so that plenty of articles about animals, insects and plants point to the Wikipedia article about him.” This research was clearly approached with a critical eye toward methodology.

Eom et al. do not fare as well historically as methodologically; opportunities to frame claims more carefully, or to ask different sorts of questions, are overlooked. I mentioned earlier that the research assumes historical cultural consistency, but cultural currents intersect languages and geography at odd angles.

The fact that Wikipedia English draws significantly from other locations the earlier you look should come as no surprise. But, it’s unlikely English Wikipedians are simply looking to more historically diverse subjects; rather, the locus of some cultural current (Christianity, mathematics, political philosophy) has likely moved from one geographic region to another. This should be easy to test with their dataset by looking at geographic clustering and spread in any given year. It’d be nice to see them move in that direction next.

I do appreciate that they tried to validate their method by comparing their “top people” to lists other historians have put together. Unfortunately, the only non-Wikipedia-based comparison they make is to a book written by an astrophysicist and white separatist with no historical training: “To assess the alignment of our ranking with previous work by historians, we compare it with [Michael H.] Hart’s list of the top 100 people who, according to him, most influenced human history.”

Top People

Both articles claim that an algorithm analyzing Wikipedia networks can compare cultures and discover the most important historical actors, though neither define what they mean by “important.” The claim rests on the notion that Wikipedia’s grand scale and scope smooths out enough authorial bias that analyses of Wikipedia can inductively lead to discoveries about Culture and History.

And critically approached, that notion is more plausible than historians might admit. These two reviewed articles, however, don’t bring that critique to the table. 4 In truth, the dataset and analysis lets us look through a remarkably clear mirror into the cultures that created Wikipedia, the heroes they make, and the roots to which they feel most connected.

Usefully for historians, there is likely much overlap between history and the picture Wikipedia paints of it, but the nature of that overlap needs to be understood before we can use Wikipedia to aid our understanding of the past. Without that understanding, boldly inductive claims about History and Culture risk reinforcing the same systemic biases which we’ve slowly been trying to fix. I’m absolutely certain the authors don’t believe that only 5% of history’s most important figures were women, but the framing of the articles do nothing to dispel readers of this notion.

Eom et al. themselves admit “[i]t is very difficult to describe history in an objective way,” which I imagine is a sentiment we can all get behind. They may find an easier path forward in the company of some historians.

Notes:

  1. net income: -$120/year.
  2. If you’re curious, the 10 most important people in the English-speaking world, in order, are George W. Bush, ol’ Willy Shakespeare, Sidney Lee, Jesus, Charles II, Aristotle, Napoleon, Muhammad, Charlemagne, and Plutarch.
  3. Download their data here.
  4. Actually the Eom et al. article does raise useful critiques, but mentioning them without addressing them doesn’t really help matters.

Historians, Doctors, and their Absence

[Note: sorry for the lack of polish on the post compared to others. This was hastily written before a day of international travel. Take it with however many grains of salt seem appropriate under the circumstances.]

[Author’s note two: Whoops! Never included the link to the article. Here it is.]

Every once in a while, 1 a group of exceedingly clever mathematicians and physicists decide to do something exceedingly clever on something that has nothing to do with math or physics. This particular research project has to do with the 14th Century Black Death, resulting in such claims as the small-world network effect is a completely modern phenomenon, and “most social exchange among humans before the modern era took place via face-to-face interaction.”

The article itself is really cool. And really clever! I didn’t think of it, and I’m angry at myself for not thinking of it. They look at the empirical evidence of the spread of disease in the late middle ages, and note that the pattern of disease spread looked shockingly different than patterns of disease spread today. Epidemiologists have long known that today’s patterns of disease propagation are dependent on social networks, and so it’s not a huge leap to say that if earlier diseases spread differently, their networks must have been different too.

Don’t get me wrong, that’s really fantastic. I wish more people (read: me) would make observations like this. It’s the sort of observation that allows historians to infer facts about the past with reasonable certainty given tiny amounts of evidence. The problem is, the team had neither any doctors, nor any historians of the late middle ages, and it turned an otherwise great paper into a set of questionable conclusions.

Small world networks have a formal mathematical definition, which (essentially) states that no matter how big the population of the world gets, everyone is within a few degrees of separation from you. Everyone’s an acquaintance of an acquaintance of an acquaintance of an acquaintance. This non-intuitive fact is what drives the insane speeds of modern diseases; today, an epidemic can spread from Australia to every state in the U.S. in a matter of days. Due to this, disease spread maps are weirdly patchy, based more around how people travel than geographic features.

Patchy h5n1 outbreak map.
Patchy h5n1 outbreak map.

The map of the spread of black death in the 14th century looked very different. Instead of these patches, the disease appeared to spread in very deliberate waves, at a rate of about 2km/day.

Spread of the plague, via the original article.
Spread of the plague, via the original article.

How to reconcile these two maps? The solution, according to the network scientists, was to create a model of people interacting and spreading diseases across various distances and types of networks. Using the models, they show that in order to generate these wave patterns of disease spread, the physical contact network cannot be small world. From this, because they make the (uncited) claimed that physical contact networks had to be a subset of social contact networks (entirely ignoring, say, correspondence), the 14th century did not have small world social networks.

There’s a lot to unpack here. First, their model does not take into account the fact that people, y’know, die after they get the plague. Their model assumes infected have enough time and impetus to travel to get the disease as far as they could after becoming contagious. In the discussion, the authors do realize this is a stretch, but suggest that because, people could if they so choose travel 40km/day, and the black death only spread 2km/day, this is not sufficient to explain the waves.

I am no plague historian, nor a doctor, but a brief trip on the google suggests that black death symptoms could manifest in hours, and a swift death comes only days after. It is, I think, unlikely that people would or could be traveling great distances after symptoms began to show.

More important to note, however, are the assumptions the authors make about social ties in the middle ages. They assume a social tie must be a physical one; they assume social ties are connected with mobility; and they assume social ties are constantly maintained. This is a bit before my period of research, but only a hundred years later (still before the period the authors claim could have sustained small world networks), but any early modern historian could tell you that communication was asynchronous and travel was ordered and infrequent.

Surprisingly, I actually believe the authors’ conclusions: that by the strict mathematical definition of small world networks, the “pre-modern” world might not have that feature. I do think distance and asynchronous communication prevented an entirely global 6-degree effect. That said, the assumptions they make about what a social tie is are entirely modern, which means their conclusion is essentially inevitable: historical figures did not maintain modern-style social connections, and thus metrics based on those types of connections should not apply. Taken in the social context of the Europe in the late middle ages, however, I think the authors would find that the salient features of small world networks (short average path length and high clustering) exist in that world as well.

A second problem, and the reason I agree with the authors that there was not a global small world in the late 14th century, is because “global” is not an appropriate axis on which to measure “pre-modern” social networks. Today, we can reasonably say we all belong to a global population; at that point in time, before trade routes from Europe to the New World and because of other geographical and technological barriers, the world should instead have been seen as a set of smaller, overlapping populations. My guess is that, for more reasonable definitions of populations for the time period, small world properties would continue to hold in this time period.

Notes:

  1. Every day? Every two days?

Predicting victors in an attention and feedback economy

This post is about computer models and how they relate to historical research, even though it might not seem like it at first. Or at second. Or third. But I encourage anyone who likes history and models to stick with it, because it gets to a distinction of model use that isn’t made frequently enough.

Music in a vacuum

Imagine yourself uninfluenced by the tastes of others: your friends, their friends, and everyone else. It’s an effort in absurdity, but try it, if only to pin down how their interests affect yours. Start with something simple, like music. If you want to find music you liked, you might devise a program that downloads random songs from the internet and plays them back without revealing their genre or other relevant metadata, so you can select from that group to get an unbiased sample of songs you like. It’s a good first step, given that you generally find music by word-of-mouth, seeing your friends’ last.fm playlists, listening to what your local radio host thinks is good, and so forth. The music that hits your radar is determined by your social and technological environment, so the best way to break free from this stifling musical determinism is complete randomization.

So you listen to the songs for a while and rank them as best you can by quality, the best songs (Stairway to Heaven, Shine On You Crazy Diamond, I Need A Dollar) at the very top and the worst (Ice Ice Baby, Can’t Touch This, that Korean song that’s been all over the internet recently) down at the bottom of the list. You realize that your list may not be a necessarily objective measurement of quality, but it definitely represents a hierarchy of quality to you, which is real enough, and you’re sure if your best friends from primary school tried the same exercise they’d come up with a fairly comparable order.

Friends don’t let friends share music. via.

Of course, the fact that your best friends would come up with a similar list (but school buddies today or a hundred years ago wouldn’t) reveals another social aspect of musical tastes; there is no ground truth of objectively good or bad music. Musical tastes are (largely) socially constructed 1, which isn’t to say that there isn’t any real difference between good and bad music, it’s just that the evaluative criteria (what aspects of the music are important and definitions of ‘good’ and ‘bad’) are continuously being defined and redefined by your social environment. Alice Bell wrote the best short explanation I’ve read in a while on how something can be both real and socially constructed.

There you have it: other people influence what songs we listen to out of the set of good music that’s been recorded, and other people influence our criteria for defining good and bad music to begin with. This little thought experiment goes a surprisingly long way in explaining why computational models are pretty bad at predicting Nobel laureates, best-selling authors, box office winners, pop stars, and so forth. Each category is ostensibly a mark of quality, but is really more like a game of musical chairs masquerading as a meritocracy. 2

Sure, you (usually) need to pass a certain threshold of quality to enter the game, but once you’re there, whether or not you win is anybody’s guess. Winning is a game of chance with your generally equally-qualified peers competing for the same limited resource: membership in the elite. Merton (1968) compared this phenomenon to the French Academy’s “Forty-First Chair,” because while the Academy was limited to only forty members (‘chairs’), there were many more who were also worthy of a seat but didn’t get one when the music stopped: Descartes, Diderot, Pascal, Proust, and others. It was almost literally a game of musical chairs between great thinkers, much in the same way it is today in so many other elite groups.

Musical Chair. via.

Merton’s same 1968 paper described the mechanism that tends to pick the winners and losers, which he called the ‘Matthew Effect,’ but is also known as ‘Preferential Attachment,’ ‘Rich-Get-Richer,’ and all sorts of other names besides. The idea is that you need money to make money, and the more you’ve got the more you’ll get. In the music world, this manifests when a garage band gets a lucky break on some local radio station, which leads to their being heard by a big record label company who releases the band nationally, where they’re heard by even more people who tell their friends, who in turn tell their friends, and so on and so on until the record company gets rich, the band hits the top 40 charts, and the musicians find themselves desperate for a fix and asking for only blue skittles in their show riders. Okay, maybe they don’t all turn out that way, but if it sounds like a slippery slope it’s because it is one. In complex systems science, this is an example of a positive feedback loop, where what happens in the future is reliant upon and tends to compound what happens just before it. If you get a little fame, you’re more likely to get more, and with that you’re more likely to get even more, and so on until Lady Gaga and Mick Jagger.

Rishidev Chaudhuri does a great job explaining this with bunnies, showing that if 10% of rabbits reproduce a year, starting with a hundred, in a year there’d be 110, in two there’d be 121, in twenty-five there’d be a thousand, and in a hundred years there’d be over a million rabbits. Feedback systems (so-named because the past results feed back on themselves to the future) multiply rather than add, with effects increasing exponentially quickly. When books or articles are read, each new citation increases its chances of being read and cited again, until a few scholarly publications end up with thousands or hundreds of thousands of citations when most have only a handful.

This effect holds true in Nobel prize-winning science, box office hits, music stars, and many other areas where it is hard to discern between popularity and quality, and the former tends to compound while exponentially increasing the perception of the latter. It’s why a group of musicians who are every bit as skilled as Pink Floyd wind up never selling outside their own city if they don’t get a lucky break, and why two equally impressive books might have such disproportionate citations. Add to that the limited quantity of ‘elite seats’ (Merton’s 40 chairs) and you get a situation where only a fraction of the deserving get the rewards, and sometimes the most deserving go unnoticed entirely.

Different musical worlds

But I promised to talk  about computational models, contingency, and sensitivity to initial conditions, and I’ve covered none of that so far. And before I get to it, I’d like to talk about music a bit more, this time somewhat more empirically. Salganik, Dodds, and Watts (2006; 10.1126/science.1121066) recently performed a study on about 15,000 individuals that mapped pretty closely to the social aspects of musical taste I described above. They bring up some literature suggesting popularity doesn’t directly and deterministically map on to musical proficiency; instead, while quality does play a role, much of the deciding force behind who gets fame is a stochastic (random) process driven by social interactivity. Unfortunately, because history only happened once, there’s no reliable way to replay time to see if the same musicians would reach fame the second time around.

Remember Napster? via.

Luckily Salganik, Dodds, and Watts are pretty clever, so they figured out how to make history happen a few times. They designed a music streaming site for teens which, unbeknownst to the teens but knownst to us, was not actually the same website for everyone who visited. The site asked users to listen to previously unknown songs and rate them, and then gave them an option to download the music.  Some users who went to the site were only given these options, and the music was presented to them in no particular order; this was the control group. Other users, however, were presented with a different view. Besides the control group, there were eight other versions of the site that were each identical at the outset, but could change depending on the actions of its members. Users were randomly assigned to reside in one of these eight ‘worlds,’ which they would come back to every time they logged in, and each of these worlds presented a list of most downloaded songs within that world. That is, Betty listened to a song in world 3, rated it five stars, and downloaded it. Everyone in world 3 would now see that the song had been downloaded once, and if other users downloaded it within that world, the download count would iterate up as expected.

The ratings assigned to each song in the control world, where download counts were not visible, were taken to be the independent measure of quality of each song. As expected, in the eight social influence worlds the most popular songs were downloaded a lot more than the most popular songs in the control world, because of the positive feedback effect of people seeing highly downloaded songs and then listening to and downloading them as well, which in turn increased their popularity even more. It should also come as no surprise that the ‘best’ songs, according to their rating in the independent world, rarely did badly in their download/rating counts in the social worlds, and the ‘worst’ songs under the same criteria rarely did well in the social worlds, but the top songs differed from one social world to the next, with the hugely popular hits with orders of magnitude more downloads being completely different in each social world. Their study concludes

We conjecture, therefore, that experts fail to predict success not because they are incompetent judges or misinformed about the preferences of others, but because when individual decisions are subject to social influence, markets do not simply aggregate pre-existing individual preferences. In such a world, there are inherent limits on the predictability of outcomes, irrespective of how much skill or information one has.

Contingency and sensitivity to initial conditions

In the complex systems terminology, the above is an example of a system that is highly sensitive to initial conditions and contingent (chance) events. It’s similar to that popular chaos theory claim that a butterfly flapping its wings in China can cause a hurricane years later over Florida. It’s not that one inevitably leads to the other; rather, positive feedback loops make it so that very small changes can quickly become huge causal factors in the system as their effects exponentially increase. The nearly-arbitrary decision for a famous author to cite one paper on computational linguistics over another equally qualified might be the impetus the first paper needs to shoot into its own stardom. The first songs randomly picked and downloaded in each social world of the above music sharing site greatly influenced the eventual winners of the popularity contest disguised as a quality rank.

Some systems are fairly inevitable in their outcomes. If you drop a two-ton stone from five hundred feet, it’s pretty easy to predict where it’ll fall, regardless of butterflies flapping their wings in China or birds or branches or really anything else that might get in the way. The weight and density of the stone are overriding causal forces that pretty much cancel out the little jitters that push it one direction or another. Not so with a leaf; dropped from the same height, we can probably predict it won’t float into space, or fall somewhere a few thousand miles away, but barring that prediction is really hard because the system is so sensitive to contingent events and initial conditions.

There does exist, however, a set of systems right at the sweet spot between those two extremes; stochastic enough that predicting exactly how it will turn out is impossible, but ordered enough that useful predictions and explanations can still be made. Thankfully for us, a lot of human activity falls in this class.

Tracking Hurricane Ike with models. Notice how short-term predictions are pretty accurate. (Click image watch this model animated). via.

Nate Silver, the expert behind the political prediction blog fivethirtyeight, published a book a few weeks ago called The Signal and the Noise: why so many predictions fail – but some don’t. Silver has an excellent track record of accurately predicting what large groups of people will do, although I bring him up here to discuss what his new book has to say about the weather. Weather predictions, according to Silver, are “highly vulnerable to inaccuracies in our data.” We understand physics and meteorology well enough that, if we had a powerful enough computer and precise data on environmental conditions all over the world, we could predict the weather with astounding precision. And indeed we do; the National Hurricane Center has become 350% more accurate in the last 25 years alone, giving people two or three day warnings for fairly exact locations with regard to storms. However, our data aren’t perfect, and slightly inaccurate or imprecise measurements abound. These small imprecisions can have huge repercussions in weather prediction models, with a few false measurements sometimes being enough to predict a storm tens or hundreds of miles off course.

To account for this, meteorologists introduce stochasticity into the models themselves. They run the same models tens, hundreds, or thousands of times, but each time they change the data slightly, accounting for where their measurements might be wrong. Run the model once pretending the wind was measured at one particular speed in one particular direction; run the model again with the wind at a slightly different speed and direction. Do this enough times, and you wind up with a multitude of predictions guessing the storm will go in different directions. “These small changes, introduced intentionally in order to represent the inherent uncertainty in the quality of the observational data, turn the deterministic forecast into a probabilistic one.” The most extreme predictions show the furthest a hurricane is likely to travel, but if most runs of the model have the hurricane staying within some small path, it’s a good bet that this is the path the storm will travel.

Silver uses a similar technique when predicting American elections. Various polls show different results from different places, so his models take this into account by running many times and then revealing the spread of possible outcomes; those outcomes which reveal themselves most often might be considered the most likely, but Silver also is careful to use the rest of the outcomes to show the uncertainty in his models and the spread of other plausible occurrences.

Going back to the music sharing site, while the sensitivity of the system would prevent us from exactly predicting the most-popular hits, the musical evaluations of the control world still give us a powerful predictive capacity. We can use those rankings to predict the set of most likely candidates to become hits in each of the worlds, and if we’re careful, all or most of the most-downloaded songs will have appeared in our list of possible candidates.

The payoff: simulating history

Simulating the plague in 19th century Canada. via.

So what do hurricanes, elections, and musical hits have to do with computer models and the humanities, specifically history? The fact of the matter is that a lot of models are abject failures when it comes to their intended use: predicting winners and losers. The best we can do in moderately sensitive systems that have difficult-to-predict positive feedback loops and limited winner space (the French Academy, Nobel laureates, etc.) is to find a large set of possible winners. We might be able to reduce that set so it has fairly accurate recall and moderate precision (out of a thousand candidates to win 10 awards, we can pick 50, and 9 out of the 10 actual winners was in our list of 50). This might not be great betting odds, but it opens the door for a type of history research that’s generally been consigned to the distant and somewhat distasteful realm of speculation. It is closely related to the (too-often scorned) realm of counterfactual history (What if the Battle of Gettysburg had been won by the other side? What if Hitler had never been born?), and is in fact driven by the ability to ask counterfactual questions.

The type of historiography of which I speak is the question of evolution vs. revolution; is history driven by individual, world-changing events and Great People, or is the steady flow of history predetermined, marching inevitably in some direction with the players just replaceable cogs in the machine? The dichotomy is certainly a false one, but it’s one that has bubbled underneath a great many historiographic debates for some time now. The beauty of historical stochastic models 3 is exactly their propensity to yield likely and unlikely paths, like the examples above. A well-modeled historical simulation 4 can be run many times; if only one or a few runs of the model reveal what we take as the historical past, then it’s likely that set of events was more akin to the ‘revolutionary’ take on historical changes. If the simulation takes the same course every time, regardless of the little jitters in preconditions, contingent occurrences, and exogenous events, then that bit of historical narrative is likely much closer to what we take as ‘inevitable.’

Models have many uses, and though many human systems might not be terribly amenable to predictive modeling, it doesn’t mean there aren’t many other useful questions a model can help us answer. The balance between inevitability and contingency, evolution and revolution, is just one facet of history that computational models might help us explore.

Notes:

  1. Music has a biological aspect as well. Most cultures with music tend towards discrete pitches, discernible (discrete) rhythm, ‘octave’-type systems with relatively few notes looping back around, and so forth. This suggests we’re hard-wired to appreciate music within a certain set of constraints, much in the same way we’re hard-wired to see only certain wavelengths of light or to like the taste of certain foods over others (Peretz 2006; doi:10.1016/j.cognition.2005.11.004). These tendencies can certainly be overcome, but to suggest the pre-defined structure of our wet thought-machine plays no role in our musical preferences is about as far-fetched as suggesting it plays the only role.
  2. I must thank Miriam Posner for this wonderful turn of phrase.
  3. presuming the historical data and model specifications are even accurate, which is a whole different can of worms to be opened in a later post
  4. Seriously, see the last note, this is really hard to do. Maybe impossible. But this argument is just assuming it isn’t, for now.

Science Systems Engineering

Warning: This post is potentially evil, and definitely normative. While I am unsure whether what I describe below should be doneI’m becoming increasingly certain that it could be. Read with caution.

Complex Adaptive Systems

Science is a complex adaptive system. It is a constantly evolving network of people and ideas and artifacts which interact with and feed back on each other to produce this amorphous socio-intellectual entity we call science. Science is also a bunch of nested complex adaptive systems, some overlapping, and is itself part of many other systems besides.

The study of complex interactions is enjoying a boom period due to the facilitating power of the “information age.” Because any complex system, whether it be a social group or a pool of chemicals, can exist in almost innumerable states while comprising the same constituent parts, it requires massive computational power to comprehend all the many states a system might find itself in. From the other side, it takes a massive amount of data observation and collection to figure out what states systems eventually do find themselves in, and that knowledge of how complex systems play out in the real world relies on collective and automated data gathering. From seeing how complex systems work in reality, we can infer properties of their underlying mechanisms; by modeling those mechanisms and computing the many possibilities they might allow, we can learn more about ourselves and our place in the larger multisystem. 1

One of the surprising results of complexity theory is that seemingly isolated changes can produce rippling, massive effects throughout a system.  Only a decade after the removal of big herbivores like giraffes and elephants from an African savanna, a generally positive relationship between bugs and plants turned into an antagonistic one. Because the herbivores no longer grazed on certain trees, those trees began producing less nectar and fewer thorns, which in turn caused cascading repercussions throughout the ecosystem. Ultimately, the trees’ mortality rate doubled, and a variety of species were worse-off than they had been. 2 Similarly, the introduction of an invasive species can cause untold damage to an ecosystem, as has become abundantly clear in Florida 3 and around the world (the extinction of flightless birds in New Zealand springs to mind).

http://www.flickr.com/photos/arnolouise/3202569865/

Both evolutionary and complexity theories show that self-organizing systems evolve in such a way that they are self-sustaining and self-perpetuating. Often, within a given context or environment, the systems which are most resistant to attack, or the most adaptable to change, are the most likely to persist and grow. Because the entire environment evolves concurrently, small changes in one subsystem tend to propagate as small changes in many others. However, when the constraints of the environment change rapidly (like with the introduction of an asteroid and a cloud of sun-cloaking dust), when a new and sufficiently foreign system is introduced (land predators to New Zealand), or when an important subsystem is changed or removed (the loss of megafauna in Africa), devastating changes ripple outward.

An environmental ecosystem is one in which many smaller overlapping systems exist, and changes in the parts may change the whole; society can be described similarly. Students of history know that the effects of one event (a sinking ship, an assassination, a terrorist attack) can propagate through society for years or centuries to come. However, a system not merely a slave to these single occurrences which cause Big Changes. The structure and history of a system implies certain stable, low energy states. We often anthropomorphize the tendency of systems to come to a stable mean, for example “nature abhors a vacuum.” This is just the manifestation of the second law of thermodynamics: entropy always increases, systems naturally tend toward low energy states.

For the systems of society, they are historically structured constrained in such a way that certain changes would require very little energy (an assassination leading to war in a world already on the brink), whereas others would require quite a great deal (say, an attempt to cause war between Canada and the U.S.). It is a combination of the current structural state of a system and the interactions of the constituent parts that lead that system in one direction or another. Put simply, a society, its people, and its environment are responsible for its future. Not terribly surprising, I know, but the formal framework of complexity theory is a useful one for what is described below.

metastability

The above picture, from the Wikipedia article on metastability, provides an example of what’s described above. The ball is resting in a valley, a low energy state, and a small change may temporarily excite the system, but the ball eventually finds its way into the same, or another, low energy state. When the environment is stable, its subsystems tend to find comfortably stable niches as well. Of course, I’m not sure anyone would call society wholly stable…

Science as a System

Science (by which I mean wissenschaft, any systematic research) is part of society, and itself includes many constituent and overlapping parts. I recently argued, not without precedent, that the correspondence network between early modern Europeans facilitated the rapid growth of knowledge we like to call the Scientific Revolution. Further, that network was an inevitable outcome of socio/political/technological factors, including shrinking transportation costs, increasing political unrest leading to scholarly displacement, and, very simply, an increased interest in communicating once communication proved so fruitful. The state of the system affected the parts, the parts in turn affected the system, and a growing feedback loop led to the co-causal development of a massive communication network and a period of massively fruitful scholarly work.

Scientific Correspondence Network

Today and in the past, science is embedded in, and occasionally embodied by, the various organizational and communicative hierarchies its practitioners find themselves in. The people, ideas, and products of science feed back on one another. Scientists are perhaps more affected by their labs, by the process of publication, by the realities of funding, than they might admit. In return, the knowledge and ideas produced by science, the message, shape and constrain the medium in which they are propagated. I’ve often heard and read two opposing views: that knowledge is True and Right  and unaffected the various social goings on of those who produce it, and that knowledge is Constructed and Meaningless outside of the social and linguistic system it resides in. The truth, I’m sure, is a complex tangle somewhere between the two, and affected by both.

In either case, science does not take place in a vacuum. We do our work through various media and with various funds, in departments and networks and (sometimes) lab-coats, using a slew of carefully designed tools and a language that was not, in general, made for this purpose. In short, we and our work exist in a  complex system.

Engineering the Academy

That system is changing. Michael Nielsen’s recent book 4 talks about the rise of citizen science, augmented intelligence, and collaborative systems as not merely as ways to do what we’ve already done faster, but as new methods of discovery. The ability to coordinate on such a scale, and in such new ways, changes the game of science. It changes the system.

While much of these changes are happening automatically, in a self-organized sort of way, Nielsen suggests that we can learn from our past and learn from other successful collective ventures in order to make a “design science of collaboration.” That is, using what we know of how people work together best, of what spurs on the most inspired research and the most interesting results, we can design systems to facilitate collaboration and scientific research. In Nielsen’s case, he’s talking mostly about computer systems; how can we design a website or an algorithm or a technological artifact that will aid in scientific discovery, using the massive distributed power of the information age? One way Nielson points out is “designed serendipity,” creating an environment where scientists are more likely experience serendipitous occurrences, and thus more likely to come up with innovated and unexpected ideas.

Can we engineer science? http://www.flickr.com/photos/seattlemunicipalarchives/4818952324

In complexity terms, this idea is restructuring the system in such a way that the constituent parts or subsystems will be or do “better,” however we feel like defining better in this situation. It’s definitely not the first time an idea like this has been used. For example, science policy makers, government agencies, and funding bodies have long known that science will often go where the money is. If there is a lot of money available to research some particular problem, then that problem will tend to get researched. If the main funding requires research funded to become open access, by and large that will happen (NIH’s PubMed requirements).

There are innumerable ways to affect the system in a top-down way in order to shape its future. Terrence Deacon writes about how it is the constraints on a system which tend it toward some equilibrium state 5; by shaping the structure of the scientific system, we can predictably shape its direction. That is, we can artificially create a low energy state (say, open access due to policy and funding changes), and let the constituent parts find their way into that low energy state eventually, reaching equilibrium. I talked a bit more about this idea of constraints leading a system in a recent post.

As may be recalled from the discussion above, however, this is not the only way to affect a complex system. External structural changes are only part of the story of how a system grows shifts, but only a small part of the story. Because of the series of interconnected feedback loops that embody a system’s complexity, small changes can (and often do) propagate up and change the system as a whole. Lie, Slotine, and Barabási recently began writing about the “controllability of complex networks 6,”  suggesting ways in which changing or controlling constituent parts of a complex system can reliably and predictably change the entire system, perhaps leading it toward a new preferred low energy state. In this case, they were talking about the importance of well-connected hubs in a network; adding or removing them in certain areas can deeply affect the evolution of that network, no matter the constraints. Watts recounts a great example of how a small power outage rippled into a national disaster because just the right connections were overloaded and removed 7. The strategic introduction or removal of certain specific links in the scientific system may go far toward changing the system itself.

Not only is science is a complex adaptive system, it is a system which is becoming increasingly well-understood. A century of various science studies combined with the recent appearance of giant swaths of data about science and scientists themselves is beginning to allow us to learn the structure and mechanisms of the scientific system. We do not, and will never, know the most intricate details of that system, however in many cases and for many changes, we only need to know general properties of a system in order to change it in predictable ways. If society feels a certain state of science is better than others, either for the purpose of improved productivity or simply more control, we are beginning to see which levers we need to pull in order to enact those changes.

This is dangerous. We may be able to predict first order changes, but as they feed back onto second order, third order, and further-down-the-line changes, the system becomes more unpredictable. Changing one thing positively may affect other aspects in massively negative (and massively unpredictable) ways.

However, generally if humans can do something, we will. I predict the coming years will bring a more formal Science Systems Engineering, a specialty apart from science policy which will attempt to engineer the direction of scientific research from whatever angle possible. My first post on this blog concerned a concept I dubbed scientonomy, which was just yet another attempt at unifying everybody who studies science in a meta sort of way. In that vocabulary, then, this science systems engineering would be an applied scientonomy. We have countless experts in all aspects of how science works on a day-to-day basis from every angle; that expertise may soon become much more prominent in application.

It is my hope and belief that a more formalized way of discussing and engineering scientific endeavors, either on the large scale or the small, can lead to benefits to humankind in the long run. I share the optimism of Michael Nielsen in thinking that we can design ways to help the academy run more smoothly and to lead it toward a more thorough, nuanced, and interesting understanding of whatever it is being studied. However, I’m also aware of the dangers of this sort of approach, first and foremost being disagreement on what is “better” for science or society.

At this point, I’m just putting this idea out there to hear the thoughts of my readers. In my meatspace day-to-day interactions, I tend to be around experimental scientists and quantitative social scientists who in general love the above ideas,  but at my heart and on my blog I feel like a humanist, and these ideas worry me for all the obvious reasons (and even some of the more obscure ones). I’d love to get some input, especially from those who are terrified that somebody could even think this is possible.

Notes:

  1. I’m coining the term “multisystem” because ecosystem is insufficient, and I don’t know something better. By multisystem, I mean any system of systems; specifically here, the universe and how it evolves. If you’ve got a better term that invokes that concept, I’m all for using it. Cosmos comes to mind, but it no longer represents “order,” a series of interlocking systems, in the way it once did.
  2. Palmer, Todd M, Maureen L Stanton, Truman P Young, Jacob R Goheen, Robert M Pringle, and Richard Karban. 2008. “Breakdown of an Ant-Plant Mutualism Follows the Loss of Large Herbivores from an African Savanna.” Science319 (5860) (January 11): 192–195. doi:10.1126/science.1151579.
  3. Gordon, Doria R. 1998. “Effects of Invasive, Non-Indigenous Plant Species on Ecosystem Processes: Lessons From Florida.” Ecological Applications 8 (4): 975–989. doi:10.1890/1051-0761(1998)008[0975:EOINIP]2.0.CO;2.
  4. Nielsen, Michael. Reinventing Discovery: The New Era of Networked Science. Princeton University Press, 2011.
  5. Deacon, Terrence W. “Emergence: The Hole at the Wheel’s Hub.” In The Re-Emergence of Emergence: The Emergentist Hypothesis from Science to Religion, edited by Philip Clayton and Paul Davies. Oxford University Press, USA, 2006.
  6. Liu, Yang-Yu, Jean-Jacques Slotine, and Albert-László Barabási. “Controllability of Complex Networks.” Nature473, no. 7346 (May 12, 2011): 167–173.
  7. Watts, Duncan J. Six Degrees: The Science of a Connected Age. 1st ed. W. W. Norton & Company, 2003.

More heavy-handed culturomics

A few days ago, Gao, Hu, Mao, and Perc posted a preprint of their forthcoming article comparing social and natural phenomena. The authors, apparently all engineers and physicists, use the google ngrams data to come to the conclusion that “social and natural phenomena are governed by fundamentally different processes.” The take-home message is that words describing natural phenomena increase in frequency at regular, predictable rates, whereas the use of certain socially-oriented words change in unpredictable ways. Unfortunately, the paper doesn’t necessarily differentiate between words and what they describe.

Specifically, the authors invoke random fractal theory (sort of a descendant of chaos theory) to find regular patterns in 1-grams. A 1-gram is just a single word, and this study looks at how the frequency of certain words grow or shrink over time. A “hurst parameter” is found for 24 words, a dozen pertaining to nature (earthquake, fire, etc.), and another dozen “social” words (war, unemployment, etc.). The hurst parameter (H) is a number which, essentially, reveals whether or not a time series of data is correlated with itself. That is, given a set of observations over the last hundred years, autocorrelated data means the observation for this year will very likely follow a predictable trend from the past.

If H is between 0.5 and 1, that means the dataset has “long-term positive correlation,” which is roughly equivalent to saying that data quite some time in the past will still positively and noticeably effect data today. If H is under 0.5, data are negatively correlated with their past, suggesting that a high value in the past implies a low value in the future, and if H = 0.5, the data likely describe Brownian motion (they are random). H can exceed 1 as well, a point which I’ll get to momentarily.

The authors first looked at the frequency of 12 words describing natural phenomena between 1770 and 2007. In each case, H was between 0.5 and 1, suggesting a long-term positive trend in the use of the terms. That is, the use of the term “earthquake” does not fluctuate terribly wildly from year to year; looking at how frequently it was used in the past can reasonably predict how frequently it will be used in the future. The data have a long “memory.”

Natural 1-grams from Gao et al. (2012)

The paper then analyzed 12 words describing social phenomena, with very different results. According to the authors, “social phenomena, apart from rare exceptions, cannot be classified solely as processes with persistent-long range correlations.” For example, the use of the word “war” bursts around World War I and World War II; these are unpredictable moments in the discussion of social phenomena. The way “war” was used in the past was not a good predictor of how “war” would be used around 1915 and 1940, for obvious reasons.

Social 1-grams from Gao et al. (2012)

You may notice that, for many of the social terms, H is actually greater than 1, “which indicates that social phenomena are most likely to be either nonstationary, on-off intermittent, or Levy walk-like process.” Basically, the H parameter alone is not sufficient to describe what’s going on with the data. Nonstationary processes are, essentially, unpredictable. A stationary process can be random, but at least certain statistical properties of that randomness remain persistent. Nonstationary processes don’t have those persistent statistical properties. The authors point out that not all social phenomena will have H >1, citing famine, because it might relate to natural phenomena. They also point out that “the more the social phenomena can be considered recent (unemployment, recession, democracy), the higher their Hurst parameter is likely to be.”

In sum, they found that “The prevalence of long-term memory in natural phenomena [compels them] to conjecture that the long-range correlations in the usage frequency of the corresponding terms is predominantly driven by occurrences in nature of those phenomena,” whereas “it is clear that all these processes [describing social phenomena] are fundamentally different from those describing natural phenomena.” That the social phenomena follow different laws is not unexpected, they say, because they themselves are more complex; they rely on political, economic, and social forces, as well as natural phenomena.

While this paper is exceptionally interesting, and shows a very clever use of fairly basic data (24 one-dimensional variables, just looking at word use per year), it lacks the same sort of nuance also lacking in the original culturomics paper. Namely, in this case, it lacks the awareness that social and natural phenomena are not directly coupled with the words used to describe them, nor the frequency with which those words are used. The paper suggests that natural and social phenomena are governed by different scaling laws when, realistically, it is the way they are discussed, and how those discussions are published which are governed by the varying scaling laws. Further, although they used words exemplifying the difference between “nature” and “society,” the two are not always so easily disentangled, either in language or the underlying phenomena.

Perhaps the sort of words used to describe social events change differently than the sort used to describe natural events. Perhaps, because natural phenomena are often immediately felt across vast distances, whereas news of social phenomena can take some time to diffuse, how rapidly some words are discussed may take very different forms. Discussions and word-usage are always embedded in a larger network. Also needing to be taken into account is who is discussing social vs. natural phenomena, and which is more likely to get published and preserved to eventually be scanned by Google Books.

Without a doubt the authors have noticed a very interesting trend, but rather than matching the phenomena directly to word, as they did, we should be using this sort of study to look at how language changes, how people change, and ultimately what relationship people have with the things they discuss and publish. At this point, the engineers and physicists still have a greater comfort with the statistical tools needed to fully utilize the google books corpus, but there are some humanists out there already doing absolutely fantastic quantitative work with similar data.

This paper, while impressive, is further proof that the quantitative study of culture should not be left to those with (apparently) little background in the subject. While it is not unlikely that different factors do, in fact, determine the course of natural disasters versus that of human interaction, this paper does not convincingly tease those apart. It may very well be that the language use is indicative of differences in underlying factors in the phenomena described, however no study is cited suggesting this to be the case. Claims like “social and natural phenomena are governed by fundamentally different processes,” given the above language data, could easily have been avoided, I think, with a short discussion between the authors and a humanist.

The Networked Structure of Scientific Growth

Well, it looks like Digital Humanities Now scooped me on posting my own article. As some of you may have read, I recently did not submit a paper on the Republic of Letters, opting instead to hold off until I could submit it to a journal which allowed authorial preprint distribution. Preprints are a vital part of rapid knowledge exchange in our ever-quickening world, and while some disciplines have embraced the preprint culture, many others have yet to. I’d love the humanities to embrace that practice, and in the spirit of being the change you want to see in the world, I’ve decided to post a preprint of my Republic of Letters paper, which I will be submitting to another journal in the near future. You can read the full first draft here.

The paper, briefly, is an attempt to contextualize the Republic of Letters and the Scientific Revolution using modern computational methodologies. It draws from secondary sources on the Republic of Letters itself, especially from my old mentor R.A. Hatch, some network analysis from sociology and statistical physics, modeling, human dynamics, and complexity theory. All of this is combined through datasets graciously donated by the Dutch Circulation of Knowledge group and Oxford’s Cultures of Knowledge project, totaling about 100,000 letters worth of metadata. Because it favors large scale quantitative analysis over an equally important close and qualitative analysis, the paper is a contribution to historiopgraphic methodology rather than historical narrative; that is, it doesn’t say anything particularly novel about history, but it does offer a (fairly) new way of looking at and contextualizing it.

A visualization of the Dutch Republic of Letters using Sci2 & Gephi

At its core, the paper suggests that by looking at how scholarly networks naturally grow and connect, we as historians can have new ways to tease out what was contingent upon the period and situation. It turns out that social networks of a certain topology are basins of attraction similar to those I discussed in Flow and Empty Space. With enough time and any of a variety of facilitating social conditions and technologies, a network similar in shape and influence to the Republic of Letters will almost inevitably form. Armed with this knowledge, we as historians can move back to the microhistories and individuated primary materials to find exactly what those facilitating factors were, who played the key roles in the network, how the network may differ from what was expected, and so forth. Essentially, this method is one base map we can use to navigate and situate historical narrative.

Of course, I make no claims of this being the right way to look at history, or the only quantitative base map we can use. The important point is that it raises new kinds of questions and is one mechanism to facilitate the re-integration of the individual and the longue durée, the close and the distant reading.

The project casts a necessarily wide net. I do not yet, and probably could not ever, have mastery over each and every disciplinary pool I draw from. With that in mind, I welcome comments, suggestions, and criticisms from historians, network analysts, modelers, sociologists, and whomever else cares to weigh in. Whomever helps will get a gracious acknowledgement in the final version, good scholarly karma, and a cookie if we ever meet in person. The draft will be edited and submitted in the coming months, and if you have ideas, please post them in the comment section below. Also, if you use ideas from the paper, please cite it as an unpublished manuscript or, if it gets published, cite that version instead.

Flow and Empty Space

Thirty spokes unite in one nave and on that which is non-existent [on the hole in the nave] depends the wheel’s utility. Clay is moulded into a vessel and on that which is non-existent [on its hollowness] depends the vessel’s utility. By cutting out doors and windows we build a house and on that which is non-existent [on the empty space within] depends the house’s utility. Therefore, existence renders actual but non-existence renders useful.

-Laozi, Tao Te Ching, Susuki Translation

(NOTE 1: Although it may not seem it from the introduction, this post is actually about humanities research, eventually. Stick with it and it may pay off!)

(NOTE 2: I’ve warned in the past about invoking concepts you know little about; let me be the first to say I know next to nothing about Eastern philosophy or t’ai chi ch’uan, though I do know a bit about emergence and a bit about juggling. This post uses the above concepts as helpful metaphors, fully apologizing to those who know a bit more about the concepts for the butchering of them that will likely ensue.)

The astute reader may have noticed that, besides being a sometimes-historian and a sometimes-data-scientist, the third role I often take on is that of a circus artist. Juggling and prop manipulation have been part of my life for over a decade now, and though I don’t perform as much as I used to, the feeling I get from practicing is still fairly essential in keeping me sane. What juggling provides me that I cannot get elsewhere is what prop manipulators generally call a state of “flow.”

Look! It's me in a candy store!

The concept draws from a positive psychology term developed by Mihály Csíkszentmihályi, and is roughly equivalent to being in “the zone.” Although I haven’t quite experienced it, this feeling apparently comes to programmers working late at night trying to solve a problem. It’s also been described by dancers, puzzle solvers, and pretty much anyone else who gets so into something they feel, if only for a short time, they have totally lost themselves in their activity. A fellow contact juggler, Richard Hartnell, recently filmed a fantastic video describing what flow means to him as a performer. I make no claims here to any meaning behind the flow state. The human brain is complex beyond my understanding, and though I do not ascribe any mystical properties to the experience, having felt “flow” so deeply, I can certainly see why some do treat it as a religious experience.

The most important contribution to my ability to experience this state while juggling was, oddly enough, a t’ai chi ch’uan course. Really, it was one concept from the course, called song kua, “relax the hips,” that truly opened up flow for me. It’s a complex concept, but the part I’d like to highlight here is the relationship between exertion and relaxation, between a push and a pull. When you move your body, that movement generally starts with an intention. I want my hand to move to the right, so I move it to the right. There is, however, another way to move parts of the body, and this is via relaxation. If I’m standing in a certain way, and I relax my hip in one directoin, my body will naturally shift in the opposite direction. My body naturally gets pulled one way, rather than me pushing it to go there. In the circus arts, I can now quickly reach a flow state by creating a system between myself and whatever prop I’m using, and allowing the state of that system to pull me to the next state, rather than intentionally pushing myself and my prop in some intentional way. It was, for me, a mind-blowing shift in perspective, and one that had absolutely nothing to do with my academic pursuits until last night, on a short plane ride back from Chicago APA.

In the past two weeks, I’ve been finishing up the first draft of a humanities paper that uses concepts from complex systems and network analysis. In it, I argue (among other things) that there are statistical regularities in human behavior, and that we as historians can use that backdrop as a context against which we can study history, finding actions and events which deviate from the norm. Much recent research has gone into showing that people, on average, behave in certain ways, generally due to constraints placed on us by physics, biology, and society. This is not to say humans are inherently predictable – merely that there are boundaries beyond which certain actions are unlikely or even impossible given the constraints of our system. In the paper, I further go on to suggest that the way we develop our social networks also exhibits regularities across history, and the differences against those regularities, and the mechanisms by which they occur, are historically interesting.

Fast-forward to last night: I’m reading a fantastic essay by anthropologist Terrence W. Deacon about the emergence of self-organizing biological systems on the plane-ride home. 1 In the essay, Deacon attempts to explain why entropy seems to decrease enough to allow, well, Life, The Universe, and Everything, given the second law of thermodynamics. His answer is that there are basins of attraction in the dynamics of most processes which inherently and inevitably produce order. That is, as a chaotic system interacts with itself, there are dynamical states which the system can inhabit which are inherently self-sustaining. After a chaotic system shuffles around for long enough, it will eventually and randomly reach a state that “attracts” toward a self-sustaining dynamical state, and once it falls into that basin of attraction, the system will feed back on itself, remaining in its state, creating apparent order from chaos for a sustained period of time.

Deason invokes a similar Tao Te Ching section as was quoted above, suggesting that empty or negative space, if constrained properly and possessing the correct qualities, act as a kind of potential energy. The existence of the walls of a clay pot are what allows it to be a clay pot, but the function of it rests in the constrained negative space bounded by those walls. In the universe, Deason suggests, constraints are implicit and temporally sensitive; if only a few state structures are self-sustaining, those states, if reached, will naturally persist. Similar to that basic tenant of natural selection, that which can persist tends to.

The example Deason first uses is that of a whirlpool forming in the empty space behind a rock in a flowing river.

Consider a whirlpool, stably spinning behind a boulder in a stream. As moving water enters this location it is compensated for by a corresponding outflow. The presence of an obstruction imparts a lateral momentum to the molecules in the flow. The previous momentum is replaced by introducing a reverse momentum imparted to the water as it flows past the obstruction and rushes to fill the comparatively vacated region behind the rock. So not only must excess water move out of the local vicinity at a constant rate; these vectors of perturbed momentum must also be dissipated locally so that energy and water doesn’t build up. The spontaneous instabilities that result when an obstruction is introduced will effectively induce irregular patterns of build-up and dissipation of flow that ‘explore’ new possibilities, and the resulting dynamics tends toward the minimization of the constantly building instabilities. This ‘exploration’ is essentially the result of chaotic dynamics that are constantly self-undermining. To the extent that characteristics of component interactions or boundary conditions allow any degree of regularity to develop (e.g. circulation within a trailing eddy), these will come to dominate, because there are only a few causal architectures that are not self-undermining. This is also the case for semi-regular patterns (e.g. patterns of eddies that repeatedly form and disappear over time), which are just less self-undermining than other configurations.

The flow is not forced to form a whirlpool. This dynamical geometry is not ‘pushed’ into existence, so to speak, by specially designed barriers and guides to the flow. Rather, the system as a whole will tend to spend more time in this semi-regular behaviour because the dynamical geometry of the whirlpool affords one of the few ways that the constant instabilities can most consistently compensate for one another. [Deason, 2009, emphasis added]

Self-Organizing System (http://www.flickr.com/photos/lapstrake/3164577339/)

Essentially, when lots of things interact at random, there are some self-organized constraints to their interactions which allow order to arise from chaos. This order may be fleeting or persistent. Rather than using the designed constraint of a clay pot, walls of a room, or spokes around a hub, the constraints to the system arise from the potential in the context of the interactions, and in the properties of the interacting objects themselves.

So what in the world does this have to do with the humanities?

My argument in the above paper was that people naturally interact in certain ways; there are certain basins of attraction, properties of societies that tend to self-organize and persist. These are stochastic regularities; people do not always interact in the same way, and societies do not come to the same end, nor meet their ends in the same fashion. However, there are properties which make social organization more likely, and knowing how societies tend to form, historians can use that knowledge to frame questions and focus studies.

Explicit, data-driven models of the various mechanisms of human development and interaction will allow a more nuanced backdrop against which the actualities of the historical narrative can be studied. Elijah Meeks recently posted, about models,

[T]he beauty of a model is that all of these [historical] assumptions are formalized and embedded in the larger argument…  That formalization can be challenged, extended, enhanced and amended [by more historical research]… Rather than a linear text narrative, the model itself is an argument.

It is striking how seemingly unrelated strands of my life came together last night. The pull and flow of juggling, the bounded ordering of emergent behaviors, and the regularities in human activities. Perhaps this is indicative of the consilience of human endeavors; perhaps it is simply the overactive pattern-recognition circuits in my brain doing what they do best. In any case, even if the relationships are merely loose metaphors, it seems clear that a richer understanding of complexity theory, modeling, and data-driven humanities leading to a more nuanced, humanistic understanding of human dynamics would benefit all. This understanding can help ground the study of history in the Age of Abundance. A balance can be drawn between the uniquely human and individual, on one side, and the statistically regular ordering of systems, on the other; both sides need to be framed in terms of the other. Unfortunately, the dialogue on this topic in the public eye has thus-far been dominated by applied mathematicians and statistical physicists who tend not to take into account the insights gained from centuries of qualitative humanistic inquiry. That probably means it’s our job to learn from them, because it seems unlikely that they will try to learn from us.

Notes:

  1. in The Re-Emergence of Emergence, 2009, edited by Philip Clayton & Paul Davies.