The Turing Point

Below is some crazy, uninformed ramblings about the least-complex possible way to trick someone into thinking a computer is a human, for the purpose of history research. I’d love some genuine AI/Machine Intelligence researchers to point me to the actual discussions on the subject. These aren’t original thoughts; they spring from countless sci-fi novels and AI research from the ’70s-’90s. Humanists beware: this is super sci-fi speculative, but maybe an interesting thought experiment.


If someone’s chatting with a computer, but doesn’t realize her conversation partner isn’t human, that computer passes the Turing Test. Unrelatedly, if a robot or piece of art is just close enough to reality to be creepy, but not close enough to be convincingly real, it lies in the Uncanny ValleyI argue there is a useful concept in the simplest possible computer which is still convincingly human, and that computer will be at the Turing Point. 1 

By Smurrayinchester - self-made, based on image by Masahiro Mori and Karl MacDorman at http://www.androidscience.com/theuncannyvalley/proceedings2005/uncannyvalley.html, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=2041097
By Smurrayinchester – self-made, based on image by Masahiro Mori and Karl MacDorman, CC BY-SA 3.0

Forgive my twisting Turing Tests and Uncanny Valleys away from their normal use, for the sake of outlining the Turing Point concept:

  • A human simulacrum is a simulation of a human, or some aspect of a human, in some medium, which is designed to be as-close-as-possible to that which is being modeled, within the scope of that medium.
  • A Turing Test winner is any human simulacrum which humans consistently mistake for the real thing.
  • An occupant of the Uncanny Valley is any human simulacrum which humans consistently doubt as representing a “real” human.
  • Between the Uncanny Valley and Turing Test winners lies the Turing Point, occupied by the least-sophisticated human simulacrum that can still consistently pass as human in a given medium. The Turing Point is a hyperplane in a hypercube, such that there are many points of entry for the simulacrum to “phase-transition” from uncanny to convincing.

Extending the Turing Test

The classic Turing Test scenario is a text-only chatbot which must, in free conversation, be convincing enough for a human to think it is speaking with another human. A piece of software named Eugene Goostman sort-of passed this test in 2014, convincing a third of judges it was a 13-year-old Ukrainian boy.

There are many possible modes in which a computer can act convincingly human. It is easier to make a convincing simulacrum of a 13-year-old non-native English speaker who is confined to text messages than to make a convincing college professor, for example. Thus the former has a lower Turing Point than the latter.

Playing with the constraints of the medium will also affect the Turing Point threshold. The Turing Point for a flesh-covered robot is incredibly difficult to surpass, since so many little details (movement, design, voice quality, etc.) may place it into the Uncanny Valley. A piece of software posing as a Twitter user, however, would have a significantly easier time convincing fellow users it is human.

The Turing Point, then, is flexible to the medium in which the simulacrum intends to deceive, and the sort of human it simulates.

From Type to Token

Convincing the world a simulacrum is any old human is different than convincing the world it is some specific human. This is the token/type distinction; convincingly simulating a specific person (token) is much more difficult than convincingly simulating any old person (type).

Simulations of specific people are all over the place, even if they don’t intend to deceive. Several Twitter-bots exist as simulacra of Donald Trump, reading his tweets and creating new ones in a similar style. Perhaps imitating Poe’s Law, certain people’s styles, or certain types of media (e.g. Twitter), may provide such a low Turing Point that it is genuinely difficult to distinguish humans from machines.

Put differently, the way some Turing Tests may be designed, humans could easily lose.

It’ll be useful to make up and define two terms here. I imagine the concepts already exist, but couldn’t find them, so please comment if they do so I can use less stupid words:

  • type-bot is a machine designed to be represent something at the type-level. For example, a bot that can be mistaken for some random human, but not some specific human.
  • token-bot is a machine designed to represent something at the token-level. For example, a bot that can be mistaken for Donald Trump.

Replaying History

Using traces to recreate historical figures (or at least things they could have done) as token-bots is not uncommon. The most recent high-profile example of this is a project to create a new Rembrandt painting in the original style. Shawn Graham and I wrote an article on using simulations to create new plausible histories, among many other examples old and new.

This all got me thinking, if we reach the Turing Point for some social media personalities (that is, it is difficult to distinguish between their social media presence, and a simulacrum of it), what’s to say we can’t reach it for an entire social media ecosystem? Can we take a snapshot of Twitter and project it several seconds/minutes/hours/days into the future, a bit like a meteorological model?

A few questions and obvious problems:

  • Much of Twitter’s dynamics are dependent upon exogenous forces: memes from other media, real world events, etc. Thus, no projection of Twitter alone would ever look like the real thing. One can, however, potentially use such a simulation to predict how certain types of events might affect the system.
  • This is way overkill, and impossibly computationally complex at this scale. You can simulate the dynamics of Twitter without simulating every individual user, because people on average act pretty systematically. That said, for the humanities-inclined, we may gain more insight from the ground-level of the system (individual agents) than macroscopic properties.
  • This is key. Would a set of plausibly-duplicate Twitter personalities on aggregate create a dynamic system that matches Twitter as an aggregate system? That is, just because the algorithms pass the Turing Test, because humans believe them to be humans, does that necessarily imply the algorithms have enough fidelity to accurately recreate the dynamics of a large scale social network? Or will small unnoticeable differences between the simulacrum and the original accrue atop each other, such that in aggregate they no longer act like a real social network?

The last point is I think a theoretically and methodologically fertile one for people working in DH, AI, and Cognitive Science: whether reducing human-appreciable traits between machines and people is sufficient to simulate aggregate social behavior, or whether human-appreciability (i.e., Turing Test) is a strict enough criteria for making accurate predictions about societies.

These points aside, if we ever do manage to simulate specific people (even in a very limited scope) as token-bots based on the traces they leave, it opens up interesting pedagogical and research opportunities for historians. Scott Enderle tweeted a great metaphor for this:

Imagine, as a student, being able to have a plausible discussion with Marie Curie, or sitting in an Enlightenment-era salon. 2 Or imagine, as a researcher (if individual Turing Point machines do aggregate well), being able to do well-grounded counterfactual history that works at the token level rather than at the type level.

Turing Point Simulations

Bringing this slightly back into the realm of the sane, the interesting thing here is the interplay between appreciability (a person’s ability to appreciate enough difference to notice something wrong with a simulacrum) and fidelity.

We can specifically design simulation conditions with incredibly low-threshold Turing Points, even for token-bots. That is to say, we can create a condition where the interactions are simple enough to make a bot that acts indistinguishably from the specific human it is simulating.

At the most extreme end, this is obviously pointless. If our system is one in which a person can only answer “yes” or “no” to pre-selected preference questions (“Do you like ice-cream?”), making a bot to simulate that person convincingly would be trivial.

Putting that aside (lest we get into questions of the Turing Point of a set of Turing Points), we can potentially design reasonably simplistic test scenarios that would allow for an easy-to-reach Turing Point while still being historiographically or sociologically useful. It’s sort of a minimization problem in topological optimizations. Such a goal would limit the burden of the simulation while maximizing the potential research benefit (but only if, as mentioned before, the difference between true fidelity and the ability to win a token-bot Turing Test is small enough to allow for generalization).

In short, the concept of a Turing Point can help us conceptualize and build token-simulacra that are useful for research or teaching. It helps us ask the question: what’s the least-complex-but-still-useful token-simulacra? It’s also kind-of maybe sort-of like Kolmogorov complexity for human appreciability of other humans: that is, the simplest possible representation of a human that is convincing to other humans.

I’ll end by saying, once again, I realize how insane this sounds, and how far-off. And also how much an interloper I am to this space, having never so much as designed a bot. Still, as Bill Hart-Davidson wrote,

the possibility seems more plausible than ever, even if not soon-to-come. I’m not even sure why I posted this on the Irregular, but it seemed like it’d be relevant enough to some regular readers’ interests to be worth spilling some ink.

Notes:

  1. The name itself is maybe too on-the-nose, being a pun for turning point and thus connected to the rhetoric of singularity, but ¯\_(ツ)_/¯
  2. Yes yes I know, this is SecondLife all over again, but hopefully much more useful.

Who sits in the 41st chair?

tl;dr Rich-get-richer academic prestige in a scarce job market makes meritocracy impossible. Why some things get popular and others don’t. Also agent-based simulations.

Slightly longer tl;dr This post is about why academia isn’t a meritocracy, at no intentional fault of those in power who try to make it one. None of presented ideas are novel on their own, but I do intend this as a novel conceptual contribution in its connection of disparate threads. Especially, I suggest the predictability of research success in a scarce academic economy as a theoretical framework for exploring successes and failures in the history of science.

But mostly I just beat a “musical chairs” metaphor to death.

Positive Feedback

To the victor go the spoils, and to the spoiled go the victories. Think about it: the Yankees; Alexander the Great; Stanford University. Why do the Yankees have twice as many World Series appearances as their nearest competitors, how was Alex’s empire so fucking vast, and why does Stanford get all the cool grants?

The rich get richer. Enough World Series victories, and the Yankees get the reputation and funding to entice the best players. Ol’ Allie-G inherited an amazing army, was taught by Aristotle, and pretty much every place he conquered increased his military’s numbers. Stanford’s known for amazing tech innovation, so they get the funding, which means they can afford even more innovation, which means even more people think they’re worthy of funding, and so on down the line until Stanford and its neighbors (Google, Apple, etc.) destroy the local real estate market and then accidentally blow up the world.

Alexander's Empire [via]
Alexander’s Empire [via]
Okay, maybe I exaggerated that last bit.

Point is, power begets power. Scientists call this a positive feedback loop: when a thing’s size is exactly what makes it grow larger.

You’ve heard it firsthand when a microphoned singer walks too close to her speaker. First the mic picks up what’s already coming out of the speaker. The mic, doings its job, sends what it hears to an amplifier, sending an even louder version to the very same speaker. The speaker replays a louder version of what it just produced, which is once again received by the microphone, until sound feeds back onto itself enough times to produce the ear-shattering squeal fans of live music have come to dread. This is a positive feedback loop.

Feedback loop. [via]
Feedback loop. [via]
Positive feedback loops are everywhere. They’re why the universe counts logarithmically rather than linearly, or why income inequality is so common in free market economies. Left to their own devices, the rich tend to get richer, since it’s easier to make money when you’ve already got some.

Science and academia are equally susceptible to positive feedback loops. Top scientists, the most well-funded research institutes, and world-famous research all got to where they are, in part, because of something called the Matthew Effect.

Matthew Effect

The Matthew Effect isn’t the reality TV show it sounds like.

For unto every one that hath shall be given, and he shall have abundance: but from him that hath not shall be taken even that which he hath. —Matthew 25:29, King James Bible.

It’s the Biblical idea that the rich get richer, and it’s become a popular party trick among sociologists (yes, sociologists go to parties) describing how society works. In academia, the phrase is brought up alongside evidence that shows previous grant-recipients are more likely to receive new grants than their peers, and the more money a researcher has been awarded, the more they’re likely to get going forward.

The Matthew Effect is also employed metaphorically, when it comes to citations. He who gets some citations will accrue more; she who has the most citations will accrue them exponentially faster. There are many correct explanations, but the simplest one will do here: 

If Susan’s article on the danger of velociraptors is cited by 15 other articles, I am more likely to find it and cite her than another article on velociraptors containing the same information, that has never been citedThat’s because when I’m reading research, I look at who’s being cited. The more Susan is cited, the more likely I’ll eventually come across her article and cite it myself, which in turn increases the likelihood that much more that someone else will find her article through my own citations. Continue ad nauseam.

Some of you are thinking this is stupid. Maybe it’s trivially correct, but missing the bigger picture: quality. What if Susan’s velociraptor research is simply better than the competing research, and that’s why it’s getting cited more?

Yes, that’s also an issue. Noticeably awful research simply won’t get much traction. 1 Let’s disqualify it from the citation game. The point is there is lots of great research out there, waiting to be read and built upon, and its quality isn’t the sole predictor of its eventual citation success.

In fact, quality is a mostly-necessary but completely insufficient indicator of research success. Superstar popularity of research depends much more on the citation effects I mentioned above – more citations begets even more. Previous success is the best predictor of future success, mostly independent of the quality of research being shared.

Example of positive feedback loops pushing some articles to citation stardom.
Example of positive feedback loops pushing some articles to citation stardom. [via]
This is all pretty hand-wavy. How do we know success is more important than quality in predicting success? Uh, basically because of Napster.

Popular Music

If VH1 were to produce a retrospective on the first decade of the 21st century, perhaps its two biggest subjects would be illegal music sharing and VH1’s I Love the 19xx… TV series. Napster came and went, followed by LimeWire, eDonkey2000, AudioGalaxy, and other services sued by Metallica. Well-known early internet memes like Hamster Dance and All Your Base Are Belong To Us spread through the web like socially transmitted diseases, and researchers found this the perfect opportunity to explore how popularity worked. Experimentally.

In 2006, a group of Columbia University social scientists designed a clever experiment to test why some songs became popular and others did not, relying on the public interest in online music sharing. They created a music downloading site which gathered 14,341 users, each one to become a participant in their social experiment.

The cleverness arose out of their experimental design, which allowed them to get past the pesky problem of history only ever happening once. It’s usually hard to learn why something became popular, because you don’t know what aspects of its popularity were simply random chance, and what aspects were genuine quality. If you could, say, just rerun the 1960s, changing a few small aspects here or there, would the Beatles still have been as successful? We can’t know, because the 1960s are pretty much stuck having happened as they did, and there’s not much we can do to change it. 2

But this music-sharing site could rerun history—or at least, it could run a few histories simultaneously. When they signed up, each of the site’s 14,341 users were randomly sorted into different groups, and their group number determined how they were presented music. The musical variety was intentionally obscure, so users wouldn’t have heard the bands before.

A user from the first group, upon logging in, would be shown songs in random order, and were given the option to listen to a song, rate it 1-5, and download it. Users from group #2, instead, were shown the songs ranked in order of their popularity among other members of group #2. Group #3 users were shown a similar rank-order of popular songs, but this time determined by the song’s popularity within group #3. So too for groups #4-#9. Every user could listen to, rate, and download music.

Essentially, the researchers put the participants into 9 different self-contained petri dishes, and waited to see which music would become most popular in each. Ranking and download popularity from group #1 was their control group, in that members judged music based on their quality without having access to social influence. Members of groups #2-#9 could be influenced by what music was popular with their peers within the group. The same songs circulated in each petri dish, and each petri dish presented its own version of history.

Music sharing site from Columbia study.
Music sharing site from Columbia study.

No superstar songs emerged out of the control group. Positive feedback loops weren’t built into the system, since popularity couldn’t beget more popularity if nobody saw what their peers were listening to. The other 8 musical petri dishes told a different story, however. Superstars emerged in each, but each group’s population of popular music was very different. A song’s popularity in each group was slightly related to its quality (as judged by ranking in the control group), but mostly it was social-influence-produced chaos. The authors put it this way:

In general, the “best” songs never do very badly, and the “worst” songs never do extremely well, but almost any other result is possible. —Salganik, Dodds, & Watts, 2006

These results became even more pronounced when the researchers increased the visibility of social popularity in the system. The rich got even richer still. A lot of it has to do with timing. In each group, the first few good songs to become popular are the ones that eventually do the best, simply by an accident of circumstance. The first few popular songs appear at the top of the list, for others to see, so they in-turn become even more popular, and so ad infinitum.  The authors go on:

experts fail to predict success not because they are incompetent judges or misinformed about the preferences of others, but because when individual decisions are subject to social influence, markets do not simply aggregate pre-existing individual preferences.

In short, quality is a necessary but insufficient criteria for ultimate success. Social influence, timing, randomness, and other non-qualitative features of music are what turn a good piece of music into an off-the-charts hit.

Wait what about science?

Compare this to what makes a “well-respected” scientist: it ain’t all citations and social popularity, but they play a huge role. And as I described above, simply out of exposure-fueled-propagation, the more citations someone accrues, the more citations they are likely to accrue, until we get a situation like the Yankees (40 world series appearances, versus 20 appearances by the Giants) on our hands. Superstars are born, who are miles beyond the majority of working researchers in terms of grants, awards, citations, etc. Social scientists call this preferential attachment.

Which is fine, I guess. Who cares if scientific popularity is so skewed as long as good research is happening? Even if we take the Columbia social music experiment at face-value, an exact analog for scientific success, we know that the most successful are always good scientists, and the least successful are always bad ones, so what does it matter if variability within the ranks of the successful is so detached from quality?

Except, as anyone studying their #OccupyWallstreet knows, it ain’t that simple in a scarce economy. When the rich get richer, that money’s gotta come from somewhere. Like everything else (cf. the law of conservation of mass), academia is a (mostly) zero-sum game, and to the victors go the spoils. To the losers? Meh.

So let’s talk scarcity.

The 41st Chair

The same guy who who introduced the concept of the Matthew Effect to scientific grants and citations, Robert K. Merton (…of Columbia University), also brought up “the 41st chair” in the same 1968 article.

Merton’s pretty great, so I’ll let him do the talking:

In science as in other institutional realms, a special problem in the workings of the reward system turns up when individuals or organizations take on the job of gauging and suitably rewarding lofty performance on behalf of a large community. Thus, that ultimate accolade in 20th-century science, the Nobel prize, is often assumed to mark off its recipients from all the other scientists of the time. Yet this assumption is at odds with the well-known fact that a good number of scientists who have not received the prize and will not receive it have contributed as much to the advancement of science as some of the recipients, or more.

This can be described as the phenomenon of “the 41st chair.” The derivation of this tag is clear enough. The French Academy, it will be remembered, decided early that only a cohort of 40 could qualify as members and so emerge as immortals. This limitation of numbers made inevitable, of course, the exclusion through the centuries of many talented individuals who have won their own immortality. The familiar list of occupants of this 41st chair includes Descartes, Pascal, Moliere, Bayle, Rousseau, Saint-Simon, Diderot, Stendahl, Flaubert, Zola, and Proust

[…]

But in greater part, the phenomenon of the 41st chair is an artifact of having a fixed number of places available at the summit of recognition. Moreover, when a particular generation is rich in achievements of a high order, it follows from the rule of fixed numbers that some men whose accomplishments rank as high as those actually given the award will be excluded from the honorific ranks. Indeed, their accomplishments sometimes far outrank those which, in a time of less creativity, proved
enough to qualify men for his high order of recognition.

The Nobel prize retains its luster because errors of the first kind—where scientific work of dubious or inferior worth has been mistakenly honored—are uncommonly few. Yet limitations of the second kind cannot be avoided. The small number of awards means that, particularly in times of great scientific advance, there will be many occupants of the 41st chair (and, since the terms governing the award of the prize do not provide for posthumous recognition, permanent occupants of that chair).

Basically, the French Academy allowed only 40 members (chairs) at a time. We can be reasonably certain those members were pretty great, but we can’t be sure that equally great—or greater—women existed who simply never got the opportunity to participate because none of the 40 members died in time.

These good-enough-to-be-members-but-weren’t were said to occupy the French Academy’s 41st chair, an inevitable outcome of a scarce economy (40 chairs) when the potential number benefactors of this economy far outnumber the goods available (40). The population occupying the 41st chair is huge, and growing, since the same number of chairs have existed since 1634, but the population of France has quadrupled in the intervening four centuries.

Returning to our question of “so what if rich-get-richer doesn’t stick the best people at the top, since at least we can assume the people at the top are all pretty good anyway?”, scarcity of chairs is the so-what.

Since faculty jobs are stagnating compared to adjunct work, yet new PhDs are being granted faster than new jobs become available, we are presented with the much-discussed crisis in higher education. Don’t worry, we’re told, academia is a meritocracy. With so few jobs, only the cream of the crop will get them. The best work will still be done, even in these hard times.

Recent Science PhD growth in the U.S. [via]
Recent Science PhD growth in the U.S. [via]
Unfortunately, as the Columbia social music study (among many other studies) showed, true meritocracies are impossible in complex social systems. Anyone who plays the academic game knows this already, and many are quick to point it out when they see people in much better jobs doing incredibly stupid things. What those who point out the falsity of meritocracy often get wrong, however, is intention: the idea that there is no meritocracy because those in power talk the meritocracy talk, but don’t then walk the walk. I’ll talk a bit later about how, even if everyone is above board in trying to push the best people forward, occupants of the 41st chair will still often wind up being more deserving than those sitting in chairs 1-40. But more on that later.

For now, let’s start building a metaphor that we’ll eventually over-extend well beyond its usefulness. Remember that kids’ game Musical Chairs, where everyone’s dancing around a bunch of chairs while the music is playing, but as soon as the music stops everyone’s got to find a chair and sit down? The catch, of course, is that there are fewer chairs than people, so someone always loses when the music stops.

The academic meritocracy works a bit like this. It is meritocratic, to a point: you can’t even play the game without proving some worth. The price of admission is a Ph.D. (which, granted, is more an endurance test than an intelligence test, but academic success ain’t all smarts, y’know?), a research area at least a few people find interesting and believe you’d be able to do good work in it, etc. It’s a pretty low meritocratic bar, since it described 50,000 people who graduated in the U.S. in 2008 alone, but it’s a bar nonetheless. And it’s your competition in Academic Musical Chairs.

Academic Musical Chairs

Time to invent a game! It’s called Academic Musical Chairs, the game where everything’s made up and the points don’t matter. It’s like Regular Musical Chairs, but more complicated (see Fig. 1). Also the game is fixed.

Figure 1: Academic Musical Chairs
Figure 1: Academic Musical Chairs

See those 40 chairs in the middle green zone? People sitting in them are the winners. Once they’re seated they have what we call in the game “tenure”, and they don’t get up until they die or write something controversial on twitter. Everyone bustling around them, the active players, are vying for seats while they wait for someone to die; they occupy the yellow zone we call “the 41st chair”. Those beyond that, in the red zone, can’t yet (or may never) afford the price of game admission; they don’t have a Ph.D., they already said something controversial on Twitter, etc. The unwashed masses, you know?

As the music plays, everyone in the 41st chair is walking around in a circle waiting for someone to die and the music to stop. When that happens, everyone rushes to the empty seat. A few invariably reach it simultaneously, until one out-muscles the others and sits down. The sitting winner gets tenure. The music starts again, and the line continues to orbit the circle.

If a player spends too long orbiting in the 41st chair, he is forced to resign. If a player runs out of money while orbiting, she is forced to resign. Other factors may force a player to resign, but they will never appear in the rulebook and will always be a surprise.

Now, some players are more talented than others, whether naturally or through intense training. The game calls this “academic merit”, but it translates here to increased speed and strength, which helps some players reach the empty chair when the music stops, even if they’re a bit further away. The strength certainly helps when competing with others who reach the chair at the same time.

A careful look at Figure 1 will reveal one other way players might increase their chances of success when the music stops. The 41st chair has certain internal shells, or rings, which act a bit like that fake model of an atom everyone learned in high-school chemistry. Players, of course, are the electrons.

Electron shells. [via]
Electron shells. [via]
You may remember that the further out the shell, the more electrons can occupy it(-ish): the first shell holds 2 electrons, the second holds 8; third holds 18; fourth holds 32; and so on. The same holds true for Academic Musical Chairs: the coveted interior ring only fits a handful of players; the second ring fits an order of magnitude more; the third ring an order of magnitude more than that, and so on.

Getting closer to the center isn’t easy, and it has very little to do with your “academic rigor”! Also, of course, the closer you are to the center, the easier it is to reach either the chair, or the next level (remember positive feedback loops?). Contrariwise, the further you are from the center, the less chance you have of ever reaching the core.

Many factors affect whether a player can proceed to the next ring while the music plays, and some factors actively count against a player. Old age and being a woman, for example, take away 1 point. Getting published or cited adds points, as does already being friends with someone sitting in a chair (the details of how many points each adds can be found in your rulebook). Obviously the closer you are to the center, the easier you can make friends with people in the green core, which will contribute to your score even further. Once your score is high enough, you proceed to the next-closest shell.

Hooray, someone died! Let’s watch what happens.

The music stops. The people in the innermost ring who have the luckiest timing (thus are closest to the empty chair) scramble for it, and a few even reach it. Some very well-timed players from the 2nd & 3rd shells also reach it, because their “academic merit” has lent them speed and strength to reach past their position. A struggle ensues. Miraculously, a pregnant black woman sits down (this almost never happens), though not without some bodily harm, and the music begins again.

Oh, and new shells keep getting tacked on as more players can afford the cost of admission to the yellow zone, though the green core remains the same size.

Bizarrely, this is far from the first game of this nature. A Spanish boardgame from 1587 called the Courtly Philosophy had players move figures around a board, inching closer to living a luxurious life in the shadow of a rich patron. Random chance ruled their progression—a role of the dice—and occasionally they’d reach a tile that said things like: “Your patron dies, go back 5 squares”.

The courtier's philosophy. [via]
The courtier’s philosophy. [via]
But I digress. Let’s temporarily table the scarcity/41st-chair discussion and get back to the Matthew Effect.

The View From Inside

A friend recently came to me, excited but nervous about how well they were being treated by their department at the expense of their fellow students. “Is this what the Matthew Effect feels like?” they asked. Their question is the reason I’m writing this post, because I spent the next 24 hours scratching my head over “what does the Matthew Effect feel like?”.

I don’t know if anyone’s looked at the psychological effects of the Matthew Effect (if you do, please comment?), but my guess is it encompasses two feelings: 1) impostor syndrome, and 2) hard work finally paying off.

Since almost anyone who reaps the benefits of the Matthew Effect in academia will be an intelligent, hard-working academic, a windfall of accruing success should feel like finally reaping the benefits one deserves. You probably realize that luck played a part, and that many of your harder-working, smarter friends have been equally unlucky, but there’s no doubt in your mind that, at least, your hard work is finally paying off and the academic community is beginning to recognize that fact. No matter how unfair it is that your great colleagues aren’t seeing the same success.

But here’s the thing. You know how in physics, gravity and acceleration feel equivalent? How, if you’re in a windowless box, you wouldn’t be able to tell the difference between being stationary on Earth, or being pulled by a spaceship at 9.8 m/s2 through deep space? Success from merit or from Matthew Effect probably acts similarly, such that it’s impossible to tell one from the other from the inside.

Gravity vs. Acceleration. [via]
Gravity vs. Acceleration. [via]
Incidentally, that’s why the last advice you ever want to take is someone telling you how to succeed from their own experience.

Success

Since we’ve seen explosive success requires but doesn’t rely on skill, quality, or intent, the most successful people are not necessarily in the best position to understand the reason for their own rise. Their strategies may have paid off, but so did timing, social network effects, and positive feedback loops. The question you should be asking is, why didn’t other people with the same strategies also succeed?

Keep this especially in mind if you’re a student, and your tenured-professor advised you to seek an academic career. They may believe that giving you their strategies for success will help you succeed, when really they’re just giving you one of 50,000 admission tickets to Academic Musical Chairs.

Building a Meritocracy

I’m teetering well-past the edge of speculation here, but I assume the communities of entrenched academics encouraging undergraduates into a research career are the same communities assuming a meritocracy is at play, and are doing everything they can in hiring and tenure review to ensure a meritocratic playing field.

But even if gender bias did not exist, even if everyone responsible for decision-making genuinely wanted a meritocracy, even if the game weren’t rigged at many levels, the economy of scarcity (41st chair) combined with the Matthew Effect would ensure a true meritocracy would be impossible. There are only so many jobs, and hiring committees need to choose some selection criteria; those selection criteria will be subject to scarcity and rich-get-richer effects.

I won’t prove that point here, because original research is beyond the scope of this blog post, but I have a good idea of how to do it. In fact, after I finish writing this, I probably will go do just that. Instead, let me present very similar research, and explain how that method can be used to answer this question.

We want an answer to the question of whether positive feedback loops and a scarce economy are sufficient to prevent the possibility of a meritocracy. In 1971, Tom Schelling asked an unrelated question which he answered using a very relevant method: can racial segregation manifest in a community whose every actor is intent on not living a segregated life? Spoiler alert: yes.

He answered this question using by simulating an artificial world—similar in spirit to the Columbia social music experiment, except for using real participants, he experimented on very simple rule-abiding game creatures of his own invention. A bit like having a computer play checkers against itself.

The experiment is simple enough: a bunch of creatures occupy a checker board, and like checker pieces, they’re red or black. Every turn, one creature has the opportunity to move randomly to another empty space on the board, and their decision to move is based on their comfort with their neighbors. Red pieces want red neighbors, and black pieces want black neighbors, and they keep moving randomly ’till they’re all comfortable. Unsurprisingly, segregated creature communities appear in short order.

What if we our checker-creatures were more relaxed in their comforts? They’d be comfortable as long as they were in the majority; say, at least 50% of their neighbors were the same color. Again, let the computer play itself for a while, and within a few cycles the checker board is once again almost completely segregated.

Schelling segregation. [via]
Schelling segregation. [via]
What if the checker pieces are excited about the prospect of a diverse neighborhood? We relax the criteria even more, so red checkers only move if fewer than a third of their neighbors are red (that is, they’re totally comfortable with 66% of their neighbors being black)? If we run the experiment again, we see, again, the checker board breaks up into segregated communities.

Schelling’s claim wasn’t about how the world worked, but about what the simplest conditions were that could still explain racism. In his fictional checkers-world, every piece could be generously interested in living in a diverse neighborhood, and yet the system still eventually resulted in segregation. This offered a powerful support for the theory that racism could operate subtly, even if every actor were well-intended.

Vi Hart and Nicky Case created an interactive visualization/game that teaches Schelling’s segregation model perfectly. Go play it. Then come back. I’ll wait.


Such an experiment can be devised for our 41st-chair/positive-feedback system as well. We can even build a simulation whose rules match the Academic Musical Chairs I described above. All we need to do is show that a system in which both effects operate (a fact empirically proven time and again in academia) produces fundamental challenges for meritocracy. Such a model would be show that simple meritocratic intent is insufficient to produce a meritocracy. Hulk smashing the myth of the meritocracy seems fun; I think I’ll get started soon.

The Social Network

Our world ain’t that simple. For one, as seen in Academic Musical Chairs, your place in the social network influences your chances of success. A heavy-hitting advisor, an old-boys cohort, etc., all improve your starting position when you begin the game.

To put it more operationally, let’s go back to the Columbia social music experiment. Part of a song’s success was due to quality, but the stuff that made stars was much more contingent on chance timing followed by positive feedback loops. Two of the authors from the 2006 study wrote another in 2007, echoing this claim that good timing was more important than individual influence:

models of information cascades, as well as human subjects experiments that have been designed to test the models (Anderson and Holt 1997; Kubler and Weizsacker 2004), are explicitly constructed such that there is nothing special about those individuals, either in terms of their personal characteristics or in their ability to influence others. Thus, whatever influence these individuals exert on the collective outcome is an accidental consequence of their randomly assigned position in the queue.

These articles are part of a large literature in predicting popularity, viral hits, success, and so forth. There’s The Pulse of News in Social Media: Forecasting Popularity by Bandari, Asur, & Huberman, which showed that a top predictor of newspaper shares was the source rather than the content of an article, and that a major chunk of articles that do get shared never really make it to viral status. There’s Can Cascades be Predicted? by Cheng, Adamic, Dow, Kleinberg, and Leskovec (all-star cast if ever I saw one), which shows the remarkable reliance on timing & first impressions in predicting success, and also the reliance on social connectivity. That is, success travels faster through those who are well-connected (shocking, right?), and structural properties of the social network are important. This study by Susarla et al. also shows the importance of location in the social network in helping push those positive feedback loops, effecting the magnitude of success in YouTube Video shares.

Twitter information cascade. [via]
Twitter information cascade. [via]
Now, I know, social media success does not an academic career predict. The point here, instead, is to show that in each of these cases, before sharing occurs and not taking into account social media effects (that is, relying solely on the merit of the thing itself), success is predictable, but stardom is not.

Concluding, Finally

Relating it to Academic Musical Chairs, it’s not too difficult to say whether someone will end up in the 41st chair, but it’s impossible to tell whether they’ll end up in seats 1-40 until you keep an eye on how positive feedback loops are affecting their career.

In the academic world, there’s a fertile prediction market for Nobel Laureates. Social networks and Matthew Effect citation bursts are decent enough predictors, but what anyone who predicts any kind of success will tell you is that it’s much easier to predict the pool of recipients than it is to predict the winners.

Take Economics. How many working economists are there? Tens of thousands, at least. But there’s this Econometric Society which began naming Fellows in 1933, naming 877 Fellows by 2011. And guess what, 60 of 69 Nobel Laureates in Economics before 2011 were Fellows of the society. The other 817 members are or were occupants of the 41st chair.

The point is (again, sorry), academic meritocracy is a myth. Merit is a price of admission to the game, but not a predictor of success in a scarce economy of jobs and resources. Once you pass the basic merit threshold and enter the 41st chair, forces having little to do with intellectual curiosity and rigor guide eventual success (ahem). Small positive biases like gender, well-connected advisors, early citations, lucky timing, etc. feed back into increasingly larger positive biases down the line. And since there are only so many faculty jobs out there, these feedback effects create a naturally imbalanced playing field. Sometimes Einsteins do make it into the middle ring, and sometimes they stay patent clerks. Or adjuncts, I guess. Those who do make it past the 41st chair are poorly-suited to tell you why, because by and large they employed the same strategies as everybody else.

Figure 1: Academic Musical Chairs
Yep, Academic Musical Chairs

And if these six thousand words weren’t enough to convince you, I leave you with this article and this tweet. Have a nice day!

Addendum for Historians

You thought I was done?

As a historian of science, this situation has some interesting repercussions for my research. Perhaps most importantly, it and related concepts from Complex Systems research offer a middle ground framework between environmental/contextual determinism (the world shapes us in fundamentally predictable ways) and individual historical agency (we possess the power to shape the world around us, making the world fundamentally unpredictable).

More concretely, it is historically fruitful to ask not simply what non-“scientific” strategies were employed by famous scientists to get ahead (see Biagioli’s Galileo, Courtier), but also what did or did not set those strategies apart from the masses of people we no longer remember. Galileo, Courtier provides a great example of what we historians can do on a larger scale: it traces Galileo’s machinations to wind up in the good graces of a wealthy patron, and how such a system affected his own research. Using recently-available data on early modern social and scholarly networks, as well as the beginnings of data on people’s activities, interests, practices, and productions, it should be possible to zoom out from Biagioli’s viewpoint and get a fairly sophisticated picture of trajectories and practices of people who weren’t Galileo.

This is all very preliminary, just publicly blogging whims, but I’d be fascinated by what a wide-angle (dare I say, macroscopic?) analysis of the 41st chair in could tell us about how social and “scientific” practices shaped one another in the 16th and 17th centuries. I believe this would bear previously-impossible fruit, since a lone historian grasping ten thousand tertiary actors at once is a fool’s errand, but is a walk in the park for my laptop.

As this really is whim-blogging, I’d love to hear your thoughts.

Notes:

  1. Unless it’s really awful, but let’s avoid that discussion here.
  2. short of a TARDIS.

Networks Demystified 9: Bimodal Networks

What do you think, is a year long enough to wait between Networks Demystified posts? I don’t think so, which is why it’s been a year and a month. Welcome back! A recent twitter back-and-forth culminated in a request for a discussion of “bimodal networks”, and my Networks Demystified series seemed like a perfect place for just such a discussion.

What’s a bimodal network, you ask? (Go on, ask aloud at your desk. Nobody will look at you funny, this is the age of Siri!) A bimodal network is one which connects two varieties of things. It’s also called a bipartite, 2-partite, or 2-mode network. A network of authors connected to the papers they write is bimodal, as are networks of books to topics, and people to organizations they are affiliated with.

A bimodal network.
A bimodal network.

This is a bimodal network which connects people and the clubs they belong to. Alice is a member of the Network Club and the We Love History Society, Bob‘s in the Network Club and the No Adults Allowed Club, and Carol‘s in the No Adults Allowed Club.

If this makes no sense, read my earlier Networks Demystified posts (the first two posts), or the our Historian’s Macroscope chapter, for a primer on networks. If it does make sense, excellent! The rest of this post will hopefully take you out of your comfort zone, but remain understandable to someone who doesn’t speak math.

k-partite Networks & Projections

Bimodal networks are part of a larger class of k-partite networks. Unipartite/unimodal networks have only one type of node (remember, nodes are the stuff being connected by the edges), bipartite/bimodal networks have two types of nodes, tripartite/trimodal networks have three types of node, and so on to infinity.

The most common networks you’ll see being researched are unipartite. Who follows whom on Twitter? Who’s writing to whom in early modern Europe? What articles cite which other articles? All are examples of unipartite networks. It’s important to realize this isn’t necessarily determined by the dataset, but by the researcher doing the studying. For example, you can use the same organization affiliation dataset to create a unipartite network of who is in a club with whom, or a bipartite network of which person is affiliated with each organization.

The same dataset used to create a unipartite (left) and a bipartite (right) network.
The same dataset used to create a unipartite (left) and a bipartite (right) network.

The above illustration shows the same dataset used to create a unimodal and a bimodal network. The process of turning a pre-existing bimodal network into a unimodal network is called a bimodal projection. This process collapses one set of nodes into edges connecting the other set. In this case, because Alice and Bob are both members of the Network Club, the Network Club collapses into becoming an edge between those two people. The No Adults Allowed Club collapses into an edge between Bob and Carol. Because only Alice is a member of the We Love History Society, it does not collapse into an edge connecting any people.

You can also collapse the network in the opposite direction, connecting organizations who share people. No Adults Allowed and Network Club would share an edge (Bob), as would Network Club and We Love History Society (Alice).

Why Bimodal Networks?

If the same dataset can be described with unimodal networks, which are less complex, why go to bi-, tri-, or multimodal? The answer to that is in your research question: different network representations suit different questions better.

Collaboration is a hot topic in bibliometrics. Who collaborates with whom? Why? Do your collaborators affect your future collaborations? Co-authorship networks are well-suited to some of these questions, since they directly connect collaborators who author a piece together. This is a unimodal network: I wrote The Historian’s Macroscope with Shawn Graham and Ian Milligan, so we draw an edge connecting each of us together.

Some of the more focused questions of collaboration, however, require a more nuanced view of the data. Let’s say you want to know how individual instances of collaboration affect individual research patterns going forward. In this case, you want to know more than the fact that I’ve co-authored two pieces with Shawn and Ian, and they’ve co-authored three pieces together.

For this added nuance, we can draw an edge from each of us to The Historian’s Macroscope (rather than each-other), then another set edges to the piece we co-authored in The Programming Historian, and a last set of edges going from Shawn and Ian to the piece they wrote in the Journal of Digital Humanities. That’s three people nodes and three publication nodes.

Scott, Ian, and Shawn's co-authorship network
Scott, Ian, and Shawn’s co-authorship network

Why Not Bimodal Networks?

Humanities data are often a rich array of node types: people, places, things, ideas, all connected to each other via a complex network. The trade-off is, the more complex and multimodal your dataset, the less you can reasonably do with it. This is one of the fundamental tensions between computational and traditional humanities. More categories lead to a richer understanding of the diversity of human experience, but are incredibly unhelpful when you want to count things.

Consider two pie-charts showing the religious makeup of the United States. The first chart groups together religions that fall under a similar umbrella, and the second does not. That is, the first chart groups religions like Calvinists and Lutherans together into the same pie slice (Protestants), and the second splits them into separate slices. The second, more complex chart obviously presents a richer picture of religious diversity in the United States, but it’s also significantly more difficult to read. It might trick you into thinking there are more Catholics than Protestants in the country, due to how the pie is split.

The same is true in network analysis. By creating a dataset with a hundred varieties of nodes, you lose your ability to see a bigger picture through meaningful aggregations.

Surely, you’re thinking, bimodal networks, with only two categories, should be fine! Wellllll, yes and no. You don’t bump into the same aggregation problem you do with very multimodal networks; instead, you bump into technical and mathematical issues. These issues are why I often warn non-technical researchers away from bimodal networks in research. They’re not theoretically unsound, they’re just difficult to work with properly unless you know what changes when you’re working with these complex networks.

The following section will discuss a few network metrics you may be familiar with, and what they mean for bimodal networks.

Network Metrics and Bimodality

The easiest thing to measure in a network is a node’s degree centrality. You’ll recall this is a measurement of how many edges are attached to a node, which gives a rough proxy for this concept we’ve come to call network “centrality“. It means different things depending on your data and your question: the most important or well-connected person in your social network; the point in the U.S. electrical grid which is most vulnerable to attack; the book that shares the most concepts with other books (the encyclopedia?); the city that the most traders pass through to get to their destination. These are all highly “central” in the networks they occupy.

A network with each node labeled with its degree centrality.
A network with each node labeled with its degree centrality, via Wikipedia.

Degree centrality is the easiest such proxy to compute: how many connections does a node have? The idea is that nodes that are more highly connected are more central. The assumption only goes so far, and it’s easy to come up with nodes that are central that do not have a  high degree, as with the network below.

The blue node is highly central, but only has a degree centrality of 3. [via]
The blue node is highly central, but only has a degree centrality of 3. [via]
That’s the thing with these metrics: if you know how they work, you know which networks they apply well to, and which they do not. If what you mean by “centrality” is “has more friends”, and we’re talking about a Facebook network, then degree centrality is a perfect metric for the job.

If what you mean is “an important stop for river trade”, and we’re talking about 12th century Russia, then degree centrality sucks. The below is an illustration of such a network by Pitts (1978):

Russian river trade routes. Numbers/nodes are cities, and edges are rivers between them.
Russian river trade routes. Numbers/nodes are cities, and edges are rivers between them.

Moscow is number 35, and pretty clearly the most central according to the above criteria (you’ll likely pass through it to reach other destinations). But it only has a degree centrality of four! Node 9 also has a degree centrality of four, but clearly doesn’t play as important a structural role as Moscow in this network.

We already see that depending on your question, your definitions, and your dataset, specific metrics will either be useful or not. Metrics may change meanings entirely from one network to the next – for example, looking at bimodal rather than unimodal networks.

Consider what degree centrality means for the Alice, Bob, and Carol’s bimodal affiliation network above, where each is associated with a different set of clubs. Calculate the degree centralities in your head (hint: if you can’t, you haven’t learned what degree centrality means yet. Try again.).

Alice and Bob have a degree of 2, and Carol has a degree of 1. Is this saying anything about how central each is to the network? Not at all. Compare this to the unimodal projection, and you’ll see Bob is clearly the only structurally central actor in the network. In a bimodal network, degree centrality is nothing more than a count of affiliations with the other half of the network. It is much less likely to tell you something structurally useful than if you were looking at a unimodal network.

Consider another common measurement: clustering coefficient. You’ll recall that a node’s local clustering coefficient is the extent to which its neighbors are neighbors to one another. If all my Facebook friends know each other, I have a high clustering coefficient; if none of them know each other, I have a low clustering coefficient. If all of a power plant’s neighbors directly connect to one another, it has a high clustering coefficient, and if they don’t, it has a low clustering coefficient.

Clustering coefficient, from largest to smallest. [via]
Clustering coefficient, from largest to smallest. [via]
This measurement winds up being important for all sorts of reasons, but one way to interpret its meaning is as a proxy for the extent to which a node bridges diverse communities, the extent to which it is an important broker. In the 17th century, Henry Oldenburg was an important broker between disparate scholarly communities, in that he corresponded with people all across Europe, many of whom would never meet one another. The fact that they’d never meet is represented by the local clustering coefficient. It’s low, so we know his neighbors were unlikely to be neighbors of one another.

You can get creative (and network scientists often are) with what this metric means in the context of your own dataset. As long as you know how the algorithm works (taking the fraction of neighbors who are neighbors to one another), and the structural assumptions underlying your dataset, you can argue why clustering coefficient is a useful proxy for answering whatever question you’re asking.

Your argument may be pretty good, like if you say clustering coefficient is a decent (but not the best) proxy for revealing nodes that broker between disparate sections of a unimodal social network. Or your argument may be bad, like if you say clustering coefficient is a good proxy for organizational cohesion on the bimodal Alice, Bob, and Carol affiliation network above.

A thorough glance at the network, and a realization of our earlier definition of clustering coefficient (taking the fraction of neighbors who are neighbors to one another), should reveal why this is a bad justification. Alice’s clustering coefficient is zero. As is Bob’s. As is the Network Club’s. Every node has a clustering coefficient of zero, because no node’s neighbors connect to each other. That’s just the nature of bimodal networks: they connect across, rather than between, modes. Alice can never connect directly with Bob, and the Network Club can never connect directly with the We Love History Society.

Bob’s neighbors (the organizations) can never be neighbors with each other. There will never be a clustering coefficient as we defined it.

In short, the simplest definition of clustering coefficient doesn’t work on bimodal networks. It’s obvious if you know how your network works, and how clustering coefficient is calculated, but if you don’t think about it before you press the easy “clustering coefficient” button in Gephi, you’ll be lead astray.

Gephi doesn’t know if your network is bimodal or unimodal or ∞modal. Gephi doesn’t care. Gephi just does what you tell it to. You want Gephi to tell you the degree centralities in a bimodal network? Here ya go! You want it to give you the local clustering coefficients of nodes in a bimodal network? Voila! Everything still works as though these metrics would produce meaningful, sensible results.

But they won’t be meaningful on your network. You need to be your own network’s sanity check, and not rely on software to tell you something’s a bad idea. Think about your network, think about your algorithm, and try to work through what an algorithm means in the context of your data.

Using Bimodal Networks

This doesn’t mean you should stop using bimodal networks. Most of the easy network software out there comes with algorithms made for unimodal networks, but other algorithms exist and are available for more complex networks. Very occasionally, but by no means always, you can project your bimodal network to a unimodal network, as described above, and run your unimodal algorithms on that new network projection.

There are a number of times when this doesn’t work well. At 2,300 words, this tutorial is already too long, so I’ll leave thinking through why as an exercise for the reader. It’s less complicated than you’d expect, if you have a pen and paper and know how fractions work.

The better solution, usually, is to use an algorithm meant for bi- or multimodal networks. Tore Opsahl has put together a good primer on the subject with regard to clustering coefficient (slightly mathy, but you can get through it with ample use of Wikipedia). He argues that projection isn’t an optimal solution, but gives a simple algorithm for a finding bimodal clustering coefficients, and directions to do so in R. Essentially the algorithm extends the visibility of the clustering coefficient, asking whether a node’s neighbors 2 hops away can reach the others via 2 hops as well. Put another way, I don’t want to know what clubs Bob belongs to, but rather whether Alice and Carol can also connect to one another through a club.

It’s a bit difficult to write without the use of formulae, but looking at the bimodal network and thinking about what clustering coefficient ought to mean should get you on the right track.

Bimodal networks aren’t an unsolved problem. If you search Google Scholar for bimodal, bipartite, and 2-mode networks, you’ll discover all sorts of clever methods for analyzing bimodal networks, including some great introductory texts by Borgatti and Everett.

The issue is there aren’t easy solutions through platforms like Gephi, and that’s probably on us as Digital Humanists.  I’ve found that DHers are much more likely to have bi- or multimodal datasets than most network researchers. If we want to be able to analyze them easily, we need to start developing our own plugins to Gephi, or our own tools, to do so. Push-button solutions are great if you know what’s happening when you push the button.

So let this be an addendum to my previous warnings against using bimodal networks: by all means, use them, but make sure you really think about the algorithms and your data, and what any given metric might imply when run on your network specifically. There are all sorts of free resources online you can find by googling your favorite algorithm. Use them.


For more information, read up on specific algorithms, methods, interpretations, etc. for two-mode networks from Tore Opsahl.

 

The moral role of DH in a data-driven world

This is the transcript from my closing keynote address at the 2014 DH Forum in Lawrence, Kansas. It’s the result of my conflicted feelings on the recent Facebook emotional contagion controversy, and despite my earlier tweets, I conclude the study was important and valuable specifically because it was so controversial.

For the non-Digital-Humanities (DH) crowd, a quick glossary. Distant Reading is our new term for reading lots of books at once using computational assistance; Close Reading is the traditional term for reading one thing extremely minutely, exhaustively.


Networked Society

Distant reading is a powerful thing, an important force in the digital humanities. But so is close reading. Over the next 45 minutes, I’ll argue that distant reading occludes as much as it reveals, resulting in significant ethical breaches in our digital world. Network analysis and the humanities offers us a way out, a way to bridge personal stories with the big picture, and to bring a much-needed ethical eye to the modern world.

Today, by zooming in and out, from the distant to the close, I will outline how networks shape our world and our lives, and what we in this room can do to set a path going forward.

Let’s begin locally.

1. Pale Blue Dot

Pale Blue Dot

You are here. That’s a picture of Kansas, from four billion miles away.

In February 1990, after years of campaigning, Carl Sagan convinced NASA to turn the Voyager 1 spacecraft around to take a self-portrait of our home, the Earth. This is the most distant reading of humanity that has ever been produced.

I’d like to begin my keynote with Carl Sagan’s own words, his own distant reading of humanity. I’ll spare you my attempt at the accent:

Consider again that dot. That’s here. That’s home. That’s us. On it everyone you love, everyone you know, everyone you ever heard of, every human being who ever was, lived out their lives. The aggregate of our joy and suffering, thousands of confident religions, ideologies, and economic doctrines, every hunter and forager, every hero and coward, every creator and destroyer of civilization, every king and peasant, every young couple in love, every mother and father, hopeful child, inventor and explorer, every teacher of morals, every corrupt politician, every ‘superstar,’ every ‘supreme leader,’ every saint and sinner in the history of our species lived there – on a mote of dust suspended in a sunbeam.

What a lonely picture Carl Sagan paints. We live and die in isolation, alone in a vast cosmic darkness.

I don’t like this picture. From too great a distance, everything looks the same. Every great work of art, every bomb, every life is reduced to a single point. And our collective human experience loses all definition. If we want to know what makes us, us, we must move a little closer.

2. Black Rock City

Black Rock City

We’ve zoomed into Black Rock City, more popularly known as Burning Man, a city of 70,000 people that exists for only a week in a Nevada desert, before disappearing back into the sand until the following year. Here life is apparent; the empty desert is juxtaposed against a network of camps and cars and avenues, forming a circle with some ritualistic structure at its center.

The success of Burning Man is contingent on collaboration and coordination; on the careful allocation of resources like water to keep its inhabitants safe; on the explicit planning of organizers to keep the city from descending into chaos year after year.

And the creation of order from chaos, the apparent reversal of entropy, is an essential feature of life. Organisms and societies function through the careful coordination and balance of their constituent parts. As these parts interact, patterns and behaviors emerge which take on a life of their own.

3. Complex Systems

Thus cells combine to form organs, organs to form animals, and animals to form flocks.

We call these networks of interactions complex systems, and we study complex systems using network analysis. Network analysis as a methodology takes as a given that nothing can be properly understood in total isolation. Carl Sagan’s pale blue dot, though poignant and beautiful, is too lonely and too distant to reveal anything of we creatures who inhabit it.

We are not alone.

4. Connecting the Dots

When looking outward rather than inward, we find we are surrounded on all sides by a hundred billion galaxies each with a hundred billion stars. And for as long as we can remember, when we’ve stared up into the night sky, we’ve connected the dots. We’ve drawn networks in the stars in order to make them feel more like us, more familiar, more comprehensible.

Nothing exists in isolation. We use networks to make sense of our place in the vast complex system that contains protons and trees and countries and galaxies.The beauty of network analysis is its ability to transcend differences in scale, such that there is a place for you and for me, and our pieces interact with other pieces to construct the society we occupy. Networks allow us to see the forest and the trees, to give definition to the microcosms and macrocosms which describe the world around us.

5. Networked World

Networks open up the world. Over the past four hundred years, the reach of the West extended to the globe, overtaking trade routes created first by eastern conquerors. From these explorations, we produced new medicines and technologies. Concomitant with this expansion came unfathomable genocide and a slave trade that spanned many continents and far too many centuries.

Despite the efforts of the Western World, it could only keep the effects of globalization to itself for so long. Roads can be traversed in either direction, and the network created by Western explorers, businesses, slave traders, and militaries eventually undermined or superseded the Western centers of power. In short order, the African slave trade in the Americas led to a rich exchange of knowledge of plants and medicines between Native Americans and Africans.

In Southern and Southeast Asia, trade routes set up by the Dutch East India Company unintentionally helped bolster economies and trade routes within Asia. Captains with the company, seeking extra profits, would illicitly trade goods between Asian cities. This created more tightly-knit internal cultural and economic networks than had existed before, and contributed to a global economy well beyond the reach of the Dutch East India Company.

In the 1960s, the U.S. military began funding what would later become the Internet, a global communication network which could transfer messages at unfathomable speeds. The infrastructure provided by this network would eventually become a tool for control and surveillance by governments around the world, as well as a distribution mechanism for fuel that could topple governments in the Middle East or spread state secrets in the United States. The very pervasiveness which makes the internet particularly effective in government surveillance is also what makes it especially dangerous to governments through sites like WikiLeaks.

In short, science and technology lay the groundwork for our networked world, and these networks can be great instruments of creation, or terrible conduits of destruction.

6. Macro Scale

So here we are, occupying this tiny mote of dust suspended in a sunbeam. In the grand scheme of things, how does any of this really matter? When we see ourselves from so great a distance, it’s as difficult to be enthralled by the Sistine Chapel as it is to be disgusted by the havoc we wreak upon our neighbors.

7. Meso Scale

But networks let us zoom in, they let us keep the global system in mind while examining the parts. Here, once again, we see Kansas, quite a bit closer than before. We see how we are situated in a national and international set of interconnections. These connections come in every form, from physical transportation to electronic communication. From this scale, wars and national borders are visible. Over time, cultural migration patterns and economic exchange become apparent. This scale shows us the networks which surround and are constructed by us.

slide7

And this is the scale which is seen by the NSA and the CIA, by Facebook and Google, by social scientists and internet engineers. Close enough to provide meaningful aggregations, but far enough that individual lives remain private and difficult to discern. This scale teaches us how epidemics spread, how minorities interact, how likely some city might be a target for the next big terrorist attack.

From here, though, it’s impossible to see the hundred hundred towns whose factories have closed down, leaving many unable to feed their families. It’s difficult to see the small but endless inequalities that leave women and minorities systematically underappreciated and exploited.

8. Micro Scale

slide8

We can zoom in further still, Lawrence Kansas at a few hundred feet, and if we watch closely we can spot traffic patterns, couples holding hands, how the seasons affect people’s activities. This scale is better at betraying the features of communities, rather than societies.

But for tech companies, governments, and media distributors, it’s all-too-easy to miss the trees for the forest. When they look at the networks of our lives, they do so in aggregate. Indeed, privacy standards dictate that the individual be suppressed in favor of the community, of the statistical average that can deliver the right sort of advertisement to the right sort of customer, without ever learning the personal details of that customer.

This strange mix of individual personalization and impersonal aggregation drives quite a bit of the modern world. Carefully micro-targeted campaigning is credited with President Barack Obama’s recent presidential victories, driven by a hundred data scientists in an office in Chicago in lieu of thousands of door-to-door canvassers. Three hundred million individually crafted advertisements without ever having to look a voter in the face.

9. Target

And this mix of impersonal and individual is how Target makes its way into the wombs of its shoppers. We saw this play out a few years ago when a furious father went to complain to a Target store manager. Why, he asked the manager, is my high school daughter getting ads for maternity products in the mail? After returning home, the father spoke to his daughter to discover she was, indeed pregnant.  How did this happen? How’d Target know?

 It turns out, Target uses credit cards, phone numbers, and e-mail addresses to give every customer a unique ID. Target discovered a list of about 25 products that, if purchased in a certain sequence by a single customer, is pretty indicative of a customer’s pregnancy. What’s more, the date of the purchased products can pretty accurately predict the date the baby would be delivered. Unscented lotion, magnesium, cotton balls, and washcloths are all on that list.

When Target’s systems learns one of its customers is probably pregnant, it does its best to profit from that pregnancy, sending appropriately timed coupons for diapers and bottles. This backfired, creeping out customers and invading their privacy, as with the angry father who didn’t know his daughter was pregnant. To remedy the situation, rather than ending the personalized advertising, Target began interspersing ads for unrelated products with personalized products in order to trick the customer into thinking the ads were random or general. All the while, a good portion of the coupons in the book were still targeted directly towards those customers.

One Target executive told a New York Times reporter:

We found out that as long as a pregnant woman thinks she hasn’t been spied on, she’ll use the coupons. She just assumes that everyone else on her block got the same mailer for diapers and cribs. As long as we don’t spook her, it works.

The scheme did work, raising Target’s profits by billions of dollars by subtly matching their customers with coupons they were likely to use. 

10. Presidential Elections

Political campaigns have also enjoyed the successes of microtargeting. President Bush’s 2004 campaign pioneered this technique, targeting socially conservative Democratic voters in key states in order to either convince them not to vote, or to push them over the line to vote Republican. This strategy is credited with increasing the pro-Bush African American vote in Ohio from 9% in 2000 to 16% in 2004, appealing to anti-gay marriage sentiments and other conservative values.

The strategy is also celebrated for President Obama’s 2008 and especially 2012 campaigns, where his staff maintained a connected and thorough database of a large portion of American voters. They knew, for instance, that people who drink Dr. Pepper, watch the Golf Channel, drive a Land Rover, and eat at Cracker Barrel are both very likely to vote, and very unlikely to vote Democratic. These insights lead to the right political ads targeted exactly at those they were most likely to sway.

So what do these examples have to do with networks? These examples utilize, after all, the same sorts of statistical tools that have always been available to us, only with a bit more data and power to target individuals thrown in the mix.

It turns out that networks are the next logical step in the process of micronudging, the mass targeting of individuals based on their personal lives in order to influence them toward some specific action.

In 2010, a Facebook study, piggy-backing on social networks, influenced about 340,000 additional people to vote in the US mid-term elections. A team of social scientists at UCSD experimented on 61 million facebook users in order to test the influence of social networks on political action.

A portion of American Facebook users who logged in on election day were given the ability to press an “I voted” button, which shared the fact that they voted with their friends. Facebook then presented users with pictures of their friends who voted, and it turned out that these messages increased voter turnout by about 0.4%. Further, those who saw that close friends had voted were more likely to go out and vote than those who had seen that distant friends voted. The study was framed as “voting contagion” – how well does the action of voting spread among close friends?

This large increase in voter turnout was prompted by a single message on Facebook spread among a relatively small subset of its users. Imagine that, instead of a research question, the study was driven by a particular political campaign. Or, instead, imagine that Facebook itself had some political agenda – it’s not too absurd a notion to imagine.

11. Blackout

slide11

In fact, on January 18, 2012, a great portion of the social web rallied under a single political agenda. An internet blackout. In protest of two proposed U.S. congressional laws that threatened freedom of speech on the Web, SOPA and PIPA, 115,000 websites voluntarily blacked out their homepages, replacing them with pleas to petition congress to stop the a bills.

Reddit, Wikipedia, Google, Mozilla, Twitter, Flickr, and others asked their users to petition Congress, and it worked. Over 3 million people emailed their congressional representatives directly, another million sent a pre-written message to Congress from the Electronic Frontier Foundation, a Google petition reached 4.5 million signatures, and lawmakers ultimated collected the names of over 14 million people who protested the bills. Unsurprisingly, the bills were never put up to vote.

These techniques are increasingly being leveraged to influence consumers and voters into acting in-line with whatever campaign is at hand. Social networks and the social web, especially, are becoming tools for advertisers and politicians.

12a. Facebook and Social Guessing

In 2010, Tim Tangherlini invited a few dozen computer scientists, social scientists, and humanists to a two-week intensive NEH-funded summer workshop on network analysis for the humanities. Math camp for nerds, we called it. The environment was electric with potential projects and collaborations, and I’d argue it was this workshop that really brought network analysis to the humanities in force.

During the course of the workshop, one speaker sticks out in my memory: a data scientist at Facebook. He reached the podium, like so many did during those two weeks, and described the amazing feats they were able to perform using basic linguistic and network analyses. We can accurately predict your gender and race, he claimed, regardless of whether you’ve told us. We can learn your political leanings, your sexuality, your favorite band.

Much like most talks from computer scientists at the event, the purpose was to show off the power of large-scale network analysis when applied to people, and didn’t focus much on its application. The speaker did note, however, that they used these measurements to effectively advertise to their users; electronics vendors could advertise to wealthy 20-somethings; politicians could target impoverished African Americans in key swing states.

It was a few throw-away lines in the presentation, but the force of the ensuing questions revolved around those specifically. How can you do this without any sort of IRB oversight? What about the ethics of all this? The Facebook scientist’s responses were telling: we’re not doing research, we’re just running a business.

And of course, Facebook isn’t the only business doing this. The Twitter analytics dashboard allows you to see your male-to-female follower ratio, even though users are never asked their gender. Gender is guessed based on features of language and interactions, and they claim around 90% accuracy.

Google, when it targets ads towards you as a user, makes some predictions based on your search activity. Google guessed, without my telling it, that I am a 25-34 year old male who speaks English and is interested in, among other things, Air Travel, Physics, Comics, Outdoors, and Books. Pretty spot-on.

12b. Facebook and Emotional Contagion

And, as we saw with the Facebook voting study, social web services are not merely capable of learning about you; they are capable of influencing your actions. Recently, this ethical question has pushed its way into the public eye in the form of another Facebook study, this one about “emotional contagion.”

A team of researchers and Facebook data scientists collaborated to learn the extent to which emotions spread through a social network. They selectively filtered the messages seen by about 700,000 Facebook users, making sure that some users only saw emotionally positive posts by their friends, and others only saw emotionally negative posts. After some time passed, they showed that users who were presented with positive posts tended to post positive updates, and those presented with negative posts tended to post negative updates.

The study stirred up quite the controversy, and for a number of reasons. I’ll unpack a few of them:

First of all, there were worries about the ethics of consent. How could Facebook do an emotional study of 700,000 users without getting their consent, first? The EULA that everyone clicks through when signing up for Facebook only has one line saying that data may be used for research purposes, and even that line didn’t appear until several months after the study occurred.

A related issue raised was one of IRB approval: how could the editors at PNAS have approved the study given that the study took place under Facebook’s watch, without an external Institutional Review Board? Indeed, the university-affiliated researchers did not need to get approval, because the data were gathered before they ever touched the study. The counter-argument was that, well, Facebook conducts these sorts of studies all the time for the purposes of testing advertisements or interface changes, as does every other company, so what’s the problem?

A third issue discussed was one of repercussions: if the study showed that Facebook could genuinely influence people’s emotions, did anyone in the study physically harm themselves as a result of being shown a primarily negative newsfeed? Should Facebook be allowed to wield this kind of influence? Should they be required to disclose such information to their users?

The controversy spread far and wide, though I believe for the wrong reasons, which I’ll explain shortly. Social commentators decried the lack of consent, arguing that PNAS shouldn’t have published the paper without proper IRB approval. On the other side, social scientists argued the Facebook backlash was antiscience and would cause more harm than good. Both sides made valid points.

One well-known social scientist noted that the Age of Exploration, when scientists finally started exploring the further reaches of the Americas and Africa, was attacked by poets and philosophers and intellectuals as being dangerous and unethical. But, he argued, did not that exploration bring us new wonders? Miracle medicines and great insights about the world and our place in it?

I call bullshit. You’d be hard-pressed to find a period more rife with slavery and genocide and other horrible breaches of human decency than that Age of Exploration. We can’t sacrifice human decency in the name of progress. On the flip-side, though, we can’t sacrifice progress for the tiniest fears of misconduct. We must proceed with due diligence to ethics without being crippled by inefficacy.

But this is all a red herring. The issue here isn’t whether and to what extent these activities are ethical science, but to what extent they are ethical period, and if they aren’t, what we should do about it. We can’t have one set of ethical standards for researchers, and another for businesses, but that’s what many of the arguments in recent months have boiled down to. Essentially, it was argued, Facebook does this all the time. It’s something called A/B testing: they make changes for some users and not others, and depending on how the users react, they change the site accordingly. It’s standard practice in web development.

13. An FDA/FTC for Data?

It is surprising, then, that the crux of the anger revolved around the published research. Not that Facebook shouldn’t do A/B testing, but that researchers shouldn’t be allowed to publish on it. This seems to be the exact opposite of what should be happening: if indeed every major web company practices these methods already, then scholarly research on how such practices can sway emotions or voting practices are exactly what we need. We must bring these practices to light, in ways the public can understand, and decide as a society whether they cross ethical boundaries. A similar discussion occurred during the early decades of the 20th century, when the FDA and FTC were formed, in part, to prevent false advertising of snake oils and foods and other products.

We are at the cusp of a new era. The mix of big data, social networks, media companies, content creators, government surveillance, corporate advertising, and ubiquitous computing is a perfect storm for intense influence both subtle and far-reaching. Algorithmic nudging has the power to sell products, win elections, topple governments, and oppress a people, depending on how it is wielded and by whom. We have seen this work from the bottom-up, in Occupy Wallstreet, the Revolutions in the Middle East, and the ALS Ice-Bucket Challenge, and from the top-down in recent presidential campaigns, Facebook studies, and coordinated efforts to preserve net neutrality. And these have been works of non-experts: people new to this technology, scrambling in the dark to develop the methods as they are deployed. As we begin to learn more about network-based control and influence, these examples will multiply in number and audacity.

14. Surveillance

And this story leaves out one of the most major players of all: government. When Edward Snowden leaked the details of classified NSA surveillance program, the world was shocked at the government’s interest in and capacity for omniscience. Data scientists, on the other hand, were mostly surprised that people didn’t realize this was happening. If the technology is there, you can bet it will be used.

And so here, in the NSA’s $1.5 billion dollar data center in Utah, are the private phone calls, parking receipts, emails, and Google searches of millions of American citizens. It stores a few exabytes of our data, over a billion gigabytes and roughly equivalent to a hundred thousand times the size of the library of congress. More than enough space, really.

The humanities have played some role in this complex machine. During the Cold War, the U.S. government covertly supported artists and authors to create cultural works which would spread American influence abroad and improve American sentiment at home.

Today the landscape looks a bit different. For the last few years DARPA, the research branch of the U.S. Department of Defense, has been funding research and hosting conferences in what they call “Narrative Networks.” Computer scientists, statisticians, linguists, folklorists, and literary scholars have come together to discuss how ideas spread and, possibly, how to inject certain sentiments within specific communities. It’s a bit like the science of memes, or of propaganda.

Beyond this initiative, DARPA funds have gone toward several humanities-supported projects to develop actionable plans for the U.S. military. One project, for example, creates as-complete-as-possible simulations of cultures overseas, which can model how groups might react to the dropping of bombs or the spread of propaganda. These models can be used to aid in the decision-making processes of officers making life-and-death decisions on behalf of troops, enemies, and foreign citizens. Unsurprisingly, these initiatives, as well as NSA surveillance at home, all rely heavily on network analysis.

In fact, when the news broke on the captures of Osama bin Laden and Saddam Hussein, and how they were discovered via network analysis, some of my family called me after reading the newspapers claiming “we finally understand what you do!” This wasn’t the reaction I was hoping for.

In short, the world is changing incredibly rapidly, in large part driven by the availability of data, network science and statistics, and the ever-increasing role of technology in our lives. Are these corporate, political, and grassroots efforts overstepping their bounds? We honestly don’t know. We are only beginning to have sustained, public discussions about the new role of technology in society, and the public rarely has enough access to information to make informed decisions. Meanwhile, media and web companies may be forgiven for overstepping ethical boundaries, as our culture hasn’t quite gotten around to drawing those boundaries yet.

15. The Humanities’ Place

This is where the humanities come in – not because we have some monopoly on ethics (goodness knows the way we treat our adjuncts is proof we do not) – but because we are uniquely suited to the small scale. To close reading. While what often sets the digital humanities apart from its analog counterpart is the distant reading, the macroanalysis, what sets us all apart is our unwillingness to stray too far from the source. We intersperse the distant with the close, attempting to reintroduce the individual into the aggregate.

Network analysis, not coincidentally, is particularly suited to this endeavor. While recent efforts in sociophysics have stressed the importance of the grand scale, let us not forget that network theory was built on the tiniest of pieces in psychology and sociology, used as a tool to explore individuals and their personal relationships. In the intervening years, all manner of methods have been created to bridge macro and micro, from Granovetter’s theory of weak ties to Milgram’s of Small Worlds, and the way in which people navigate the networks they find themselves in. Networks work at every scale, situating the macro against the meso against the micro.

But we find ourselves in a world that does not adequately utilize this feature of networks, and is increasingly making decisions based on convenience and money and politics and power without taking the human factor into consideration. And it’s not particularly surprising: it’s easy, in the world of exabytes of data, to lose the trees for the forest.

This is not a humanities problem. It is not a network scientist problem. It is not a question of the ethics of research, but of the ethics of everyday life. Everyone is a network scientist. From Twitter users to newscasters, the boundary between people who consume and people who are aware of and influence the global social network is blurring, and we need to deal with that. We must collaborate with industries, governments, and publics to become ethical stewards of this networked world we find ourselves in.

16. Big and Small

Your challenge, as researchers on the forefront of network analysis and the humanities, is to tie the very distant to the very close. To do the research and outreach that is needed to make companies, governments, and the public aware of how perturbations of the great mobile that is our society affect each individual piece.

We have a number of routes available to us, in this respect. The first is in basic research: the sort that got those Facebook study authors in such hot water. We need to learn and communicate the ways in which pervasive surveillance and algorithmic influence can affect people’s lives and steer societies.

A second path towards influencing an international discussion is in the development of new methods that highlight the place of the individual in the larger network. We seem to have a critical mass of humanists collaborating with or becoming computer scientists, and this presents a perfect opportunity to create algorithms which highlight a node’s uniqueness, rather than its similarity.

Another step to take is one of public engagement that extends beyond the academy, and takes place online, in newspapers or essays, in interviews, in the creation of tools or museum exhibits. The MIT Media Lab, for example, created a tool after the Snowden leaks that allows users to download their email metadata to reveal the networks they form. The tool was a fantastic example of a way to show the public exactly what “simply metadata” can reveal about a person, and its viral spread was a testament to its effectiveness. Mike Widner of Stanford called for exactly this sort of engagement from digital humanists a few years ago, and it is remarkable how little that call has been heeded.

Pedagogy is a fourth option. While people cry that the humanities are dying, every student in the country will have taken many humanities-oriented courses by the time they graduate. These courses, ostensibly, teach them about what it means to be human in our complex world. Alongside the history, the literature, the art, let’s teach what it means to be part of a global network, constantly contributing to and being affected by its shadow.

With luck, reconnecting the big with the small will hasten a national discussion of the ethical norms of big data and network analysis. This could result in new government regulating agencies, ethical standards for media companies, or changes in ways people interact with and behave on the social web.

17. Going Forward

When you zoom out far enough, everything looks the same. Occupy Wall Street; Ferguson Riots; the ALS Ice Bucket Challenge; the Iranian Revolution. They’re all just grassroots contagion effects across a social network. Rhetorically, presenting everything as a massive network is the same as photographing the earth from four billion miles: beautiful, sobering, and homogenizing. I challenge you to compare network visualizations of Ferguson Tweets with the ALS Ice Bucket Challenge, and see if you can make out any differences. I couldn’t. We need to zoom in to make meaning.

The challenge of network analysis in the humanities is to bring our close reading perspectives to the distant view, so media companies and governments don’t see everyone as just some statistic, some statistical blip floating on this pale blue dot.

I will end as I began, with a quote from Carl Sagan, reflecting on a time gone by but every bit as relevant for the moment we face today:

I know that science and technology are not just cornucopias pouring good deeds out into the world. Scientists not only conceived nuclear weapons; they also took political leaders by the lapels, arguing that their nation — whichever it happened to be — had to have one first. … There’s a reason people are nervous about science and technology. And so the image of the mad scientist haunts our world—from Dr. Faust to Dr. Frankenstein to Dr. Strangelove to the white-coated loonies of Saturday morning children’s television. (All this doesn’t inspire budding scientists.) But there’s no way back. We can’t just conclude that science puts too much power into the hands of morally feeble technologists or corrupt, power-crazed politicians and decide to get rid of it. Advances in medicine and agriculture have saved more lives than have been lost in all the wars in history. Advances in transportation, communication, and entertainment have transformed the world. The sword of science is double-edged. Rather, its awesome power forces on all of us, including politicians, a new responsibility — more attention to the long-term consequences of technology, a global and transgenerational perspective, an incentive to avoid easy appeals to nationalism and chauvinism. Mistakes are becoming too expensive.

Let us take Carl Sagan’s advice to heart. Amidst cries from commentators on the irrelevance of the humanities, it seems there is a large void which we are both well-suited and morally bound to fill. This is the path forward.

Thank you.


Thanks to Nickoal Eichmann and Elijah Meeks for editing & inspiration.

Networks Demystified 8: When Networks are Inappropriate

A few hundred years ago, I promised to talk about when not to use networks, or when networks are used improperly. With The Historian’s Macroscope in the works, I’ve decided to finally start answering that question, and this Networks Demystified is my first attempt at doing so. If you’re new here, this is part of an annoyingly long series (1 network basics, 2 degree, 3 power laws, 4 co-citation analysis, 5 communities and PageRank, 6 this space left intentionally blank, 7 co-citation analysis II). I’ve issued a lot of vague words of caution without doing a great job of explaining them, so here is the first substantive part of that explanation.

Networks are great. They allow you to do things like understand the role of postal routes in the circulation of knowledge in early modern Europe, or of the spread of the black death in the middle ages, or the diminishing importance of family ties in later Chinese governments. They’re versatile, useful, and pretty easy in today’s software environment. And they’re sexy, to boot. I mean, have you seen this visualization of curved lines connecting U.S. cities? I don’t even know what it’s supposed to represent, but it sure looks pretty enough to fund!

A really pretty network visualization. [via]
A really pretty network visualization. [via]

So what could possibly dissuade you from using a specific network, or the concept of networks in general? A lot of things, it turns out, and even a big subset of things that belong only to historians. I won’t cover all of them here, but I will mention a few big ones.

An Issue of Memory Loss

Okay, I lied about not knowing what the above network visualization represents. It turns out it’s a network of U.S. air travel pathways; if a plane goes from one city to another, an edge connects the two cities together. Pretty straightforward. And pretty useful, too, if you want to model something like the spread of an epidemic. You can easily see how someone with the newest designer virus flying into Texas might infect half-a-dozen people at the airport, who would in turn travel to other airports, and quickly infect most parts of the country with major airports. Transportation networks like this are often used by the CDC for just such a purpose, to determine what areas might need assistance/quarantine/etc.

The problem is that, although such a network might be useful for epidemiology, it’s not terribly useful for other seemingly intuitive questions. Take migration patterns: you want to know how people travel. I’ll give you another flight map that’s a bit easier to read.

Flight patterns over U.S. [via]
Flight patterns over U.S. [via]
The first thing people tend to do when getting their hands on a new juicy network dataset is to throw it into their favorite software suite (say, Gephi) and run a bunch of analyses on it. Of those, people really like things like PageRank or Betweenness Centrality, which can give the researcher a sense of important nodes in the network based on how central they are; in this case, how many flights have to go through a particular city in order to get where they eventually intend to go.

Let’s look at Las Vegas. By anyone’s estimation it’s pretty important; well-connected to cities both near and far, and pretty central in the southwest. If I want to go from Denver to Los Angeles and a direct flight isn’t possible, Las Vegas seems to be the way to go. If we also had road networks, train networks, cell-phone networks, email networks, and so forth all overlaid on top of this one, looking at how cities interact with each other, we might be able to begin to extrapolate other information like how rumors spread, or where important trade hubs are.

Here’s the problem: network structures are deceitful. They come with a few basic assumptions that are very helpful in certain areas, but extremely dangerous in others, and they are the reason why you shouldn’t analyze a network without thinking through what you’re implying by fitting your data to the standard network model. In this case, the assumption to watch out for is what’s known as a lack of memory.

The basic networks you learn about, with nodes and edges and maybe some attributes, embed no information on how those networks are generally traversed. They have no memories. For the purposes of disease tracking, this is just fine: all epidemiologists generally need to know is whether two people might accidentally happen to find themselves in the same place at the same time, and where they individually go from there. The structure of the network is enough to track the spread of a disease.

For tracking how people move, or how information spreads, or where goods travel, structure alone is rarely enough. It turns out that Las Vegas is basically a sink, not a hub, in the world of airline travel. People who travel there tend to stay for a few days before traveling back home. The fact that it happens to sit between Colorado and California is meaningless, because people tend not to go through Vegas to get from one to another, even though individually, people from both states travel there with some frequency.

If the network had a memory to it, if it somehow knew not just that a lot of flights tended to go between Colorado and Vegas and between LA and Vegas, but also that the people who went to Vegas returned to where they came from, then you’d be able to see that Vegas isn’t the same sort of hub that, say, Atlanta is. Travel involving Vegas tends to be to or from, rather than through. In truth, all cities have their own unique profiles, and some may be extremely central to the network without necessarily being centrally important in questions about that network (like human travel patterns).

The same might be true of letter-writing networks in early modern Europe, my research of choice. We often find people cropping up as extremely central, connecting very important figures whom we did not previously realize were connected, only to find out that later that, well, it’s not exactly what we thought. This new central figure, we’ll call him John Smith, happened to be the cousin of an important statesman, the neighbor of a famous philosopher, and the once-business-partner of some lawyer. None of the three ever communicated with John about any of the others, and though he was structurally central on the network, he was no-one of any historical note. A lack of memory in the network that information didn’t flow through John, only to or from him, means my centrality measurements can often be far from the mark.

It turns out that in letter-writing networks, people have separate spheres: they tend to write about family with family members, their governmental posts with other officials, and their philosophies with other philosophers. The overarching structure we see obscures partitions between communities that seem otherwise closely-knit. When researching with networks, especially going from the visualization to the analysis phase, it’s important to keep in mind what the algorithms you use do, and what assumptions they and your network structure embed in the evidence they provide.

Sometimes, the only network you have might be the wrong network for the job. I have a lot of peers (me included) who try to understand the intellectual landscape of early modern Europe using correspondence networks, but this is a poor proxy indeed for what we are trying to understand. Because of the spurious structural connections, like that of our illustrious John Smith, early modern networks give us a sense of unity that might not have been present at the time.

And because we’re only looking on one axis (letters), we get an inflated sense of the importance of spatial distance in early modern intellectual networks. Best friends never wrote to each other; they lived in the same city and drank in the same pubs; they could just meet on a sunny afternoon if they had anything important to say. Distant letters were important, but our networks obscure the equally important local scholarly communities.

If there’s a moral to the story, it’s that there are many networks that can connect the same group of nodes, and many questions that can be asked of any given network, but before trying to use networks to study history, you should be careful to make sure the questions match the network.

Multimodality

As humanists asking humanistic questions, our networks tend to be more complex than the sort originally explored in network science. We don’t just have people connected to people or websites to websites, we’ve got people connected to institutions to authored works to ideas to whatever else, and we want to know how they all fit together. Cue the multimodal network, or a network that includes several types of nodes (people, books, places, etc.).

I’m going to pick on Elijah Meeks’ map of of the DH2011 conference, because I know he didn’t actually use it to commit the sins I’m going to discuss. His network connected participants in the conference with their institutional affiliations and the submissions they worked on together.

Part of Elijah Meeks' map of DH2011. [via]
Part of Elijah Meeks’ map of DH2011. [via]
From a humanistic perspective, and especially from a Latourian one, these multimodal networks make a lot of sense. There are obviously complex relationships between many varieties of entities, and the promise of networks is to help us understand these relationships. The issue here, however, is that many of the most common metrics you’ll find in tools like Gephi were not created for multimodal networks, and many of the basic assumptions of network research need to be re-aligned in light of this type of use.

Let’s take the local clustering coefficient as an example. It’s a measurement often used to see if a particular node spans several communities, and it’s calculated by seeing how many of a node’s connections are connected to each other. More concretely, if all of my friends were friends with one another, I would have a high local clustering coefficient; if, however, my friends tended not to be friends with one another, and I was the only person in common between them, my local clustering coefficient would be quite low. I’d be the bridge holding the disparate communities together.

If you study the DH2011 network, the problem should become clear: local clustering coefficient is meaningless in multimodal networks. If people are connected to institutions and conference submissions, but not to one another, then everyone must have the same local clustering coefficient: zero. Nobody’s immediate connections are connected to each other, by definition in this type of network.

Local clustering coefficient is an extreme example, but many of the common metrics break down or mean something different when multiple node-types are introduced to the network. People are coming up with ways to handle these networks, but the methods haven’t yet made their way into popular software. Yet another reason that a researcher should have a sense of how the algorithms work and how they might interact with their own data.

No Network Zone

The previous examples pointed out when networks might be used inappropriately, but there are also times when there is no appropriate use for a network. This isn’t so much based on data (most data can become a network if you torture them enough), but on research questions. Networks seem to occupy a similar place in the humanities as power laws do in computational social sciences: they tend to crop up everywhere regardless of whether they actually add anything informative. I’m not in the business of calling out poor uses of networks, but a good rule of thumb on whether you should include a network in your poster or paper is to ask yourself whether its inclusion adds anything that your narrative doesn’t.

Alternatively, it’s also not uncommon to see over-explanations of networks, especially network visualizations. A narrative description isn’t always the best tool for conveying information to an audience; just as you wouldn’t want to see a table of temperatures over time when a simple line chart would do, you don’t want a two-page description of communities in a network when a simple visualization would do.

This post is a bit less concise and purposeful than the others in this series, but stay-tuned for a revamped (and hopefully better) version to show up in The Historian’s Macroscope. In the meantime, as always, comments are welcome and loved and will confer good luck on all those who write them.

Historians, Doctors, and their Absence

[Note: sorry for the lack of polish on the post compared to others. This was hastily written before a day of international travel. Take it with however many grains of salt seem appropriate under the circumstances.]

[Author’s note two: Whoops! Never included the link to the article. Here it is.]

Every once in a while, 1 a group of exceedingly clever mathematicians and physicists decide to do something exceedingly clever on something that has nothing to do with math or physics. This particular research project has to do with the 14th Century Black Death, resulting in such claims as the small-world network effect is a completely modern phenomenon, and “most social exchange among humans before the modern era took place via face-to-face interaction.”

The article itself is really cool. And really clever! I didn’t think of it, and I’m angry at myself for not thinking of it. They look at the empirical evidence of the spread of disease in the late middle ages, and note that the pattern of disease spread looked shockingly different than patterns of disease spread today. Epidemiologists have long known that today’s patterns of disease propagation are dependent on social networks, and so it’s not a huge leap to say that if earlier diseases spread differently, their networks must have been different too.

Don’t get me wrong, that’s really fantastic. I wish more people (read: me) would make observations like this. It’s the sort of observation that allows historians to infer facts about the past with reasonable certainty given tiny amounts of evidence. The problem is, the team had neither any doctors, nor any historians of the late middle ages, and it turned an otherwise great paper into a set of questionable conclusions.

Small world networks have a formal mathematical definition, which (essentially) states that no matter how big the population of the world gets, everyone is within a few degrees of separation from you. Everyone’s an acquaintance of an acquaintance of an acquaintance of an acquaintance. This non-intuitive fact is what drives the insane speeds of modern diseases; today, an epidemic can spread from Australia to every state in the U.S. in a matter of days. Due to this, disease spread maps are weirdly patchy, based more around how people travel than geographic features.

Patchy h5n1 outbreak map.
Patchy h5n1 outbreak map.

The map of the spread of black death in the 14th century looked very different. Instead of these patches, the disease appeared to spread in very deliberate waves, at a rate of about 2km/day.

Spread of the plague, via the original article.
Spread of the plague, via the original article.

How to reconcile these two maps? The solution, according to the network scientists, was to create a model of people interacting and spreading diseases across various distances and types of networks. Using the models, they show that in order to generate these wave patterns of disease spread, the physical contact network cannot be small world. From this, because they make the (uncited) claimed that physical contact networks had to be a subset of social contact networks (entirely ignoring, say, correspondence), the 14th century did not have small world social networks.

There’s a lot to unpack here. First, their model does not take into account the fact that people, y’know, die after they get the plague. Their model assumes infected have enough time and impetus to travel to get the disease as far as they could after becoming contagious. In the discussion, the authors do realize this is a stretch, but suggest that because, people could if they so choose travel 40km/day, and the black death only spread 2km/day, this is not sufficient to explain the waves.

I am no plague historian, nor a doctor, but a brief trip on the google suggests that black death symptoms could manifest in hours, and a swift death comes only days after. It is, I think, unlikely that people would or could be traveling great distances after symptoms began to show.

More important to note, however, are the assumptions the authors make about social ties in the middle ages. They assume a social tie must be a physical one; they assume social ties are connected with mobility; and they assume social ties are constantly maintained. This is a bit before my period of research, but only a hundred years later (still before the period the authors claim could have sustained small world networks), but any early modern historian could tell you that communication was asynchronous and travel was ordered and infrequent.

Surprisingly, I actually believe the authors’ conclusions: that by the strict mathematical definition of small world networks, the “pre-modern” world might not have that feature. I do think distance and asynchronous communication prevented an entirely global 6-degree effect. That said, the assumptions they make about what a social tie is are entirely modern, which means their conclusion is essentially inevitable: historical figures did not maintain modern-style social connections, and thus metrics based on those types of connections should not apply. Taken in the social context of the Europe in the late middle ages, however, I think the authors would find that the salient features of small world networks (short average path length and high clustering) exist in that world as well.

A second problem, and the reason I agree with the authors that there was not a global small world in the late 14th century, is because “global” is not an appropriate axis on which to measure “pre-modern” social networks. Today, we can reasonably say we all belong to a global population; at that point in time, before trade routes from Europe to the New World and because of other geographical and technological barriers, the world should instead have been seen as a set of smaller, overlapping populations. My guess is that, for more reasonable definitions of populations for the time period, small world properties would continue to hold in this time period.

Notes:

  1. Every day? Every two days?

Networks Demystified 7: Doing Co-Citation Analyses

So this is awkward. I’ve published Networks Demystified 7: Doing Citation Analyses before Networks Demystified 6: Organizing Your Twitter Lists. What depraved lunatic would do such a thing? The kind of depraved lunatic that is teaching this very subject twice in the next two weeks: deal with it, you’ll get your twitterstructions soon, internet. In the meantime, enjoy the irregular nature of the scottbot irregular.

And this is part 7 of my increasingly inaccurately named trilogy of instructional network analysis posts (1 network basics, 2 degree, 3 power laws, 4 co-citation analysis, 5 communities and PageRank, 6 this space left intentionally blank). I’m covering how to actually do citation analyses, so it’s a continuation of part 4 of the series. If you want to know what citation analysis is and why to do it, as well as a laundry list of previous examples in the humanities and social sciences, go read that post. If you want to just finally be able to analyze citations, like you’ve always dreamed, read on. 1

You’re going to need two things for these instructions: The Sci2 Tool, and either a subscription to the multi-gazillion dollar ISI Web of Science database, or this sample dataset. The Sci2 (Science of Science) Tool is a fairly buggy program (I’m allowed to say that because I’m kinda off-and-on the development team and I wrote half the user manual) that specializes in ingesting data of various formats and turning them into networks for analysis and visualization. It’s a good tool to use before you run to Gephi to make your networks pretty, and has a growing list of available plugins. If you already have the Sci2 Tool, download it again, because there’s a new version and it doesn’t auto-update. Go download it. It’s 80mb, I’ll wait.

Once you’ve registered for (not my decision, don’t blame me!) and downloaded the tool, extract the zip folder wherever you want, no install necessary. The first thing to do is increase the amount of memory available to the program, assuming you have at least a gig of RAM on your computer. We’re going to be doing some intensive analysis, so you’ll need the extra space. Edit sci2.ini; on Windows, that can be done by right-clicking on the file and selecting ‘edit’; on Mac, I dunno, elbow-click and press ‘CHANGO’? I have no idea how things work on Macs. (Sorry Mac-folk! We’ve actually documented in more detail how to increase memory – on both Windows and Mac – here)

Once editing the file, you’ll see a nigh-unintelligble string of letters and numbers that end in “-Xmx350m”. Assuming you have more than a gig of RAM on your computer, change that to “-Xmx1000m”. If you don’t have more RAM, really, you should go get some. Or use only a quarter of the dataset provided. Save it and close the text editor.

Run Sci2.exe We didn’t pay Microsoft to register the app, so if you’re on Windows, you may get a OHMYGODWARNING sign. Click ‘run anyway’ and safely let my team’s software hack your computer and use it to send pictures of cats to famous network scientists. (No, we’ll be good, promise). You’ll get to a screen remarkably like Figure 7. Leave it open, and if you’re at an institution that pays ISI Web of Science the big bucks, head there now. Otherwise ignore this and just download the sample dataset.

Downloading Data

I’m a historian of science, so let’s look for history of science articles. Search for ‘Isis‘ as a ‘Publication Name’ from the drop-down menu (see Figure 1) and notice that, as of 9/23/2013, there are 14,858 results (see Figure 2).

Figure 1: Searching for Isis as the name of a publication.
Figure 1: Searching for Isis as the name of a publication.
Figure 2: Isis periodical search results.
Figure 2: Isis periodical search results.

This is a list of every publication in the journal ISIS. Each individual record includes bibliographic material, abstract, and the list of references that are cited in the article. To get a reasonable dataset to work with, we’re going to download every article ever published in ISIS, of which there are 1,189. The rest of the records are book reviews, notes, etc. Select only the articles by clicking the checkbox next to ‘articles’ on the left side of the results screen and clicking ‘refine’.

The next step is to download all the records. This web service limits you to 500 records per download, so you’re going to need to download 3 separate files (records 1-500, 501-1000, and 1001-1189) and combine them together, which is a fairly complicated step, so pay close attention. There’s a little “Send to:” drop-down menu at the top of the search results (Figure 3). Click it, and click ‘Other File Formats’.

Figure 3: Saving Web of Science records.
Figure 3: Saving Web of Science records.

At the pop-up box, check the radio box for records 1 to 500 and enter those numbers, change the record content to ‘Full Record and Cited References’, and change the file format to ‘Plain Text’ (Figure 4). Save the file somewhere you’ll be able to find it. Do this twice more, changing the numbers to 501-1000 and 1001-1189, saving these files as well.

Figure 4: Parameters for downloading Web of Science files.
Figure 4: Parameters for downloading Web of Science files.

You’ll end up with three files, possibly named: savedrecs.txt, savedrecs(1).txt, and savedrecs(2).txt. If you open one up (Figure 5), you’ll see that each individual article gets its own several-dozen lines, and includes information like author, title, keywords, abstract, and (importantly in our case) cited references.

Figure 5: An example Isis record.
Figure 5: An example ISIS record.
Figure 6: The end of an ISIS record file.
Figure 6: The end of an ISIS record file.

You’ll also notice (Figures 5 & 6) that first two lines and last line of every file are special header and footer lines. If we want to merge the three files so that the Sci2 Tool can understand it, we have to delete the footer of the first file, the header and footer of the second file, and the header of the last file, so that the new text file only has one header at the beginning, one footer at the end, and none in between. Those of you who are familiar enough with a text editor (and let’s be honest, it should be everyone reading this) go ahead and copy the three files into one huge file with only one header and footer. If you’re feeling lazy, just download it here.

Creating a Citation Network

Now open the Sci2 Tool (Figure 7) and go to File->Load in the drop-down menu. Find your super file with all of ISIS and open it, loading it as an ‘ISI flat format’ file (Figure 8).

Figure 7: The Sci2 Tool.
Figure 7: The Sci2 Tool.
Figure 8: Loading a file as an ISI flat format file.
Figure 8: Loading a file as an ISI flat format file.

If all goes correctly, two new files should appear in the Data Manager, the pane on the right-hand side of the software. I’ll take a bit of a detour here to explain the Sci2 Tool.

The main ‘Console’ pane on the top-left will include a complete log of your workflow, including all the various algorithms you use, what settings and parameters you use with them, and how to cite the various ones you use. When you close the program, a copy of the text in the ‘Console’ pain will save itself as a log file in the program directory so you can go back to it later and see what exactly you did.

The ‘Scheduler’ pane on the bottom is just that: it shows you what algorithms are currently running and what already ran. You can safely ignore it.

Along with the drop-down menus at the top, the already-mentioned ‘Data Manager’ pane on the right is where you’ll be spending most of your time. Every time you load a file, it will appear in the data manager. Every time you run an algorithm on or manipulate that file in some way, a copy of it with the new changes will appear hierarchically nested below the original file. This is so, if you make a mistake, want to use an earlier version of the file, or want to run run a different set of analyses, you can still do so. You can right-click on files in the data manager to view or save them in various file formats. It is important to remember to make sure that the appropriate file is selected in the data manager when you run an analysis, as it’s easy to accidentally run an algorithm on some other random data file.

With that in mind, once your file is loaded, make sure to select (by left-clicking) the ‘1189 Unique ISI Records’ data file in the data manager. If you right-click and view the file, it should open up in Excel (Figure 9) or whatever your default *.csv viewer is, and you’ll see that the previous text file has been converted to a spreadsheet. You can look through it to see what the data look like.

Figure 9: All of the ISIS History of Science journal articles as a csv.
Figure 9: All of the ISIS History of Science journal articles as a csv.

When you’re done ogling at all the pretty data, close the spreadsheet and go back to the tool. Making sure the ‘1189 Unique ISI Records’ file is selected, go to ‘Data Preparation -> Extract Paper Citation Network’ in the drop-down menu.

Voilà! You now have a history of science citation network. The algorithm spits out two files: ‘Extracted paper-citation network’, which is the network file itself, and ‘Paper information’, which is a spreadsheet that includes all the nodes in the network (in this case, articles that either were published in ISIS or are cited by them). It includes a ‘localCitationCount’ column, which tells you how frequently a work is cited within the dataset (Shapin’s Leviathan and the Air Pump‘ is cited 16 times, you’ll see if you open up the file), and a ‘globalCitationCount’ column, which is how many times ISI Web of Science thinks the article has been cited overall, not just within the dataset (Merton’s ” The Matthew effect in science II” is cited 183 times overall). ‘globalCitationCount’ statistics are of course only available for the records you downloaded, so you have them for ISIS published articles, but none of the other records.

Select ‘Extracted paper-citation network’ in the data manager. From the drop-down menu, run ‘Analysis -> Networks -> Network Analysis Toolkit (NAT)’. It’s a good idea to run this on any network you have, just to see the basic statistics of what you’re working with. The details will appear in the console window (Figure 10).

Figure 10: Network analysis toolkit output on the ISIS citation network.
Figure 10: Network analysis toolkit output on the ISIS citation network.

There are a few things worth noting right away. The first is that there are 52,479 nodes; that means that our adorable little dataset of 1,189 articles actually referenced over 50,000 other works between them, about 50 refs/article. The second fact worth noting is that there are 54,915 directed edges, which is the total number of direct citations in the dataset. One directed edge is a citation from a citing node (an ISIS article) to a cited node (either an ISIS article, or a book, or whatever the author decides to reference).

The last bit worth pointing out is the number of weakly connected components, and the size of the largest connected component. Each weakly connected component is a chunk of the network connected by citation chains: if article A and B are the only articles which cite article C, if article C cites nothing else, and if A and B are uncited by any other articles, they together make a weakly connected component. As soon as another citation link comes from or to them, it becomes part of that component. In our case, the biggest component is 46,971 nodes, which means that most of the nodes in the network are connected to each other. That’s important, it means history of science as represented by ISIS is relatively cohesive. There are 215 weakly connected components in all, small islands that are disconnected from the mainland.

If you have Gephi installed, you can visualize the network by selecting ‘Extracted paper-citation network’ in the data manager and clicking ‘Visualization -> Networks -> Gephi’, though what you do from there is beyond the scope of these instructions. It also probably won’t make a heck of a lot of sense: there aren’t many situations where visualizing a citation network are actually useful. It’s what’s called a Directed Acyclic Graph, which are generally the most visually boring graphs around (don’t cite me on this).

I do have a very important warning. You can tell it’s important because it’s bold. The Sci2 Tool was made by my advisor Katy Börner as a tool for people with similar research to her own, whose interests lie in modeling and predicting the spread of information on a network. As such, the direction of citation edges created by the tool are opposite what many expect. They go from the cited source to the citing source, because the idea is that’s the direction that information flows, rather than from the citing source to the cited source. As a historian, I’m more interested in considering the network in the reverse direction: citing to cited, as that gives more agency to the author. More details in the footnote. 2

Great, now that that’s out of the way, let’s get to the more interesting analyses. Select ‘Extracted paper-citation network’ in the data manager and run ‘Data Preparation -> Extract Document Co-Citation Network’. And then wait. Have you waited for a while? Good, wait some more. This is a process. And 50,000 articles is a lot of articles. While you’re waiting, re-read Networks Demystified 4: Co-Citation Analysis to get an idea of what it is you’re doing and why you want to do it.

Okay, we’re done (assuming you increased the allotted memory to the tool like we discussed earlier). You’re no presented the ‘Co-citation Similarity Network’ in the data manager, and you should, once again, run ‘Analysis -> Networks -> Network Analysis Toolkit (NAT)’ in the Data Manager. This as well will take some time, and you’ll see why shortly.

Figure 11: Network analysis toolkit of the ISIS co-citation network.
Figure 11: Network analysis toolkit of the ISIS co-citation network.

Notice that while there are the same number of nodes (citing or cited articles) as before, 52,479, the number of edges went from 54,915 to 2,160,275, a 40x increase. Why? Because every time two articles are cited together, they get an edge between them and, according to the ‘Average degree’ in the console pane, each article or book is cited alongside an average of 82 other works.

In order to make the analysis and visualization of this network easier we’re going to significantly cut its size. Recall that document co-citation networks connect documents that are cited alongside each other, and that the weight of that connection is increased the more often the two documents appear together in a bibliography. What we’re going to do here is drastically reduce the network’s size deleting any edge between documents unless they’ve been cited together more than once. Select ‘Co-citation Similarity Network’ and run ‘Preprocessing -> Networks -> Extract Edges Above or Below Value’. Use the default settings (Figure 12).

Note that when you’re doing a scholarly citation analysis, cutting all the edges below a certain value (called ‘thresholding’) is usually a bad idea unless you know exactly how it will affect your study. We’re doing it here to make the walkthrough easier.

Figure 12: Extracting edges to reduce the size of the network.
Figure 12: Extracting edges to reduce the size of the network.

Run ‘Analysis -> Networks -> Network Analysis Toolkit (NAT)’ on the new ‘Edges above 1 by weight’ dataset, and note that the network has been reduced from two million edges to three thousand edges, a much more manageable number for our purposes. You’ll also see that there are 51,313 isolated nodes: nodes that are no longer connected to the network because we cut so many edges in our mindless rampage. Who cares about them? Let’s delete them too! Select ‘Edges above 1 by weight’ and run ‘Preprocessing -> Networks -> Delete Isolates’, and watch as fifty thousand precious history of science citations vanish in a puff of metadata. Gone.

If you run the Network Analysis Toolkit on the new network, you’ll see that we’re left with a small co-citation net of 1,166 documents and 3,344 co-citations between them. The average degree tells us that each document is connected to, on average, 6 other documents, and that the largest connected component contains 476 documents.

So now’s the moment of truth, the time to visualize all your hard work. If you know how to use Gephi, and have it installed, select ‘With isolates removed’ in the data manager and run ‘Visualization -> Networks -> Gephi’. If you don’t, run ‘Visualization -> Networks -> GUESS’ instead, and give it a minute to load. You will be presented with this stunning work of art vaguely reminiscent of last night’s spaghetti and meatball dinner (Figure 13).

Figure 13: GUESS in all its glory.
Figure 13: GUESS in all its glory.

Fear not! The first step to prettifying the network is to run ‘Layout -> GEM’ and then ‘Layout -> Bin Pack’. Better already, right? Then you can make edits using the graph modifier below (or using python commands in the interpreter), but the friendly folks at my lab have put together a script for you that will do that automatically. Run ‘Script -> Run Script’.

When you do, you will be presented with a godawful java applet that automatically sticks you in some horrible temp directory that you have to find your way out of. In the ‘Look In:’ navigation drop-down, find your way back to your desktop or your documents directory and then find wherever you installed the Sci2 Tool. In the Sci2 directory, there’s a folder called ‘scripts’, and in the ‘scripts’ folder, there’s a ‘GUESS’ folder, and in the ‘GUESS’ folder you will find the holy grail. Select ‘reference-co-occurrence-nw.py’ and press ‘open’.

Magic! Your document co-citation network is now all green and pretty, and you can zoom in and out using either the +/- button on the left, or using your mouse wheel and clicking and dragging on the network itself. It’ll look a bit like Figure 14.

Figure 14: Co-Citation network in GUESS.
Figure 14: Co-Citation network in GUESS.

If you feel more dangerous and cool, you can try visualizing the same network in Gephi, and it might come out something like Figure 15.

Figure 15: Gephi's document co-citation network, with nodes sized by how frequently they're cited in ISIS.
Figure 15: Gephi’s document co-citation network, with nodes sized by how frequently they’re cited in ISIS. Click to enlarge.

That’s it! You’ve co-cited a dataset. I hope you feel proud of yourself, because you should. And all without breaking a sweat. If you want (and you should want), you can save your results by right clicking the various files in the data manager you want to save. I’d recommend saving the most recent file, ‘With isolates removed’, and saving it as an NWB file, which is fairly easy to read and is the Sci2 Tool’s native format.

Stay-tuned for the paradoxically earlier-numbered Networks Demystified 6, on organizing your twitter feed.

Notes:

  1. Part 4 also links to a few great tutorials on how to do this with programming, but if you don’t know the first thing about programming, start here instead.
  2. Those of you who know network basics, keep this in mind when running your analyses: PageRank, In & Out Degree, etc., may be opposite of what you expect, with the papers that cite the most sources as those with the highest In-Degree and PageRank. If this is opposite your workflow, you can fairly easily change the data by hand in a spreadsheet editor or with regular expressions.

Networks Demystified 5: Communities, PageRank, and Sampling Caveats

The fifth and sixth (coming soon…) installment of Networks Demystified will be a bit more applied than the previous bunch (1 network basics, 2 degree, 3 power laws, 4 co-citation analysis). Like many of my recent posts, this one is in response to a Twitter conversation:

If you follow a lot of people on Twitter (Michael follows over a thousand), getting a grasp of them all and organizing them can be tough. Luckily network analysis can greatly ease the task of organizing twitter follows, and this and next post will teach you how to do that using NodeXL, a plugin for Microsoft Excel that (unfortunately) only works on Windows. It’s super easy, though, so if you have access to a Windows machine with Office installed, it’s worth trying it out despite the platform limitations.

This installment will explain the concept of modularity for group detection in networks, as well as why certain metrics like centrality should be avoided when using certain kinds of datasets. I’m going to be as gentle as I can be on the math, so this tutorial is probably best-suited for those just learning network techniques, but will fall short for those hoping for more detailed or specific information.

Next installment, Networks Demystified 6, will include the actual step-by-step instructions of how to run these analyses using NodeXL. I’m posting the description first, because I strongly believe you should learn the concepts before applying the techniques. At least that’s the theory: actually I’m posting this first because Twitter is rate-limiting the download of my follower/followee network, and I’m impatient and want to post this right away.

Modularity / Community Detection

Modularity is a technique for finding which groups of nodes in a network are more similar to each other than to other groups; it lets you spot communities.

It is unfortunate (for me) that modularity is one of the more popular forms of community detection, because it also happens to be one of the methods more difficult to explain without lots of strange symbols, which I’m trying to avoid. First off, the modularity technique is not one simple algorithm, as much as it is a conceptual framework for thinking about communities in networks. There modularity you run in Gephi is different than modularity in NodeXL, because there’s more than one way to write the concept into an algorithm, and they’re not all exactly the same.

Randomness

But to describe modularity itself, let’s take a brief detour through random-network lane. Randomization is a popular tool among network scientists, statisticians, and late 20th century avant-garde music composers for a variety of reasons. Suppose you’re having a high-stakes coin-flip contest with your friend, who winds up beating you 68/32. Before you run away crying that your friend cheated, because a fair coin should always land 50/50, remember that the universe is a random place. The 68/32 score could’ve appeared by chance alone, so you write up a quick computer program to flip a thousand coins a hundred times each, and if in those thousand computational coin-flip experiments, a decent amount come up around 68/32, you can reasonably assume your friend didn’t cheat.

The use of a simulated random result to see if what you’ve noticed is surprising (or, sometimes, significant) is quite common. I used it on the Irregular when reviewing Matthew Jockers’ Macroanalysis, shown in the graphic halfway down the page and reproduced here. I asked, in an extremely simplistic way, whether the trends Jockers saw over time were plausible by creating four dummy universes where randomness ruled, to see if his results could be attributable to chance alone. By comparing his data to my fake data, I concluded that some of his results were probably very accurate, and some of them might have just been chance.

This example chart compares a potential "real" underlying publication rate against several simulated potential sample datasets Jockers might have, created by multiplying the "real" dataset by some random number between 0 and 1.
This example chart compares a potential “real” underlying publication rate against several simulated potential sample datasets Jockers might have, created by multiplying the “real” dataset by some random number between 0 and 1.

Network analysts use the same sort of technique all the time. Do you want to know if it’s surprising that some actress is only six degrees away from Kevin Bacon (or anybody else on the network)? Generate a bunch of random networks with the same amount of nodes (actors) and edges (connections between them if they star in a movie together), and see if, in most cases, you can get from any one actor to any other in only six hops. Odds are you could; that’s just how random networks work.

What’s surprising is that in these, as well as most other social networks, people tend to be much more tightly clustered together than expected from a random network. They form little groups and cliques. It is significantly unlikely that in such cliquish networks, where the same groups of actors tend to appear with each other constantly, that everyone would still be only six degrees away from one another. It’s commonly known that social networks organize in what are called small-worlds, where people tend to be much more closely connected to one another than one would expect when they’re in such tight cliques. This is the power of random networks: they help pick out the unusual.

Modularity Explained

Which brings us back to modularity. With some careful thinking, one would come up with a quick solutions to figuring out how to find communities in networks: find clusters of nodes that have more internal edges between them than external edges to other groups.

What a network community should look like. [via]
What network communities should look like. [via]
There’s a lurking problem with this idea, though. If you were just counting the number of in-group connections vs. out-group connections, you could come up with an optimal solution very quickly if you say the entire network is one community: voila! no outgoing connections, and lots of internal connections. If instead you say in advance that you want two communities, or you only want communities of a certain size, it mitigates the problem somewhat, but then you’re stuck with needing to set the number of communities beforehand, which is a difficult constraint if you’re not sure what that number should be.

 

The key is randomness. You want to find communities of nodes for which there are more internal links than you would expect given that the graph was random, and fewer external links than you would expect given the graph was random. Mark Newman defines modularity as: “the number of edges falling within groups minus the expected number in an equivalent network with edges placed at random.”

Modularity is thus a network-level measurement, and it can change based on what communities you choose in your network. For example, in the figure above, most of the edges in the network are within the Freakish Grey Blobs (hereafter FGBs), and within the FGBs the edges are very dense. In that case, we would expect the modularity to be quite high. However, imagine we drew the FGBs around different nodes in the network instead: if we made four FGBs instead of three, splitting the left group into two, we’d find that a larger fraction of the edges are falling outside of groups, thus decreasing the overall network’s modularity score.

Similarly, let’s say we made two FGBs instead of three. We merge the two groups in the right into one supergroup (group 1), and leave the group on the left (group 1) the same. What would happen to the modularity? In that case, because group 2 is now less dense (defining density as the number of edges within the group compared to the total possible number of edges within it), and we’d expect a random network to look a bit more similar, so the overall network’s modularity score would (again) decrease slightly.

That’s modularity in a nutshell. The method of finding the appropriate groupings in a network varies, but essentially, all the algorithms keep drawing FGBs around different groups of nodes until the overall modularity score of the network is as high as possible. Find the right configuration of FGBs such that the modularity score is very high, and then label the nodes in each separate FGB as their own community. In the figure above, there are three communities, and your favorite network analysis software will label them as such.

Some metrics to avoid (with caveats)

There’s a stubbornly persistent desire, when analyzing a tasty new network dataset, to just run every algorithm in the box and see what comes up. PageRank and centrality? Sure! Clustering? Sounds great! Unfortunately, each algorithm makes certain underlying assumptions about the data, and our twitter network breaks many of those assumptions.

The most important worth mentioning is that we’ve already sinned. Remember how we plan on calculating modularity, and remember how I defined it earlier? Nothing was mentioned about whether or not the edges were directed. Asymmetrical edges (like asymmetries between follower and followee) are not understood by the modularity algorithm we described, which assumes there would be no difference between a follower, a followee, or a reciprocal connection of both. Running modularity on a directed network is, in general, a bad idea: in most networks, the direction of an edge is very important for determining community involvement. We can safely ignore this issue here, as we’re dealing with the fairly low-stakes problem of letting the computer help us organize our twitter network, but in publications or higher-stakes circumstances, this would be something to avoid without thinking through the implications very carefully.

A network metric that might seem more appropriate to the forthcoming twitter dataset, PageRank, is similarly inadequate without a few key changes. As I haven’t demystified PageRank yet, here’s a short description, with the promise to expand on it later.

PageRank is Google’s algorithm for ranking websites in their search results, and it’s inspired by citation analysis, but it turns out to be useful in various other circumstances. There are two ways to explain the algorithm, both equally accurate. The first has to do with probability: what is the probability that, if someone just starts clicking links on the web at random, they’ll eventually land on your website. The higher the chance that someone clicking links at random will reach your site, the higher your PageRank.

PageRank’s other definition makes a bit more ‘on-the-ground’ sense; given a large, directed network (like websites linking to other websites), those sites that are very popular can determine another site’s score by whether or not they link to it. Say a really famous website, like BBC, links to your site; you get lots of points. If Sam’s New England Crab Shack & Duck Farm links to your site, however, you won’t get many points. Seemingly paradoxically, the more points your website has, the more points you can give to sites that you link to. Sites that get linked to a lot are considered reputable, and in turn they link to other sites and pass that reputation along. But, the clever bit is that your site can only pass a fraction of its reputation along based on how many other sites it links to, thus if your site only links to the Scottbot Irregular, the Irregular will get lots of points from it, but if it links to ten sites including the Irregular, my site would only get a tenth of the potential points.

How PageRank works(-ish).  Those sites which have more points in turn confer more points to others. [via]
How PageRank works(-ish). Those sites which have more points in turn confer more points to others. [via]
This generalizes pretty easily to all sorts of networks including, as it happens, twitter follow networks. Those who are followed by lots of people are scored highly; if one of those highly scoring individuals follows only a select few, that select few will also receive a significant increase in rank. When a user is followed by many other users with very high scores, that user is scored the highest of them all. PageRank, then, is a neat way of looking at who has the power in a twitter network. Those at the top are those who even the relatively popular find interesting and worth following.

 

Which brings us to this, the network we’re creating to organize our twitter neighborhood. The network type is right: a directed, unweighted network. The algorithm will work fine. It will tell you, for example, that you are (or are nearly) the most popular person in your twitter neighborhood. And why wouldn’t it? Most of the people in your neighborhood follow you, or follow people who follow you, so the math is inevitable.

And the problem is obvious. Your sampling strategy (the criteria you used to gather your data) inherently biases this particular network metric, and most other metrics within the same family. You’ve used what’s called snowball sampling, so-named because your sample snowballs into a huge network in relatively short order, starting from a single person: you. It’s you, then those you follow, then those they follow, and so forth. You are inevitably at the center of your snowball, and the various network centrality measurements will react accordingly.

Well, you might ask, what if you just ignore yourself when looking at the network? Nope. Because PageRank (among other algorithms) takes everyone’s score into account when calculating others’ scores; even if you close your eyes whenever your name pops up, your presence will still exert an invisible influence on the network. In the case of PageRank, because your score is so high, you’ll be conferring a much higher score to (potentially) otherwise unpopular people you happen to follow.

The short-term solution is to remove yourself from the network before you run any of your analyses. This actually still isn’t perfect, for reasons I don’t feel like getting into because the post is already too long, but it will give at least a better idea of PageRank centrality within your twitter neighborhood.

While you’re at it, you should also remove yourself before running community detection. As you might be the connection that bridges two otherwise disconnected communities together, and for the purpose of this study you’re trying to organize people separate from your own influence on them, running modularity on the network without you in it will likely give you a better sense of your neighborhood.

Continuing

Stay-tuned for the next exciting installment of Networks Demystified, wherein I’ll give step-by-step instructions on how to actually do the things I’ve described using NodeXL. If you want a head-start, go ahead and download and start playing with it.

An experiment in communal editing: Finding the history & philosophy of science.

After my last post about co-citation analysis, the author of one of the papers I was responding to, K. Brad Wray, generously commented and suggested I write up and publish the results and send them off to Erkenntnis, which is the same journal he published his results. That sounded like a great idea, so I am.

Because so many good ideas have come from comments on this blog, I’d like to try opening my first draft to communal commenting. For those who aren’t familiar with google docs (anyone? Bueller?), you can comment by selecting test and either hitting ctrl-alt-m, or going to the insert-> menu and clicking ‘Comment’.

The paper is about the relationship between history of science and philosophy of science, and draws both from the blog post and from this page with additional visualizations. There is also an appendix (pdf, sorry) with details of data collection and some more interesting results for the HPS buffs. If you like history of science, philosophy of science, or citation analysis, I’d love to see your comments! If you have any general comments that don’t refer to a specific part of the text, just post them in the blog comments below.

This is a bit longer form than the usual blog, so who knows if it will inspire much interaction, but it’s worth a shot. Anyone who is signed in so I can see their name will get credit in the acknowledgements.

Finding the History and Philosophy of Science (earlier draft)  ← draft 1, thanks for your comments.

Finding the History and Philosophy of Science (current draft) ← comment here!

 

Networks Demystified 4: Co-Citation Analysis

This installment of Networks Demystified is the first one that’s actually applied. A few days ago, a discussion arose over twitter involving citation networks, and this post fills the dual purpose of continuing that discussion, and teaching a bit about basic citation analysis. If you’re looking for the very basics of networks, see part 1 and part 2. Part 3 is a warning for anyone who feels the urge to say “power law.” To recap: nodes are the dots/points in the network, edges are the lines/arrows/connections.

Understanding Sociology, Philosophy, and Literary Theory using One Easy Method™!

The growing availability of humanities and social science (HSS) citation data in databases like ISI’s Web of Science (warning: GIANT paywall. Good luck getting access if your university doesn’t subscribe.) has led to a groundswell of recent blog activity in the area, mostly by the humanists and social scientists themselves. Which is a good thing, because citation analyses of HSS will happen whether we’re involving in doing them or not, so if humanists start becoming familiar with the methods, at least we can begin getting humanistically informed citation analyses of our own data.

ISI Web of Science paywall
The size of ISI’s Web of Science paywall. You shall not pass. [via]
This is a sort of weird post. It’s about history and philosophy of science, by way of social history, by way of literary theory, by way of philosophy, by way of sociology. About this time last year, Dan Wang asked the question Is There a Canon in Economic Sociology (pdf)? Wang was searching for a set of core texts for economic sociology, using a set of 52 syllabi regarding the subject. It’s a reasonable first pass at the question, counting how often each article appears in the syllabi (plus some more complex measurements) as well as how often individual authors appear. Those numbers are used to support the hypothesis that there is a strongly present canon, both of authors and individual articles, in economic sociology. This is an example of an extremely simple bimodal network analysis where there are two varieties of node: syllabi or articles. Each syllabi cites multiple articles, and several of those articles are cited by multiple syllabi. The top part of Figure 1 is what this would look like in a basic network representation.

Figure 1: basic bimodal network (top) and the resulting co-citation network (bottom). [Via Mark Newman, PNAS]
Figure 1: basic bimodal network (top) and the resulting co-citation network (bottom). [via Mark Newman]
Wang was also curious how instructors felt these articles fit together, so he used a common method called co-citation analysis to answer the question. The idea is that if two articles are cited in the same syllabus, they are probably related, so they get an edge drawn between them. He further restricted his analysis so that articles had to appear together in the same class session, rather than the the same syllabus, to be considered related to each other. What results is a new network (Figure 1, below) of article similarity based on how frequently they appear together (how frequently they are cited by the same source). In Figure 1, you can see that because article H and article F are both cited in syllabus class session 3, they get an edge drawn between them.

A further restriction was then placed on the network, what’s called a threshold. Two articles would only get an edge drawn between them if they were cited by at least 2 different class sessions (threshold = 2). The resulting economic sociology syllabus co-citation network looked like Figure 2, pulled from the original article. From this picture, one can begin to develop a clear sense of the demarcations of subjects and areas within economic sociology, thus splitting the canon into its constituent parts.

Figure 2: Co-citation network in economic sociology. [via]
Figure 2: Co-citation network in economic sociology. Edge thickness represents how often articles appear together in syllabi, and node size is based on a measure of centrality. [via]
In short order, Kieran Healy blogged a reply to this study, providing his own interpretations of the graph and what the various clusters represented. Remember Healy’s name, as it’s important later in the story. Two days after Healy’s blog post, Neal Caren took inspiration and created a co-citation analysis of sociology more broadly–not just economic sociology–using data he downloaded from ISI’s Web of Science (remember the giant paywall from before?). Instead of using syllabi, Caren looked at articles found in American Journal of Sociology, American Sociological Review, Social Forces and Social Problems since 2008. Web of Science gave him a list of every citation from every article in those journals, and he performed the same sort of co-citation analysis as Dan Wang did with syllabi, but at a much larger scale.

Because the dataset Caren used was so much larger, he had to enforce much stricter thresholds to keep the visualization manageable. Whereas Wang’s graph showed all articles, and connected them if they appeared together in more than 2 class sessions, Caren’s graph only connected articles which were cited together more than 4 times (threshold = 4). Further, a cited article wouldn’t even appear on the network visualization unless the article itself had been cited 8 or more times, thus reducing the amount of articles appearing on the visualization overall. The final network had 397 nodes (articles) and 1,597 edges (connections between articles). He also used a popular community detection algorithm to color the different article nodes based on which other articles they were most related to. Figure 3 shows the resulting network, and clicking on it will lead to an interactive version.

Figure 3: Neal Caren's sociology co-citation analysis. Click the picture to see the interactive version. [via]
Figure 3: Neal Caren’s sociology co-citation analysis. Click the picture to see the interactive version. [via]
Caren adds a bit of contextual description in his blog post, explaining what the various clusters represent and why this visualization is a valid and useful one for the field of sociology. Notably, at the end of the post, he shares his raw data, a python script for analyzing it, and all the code for visualizing the network and making it interactive and pretty.

Jump forward a year. Kieran Healy, the one who wrote the original post inspiring Neal Caren’s, decides to try his own hand at a citation analysis using some of the code and methods that Neal Caren had posted about. Healy’s blog post, created just a few days ago, looks at the field of philosophy through the now familiar co-citation analysis. Healy’s analysis covers 20 years of four major philosophy journals, consisting of 2,200 articles. These articles together make over 34,000 citations, although many of the cited articles are duplicates of articles that had already been cited. Healy writes:

The more often any single paper is cited, the more important it’s likely to be. But the more often any two papers are cited together, the more likely they are to be part of some research question or ongoing problem or conversation topic within the discipline.

With a dataset this large, the resulting co-citation network wound up having over a million edges, or connections between co-cited articles. Healy decides to only focus on the 500 most highly-cited items in the journals (not the best practice for a co-citation analysis, but I’ll address that in a later post), resulting in only articles that had been cited more than 10 times within the four journal dataset to be present in the network. Figure 4 shows the resulting network, which like Figure 3, can be clicked on to reach the interactive version.

Figure 4: Kieran Healy's co-citation analysis of four philosophy journals. Click for interactivity. [via]
Figure 4: Kieran Healy’s co-citation analysis of four philosophy journals. Click for interactivity. [via]
The post goes on to provide a fairly thorough and interesting analysis of the various communities formed by article clusters, thus giving a description of the general philosophy landscape as it currently stands. The next day, Healy posted a follow-up delving further into citations of philosopher David Lewis, and citation frequencies by gender. Going through the most highly cited 500 or so philosophy articles by hand, Healy finds that 3.6% of the articles are written by women; 6.3% are written by David Lewis; the overwhelming majority are written by white men. It’s not lost on me that the overwhelming majority of people doing these citation analyses are also white men – someone please help change that? Healy posted a second follow-up a few days later, worth reading, on his reasoning behind which journals he used and why he looked at citations in general. He concludes “The 1990s were not the 1950s. And yet essentially none of the women from this cohort are cited in the conversation with anything close to the same frequency, despite working in comparable areas, publishing in comparable venues, and even in many cases having jobs at comparable departments.”

Merely short days after Healy’s articles, Jonathan Goodwin became inspired, using the same code Healy and Caren used to perform a co-citation analysis of Literary Theory Journals. He began by concluding that these co-citation analysis were much more useful (better) than his previous attempts at direct citation analysis. About four decades of bibliometric research backs up Goodwin’s claim. Figure 5 shows Goodwin’s Literary Theory co-citation network, drawn from five journals and clickable for the interactive version, where he adds a bit of code so that the user can determine herself what threshold she wants to cut off co-citation weights. Goodwin describes the code to create the effect on his github account. In a follow-up post, directly inspired by Healy’s, Goodwin looks at citations to women in literary theory. His results? When a feminist theory journal is included, 8 of the top 30 authors are women (27%); when that journal is not included, only 2 of the top 30 authors are women (7%).

Figure 5: Goodwin's literary theory co-citation network. [via]
Figure 5: Goodwin’s literary theory co-citation network. [via]

At the Speed of Blog

Just after these blog posts were published, a quick twitter exchange between Jonathan Goodwin, John Theibault, and myself (part of it readable here) spurred Goodwin, in the space of 20 minutes, to download, prepare, and visualize the co-citation data of four social history journals over 40 years. He used ISI Web of Science data, Neal Caren’s code, a bit of his own, and a few other bits of open script which he generously cites and links to. All of this is to highlight not only the phenomenal speed of research when unencumbered by the traditional research process, but also the ease with which these sorts of analysis can be accomplished. Most of this is done using some (fairly simple) programming, but there are just as easy solutions if you don’t know how to or don’t care to code–one specifically which I’ll mention later, the Sci2 Tool. From data to visualization can take a matter of minutes; a first pass at interpretation won’t take much longer. These are fast analyses, pretty useful for getting a general overview of some discipline, and can provide quite a bit of material for deeper analysis.

The social history dataset is now sitting on Goodwin’s blog just waiting to be interpreted by the right expert. If you or anyone you know is familiar with social history, take a stab at figuring out what the analysis reveals, and then let us all know in a blog post of your own. I’ll be posting a little more about it as well soon, though I’m no expert of the discipline. Also, if you’re interested in citation analysis in the humanities, and you’ll be at DH2013 in Nebraska, I’ll be chairing a session all about citations in the humanities featuring an impressive lineup of scholars. Come join us and bring questions, July 17th at 10:30am.

Discovering History and Philosophy of Science

Before I wrap up, it’s worth mentioning that in one of Kieran Healy’s blog posts, he thanks Brad Wray for pointing out some corrections in the dataset. Brad Wray is one of the few people to have published a recent philosophy citation analysis in a philosophy journal. Wray is a top-notch philosopher, but his citation analysis (Philosophy of Science: What are the Key Journals in the Field?, Erkenntnis, May 2010 72:3, paywalled) falls a bit short of the mark, and as this is an instructional piece on co-citation analysis, it’s worth taking some time here to explore why.

Wray’s article’s thesis is that “there is little evidence that there is such a field as the history and philosophy of science (HPS). Rather, philosophy of science is most properly conceived of as a sub-field of philosophy.” He arrives at this conclusion via a citation analysis of three well-respected monographs, A Companion to the Philosophy of ScienceThe Routledge Companion to Philosophy of Science, and The Philosophy of Science edited by David Papineau, in total comprising 149 articles. Wray then counts how many times major journals are cited within each article, and shows that in most cases, the most frequently cited journals across the board are strict philosophy of science journals.

The data used to support Wray’s thesis–that there is no such field as history & philosophy of science (HPS)–is this coarse-level journal citation data. No history of science journal is listed in the top 10-15 journals cited by the three monographs, and HPS journals appear, but very infrequently. Of the evidence, Wray writes “if there were such a field as history and philosophy of science, one would expect scholars in that field to be citing publications in the leading history of science journal. But, it appears that philosophy of science is largely independent of the history of science.”

It is curious that Wray would suggest that total citations from strict philosophy of science companions can be used as evidence of whether a related but distinct field, HPS, actually exists. Low citations from philosophy of science to history of science is that evidence. Instead, a more nuanced approach to this problem would be similar to the approach above: co-citation analysis. Perhaps HPS can be found by analyzing citations from journals which are ostensibly HPS, rather than analyzing three focused philosophy of science monographs. If a cluster of articles should appear in a co-citation analysis, this would be strong evidence that such a discipline currently exists among citing articles. If such a cluster does not appear, this would not be evidence of the non-existence of HPS (absence of evidence ≠ evidence of absence), but that the dataset or the analysis type is not suited to finding whatever HPS might be. A more thorough analysis would be required to actually disprove the existence of HPS, although one imagines it would be difficult explaining that disproof to the people who think that’s what they are.

With this in mind, I decided to perform the same sort of co-citation analysis as Dan Wang, Kieran Healy, Neal Caren, and Jonathan Goodwin, and see what could be found. I drew from 15 journals classified in ISI’s Web of Science as “History & Philosophy of Science” (British Journal for the Philosophy of Science, Journal of Philosophy, Synthese, Philosophy of Science, Studies in History and Philosophy of Science, Annals of Science, Archive for History of Exact Sciences, British Journal for the History of Science, Historical Studies in the Natural Sciences, History and Philosophy of the Life Sciences, History of Science, Isis, Journal for the History of Astronomoy, Osiris, Social Studies of Science, Studies in History and Philosophy of Modern Physics, and Technology and Culture). In all I collected 12,510 articles dating from 1956, with over 300,000 citations between them. For the purpose of not wanting to overheat my laptop, I decided to restrict my analysis to looking only at those articles within the dataset; that is, if any article from any of the 15 journals cited any other article from one of the 15 journals, it was included in the analysis.

I also changed my unit of analysis from the article to the author. I didn’t want to see how often two articles were cited by some third article–I wanted to see how often two authors were cited together within some article. The resulting co-citation analysis gives author-author pairs rather than article-article pairs, like the examples above. In all, there were 7,449 authors in the dataset, and 10,775 connections between author pairs; I did not threshold edges, so the some authors in the network were cited together only once, and some as many as 60 times. To perform the analysis I used the Science of Science (Sci2) Tool, no programming required, (full advertisement disclosure: I’m on the development team), and some co-authors and I have written up how to do a similar analysis in the documentation tutorials.

The resulting author co-citation network, in Figure 6, reveals two fairly distinct clusters of authors. You can click the image to enlarge, but I’ve zoomed in on the two communities, one primarily history of science, the other primarily philosophy of science. At first glance, Wray’s hypothesis appears to be corroborated by the visualization; there’s not much in the way of a central cluster between the two. That said, a closer look at the middle, Figure 7, highlights a group of people whom either have considered themselves within HPS, or others have considered HPS.

Figure 6: Author co-citation network of 15 history & philosophy of science journals. Two authors are connected if they are cited together in some article, and connected more strongly if they are cited together frequently. Click to enlarge. [via me!]
Figure 6: Author co-citation network of 15 history & philosophy of science journals. Two authors are connected if they are cited together in some article, and connected more strongly if they are cited together frequently. Click to enlarge. [via me!] 
Figure 7: Author co-citation analysis of history and philosophy of science journals, zoomed in on the area between history and philosophy, with authors highlighted who might be considered HPS. Click to enlarge.
Figure 7: Author co-citation analysis of history and philosophy of science journals, zoomed in on the area between history and philosophy, with authors highlighted who might be considered HPS. Click to enlarge.

Figures 6 & 7 don’t prove anything, but they do suggest that within citation patterns, history of science and philosophy of science are clearly more cohesive than some combined HPS might be. Figure 7 suggests there might be more to the story, and what is needed in the next step to try to pin down HPS–if indeed it exists as some sort of cohesive unit–is to find articles that specifically self-identify as HPS, and through their citation and language patterns, try to see what they have in common with and what separates them from the larger community. A more thorough set of analytics, visualizations, and tables, which I’ll explain further at some point, can be found here (apologies for the pdf, this was originally made in preparation for another project).

The reason I bring up this example is not to disparage Wray, whose work did a good job of finding the key journals in philosophy of science, but to argue that we as humanists need to make sure the methods we borrow match the questions we ask. Co-citation analysis happens to be a pretty good method for exploring the question Wray asked in his thesis, but there are many more situations where it wouldn’t be particularly useful. The recent influx of blog posts on the subject, and the upcoming DH2013 session, is exciting, because it means humanists are beginning to take citation analysis seriously and are exploring the various situations in which its methods are appropriate. I look forward to seeing what comes out of the Social History data analysis, as well as future directions this research will take.