#humnets preview

Last year, Tim Tangherlini and his magical crew of folkloric imps and applied mathematicians put together a most fantastic and exhausting workshop on networks and network analysis in the humanities. We called it #humnets for short. The workshop (one of the oh-so-fantastic ODH Summer Institutes) spanned two weeks, bringing together forward-thinking humanists and Big Deals in network science and computer science. Now, a year and a half later, we’re all reuniting (bouncing back?) at UCLA to show off all the fantastic network-y humanist-y projects we’ve come up with in the interim.

As of a few weeks ago, I was all set to present my findings from analyzing and modeling the correspondence networks of early-modern scholars. Unfortunately (for me, but perhaps fortunately for everyone else), some new data came in that Changed Everything and invalidated many of my conclusions. I was faced with a dilemma; present my research as it was before I learned about the new data (after all, it was still a good example of using networks in the humanities), or retool everything to fit the new data.

Unfortunately, there was no time to do the latter, and doing the former felt icky and dishonest. In keeping with Tony Beaver’s statement at UCLA last year (“Everything you can do I can do meta,”) I ultimately decided to present a paper on precisely the problem that foiled my presentation: systematic bias. Biases need not be an issue of methodology; you can do everything right methodologically, you can design a perfect experiment, and a systematic bias can still thwart the accuracy of a project. The bias can be due to the available observable data itself (external selection bias), it may be due to how we as researchers decide to collect that data (sample selection bias), or it may be how we decide to use the data we’ve collected (confirmation bias).

There is a small-but-growing precedent of literature on the effects of bias on network analysis. I’ll refer to it briefly in my talk at UCLA, but below is a list of the best references I’ve found on the matter. Most of them deal with sample selection bias, and none of them deal with the humanities.

For those of you who’ve read this far, congratulations! Here’s a preview of my Friday presentation (I’ll post the notes on Friday).



Effects of bias on network analysis condensed bibliography:

  • Achlioptas, Dimitris, Aaron Clauset, David Kempe, and Cristopher Moore. 2005. On the bias of traceroute sampling. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, 694. ACM Press. doi:10.1145/1060590.1060693. http://dl.acm.org/citation.cfm?id=1060693.
  • ———. 2009. “On the bias of traceroute sampling.” Journal of the ACM 56 (June 1): 1-28. doi:10.1145/1538902.1538905.
  • Costenbader, Elizabeth, and Thomas W Valente. 2003. “The stability of centrality measures when networks are sampled.” Social Networks 25 (4) (October): 283-307. doi:10.1016/S0378-8733(03)00012-1.
  • Gjoka, M., M. Kurant, C. T Butts, and A. Markopoulou. 2010. Walking in Facebook: A Case Study of Unbiased Sampling of OSNs. In 2010 Proceedings IEEE INFOCOM, 1-9. IEEE, March 14. doi:10.1109/INFCOM.2010.5462078.
  • Gjoka, Minas, Maciej Kurant, Carter T Butts, and Athina Markopoulou. 2011. “Practical Recommendations on Crawling Online Social Networks.” IEEE Journal on Selected Areas in Communications 29 (9) (October): 1872-1892. doi:10.1109/JSAC.2011.111011.
  • Golub, B., and M. O. Jackson. 2010. “From the Cover: Using selection bias to explain the observed structure of Internet diffusions.” Proceedings of the National Academy of Sciences 107 (June 3): 10833-10836. doi:10.1073/pnas.1000814107.
  • Henzinger, Monika R., Allan Heydon, Michael Mitzenmacher, and Marc Najork. 2000. “On near-uniform URL sampling.” Computer Networks 33 (1-6) (June): 295-308. doi:10.1016/S1389-1286(00)00055-4.
  • Kim, P.-J., and H. Jeong. 2007. “Reliability of rank order in sampled networks.” The European Physical Journal B 55 (February 7): 109-114. doi:10.1140/epjb/e2007-00033-7.
  • Kurant, Maciej, Athina Markopoulou, and P. Thiran. 2010. On the bias of BFS (Breadth First Search). In Teletraffic Congress (ITC), 2010 22nd International, 1-8. IEEE, September 7. doi:10.1109/ITC.2010.5608727.
  • Lakhina, Anukool, John W. Byers, Mark Crovella, and Peng Xie. 2003. Sampling biases in IP topology measurements. In INFOCOM 2003. Twenty-Second Annual Joint Conference of the IEEE Computer and Communications. IEEE Societies, 1:332- 341 vol.1. IEEE, April 30. doi:10.1109/INFCOM.2003.1208685.
  • Latapy, Matthieu, and Clemence Magnien. 2008. Complex Network Measurements: Estimating the Relevance of Observed Properties. In IEEE INFOCOM 2008. The 27th Conference on Computer Communications, 1660-1668. IEEE, April 13. doi:10.1109/INFOCOM.2008.227.
  • Maiya, Arun S. 2011. Sampling and Inference in Complex Networks. Chicago: University of Illinois at Chicago, April. http://arun.maiya.net/papers/asmthesis.pdf.
  • Pedarsani, Pedram, Daniel R. Figueiredo, and Matthias Grossglauser. 2008. Densification arising from sampling fixed graphs. In Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 205. ACM Press. doi:10.1145/1375457.1375481. http://portal.acm.org/citation.cfm?doid=1375457.1375481.
  • Stumpf, Michael P. H., Carsten Wiuf, and Robert M. May. 2005. “Subnets of scale-free networks are not scale-free: Sampling properties of networks.” Proceedings of the National Academy of Sciences of the United States of America 102 (12) (March 22): 4221 -4224. doi:10.1073/pnas.0501179102.
  • Stutzbach, Daniel, Reza Rejaie, Nick Duffield, Subhabrata Sen, and Walter Willinger. 2009. “On Unbiased Sampling for Unstructured Peer-to-Peer Networks.” IEEE/ACM Transactions on Networking 17 (2) (April): 377-390. doi:10.1109/TNET.2008.2001730.


Effects of selection bias on historical/sociological research condensed bibliography:

  • Berk, Richard A. 1983. “An Introduction to Sample Selection Bias in Sociological Data.” American Sociological Review 48 (3) (June 1): 386-398. doi:10.2307/2095230.
  • Bryant, Joseph M. 1994. “Evidence and Explanation in History and Sociology: Critical Reflections on Goldthorpe’s Critique of Historical Sociology.” The British Journal of Sociology 45 (1) (March 1): 3-19. doi:10.2307/591521.
  • ———. 2000. “On sources and narratives in historical social science: a realist critique of positivist and postmodernist epistemologies.” The British Journal of Sociology 51 (3) (September 1): 489-523. doi:10.1111/j.1468-4446.2000.00489.x.
  • Duncan Baretta, Silvio R., John Markoff, and Gilbert Shapiro. 1987. “The selective Transmission of Historical Documents: The Case of the Parish Cahiers of 1789.” Histoire & Mesure 2: 115-172. doi:10.3406/hism.1987.1328.
  • Goldthorpe, John H. 1991. “The Uses of History in Sociology: Reflections on Some Recent Tendencies.” The British Journal of Sociology 42 (2) (June 1): 211-230. doi:10.2307/590368.
  • ———. 1994. “The Uses of History in Sociology: A Reply.” The British Journal of Sociology 45 (1) (March 1): 55-77. doi:10.2307/591525.
  • Jensen, Richard. 1984. “Review: Ethnometrics.” Journal of American Ethnic History 3 (2) (April 1): 67-73.
  • Kosso, Peter. 2009. Philosophy of Historiography. In A Companion to the Philosophy of History and Historiography, 7-25. http://onlinelibrary.wiley.com/doi/10.1002/9781444304916.ch2/summary.
  • Kreuzer, Marcus. 2010. “Historical Knowledge and Quantitative Analysis: The Case of the Origins of Proportional Representation.” American Political Science Review 104 (02): 369-392. doi:10.1017/S0003055410000122.
  • Lang, Gladys Engel, and Kurt Lang. 1988. “Recognition and Renown: The Survival of Artistic Reputation.” American Journal of Sociology 94 (1) (July 1): 79-109.
  • Lustick, Ian S. 1996. “History, Historiography, and Political Science: Multiple Historical Records and the Problem of Selection Bias.” The American Political Science Review 90 (3): 605-618. doi:10.2307/2082612.
  • Mariampolski, Hyman, and Dana C. Hughes. 1978. “The Use of Personal Documents in Historical Sociology.” The American Sociologist 13 (2) (May 1): 104-113.
  • Murphey, Murray G. 1973. Our Knowledge of the Historical Past. Macmillan Pub Co, January.
  • Murphey, Murray G. 1994. Philosophical foundations of historical knowledge. State Univ of New York Pr, July.
  • Rubin, Ernest. 1943. “The Place of Statistical Methods in Modern Historiography.” American Journal of Economics and Sociology 2 (2) (January 1): 193-210.
  • Schatzki, Theodore. 2006. “On Studying the Past Scientifically∗.” Inquiry 49 (4) (August): 380-399. doi:10.1080/00201740600831505.
  • Wellman, Barry, and Charles Wetherell. 1996. “Social network analysis of historical communities: Some questions from the present for the past.” The History of the Family 1 (1): 97-121. doi:10.1016/S1081-602X(96)90022-6.


Welcome to the scottbot irregular. As the title suggests, this blog will (I think) remain irregular, both in content and in timing. It will probably be host to news and musings about new scientific discoveries I find sexy or alarming, discussions of exciting happenings in the world of history of science, information science, or digital humanities, and meta-discussions or critiques of scientific methodologies and computational humanities methodologies. Also my preliminary research. Also whatever else I feel like. As for the irregular timing, if you want to keep up with the blog, it’d probably be best to just subscribe via RSS.


or Yet Another New Name.

Scientonomy. n.
1. The scientific study of science and scientists, especially their interactions, creative activities, and specific objects of research.
2. A system of knowledge or beliefs about science, broadly construed.

I hope science to be taken in its broader sense, like the German’s wissenschaft, described by Wikipedia as “any study or science that involves systematic research and teaching.” This extends scientonomy to the study of most subjects taught in academia, and many that exist well outside of it. Also, it’s worth noting that “the scientific study of…” should also be taken as wissenschaft; that is, using more than just natural science methodologies to study science. This includes methods from the humanities.

Science comes from a Latin word meaning to know,” and it is knowledge and its creation and assorted practices I wish to explore. The suffix -nomy is ancient Greek, meaning law, custom, arrangement, or system of rules. They come from two different languages; deal with it. I would use episteme rather than scientia, however its connotations are too loaded, and it is too separate from its brother techne, to be useful for my purposes.

It is important that I use the root science, as this project does not seek to understand knowledge in a vacuum, or various possibilities of how knowledge and knowledge creation may work, but rather how  humanity has actually practiced scientific creation and distribution, and the associations and repercussions those practices have had (and gleaned from) the world at large.

The suffix -onomy is the natural choice for two reasons. First, scientonomy could be an unobtrusive measurement in the same way astronomy is. That is, the act of collecting and analyzing scientonomic data in a way that does not intrude on the science and scientists themselves, from a distance and using their traces, much like the way astronomers view their subjects from a distance without direct experimentation. This in no way means scientonomy would make no mark on science; indeed, much like astronomy helped pave the way for the space program and allowed us to put footprints on the moon, scientonomy has the power to greatly affect the objects of its study.

Boyack, Klavans, and others

Like scientometrics, from which springs the dreaded h-index and other terrifying ways of measuring scientific output, scientonomy wields a dangerous weapon: the power to positively or negatively affect the scientific process. Scientonomy should be cautious, but not lame; we should work to improve the rate and process of scientific discovery and dissemination, we just need to be extremely careful about it.

The second reason for –onomy is a bit sillier, and possibly somewhat self-serving. All the other good names were taken, and already mean slightly different things. We already have Science of Science (Burnet, 1774; Fichte, 1808; Ossowska & Ossowski 1935; Goldsmith 1966) which is actually pretty close to what I’m doing, but not a terribly catchy name; Scientometrics (Price, 1963) which focuses a bit too much on communicative traces at the expense of, say, philosophical accounts; Scientosophy (Goldsmith 1966; Konner, 2007) which sounds too much like science as philosophy; Scientography (Goldsmith, 1966; Vladutz, before 1977; Garfield, 1986) which deals mostly with maps; Scientopograhy (Schubert & Braun, 1996) which focuses on geographic/scientific relations; as well as meta-scientific catch-alls like STS, HPS, Sociology of Science, etc. which all have their own associated practices, all of which have a place in scientonomy. There’s also Scientology, which I won’t even bother getting into here, and (hopefully) has no place in scientonomy.