Rippling o’er the Wave

The inimitable Elijah Meeks recently shared his reasoning behind joining Google+ over Twitter or Facebook. “G+ seems to be self-consciously a network graph that happens to let one connect and keep in touch.” For those who haven’t made the jump, Google+ feels like a contact list on steroids; it lets you add contacts, organize them into different (often overlapping) “circles,” and ultimately you can share materials based on those circles, video chat, send messages, and so forth. By linking your pre-existing public Google profile (and rolling in old features like Buzz and Google Reader), Google has essentially socialized web presences rather than “web presencifying” the social space.

It’s a wishy-washy distinction, and not entirely true, but it feels true enough that many who never worried about social networking sites are going to Google+. This is also one of the big distinctions between the loved-but-lost Google Wave, which was ultrasocial but also ultraprivate; it was not an extended Twitter, but an extended AIM or gmail — really some Frankenstein of the two. It wasn’t about presences and extending contacts, but about chatting alone.

True to Google form, they’ve already realized the potential of sharing in this semi-public space. If Twitter weren’t so minimalistic, they too would have caught on early. Yesterday, via G+ itself, Ripples rippled through the social space. Google+ Ripples describes itself as “a way to visualize the impact of any public post.” This link 1 shows the “ripples” of Ripples itself 2, or the propagation of news of Ripples through the G+ space.

They do a great job invoking the very circles used to organize contacts. Nested circles show subsequent generations of the shared post, and in most cases nested circles also represent followers of the most recent root node. Below the graph, G+ displays the posting frequency over time and allows the user to rewind the clock, seeing how the network grew. Hidden at the bottom of the page, you can find the people with the most public reshares (“influencers”), basic network statistics (average path length, not terribly meaningful in this situation; longest chain; and shares-per-hour), and languages of reshared posts. You can also read the reshares themselves on the right side of the screen, which immediately moved this from my mental “toy” box to the “research tool” box.

Make no mistake, this is a research tool. Barring the lack of permanent links or the ability to export the data into some manipulable file 3, this is a perfect example of information propagation done well. When doing similar research on Twitter, one often requires API-programming prowess to get even this far; in G+, it’s as simple as copying a link. By making information-propagating-across-a-network something sexy, interesting, and easily accessible to everyone, Google is making diffusion processes part of the common vernacular. For this, I give Google +1.

 

 

Notes:

  1. One feature I would like would be the ability to freeze Ripples links. The linked content will change as more people share the initial post – this is potentially problematic.
  2. Anything you can do I can do meta.
  3. which will be necessary for this to go from “research tool” to “actually used research tool”

Psychology of Science as a New Subdiscipline in Psychology

Feist, G. J. 2011. “Psychology of Science as a New Subdiscipline in Psychology.” Current Directions in Psychological Science 20 (October 5): 330-334. doi:10.1177/0963721411418471.

Gregory Feist, a psychologist from San Jose State University, recently wrote a review of the past decade of findings in the psychology of science. He sets the discipline apart from history, philosophy, anthropology, and sociology of science, defining the psychology of science as “the scientific study of scientific thought and behavior,” both implicit and explicit, in children and adults.

Some interesting results covered in the paper:

  • “People pay more attention to evidence when it concerns plausible theories than when it concerns implausible ones.”
  • “Babies as young as 8 months of age understand probability… children as young as 4 years old can correctly draw causal inferences from bar graphs.” (I’m not sure how much I believe that last one – can grown scientists correctly draw causal inferences from bar graphs?)
  • “children, adolescents, and nonscientist adults use different criteria when evaluating explanations and evidence, they are not very good at separating belief from fact (theory and evidence), and they persistently give their beliefs as evidence for their beliefs.”
  • “one reason for the inability to distinguish theory from evidence is the belief that knowledge is certain and absolute—that is, either right or wrong”
  • “scientists use anomalies and unexpected findings as sources for new theories and experiments and that analogy is very important in generating hypotheses and interpreting results”
  • “the personality traits that make scientific interest more likely are high conscientiousness and low openness, whereas the traits that make scientific creativity more likely are high openness, low conscientiousness, and high confidence.”
  • “scientists are less prone to mental health difficulties than are other creative people,” although “It may be that science tends to weed out those with mental health problems in a way that art, music, and poetry do not.”
It is somewhat surprising that Feist doesn’t mention the old use of “psychology of science,” which largely surrounded Reichenbach’s (1938) context distinctions, as echoed by the Vienna Circle and many others. The context of discovery (rather than the context of justification) deals with the question that, as Salmon (1963) put it, “When a statement has been made, … how did it come to be thought of?” Barry F. Singer (1971) wrote “Toward a Psychology of Science,” where he quoted S.S. Stevens (1936, 1939) on the subject of a scientific psychology of science.
It is exciting that the psychology of science is picking up again as an interesting object of study, although it would have been nice for Feist to cite someone earlier than 1996 when discussing this “new subdiscipline in psychology.”
From Wired Magazine

#humnets paper/review

UCLA’s Networks and Network Analysis for the Humanities this past weekend did not fail to impress. Tim Tangherlini and his mathemagical imps returned in true form, organizing a really impressively realized (and predictably jam-packed) conference that left the participants excited, exhausted, enlightened, and unanimously shouting for more next year (and the year after, and the year after that, and the year after that…) I cannot thank the ODH enough for facilitating this and similar events.

Some particular highlights included Graham Sack’s exceptionally robust comparative analysis of a few hundred early English novels (watch out for him, he’s going to be a Heavy Hitter), Sarah Horowitz‘s really convincing use of epistolary network analysis to weave the importance of women (specifically salonières) in holding together the fabric of French high society, Rob Nelson’s further work on the always impressive Mining the Dispatch, Peter Leonard‘s thoughtful and important discussion on combining text and network analysis (hint: visuals are the way to go), Jon Kleinberg‘s super fantastic wonderful keynote lecture, Glen Worthey‘s inspiring talk about not needing All Of It, Russell Horton’s rhymes, Song Chen‘s rigorous analysis of early Asian family ties, and, well, everyone else’s everything else.

Especially interesting were the discussions, raised most particularly by Kleinberg and Hoyt Long, about what particularly we were looking at when we constructed these networks. The union of so many subjective experiences surely is not the objective truth, but neither is it a proxy of objective truth – what, then, is it? I’m inclined to say that this Big Data aggregated from individual experiences provides us a baseline subjective reality that provides us local basins of attraction; that is, trends we see are measures of how likely a certain person will experience the world in a certain way when situated in whatever part of the network/world they reside. More thought and research must go into what the global and local meaning of this Big Data, and will definitely reveal very interesting results.

 

My talk on bias also seemed to stir some discussion. I gave up counting how many participants looked at me during their presentations and said “and of course the data is biased, but this is preliminary, and this is what I came up with and what justifies that conclusion.” And of course the issues I raised were not new; further, everybody in attendance was already aware of them. What I hoped my presentation to inspire, and it seems to have been successful, was the open discussion of data biases and constraints it puts on conclusions within the context of the presentation of those conclusions.

Some of us were joking that the issues of bias means “you don’t know, you can’t ever know what you don’t know, and you should just give up now.” This is exactly opposite to the point. As long as we’re open an honest about what we do not or cannot know, we can make claims around those gaps, inferring and guessing where we need to, and let the reader decide whether our careful analysis and historical inferences are sufficient to support the conclusions we draw. Honesty is more important than completeness or unshakable proof; indeed, neither of those are yet possible in most of what we study.

 

There was some twittertalk surrounding my presentation, so here’s my draft/notes for anyone interested (click ‘continue reading’ to view):

Continue reading “#humnets paper/review”

#humnets preview

Last year, Tim Tangherlini and his magical crew of folkloric imps and applied mathematicians put together a most fantastic and exhausting workshop on networks and network analysis in the humanities. We called it #humnets for short. The workshop (one of the oh-so-fantastic ODH Summer Institutes) spanned two weeks, bringing together forward-thinking humanists and Big Deals in network science and computer science. Now, a year and a half later, we’re all reuniting (bouncing back?) at UCLA to show off all the fantastic network-y humanist-y projects we’ve come up with in the interim.

As of a few weeks ago, I was all set to present my findings from analyzing and modeling the correspondence networks of early-modern scholars. Unfortunately (for me, but perhaps fortunately for everyone else), some new data came in that Changed Everything and invalidated many of my conclusions. I was faced with a dilemma; present my research as it was before I learned about the new data (after all, it was still a good example of using networks in the humanities), or retool everything to fit the new data.

Unfortunately, there was no time to do the latter, and doing the former felt icky and dishonest. In keeping with Tony Beaver’s statement at UCLA last year (“Everything you can do I can do meta,”) I ultimately decided to present a paper on precisely the problem that foiled my presentation: systematic bias. Biases need not be an issue of methodology; you can do everything right methodologically, you can design a perfect experiment, and a systematic bias can still thwart the accuracy of a project. The bias can be due to the available observable data itself (external selection bias), it may be due to how we as researchers decide to collect that data (sample selection bias), or it may be how we decide to use the data we’ve collected (confirmation bias).

There is a small-but-growing precedent of literature on the effects of bias on network analysis. I’ll refer to it briefly in my talk at UCLA, but below is a list of the best references I’ve found on the matter. Most of them deal with sample selection bias, and none of them deal with the humanities.

For those of you who’ve read this far, congratulations! Here’s a preview of my Friday presentation (I’ll post the notes on Friday).

 

——–

Effects of bias on network analysis condensed bibliography:

  • Achlioptas, Dimitris, Aaron Clauset, David Kempe, and Cristopher Moore. 2005. On the bias of traceroute sampling. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, 694. ACM Press. doi:10.1145/1060590.1060693. http://dl.acm.org/citation.cfm?id=1060693.
  • ———. 2009. “On the bias of traceroute sampling.” Journal of the ACM 56 (June 1): 1-28. doi:10.1145/1538902.1538905.
  • Costenbader, Elizabeth, and Thomas W Valente. 2003. “The stability of centrality measures when networks are sampled.” Social Networks 25 (4) (October): 283-307. doi:10.1016/S0378-8733(03)00012-1.
  • Gjoka, M., M. Kurant, C. T Butts, and A. Markopoulou. 2010. Walking in Facebook: A Case Study of Unbiased Sampling of OSNs. In 2010 Proceedings IEEE INFOCOM, 1-9. IEEE, March 14. doi:10.1109/INFCOM.2010.5462078.
  • Gjoka, Minas, Maciej Kurant, Carter T Butts, and Athina Markopoulou. 2011. “Practical Recommendations on Crawling Online Social Networks.” IEEE Journal on Selected Areas in Communications 29 (9) (October): 1872-1892. doi:10.1109/JSAC.2011.111011.
  • Golub, B., and M. O. Jackson. 2010. “From the Cover: Using selection bias to explain the observed structure of Internet diffusions.” Proceedings of the National Academy of Sciences 107 (June 3): 10833-10836. doi:10.1073/pnas.1000814107.
  • Henzinger, Monika R., Allan Heydon, Michael Mitzenmacher, and Marc Najork. 2000. “On near-uniform URL sampling.” Computer Networks 33 (1-6) (June): 295-308. doi:10.1016/S1389-1286(00)00055-4.
  • Kim, P.-J., and H. Jeong. 2007. “Reliability of rank order in sampled networks.” The European Physical Journal B 55 (February 7): 109-114. doi:10.1140/epjb/e2007-00033-7.
  • Kurant, Maciej, Athina Markopoulou, and P. Thiran. 2010. On the bias of BFS (Breadth First Search). In Teletraffic Congress (ITC), 2010 22nd International, 1-8. IEEE, September 7. doi:10.1109/ITC.2010.5608727.
  • Lakhina, Anukool, John W. Byers, Mark Crovella, and Peng Xie. 2003. Sampling biases in IP topology measurements. In INFOCOM 2003. Twenty-Second Annual Joint Conference of the IEEE Computer and Communications. IEEE Societies, 1:332- 341 vol.1. IEEE, April 30. doi:10.1109/INFCOM.2003.1208685.
  • Latapy, Matthieu, and Clemence Magnien. 2008. Complex Network Measurements: Estimating the Relevance of Observed Properties. In IEEE INFOCOM 2008. The 27th Conference on Computer Communications, 1660-1668. IEEE, April 13. doi:10.1109/INFOCOM.2008.227.
  • Maiya, Arun S. 2011. Sampling and Inference in Complex Networks. Chicago: University of Illinois at Chicago, April. http://arun.maiya.net/papers/asmthesis.pdf.
  • Pedarsani, Pedram, Daniel R. Figueiredo, and Matthias Grossglauser. 2008. Densification arising from sampling fixed graphs. In Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 205. ACM Press. doi:10.1145/1375457.1375481. http://portal.acm.org/citation.cfm?doid=1375457.1375481.
  • Stumpf, Michael P. H., Carsten Wiuf, and Robert M. May. 2005. “Subnets of scale-free networks are not scale-free: Sampling properties of networks.” Proceedings of the National Academy of Sciences of the United States of America 102 (12) (March 22): 4221 -4224. doi:10.1073/pnas.0501179102.
  • Stutzbach, Daniel, Reza Rejaie, Nick Duffield, Subhabrata Sen, and Walter Willinger. 2009. “On Unbiased Sampling for Unstructured Peer-to-Peer Networks.” IEEE/ACM Transactions on Networking 17 (2) (April): 377-390. doi:10.1109/TNET.2008.2001730.

——–

Effects of selection bias on historical/sociological research condensed bibliography:

  • Berk, Richard A. 1983. “An Introduction to Sample Selection Bias in Sociological Data.” American Sociological Review 48 (3) (June 1): 386-398. doi:10.2307/2095230.
  • Bryant, Joseph M. 1994. “Evidence and Explanation in History and Sociology: Critical Reflections on Goldthorpe’s Critique of Historical Sociology.” The British Journal of Sociology 45 (1) (March 1): 3-19. doi:10.2307/591521.
  • ———. 2000. “On sources and narratives in historical social science: a realist critique of positivist and postmodernist epistemologies.” The British Journal of Sociology 51 (3) (September 1): 489-523. doi:10.1111/j.1468-4446.2000.00489.x.
  • Duncan Baretta, Silvio R., John Markoff, and Gilbert Shapiro. 1987. “The selective Transmission of Historical Documents: The Case of the Parish Cahiers of 1789.” Histoire & Mesure 2: 115-172. doi:10.3406/hism.1987.1328.
  • Goldthorpe, John H. 1991. “The Uses of History in Sociology: Reflections on Some Recent Tendencies.” The British Journal of Sociology 42 (2) (June 1): 211-230. doi:10.2307/590368.
  • ———. 1994. “The Uses of History in Sociology: A Reply.” The British Journal of Sociology 45 (1) (March 1): 55-77. doi:10.2307/591525.
  • Jensen, Richard. 1984. “Review: Ethnometrics.” Journal of American Ethnic History 3 (2) (April 1): 67-73.
  • Kosso, Peter. 2009. Philosophy of Historiography. In A Companion to the Philosophy of History and Historiography, 7-25. http://onlinelibrary.wiley.com/doi/10.1002/9781444304916.ch2/summary.
  • Kreuzer, Marcus. 2010. “Historical Knowledge and Quantitative Analysis: The Case of the Origins of Proportional Representation.” American Political Science Review 104 (02): 369-392. doi:10.1017/S0003055410000122.
  • Lang, Gladys Engel, and Kurt Lang. 1988. “Recognition and Renown: The Survival of Artistic Reputation.” American Journal of Sociology 94 (1) (July 1): 79-109.
  • Lustick, Ian S. 1996. “History, Historiography, and Political Science: Multiple Historical Records and the Problem of Selection Bias.” The American Political Science Review 90 (3): 605-618. doi:10.2307/2082612.
  • Mariampolski, Hyman, and Dana C. Hughes. 1978. “The Use of Personal Documents in Historical Sociology.” The American Sociologist 13 (2) (May 1): 104-113.
  • Murphey, Murray G. 1973. Our Knowledge of the Historical Past. Macmillan Pub Co, January.
  • Murphey, Murray G. 1994. Philosophical foundations of historical knowledge. State Univ of New York Pr, July.
  • Rubin, Ernest. 1943. “The Place of Statistical Methods in Modern Historiography.” American Journal of Economics and Sociology 2 (2) (January 1): 193-210.
  • Schatzki, Theodore. 2006. “On Studying the Past Scientifically∗.” Inquiry 49 (4) (August): 380-399. doi:10.1080/00201740600831505.
  • Wellman, Barry, and Charles Wetherell. 1996. “Social network analysis of historical communities: Some questions from the present for the past.” The History of the Family 1 (1): 97-121. doi:10.1016/S1081-602X(96)90022-6.

Welcome!

Welcome to the scottbot irregular. As the title suggests, this blog will (I think) remain irregular, both in content and in timing. It will probably be host to news and musings about new scientific discoveries I find sexy or alarming, discussions of exciting happenings in the world of history of science, information science, or digital humanities, and meta-discussions or critiques of scientific methodologies and computational humanities methodologies. Also my preliminary research. Also whatever else I feel like. As for the irregular timing, if you want to keep up with the blog, it’d probably be best to just subscribe via RSS.

Scientonomy

or Yet Another New Name.

Scientonomy. n.
1. The scientific study of science and scientists, especially their interactions, creative activities, and specific objects of research.
2. A system of knowledge or beliefs about science, broadly construed.

I hope science to be taken in its broader sense, like the German’s wissenschaft, described by Wikipedia as “any study or science that involves systematic research and teaching.” This extends scientonomy to the study of most subjects taught in academia, and many that exist well outside of it. Also, it’s worth noting that “the scientific study of…” should also be taken as wissenschaft; that is, using more than just natural science methodologies to study science. This includes methods from the humanities.

Science comes from a Latin word meaning to know,” and it is knowledge and its creation and assorted practices I wish to explore. The suffix -nomy is ancient Greek, meaning law, custom, arrangement, or system of rules. They come from two different languages; deal with it. I would use episteme rather than scientia, however its connotations are too loaded, and it is too separate from its brother techne, to be useful for my purposes.

It is important that I use the root science, as this project does not seek to understand knowledge in a vacuum, or various possibilities of how knowledge and knowledge creation may work, but rather how  humanity has actually practiced scientific creation and distribution, and the associations and repercussions those practices have had (and gleaned from) the world at large.

The suffix -onomy is the natural choice for two reasons. First, scientonomy could be an unobtrusive measurement in the same way astronomy is. That is, the act of collecting and analyzing scientonomic data in a way that does not intrude on the science and scientists themselves, from a distance and using their traces, much like the way astronomers view their subjects from a distance without direct experimentation. This in no way means scientonomy would make no mark on science; indeed, much like astronomy helped pave the way for the space program and allowed us to put footprints on the moon, scientonomy has the power to greatly affect the objects of its study.

Boyack, Klavans, and others

Like scientometrics, from which springs the dreaded h-index and other terrifying ways of measuring scientific output, scientonomy wields a dangerous weapon: the power to positively or negatively affect the scientific process. Scientonomy should be cautious, but not lame; we should work to improve the rate and process of scientific discovery and dissemination, we just need to be extremely careful about it.

The second reason for –onomy is a bit sillier, and possibly somewhat self-serving. All the other good names were taken, and already mean slightly different things. We already have Science of Science (Burnet, 1774; Fichte, 1808; Ossowska & Ossowski 1935; Goldsmith 1966) which is actually pretty close to what I’m doing, but not a terribly catchy name; Scientometrics (Price, 1963) which focuses a bit too much on communicative traces at the expense of, say, philosophical accounts; Scientosophy (Goldsmith 1966; Konner, 2007) which sounds too much like science as philosophy; Scientography (Goldsmith, 1966; Vladutz, before 1977; Garfield, 1986) which deals mostly with maps; Scientopograhy (Schubert & Braun, 1996) which focuses on geographic/scientific relations; as well as meta-scientific catch-alls like STS, HPS, Sociology of Science, etc. which all have their own associated practices, all of which have a place in scientonomy. There’s also Scientology, which I won’t even bother getting into here, and (hopefully) has no place in scientonomy.