Google Maps for the Ancient World

The title of this post comes from an oft-quoted passage of my previous one describing ORBIS, a scholarly argument cleverly disguised as a web tool. Of ORBIS I wrote

…given any two cities in the ancient world, it returns the fastest, cheapest, or shortest route between them, given the month, the mode of transportation, and various other options. It’s Google Maps for the ancient world, complete with the “Avoid Highways” feature.

In writing that review, I neglected to mention the many fantastic resources out there that already map the ancient world, including the Digital Atlas of Roman and Medieval Civilization and PLEIADES, a gazetteer and graph of ancient places. The most impressive full-featured online GIS application I’ve seen is called Antiquity À-la-carte, shown below. The classicists have once again proven themselves to be at the bleeding edge of technology. When they keep developing cool toys like these, I sometimes regret being an early modernist. Sometimes.

Antiquity À-la-carte, a GIS application for the ancient western world.

The cool toy I speak of now is of course no toy, but a serious scholarly endeavor which will doubtless set the bar for future online historical maps. In many ways, the Digital Map of the Roman Empire offers less than the sites I listed above. It doesn’t allow you to turn on or off particular layers, and it certainly doesn’t include all of the information the others have. In this case, however, less is more. It’s a really easy to use map of the ancient world, online. That’s it. It doesn’t tell you how to get from point A to point B, it doesn’t allow you to see the location of shipwrecks or the borders of countries at different time periods; it’s just a base map, depicting the Greek and Roman world in its entirety, asking the world to do with it what it will.

Johan Ahlfeldt wrote about his creation:

The aim of my work with Pelagios has been to create a static (non-layered) map of the ancient places in the Pleiades dataset with the capacity to serve as a background layer to online mapping applications of the Ancient World. Because it is based on ancient settlements and uses ancient placenames, our map presents a visualisation more tailored to archaeological and historical research, for which modern mapping interfaces, such as Google Maps, are hardly appropriate; it even includes non-settlement data such as the Roman roads network, some aqueducts and defence walls (limes, city walls). Thus, for example, the tiles can be used as a background layer to display the occurrence of find-spots, archaeological sites, etc., thereby creating new opportunities to put data of these kinds in their historical context.

The ancient base map.

As I wrote last year, accurate base maps are extremely important for contextualizing research. With this underneath, for example, ORBIS could provide a much richer experience of the ancient world. What’s more, the PELAGIOS group has opened up the map with a CC-BY license, allowing anyone to build on it so long as they include proper scholarly attribution. It can be used with Openlayers, Google, and Bing maps, so anybody who already has these systems in place can easily swap out the map tiles with these historical ones. Johan’s post includes examples of all of these implemented.

My posts are usually long and rambling, but I’ll keep this one short and to the point, much like the tool I’m reviewing. This is the first easily mashable base map of the ancient world, and for that it is awesome. Go explore!

ORBIS: The next step in Digital Humanities

Every once in a while, a new project comes around bearing a message loud and clear: this is a sign of things to come. ORBIS, the Stanford Geospatial Network Model of the Roman World, is one such project.

ORBIS was created by Walter Scheidel, Elijah Meeks, and a host of others. At the very beginning, I should point out I am not a classicist. The below review is of the nature rather than the content of ORBIS as a scholarly product.

Roman Travel Network

ORBIS is many things but, most simply, it is an interface allowing researchers to experience the geography of the Roman world from an ancient perspective. The executive summary: given any two cities in the ancient world, it returns the fastest, cheapest, or shortest route between them, given the month, the mode of transportation, and various other options. It’s Google Maps for the ancient world, complete with the “Avoid Highways” feature.

I was among the lucky few to see an early version of the tool, and after sending back an informal review, Elijah Meeks invited me to review the site publicly via my blog. The first section explains what I feel is the most important contribution of ORBIS to the Digital Humanities; it is a reflexive tool that allows the humanist to engage with the process as well as the product. I then highlight some of the cool features, and finally list some rough edges and desiderata for future iterations or similar projects.

Tool As Argument

Beyond being an exceptionally well-made and useful tool, it is not the tool itself which makes ORBIS stand out. Walter Scheidel and Elijah Meeks could have posted the automated map portion of the site by itself, and it would have garnered deserving praise, but they went well beyond that goal; they made a reflexive tool.

ORBIS is among the first digital scholarly tools for the humanities (that I have encountered) that really lives up to the name “digital scholarly tool for the humanities.” Beyond being a simple tool, ORBIS is an explicit and transparent argument, a way of presenting research that also happens to allow, by its very existence, further research to be done. It is a map that allows the user to engage in the process of map-making, and a presentation of a process that allows the user to make and explore in ways the initial creators could not have foreseen. Of course, as with any project there are a few rough edges and desired features, which I’ll get into further down below.

Elevation data to help model the difficulty in getting from one place to another.

Along with the map, the Makers of this project (by which I mean authors, developers, data gatherers, …) present a fairly interactive documentary of the map-making process, including historical accounts, data sources, algorithmic explanations, visual aids, downloadable data, and a forthcoming API. They built an explicit model of the ancient world, taking into account roads and rivers, oceans and coastlines, weather and geographic features, various modes of transportation for civilian and military purposes, and put it all together so any researcher can sit down and figure out how long it would have taken, or how expensive it would have been, to travel between 751 locations in the ancient Roman world. Rather than asking us to trust that their data are accurate, the makers revealed their model – their underlying argument – for critique and extension.

Exploring the Ancient World

The ORBIS model includes 751 sites covering about 4 million square miles of ancient space, including over 50,000 miles of road or desert tracks, nearly 20,000 miles of navigable rivers and canals, and almost 1,000 sea routes between sea ports. As I mentioned earlier, the model works like Google Maps; given two locations, it tells you the cheapest, shortest, or fastest route between them. These calculations take into account the time-of-year and usual weather, elevation changes between sites, fourteen modes of travel (ox cart, foot, army on march, camel caravan, etc.), river travel (including extra difficulty moving upstream), etc.

The ORBIS Interface

Another exciting feature on ORBIS is the distance cartogram. This visualization reveals the impact of travel speed and transport prices on overall connectivity; it allows the researcher to see how far other cities were with respect to a certain core city (for instance Constantinople) from the perspective of cost and travel time rather than mere geographical distance. This feature brings the researcher closer to the actual ancient Roman experience. A larger insight is revealed when taking a “distant reading” approach to the cartogram: “Distance cartograms show that due to massive cost differences between aquatic and terrestrial modes of transport, peripheries were far more remote from the center in terms of price than in terms of time.”

Constantinople Cartogram

Desiderata

ORBIS is a big step forward in designing digital scholarly objects for the digital humanities. It is a tool that is both useful and reflexive, offering engagement with both process and product. It also exemplifies an increasingly popular mode of scholarly communication: the published online object. Because the mode is still (even after decades of online DH projects) not quite solidified, ORBIS lacks a few of the basic features of common scholarly communication, and by straddling both the new and the old, ORBIS doesn’t quite live up to the best qualities of either digital or analog publication.

First of all, although their team sent a preliminary version of the site out to many people, it never went through any formal review process. Readers of this blog will know that I am no advocate of traditional publication systems or the antiquated marriage of publication and peer-review, but at this point it is worth noting that ORBIS (to my knowledge) has only been reviewed informally, by sympathetic reviewers like myself. Perhaps this means that adoption of the tool should be approached with greater caution until it is more formally reviewed by a post-publication periodical like the Journal of Digital Humanities.

That being said, the site does try remain true to humanistic and traditional publication roots. A paper version is in the works, and it is written such that we researchers can engage in the process of the tool. Unfortunately, it perhaps stays a bit too true to the paper model. The site is designed to read top-to-bottom, left-to-right, and none of the internal references to other sections include links to aid in navigation. Further, if the intent is to simultaneously allow exploration of the tool and its creation, the design does not realize this goal. The map appears at “the end” of the site, all the way on the right, and because of the layout, it is impossible to view it alongside the text describing it without opening a new window. There is quite a bit of white space to the right of the text on my wide-screen monitor – perhaps a smaller version of the tool can be embedded in that space.

One of the strengths of the project is the explicit nature of its creation. Data can be downloaded, and the sources, provenance, algorithms, and technologies are clearly stated. The model as an argument is, in short, visible and comprehensible even to those with little prior knowledge on these technologies. What this does is bridge the gap between code and humanistic inquiry, adding levels of model explication and tool-use between them. ORBIS is by far not the first project to make the creation of a tool explicit, but usually that explication is simply a public posting of the code and some limited comments or descriptions of how that code works. Unfortunately, although ORBIS does include a better bridge to explicate its argument, it does not offer the code. It’s a bit like David Copperfield explaining how he made the Statue of Liberty disappear; the explanation would certainly be helpful, but if he really wanted other people to be able to create similar illusions, he’d offer up the materials as well. (Alright, the metaphor doesn’t completely work, but stick with it.) The digital humanities seems finally to be getting into code sharing, and this is a good thing. The cost for sharing code is essentially free (although there’s a much greater price for sharing good code – all the extra time spent marking it up and making it pretty), and the benefits should go without saying: More things like ORBIS, much faster. Better tools built collectively and suiting all our individual needs.

The last, most important, and most difficult of my desires deals with uncertainty. There’s been a lot of talk about data uncertainty in the humanities lately, not least of which stemming from Stanford, the home university of ORBIS. It’s a difficult problem to solve, but presented as it is, the ORBIS project lends itself to the varieties of critiques common in the work of Johanna Drucker and others. How do you know that these were the shortest routes? What about missing information? What about the fact that every bit of travel was its own experience, with different human and environmental factors playing in, perhaps delays for sick relatives or mutineering seamen? These questions are swept under the table when ORBIS presents one route and one set of numbers per query: here, this is the fastest route, these are the cities, this is how much it would cost. The visualization and end-products create an illusion of certainty in the data, although in the text, the makers are quick to point out that a researcher should not take it as certain. One solution, and this extends to all data-driven DH projects, is to model uncertainty in the data from the ground up. How much more certain is one route than another? How certain are you of the weather in one location compared to the weather elsewhere? This sort of information flows naturally into models of Bayesian data analysis, and would allow ORBIS to deliver a list of credible routes, revealing which parts of those routes are more or less certain, and including other information like the probability of a ship being lost at sea on a particular route. Of course, data uncertainty is only part of the problem, and this would only be a partial solution.

This isn’t the place to detail exactly how uncertainty should be modeled in the data, and exactly what ought to be done with it, but the fact is there is already rich knowledge in the model and in the data available dealing with the uncertainty of travel, but that information disappears as soon as it is presented in the map interface. If ORBIS represents the next step in humanities tool production, it doesn’t quite (yet) live up to the promise of humanities data analysis, impressive as their analysis is. There is still not yet a clear enough representation of uncertainty and interpretation to reach that goal. To be fair, I’ve yet to see a single project living up to that promise at anything close to large-scale; the tools just haven’t been developed yet. Perhaps that promise is impossible at large scale, although I certainly hope that is not the case.

The View From Here

Despite my long list of rough edges and desiderata, I still stand by my statement that this tool is an exemplar of a shift in digital humanities projects. The tool itself is profoundly impressive and will prove useful for a variety of research, but what stands out from the humanities standpoint is the explicit nature of the ORBIS underbelly. It blurs the line between tool and argument. There are other profoundly impressive and useful tools out there (topic modeling comes to mind). However, with topic modeling, the assumptions are still obscure to the unfamiliar, despite my own best efforts and the even better efforts of others. This is because the software topic modeling is packaged with, the software we use to run the analyses, does not simultaneously engage in the process of its own creation in the way that ORBIS does. Going forward, I predict the most used (or at least the most useful) digital tools for humanists will include that engagement, rather than existing as black boxes out of which results spring forth, fully armed and ready to battle as Athena from Zeus’s forehead. ORBIS is by no means the first to attempt such a feat but, I think, it is as-yet the most successful.

 

Contextualizing networks with maps

Last post, I talked about combining textual and network analysis. Both are becoming standard tools in the methodological toolkit of the digital humanist, sitting next to GIS in what seems to be becoming the Big Three in computational humanities.

Data as Context, Data as Contextualized

Humanists are starkly aware that no particular aspect of a subject sits in a vacuum; context is key. A network on its own is a set of meaningless relationships without a knowledge of what travels through and across it, what entities make it up, and how that network interacts with the larger world.  The network must be contextualized by the content. Conversely, the networks in which people and processes are situated deeply affect those entities: medium shapes message and topology shapes influence. The content must be contextualized by the network.

At the risk of the iPhonification of methodologies 1,  textual, network, and geographic analysis may be combined with each other and traditional humanities research so that they might all inform one another. That last post on textual and network analysis was missing one key component for digital humanities: the humanities. Combining textual and network analysis with traditional humanities research (rather than merely using the humanities to inform text and network analysis, or vice-versa) promises to transform the sorts of questions asked and projects undertaken in Academia at large.

Just as networks can be used to contextualize text (and vice-versa), the same can be said of networks and maps (or texts and maps for that matter, or all three, but I’ll leave those for later posts). Now, instead of starting with the maps we all know and love, we’ll start by jumping into the deep end by discussing maps as any sort of representative landscape in which a network can be situated. In fact, I’m going to start off by using the network as a map against which certain relational properties can be overlaid. That is, I’m starting by using a map to contextualize a network, rather than the more intuitive other way around.

Using Maps to Contextualize a Network

The base map we’re discussing here is a map of science. They’ve made their rounds, so you’ve probably seen one, but just in case you haven’t here’s a brief description: some researchers (in this case Kevin Boyack and Richard Klavans) take tons on information from scholarly databases (in this case the Science Citation Index Expanded and the Social Science Citation Index) and create a network diagram from some set of metrics (in this case, citation similarity). They call this network representation a Map of Science.

Base Map of Science built by Boyack and Klavans from 2002 SCIE and SSCI data.

We can debate about the merits of these maps ’till we’re blue in the face, but let’s avoid that for now. To my mind, the maps are useful, interesting, and incomplete, and the map-makers are generally well-aware of their deficiencies. The point here is that it is a map: a landscape against which one can situate oneself, and with which one may be able to find paths and understand the lay of the land.

NSF Funding Profile

In Boyack, Börner 2, and Klavans (2007), the three authors set out to use the map of science to explore the evolution of chemistry research. The purpose of the paper doesn’t really matter here, though; what matters is the idea of overlaying information atop a base network map.

NIH Funding Profile

The images above are the funding profiles of the NIH (National Institutes of Health) and NSF (National Science Foundation). The authors collected publication information attached to all the grants funded by the NSF and NIH and looked at how those publications cited one another. The orange edges show connections between disciplines on the map of science that were more prevalent within the context a particular funding agency than they were compared to the entire map of science. Boyack, Börner 3, and Klavans created a map and used it to contextualize certain funding agencies. They and other parties have since used such maps to contextualize universities, authors, disciplines, and other publication groups.

From Network Maps to Geographic Maps

Of course,  the Where’s The Beef™ section of this post still has yet to be discussed, with the beef in this case being geography. How can we use existing topography to contextualize network topology? Network space rarely corresponds to geographic place, however neither of them alone can ever fully represent the landscape within which we are situated. A purely geographic map of ancient Rome would not accurately represent the world in which the ancient Romans lived, as it does not take into account the shortening of distances through well-trod trade routes.

Roman Network by Elijah Meeks, nodes laid out geographically

Enter Stanford DH ninja Elijah Meeks. In two recent posts, Elijah discussed the topology/topography divide. In the first, he created a network layout algorithm which took a network with nodes originally placed in their geographic coordinates, and then distorted the network visualization to emphasize network distance. The visualization above shows the network laid out geographically. The one below shows the Imperial Roman trade routes with network distances emphasized. As Elijah says, “everything of the same color in the above map is the same network distance from Rome.”

Roman Network by Elijah Meeks, nodes laid out geographically and by network distance.

Of course, the savvy reader has probably observed that this does not take everything into account. These are only land routes; what about the sea?

Elijah’s second post addressed just that, impressively applying GIS techniques to determine the likely route ships took to get from one port to another. This technique drives home the point he was trying to make about transitioning from network topology to network topography. The picture below, incidentally, is Elijah’s re-rendering of the last visualization taking into account both land and see routes. As you can see, the distance from any city to any other has decreased significantly, even taking into account his network-distance algorithm.

Roman Network by Elijah Meeks, nodes laid out using geography and network distance, taking into account two varieties of routes.

The above network visualization combines geography, two types of transportation routes, and network science to provide a more nuanced at-a-glance view of the Imperial Roman landscape. The work he highlighted in his post transitioning from topology to topography in edge shapes is also of utmost importance, however that topic will need to wait for another post.

The Republic of Letters (A Brief Interlude)

Elijah was also involved in another Stanford-based project, one very dear to my heart, Mapping the Republic of Letters. Much of my own research has dealt with the Republic of Letters, especially my time spent under Bob Hatch, and Paula Findlen, Dan Edelstein, and Nicole Coleman at Stanford have been heading up an impressive project on that very subject. I’ll go into more details about the Republic in another post (I know, promises promises), but for now the important thing to look at is their interface for navigating the Republic.

Stanford’s Mapping the Republic of Letters

The team has gone well beyond the interface that currently faces the public, however even the original map is an important step. Overlaid against a map of Europe are the correspondences of many early modern scholars. The flow of information is apparent temporally, spatially, and through the network topology of the Republic itself. Now as any good explorer knows, no map is any substitute for a thorough knowledge of the land itself; instead, it is to be used for finding unexplored areas and for synthesizing information at a large scale. For contextualizing.

If you’ll allow me a brief diversion, I’d like to talk about tools for making these sorts of maps, now that we’re on the subject of letters. Elijah’s post on visualizing network distance included a plugin for Gephi to emphasize network distance. Gephi’s a great tool for making really pretty network visualizations, and it also comes with a small but potent handful of network analysis algorithms.

I’m on the development team of another program, the Sci² Tool, which shares a lot of Gephi’s functionality, although it has a much wider scope and includes algorithms for textual, geographic, and statistical analysis, as well as a somewhat broader range of network analysis algorithms.

This is by no means a suggestion to use Sci² over Gephi; they both have their strengths and weaknesses. Gephi is dead simple to use, produces the most beautiful graphs on the market, and is all-around fantastic software. They both excel in different areas, and by using them (and other tools!) together, it is possible to create maps combining geographic and network features without ever having to resort to programming.

The Correspondence of Hugo Grotius

The above image was generated by combining the Sci² Tool with Gephi. It is the correspondence network of Hugo Grotius, a dataset I worked on while at Huygens ING in The Hague. They are a great group, and another team doing fantastic Republic of Letters research, and they provided this letters dataset. We just developed this particular functionality in Sci² yesterday, so it will take a bit of time before we work out the bugs and release it publicly, however as soon as it is released I’ll be sure to post a full tutorial on how to make maps like the one above.

This ends the public service announcement.

Moving Forward

These maps are not without their critics. Especially prevalent were questions along the lines of “But how is this showing me anything I didn’t already know?” or “All of this is just an artefact of population densities and standard trade routes – what are these maps telling us about the Republic of Letters?” These are legitimate critiques, however as mentioned before, these maps are still useful for at-a-glance synthesis of large scales of information, or learning something new about areas one is not yet an expert in. Another problem has been that the lines on the map don’t represent actual travel routes; those sorts of problems are beginning to be addressed by the type of work Elijah Meeks and other GIS researchers are doing.

To tackle the suggestion that these are merely representing population data, I would like to propose what I believe to be a novel idea. I haven’t published on this yet, and I’m not trying to claim scholarly territory here, but I would ask that if this idea inspires research of your own, please cite this blog post or my publication on the subject, whenever it comes out.

We have a lot of data. Of course it doesn’t feel like we have enough, and it never will, but we have a lot of data. We can use what we have, for example collecting all the correspondences from early modern Europe, and place them on a map like this one. The more data we have, the smaller time slices we can have in our maps. We create a base map that is a combination of geographic properties, statistical location properties, and network properties.

Start with a map of the world. To account for population or related correlations, do something similar to what Elijah did in this post,  encoding population information (or average number of publications per city, or whatever else you’d like to account for) into the map. On top of that, place the biggest network of whatever it is that you’re looking at that you can find. Scholarly communication, citations, whatever. It’s your big Map of YourFavoriteThingHere. All of these together are your base map.

Atop that, place whatever or whomever you are studying. The correspondence of Grotius can be put on this map, like the NIH was overlaid atop the Map of Science, and areas would light up and become larger if they are surprising against the base map. Are there more letters between Paris and The Hague in the Grotius dataset then one would expect if the dataset was just randomly plucked from the whole Republic of Letters? If so, make that line brighter and thicker.

By combining geography, point statistics, and networks, we can create base maps against which we can contextualize whatever we happen to be studying. This is just one possible combination; base maps can be created from any of a myriad of sources of data. The important thing is that we, as humanists, ought to be able to contextualize our data in the same way that we always have. Now that we’re working with a lot more of it, we’re going to need help in those contextualizations. Base maps are one solution.

It’s worth pointing out one major problem with base maps: bias. Until recently, those Maps of Science making their way around the blogosphere represented the humanities as a small island off the coast of social sciences, if they showed them at all. This is because the primary publication venues of the arts and humanities were not represented in the datasets used to create these science maps. We must watch out for similar biases when constructing our own base maps, however the problem is significantly more difficult for historical datasets because the underrepresented are too dead to speak up.  For a brief discussion of historical biases, you can read my UCLA presentation here.

[zotpress item=”I7ZCTTVX”]

Notes:

  1. putting every tool imaginable in one box and using them all at once
  2. Full disclosure: she’s my advisor. She’s also awesome. Hi Katy!
  3. Hi again, Katy!