Curating Plague Data

How can a CDH Dataset Curation Grant help you with your research? Merle Eisenberg received a Dataset Curation Grant for his project “The Justinianic Plague and the End of Antiquity.” Merle defended his dissertation in Princeton's Department of History in summer 2018, and was a Postgraduate Research Associate (PGRA) affiliated with the CDH in Fall 2018.

My interdisciplinary environmental digital humanities project, The Justinianic Plague and the End of Antiquity, will offer the first standardized, open-access repository of data on the Justinianic Plague (c. 541-750). The database aims to analyze, contextualize, and provide commentary on the approximately forty written primary sources on the Justinianic Plague, which between them discuss the outbreak several hundred times in four major languages (Latin, Greek, Arabic, and Syriac). My project employs graduate students with linguistic and historical training from one of these major languages to create the dataset. After its completion, the collected data will be published online for scholars to access and download for research.

Let’s look at an example of my data. One of the main sources for the outbreak of plague in western Europe is Bishop Gregory of Tours’s famous Histories written in Latin at the end of the sixth century. Gregory talks about war, love, and religion among many topics, but also mentions ten plague outbreaks between 543-590 CE. Each outbreak has to be entered separately with all the information someone might want to know. Here is one short example from the 590 outbreak:

Gregory of Tours original
Published version of Gregory of Tours text

My curation work has entailed capturing bibliographic information about the text, location of plague, transcription of the original text, and translations, commentary and a set of keywords that I have defined.   The structured data, captured in a google sheet, looks like this:

Gregory of Tours plague data in structured form
Gregory of Tours bibliographic and textual data in structured form

One huge problem in plague research is that scholars (both historians and scientists) tend to ignore the broader context (provided above) and instead compile an exhaustive list of plague outbreaks, as if they were all equivalent. Yet, this example shows that with context, Gregory knew little about how many people died, where the plague spread outside two cities, and how long it lasted. Did 5%, 15% or 50% of people die in this outbreak?

The dataset curation grant is part of my larger project entitled The Making of a Pandemic: Plague, Myth, and the End of Antiquity, which is part of Princeton’s Climate Change and History Research Initiative. After creating a usable database for academics to use, the next step is to create a map-based interface aimed at educating the broader public, particularly school children, teachers, and introductory university classes. While the Black Death (c. 1347-18th c.) has been incorporated into university and secondary school curricula, the Justinianic Plague is virtually unknown, since there are no online resources. A digital map component would allow an even broader audience to use the data.

With the help of CDH seminars and advice, I found my dataset curation grant to be helpful for this project and, perhaps more importantly, useful to think about how to collect, process, and manage large amounts of data. My field, the history of late antiquity, does not often work with large sets of data, but, along with a few colleagues, I’ve used the ideas behind this project to ask new questions about the plague.

The dataset curation grant for The Justinianic Plague and the End of Antiquity project has taught me quite a bit about using data too. First, there is no “silver bullet” to solve all data management questions. There are positives and negatives with every choice you make, but you have to make a decision. And, second, whatever choices you eventually do make, record them! Record them early, record them always. You must tell anyone using your data how and why you made these choices, so they can make decisions in the future.

I have been extremely fortunate to have a dataset curation grant, which has both expanded my intellectual horizons and helped me learn from experts about how to employ skills in a rigorous way.

