Info: Welcome to the new CDH website!

In 2024-25, we are celebrating ten years as a center at Princeton. Explore our redesigned website to get better acquainted with us and the many things we do!

Meet our Dataset Curation Grant Winners 2019-2020

17 April 2019

Authors

Each year, the CDH awards several Dataset Curation Grants to Princeton researchers who are experimenting with humanities data. These grants provide training in the tools and techniques of data curation as well as funding to cover project costs and salaries for student collaborators. We are excited to announce our grantees for the 2019-20 academic year. Their projects express in different ways how new forms of argument, interpretation, and description can emerge from making, using, and maintaining data in the humanities.
adams-jefferson-network.png

Each year, the CDH awards several Dataset Curation Grants to Princeton researchers who are experimenting with humanities data. These grants provide training in the tools and techniques of data curation as well as funding to cover project costs and salaries for student collaborators. We are excited to announce our grantees for the 2019-20 academic year. Their projects express in different ways how new forms of argument, interpretation, and description can emerge from making, using, and maintaining data in the humanities.

Dmitri Tymoczko (Professor, Department of Music): Curating analyses of the complete Beethoven quartets; Adding clef and pitch information to Thomas Victoria’s complete works

Dmitri Tymoczko, Director of the Graduate Program in Composition, will shed light on the musical languages of two historical composers. In the first project, Professor Tymoczko will work with two undergraduate research assistants to edit an annotated corpus of the complete Beethoven string quartets. A dataset first assembled by a group at the Federal École Polytechnique in Lausanne ( available on GitHub ) provides indications of chord and key across 16 quartets and 70 movements, comprising approximately 37,000 individual chords. Tymoczko and his team will clean and adapt this corpus in order to develop a web-application that will let nonprogrammers work with the data. In the second project, Tymoczko will edit a digital edition of complete works of Renaissance composer Tomás Luis de Victoria (1548–1611). The task of this edition is to restore the specific “pitch information” of Victoria’s original scores by using a program to infer pitch from key signature, clef choice, and final note.

Anna M. Shields (Professor, Department of East Asian Studies): Final Data Cleaning, Name-tagging, and Geotagging for Data in the Tang History Database

In their modern print editions, the Old Tang History (945 CE) and New Tang History (1060 CE) comprise over 6,000 pages chronicling the politics, art, science, economics, and natural history of the Tang Dynasty (618-907). In the Spring 2018 semester, Professors Wen Xin and Anna M. Shields began designing the Tang History Database, an online, open-access edition of these monumental documents, and received their first CDH Dataset Curation Grant for 2018-2019. This summer, with the help of a second CDH Dataset Curation Grant, a team of five graduate students will finish verifying digital transcriptions of the text. The team will then begin name-tagging people listed in the Histories using the Chinese Biographical Database (CBDB) and geotagging locations with data from the Chinese Historical Geographic Information System (CHGIS) database. This work will create a richly interlinked text that will allow scholars to attend on the macro level to the patterns of difference between the two texts, and to traverse on the semantic level the links between people and places in Tang China.

Joshua Kotin, (Associate Professor, Department of English): The Shakespeare and Company Project

Professor Joshua Kotin and project manager Cate Mahoney (graduate student, English) will continue their work on the Shakespeare and Company Project, an interactive website exploring the lending library at the center of expatriate life in Paris from 1919 to 1941. In 2015, researchers from the English Department and designers and developers from the CDH began digitizing and transcribing the borrowing records of approximately 560 members for the whole span of the lending library’s existence. Altogether, this project will provide a portrait of how the library was used by some of the most important writers of the period, from Gertrude Stein to Jacques Lacan, from James Joyce to Simone de Beauvoir. The general public will be able to browse a catalog of the 9,000 books in circulation at Shakespeare and Company and researchers will be able to analyze a dataset of 22,000 discrete borrowing events. This year’s Dataset Curation Grant will cover final preparations of the data in preparation for a project website soft launch during the 2019-2020 academic year.

Angela N. H. Creager, (Thomas M. Siebel Professor in the History of Science, Department of History): The Making (and Unmaking) of Environmental Carcinogens, 1960–2000

By the late 1960s, scientists understood cancer as the product of genetic damage caused by environmental agents. As a result, the US government began regarding cancer from chemicals as a key public health problem. However, efforts to address cancer by regulating industrial chemicals decreased by the 1980s, as both scientific research and public policy shifted towards attributing human cancer to lifestyle (especially diet and smoking). Biochemist and historian of science Angela Creager is building a dataset that will explore how environmental carcinogens evolved as a category of science and regulation among academic scientists, industry leaders, politicians, government officials, and journalists. Working with a team of graduate and undergraduate students, Professor Creager will comb through online repositories containing newspapers, corporate documents, scientific research, and government whitepapers in order to track changes in cancer research, government policy, and public understanding. The resulting dataset will serve as an important resource for quantitative and qualitative analyses of our evolving notions of cancer.