The following datasets were produced through research partnerships and dataset curation grants (now data fellowships) at the CDH. Each is published in an open-access repository that best fits the needs of the project, like Zenodo, Figshare, or Dataverse. Each is assigned a Digital Object Identifier (DOI), a unique string of characters assigned by an international registration agency that provides a persistent link to a resource across the internet. When publishing our data, we work hard to credit everyone who contributed to its production.

As the CDH continues to publish humanities datasets, we will add them here.

Princeton Ethiopian, Eritrean, and Egyptian Miracles of Mary (PEMM) Project

Cite the dataset:

Belcher, Wendy Laura, Evgeniia Lambrinaki, William F. Macomber, Jeremy R. Brown, Mehari Worku, Dawit Muluneh, Blaine Kebede, Henok Alem, Rebecca Sutton Koesser, Nicholas Budak, Jean Bauer, Getatchew Haile, Asmalu Tefere, Tasfa Gabra Selasse, Tasfa Giyorgis, Gabra Selasse Berhan, Steve Delamarter, Ekaterina Pukhovaia, Dorothea Reule, Taylor Eggan, Bret Windhauser, Rebecca Munson, Solomon Gebreyes, Eyob Derillo, Alessandro Bausi, Vitagrazia Pisani, Kevin McElwee, Stephen Parkinson, Mihret Melaku, Tariku Abas Sherif, Beimnet Beyene Kassaye, Annabel S. Lemma, Tsega-ab Hailemichael, Chiara Lombardi, Ellen Perleberg, Lauren D. Johnson, Sana Khan, Jason O. Seavey, Leia R. Walker, Nati Arbelaez Solano, Daniel Somwaru, Mika J. Hyman, Grace Matthews, Allie V. Mangel, Ellen Li, Elliot Galvis. Princeton Ethiopian, Eritrean, and Egyptian Miracles of Mary (PEMM) Project. Zenodo. August 3, 2022.

The Princeton Ethiopian, Eritrean, and Egyptian Miracles of Mary digital humanities project (PEMM) is a comprehensive resource for the miracle stories about the Virgin Mary in Ethiopia, Eritrea, and Egypt, and preserved in Gəˁəz parchment manuscripts between 1300 and the present. Directed by Prof. Wendy Laura Belcher and managed by Evgeniia Lambrinaki, PEMM was launched in March 2018, using as its base the miracle story identifications William F. Macomber made in the 1980s. The PEMM 1.0 dataset is on 953 identified stories (called Canonical Stories); 491 fully cataloged manuscripts (in Gəˁəz and a few in Arabic) (called Manuscripts); 38,836 stories documented in those manuscripts (called Story Instances); 20,001 typed Gəˁəz incipits (unique first lines) for those stories; and 2,011 paintings with 3489 scenes in 229 manuscripts (called Paintings). The manuscripts come from 79 repositories and libraries around the world (called Collections) and the stories were composed in Ethiopia, Eritrea, and Egypt (and probably Nubia, although not confirmed), as well as Europe and the Levant (called Story Origins).

Shakespeare and Company Project

Cite the dataset:

Kotin, Joshua, Rebecca Sutton Koeser, Carl Adair, Serena Alagappan, Paige Allen, Jean Bauer, Oliver J. Browne, Nick Budak, Harriet Calver, Jin Chow, Ian Davis, Gissoo Doroudian, Currie Engel, Violet Gautreau, Alex Gjaja, Elspeth A. Green, Isaac Hart, Benjamin Hicks, Madeleine E. Joelson, Carolyn Kelly, Sara Krolewski, Xinyi Li, Ellie Maag, Elizabeth Macksey, Cate Mahoney, Francesca Mancino, Jesse D. McCarthy, Mary Naydan, Sally Root, Isabel Ruehl, Sylvie Thode, Katherine Vandermel, Camey VanSant, and Clifford E. Wulfman. Shakespeare and Company Project Dataset: Lending Library Members, Books, Events. Version 1.1. January 2021. Distributed by DataSpace, Princeton University.

A website and online research tool for exploring materials from the Sylvia Beach Papers, housed in Special Collections at Princeton University Library. Beach was the owner of the Shakespeare and Company, the iconic bookshop and lending library in interwar Paris that counted among its members James Joyce, Gertrude Stein, Ernest Hemingway, and other prominent writers and intellectuals. The Shakespeare and Company Project shows what lending library members read and where they lived.The Shakespeare and Company Project makes three datasets available to download in CSV and JSON formats. The datasets provide information about lending library members; the books that circulated in the lending library; and lending library events, including borrows, purchases, memberships, and renewals. For more details, see the data export page on the Project site.

Princeton Prosody Archive

Cite the dataset:

Brogan, T. V. F., Meredith Martin, and Meagan Wilson. "Princeton Prosody Archive Dataset Generated from T. V. F. Brogan's Original Bibliography." Zenodo. June 25, 2019. doi:10.5281/zenodo.3255052.

An incomplete, yet full-text searchable database of thousands of digitized prosodic works published between 1569 and 1923. It collects historical documents and highlights discourses about the study of language, the study of poetry, and where and how these intersect and diverge. The PPA makes several arguments and welcomes new scholarship based on the work it gathers. Some of our initial questions include: What if we began to understand poetics in all of its historical, linguistic, and educational valences? What if literary concepts such as meter and rhythm are historically contingent and fundamentally unstable? What might scholars of distant reading the novel learn from a collection of materials pertaining to the study and philosophy of poetry?

Derrida’s Margins

Cite the dataset:

Chenoweth, Katie, Rebecca Sutton Koeser, Alexander Baron-Raiffe, Renée Altergott, Chad Córdova, Austin Hancock, Chloé Vettier, Jean Bauer, Benjamin Hicks, Nick Budak, and Kevin McElwee. 2021. Derrida's Margins Datasets. Version 1.1. October 2021. Distributed by DataSpace, Princeton University.

A website and online research tool for annotations from the Library of Jacques Derrida, housed at Princeton University Library (PUL). Jacques Derrida is one of the major figures of twentieth-century thought, and his library––which bears the traces of decades of close reading––represents a major intellectual archive. The first phase of the project focused on annotations related to Derrida’s landmark 1967 work De la grammatologie (Of Grammatology).