Humanities Data
Providing resources and education around the development and use of data for, in, and about the humanities
On this page
Approaching humanities sources as data is a key component of digital humanities work. At the CDH, we focus on critically engaging with the concepts and methods in data science as they apply to the creation and analysis of humanities data.
We grapple with the technical necessity of cleaning, reducing, and normalizing data in ways that don't hide diversity and nuance. How much messiness should we maintain in our data and how can we do so? How do we want to invite other scholars to view our data: as the polished result of an investigation or argument? Or, as an experimental lens on our subject matter, a sandbox that we invite others to play in?
These are the questions negotiated by datasets produced at the CDH, each of which imagines different forms of argumentation and different kinds of stories that can be told in humanities scholarship. These are also the questions we encourage others to ask through our programming and resources.
Datasets published by the CDH
Princeton Prosody Archive
Inviting users to rethink poetry's past through a collection of historical prosodic works
Shakespeare and Company Project
Recreating the world of the Lost Generation in interwar Paris
Derrida’s Margins
An online research tool for the philosopher’s annotations that provides a behind-the-scenes look at his reading practices and the philosophy of deconstruction.
Princeton Ethiopian Miracles of Mary Project
Folklore about How the Virgin Mary Helps Believers in Ethiopian Literature and Art
Resources for finding humanities data
Resource | Description |
---|---|
CDH maintained list of humanities datasets |
|
Peer-reviewed forum for reports on the curation and publication of new datasets |
|
UC Berkeley Library's Text Mining & Computational Text Analysis |
Web portal with tutorials, sources, etc. |
Collection of data analysis notebooks, curated by Quinn Dombrowski |
|
Datasets distinctive or unique to Rutgers |
|
Matthew Lavin's list of datasets |
|
List maintained by Melanie Walsh, including example uses and tutorials for each dataset |
|
List of text corpora |
Data-focused programming and curriculum
Humanities + Data Science Institute
A five-day intensive faculty seminar to explore the conceptual, practical and ethical aspects of data science