This workshop will interrogate the implicit, and sometimes problematic, decisions that researchers and data managers make in the process of cleaning humanities data. We will contemplate the relative differences between “clean” and “raw” data, weighing the costs and benefits of different degrees of data processing and highlighting some reasons why researchers might choose to embrace the original “messiness” of their source materials. We’ll also learn about some of the legal and ethical considerations prompting the removal or obfuscation of personal information from datasets, as well as practical steps researchers can take to manage collections of texts, records, images, and objects that are in various states of “cleanliness.”
This is the fourth in the Humanities Data Workshop Series, and all are welcome to register, whether or not they attended the prior sessions or plan to attend the rest of the series.
Humanities Data Workshop Series Overview:
A joint initiative between the Center for Digital Humanities (CDH) and Princeton Research Data Service (PRDS), this workshop series will explore what “data” means in the context of humanities scholarship and provide an introduction to key techniques and analytical considerations for data-curious faculty, early-career researchers, graduate students, and Library staff. Over the course of six (6) workshops (three (3) workshops per semester), participants will learn about the animating methods and questions that go into finding, structuring, cleaning, and preserving data in humanities contexts. Sessions will use case studies from a range of disciplines, and will pay particular attention to the interpretative and ethical decisions involved in creating datasets from objects of humanities research. Stay posted for announcements on upcoming workshops in the series.
To request disability-related accommodations for this event, please contact firstname.lastname@example.org at least 3 working days in advance.