Data Cleaning @ Stokes Library



Oct 03 11:00 am – 12:20 pm
Stokes Library
Wallace Hall, Lower Level

Do you have messy data?

Is the mess getting in the way of your analysis?

Does Excel crash whenever you open *that* file?

Don’t despair! Help is on the way. The Center for Digital Humanities and the Stokes Library are hosting a workshop to help you get past the mess in your data set and on to the analysis and visualizations you actually want to be doing. We will be using the open source data cleaning power tool, OpenRefine.

This is a hands on workshop, so please bring your own laptop with OpenRefine pre-installed (you can get a copy of the free software here ). If you have difficulty installing OpenRefine come 30 minutes early and someone will help you get up and running.

A sample (messy) data set will be provided, but participants are encouraged to bring their own datasets for consultation.

Workshop co-leads:

Jean Bauer is the Research Director of the Center for Digital Humanities at Princeton where she sets the research direction of the CDH and is responsible for developing innovative digital humanities projects and diffusing digital humanities tools and methods into the Princeton curriculum. Through a combination of formal training and curiosity she is an early American historian, database designer, and photographer. Jean earned her PhD in History from the University of Virginia, where she designed and built The Early American Foreign Service Database for analyzing the US foreign service from 1775-1825 as an information network.

Seth Porter is the Head of the Princeton’s Stokes Library for Public & International Affairs and Population Research. Seth holds a Master of Arts in Public Affairs from University of Alabama, a Bachelor of Arts in History from the University of Wyoming, and a Master of Library and Information Science from San Jose State University. He is currently working on a Ph.D. at the University of Georgia. His scholarly interests include teaching and learning, program evaluation and project management.

This event is co-organized by the Princeton University Library.