Upcoming Events

Next semester's events are being scheduled. Check back later or view Summer 2020 events.

Past Events

Working Group

Public Humanities Working Group

May 14 12:00–1:00 PM

The second and final session in the Spring 2020 Public Humanities Working Group series. Participants are welcome to bring their own lunch to this virtual meeting.

Join us for monthly conversations with faculty, graduate students, and staff from across the university to think together about our shared humanistic work and its larger, public implications outside of university life, and to consider more largely the value and relevance of the humanities in our present moment.

To RSVP and receive our common readings, please contact Kate Thorpe, kthorpe@princeton.edu.

Speakers:

Martha A. Sandweiss
Professor of History
Princeton University

Jim Casey
Postdoctoral Research Associate
Center for Digital Humanities
Princeton University

Julia Grummitt
Doctoral Candidate in History
Princeton University

Elena M’Bouroukounda
Master’s Candidate in Architecture
Princeton University

Read about Council working groups here.

RSVP required

Reading Group

Newspaper Navigator: Reimagining Digitized Newspapers with Machine Learning

May 15 11:30–12:30 PM

To register for this event, visit the event listing on the PUL website.

Presented by Ben Lee, a 2020 Innovator-in-Residence at the Library of Congress

Ben writes: The 16-million digitized, historic newspaper pages within Chronicling America, a joint initiative by the Library of Congress and the NEH, represent an incredibly rich resource for a wide range of users. Historians, journalists, genealogists, students, and members of the American public explore the collection regularly via keyword search. But how do we navigate the abundant visual content? Newspaper Navigator is a project that I am currently carrying out while an Innovator-in-Residence at the Library of Congress, in collaboration with Library of Congress Labs, the National Digital Newspaper Program, and my PhD advisor, Professor Daniel Weld, at the University of Washington. Newspaper Navigator consists of two parts. The first is to extract headlines, images, illustrations, maps, comics, and editorial cartoons from millions of newspaper pages by training an image recognition model on thousands of crowdsourced annotations collected by the Library of Congress’s Beyond Words initiative. The second part of Newspaper Navigator is to reimagine how we can navigate this wealth of visual content through an exploratory search interface, enabling users to define queries for concepts of their own choosing (which I refer to as “open faceted search”).

In this talk, I will share my current progress with Newspaper Navigator, including running the visual content recognition pipeline at scale. I will also discuss how this project, including the resulting datasets and search interface, can contribute to both computer science research and research within digital humanities.

Read more about the Newspaper Navigator project.

This event is part of the 2019-20 Collections as Data Discussion Series.

Working Group

South Asia Digital Humanities: Chai and Chaat

June 10 11:00–12:00 PM

The South Asia DH Working Group invites you to a Zoom gathering of South Asia DH enthusiasts to hear lightning talks about current projects and opportunities at Princeton and to exchange ideas about areas of DH interest. Grab some chai, make a snack, and chat (chaat) with us!

Hosted by Amna Qayyum (PhD Candidate, History), Ellen Ambrosone (South Asian Studies Librarian) and Wafa Fatima Isfahani (Special Collections Assistant, NEC).

Register Here: https://tinyurl.com/y9a4yjg5 <https://tinyurl.com/y9a4yjg5>

Workshop

Introduction to HTRC for Text and Data Mining

July 7 10:00–11:30 AM

This virtual four-workshop series will allow attendees to gain experience with tools and data from the HathiTrust Research Center (HTRC). The Research Center facilitates text and data mining uses of the HathiTrust corpus. HathiTrust is a partnership of research libraries, and it is a digital library containing 17.3 million items digitized at the partner libraries. HTRC tools and data range from off-the-shelf options to more advanced offerings for experienced scholars. 

The workshops will be held via Zoom and will include a mix of hands-on, discussion, and presentation. We will utilize breakout rooms to support hands-on activities. You will not be required to install any software to participate in the workshops. The workshops are open to faculty, graduate students, postdoctoral researchers, librarians, and other academic staff.

Librarians who attend all four workshops will be invited to join a cohort of other librarians who are teaching with and about the Research Center. This cohort has access to additional support from HTRC, further training opportunities, and a community of their peers who are interested in HTRC. 

In this first of four workshops, we will explore the basics of HathiTrust as a data source and how to utilize HTRC as a resource for text and data mining. The workshop will address the various tools and services of the Research Center, and options for accessing text data from HathiTrust for text analysis research. The session will be helpful for those who want a general overview, or who want a solid foundation for the other workshops in the series. 

Co-sponsored by the Center for Digital Humanities and the Princeton Research Data Service

To request disability-related accommodations for this event, please contact pulcomm@princeton.edu at least 3 working days in advance. 

Workshop

HTRC Extracted Features Dataset

July 8 10:00–11:30 AM

This virtual four-workshop series will allow attendees to gain experience with tools and data from the HathiTrust Research Center (HTRC). The Research Center facilitates text and data mining uses of the HathiTrust corpus. HathiTrust is a partnership of research libraries, and it is a digital library containing 17.3 million items digitized at the partner libraries. HTRC tools and data range from off-the-shelf options to more advanced offerings for experienced scholars. 

The workshops will be held via Zoom and will include a mix of hands-on, discussion, and presentation. We will utilize breakout rooms to support hands-on activities. You will not be required to install any software to participate in the workshops. The workshops are open to faculty, graduate students, postdoctoral researchers, librarians, and other academic staff.

Librarians who attend all four workshops will be invited to join a cohort of other librarians who are teaching with and about the Research Center. This cohort has access to additional support from HTRC, further training opportunities, and a community of their peers who are interested in HTRC. 

In this second of four workshops, we will introduce you to the Extracted Features data model and the kinds of research it enables. HTRC recently released an updated version of the Extracted Features dataset (v.2.0) that includes 17+ million files, with each file representing a volume in the HathiTrust Digital Library. The Extracted Features files contain metadata about the volumes, as well as tokens (words), parts of speech, and their per-page counts. The dataset can be used for text analysis projects where access to the words and word-counts in a volume are expected by the algorithm, such as topic modeling or certain kinds of machine learning projects. This session will include a hands-on activity using the dataset.

Co-sponsored by the Center for Digital Humanities and the Princeton Research Data Service

To request disability-related accommodations for this event, please contact pulcomm@princeton.edu at least 3 working days in advance. 

Workshop

HTRC Data Capsules Environment

July 9 10:00–11:30 AM

This virtual four-workshop series will allow attendees to gain experience with tools and data from the HathiTrust Research Center (HTRC). The Research Center facilitates text and data mining uses of the HathiTrust corpus. HathiTrust is a partnership of research libraries, and it is a digital library containing 17.3 million items digitized at the partner libraries. HTRC tools and data range from off-the-shelf options to more advanced offerings for experienced scholars. 

The workshops will be held via Zoom and will include a mix of hands-on, discussion, and presentation. We will utilize breakout rooms to support hands-on activities. You will not be required to install any software to participate in the workshops. The workshops are open to faculty, graduate students, postdoctoral researchers, librarians, and other academic staff.

Librarians who attend all four workshops will be invited to join a cohort of other librarians who are teaching with and about the Research Center. This cohort has access to additional support from HTRC, further training opportunities, and a community of their peers who are interested in HTRC.

In this third of four workshops, we will introduce you to the HTRC’s capsule environment and how it can be used by intermediate and advanced researchers. An HTRC Data Capsule is a virtual machine with special security settings that allows researchers to access text data from HathiTrust, analyze it using the text and data mining methods of their choice, and then export only the results of their analysis. This session will include a hands-on activity using an HTRC Data Capsule.

Prerequisites: either the “Introduction to HTRC for Text and Data Mining” workshop, or some previous experience with HathiTrust or HTRC.

Co-sponsored by the Center for Digital Humanities and the Princeton Research Data Service

To request disability-related accommodations for this event, please contact pulcomm@princeton.edu at least 3 working days in advance. 

Year of Data

Reading Group

Co-Sponsor an Event