The second and final session in the Spring 2020 Public Humanities Working Group series. Participants are welcome to bring their own lunch to this virtual meeting.
Join us for monthly conversations with faculty, graduate students, and staff from across the university to think together about our shared humanistic work and its larger, public implications outside of university life, and to consider more largely the value and relevance of the humanities in our present moment.
Presented by Ben Lee, a 2020 Innovator-in-Residence at the Library of Congress
Ben writes: The 16-million digitized, historic newspaper pages within Chronicling America, a joint initiative by the Library of Congress and the NEH, represent an incredibly rich resource for a wide range of users. Historians, journalists, genealogists, students, and members of the American public explore the collection regularly via keyword search. But how do we navigate the abundant visual content? Newspaper Navigator is a project that I am currently carrying out while an Innovator-in-Residence at the Library of Congress, in collaboration with Library of Congress Labs, the National Digital Newspaper Program, and my PhD advisor, Professor Daniel Weld, at the University of Washington. Newspaper Navigator consists of two parts. The first is to extract headlines, images, illustrations, maps, comics, and editorial cartoons from millions of newspaper pages by training an image recognition model on thousands of crowdsourced annotations collected by the Library of Congress’s Beyond Words initiative. The second part of Newspaper Navigator is to reimagine how we can navigate this wealth of visual content through an exploratory search interface, enabling users to define queries for concepts of their own choosing (which I refer to as “open faceted search”).
In this talk, I will share my current progress with Newspaper Navigator, including running the visual content recognition pipeline at scale. I will also discuss how this project, including the resulting datasets and search interface, can contribute to both computer science research and research within digital humanities.