New Languages for NLP: Building Linguistic Diversity in the Digital Humanities Conference

Conference

May 11 1:30 – 12 6:00 pm
A green, yellow, and pink square provides the conference title and sponsors

Participants from the Center for Digital Humanities New Languages for NLP Institute will share results, challenges and lessons learned while spending a year learning how to create linguistic data and training NLP models for under-resourced languages.

The panels will take place at the CDH on B Floor of Firestone; see below for keynote locations. For the panels, registration is required for both in-person and Zoom options.

Wednesday, May 11

1:00-2:30 pm EDT. Challenges in the Development of NLP Resources for New Languages: Case Studies from Kannada, Quechua, and Russian (panel)

  • Speakers: Katherine Bowers (University of British Columbia), John Hale (University of Georgia), Kate Holland (University of Toronto), Chad Howe (University of Georgia), and Jajwalya Karajgikar (University of Pennsylvania)
  • In-person registration or Zoom registration

2:45-4:15 pm EDT. Right to Left and Back: NLP for Ottoman Turkish, Yiddish and Classical Arabic (panel)

  • Speakers: Ephraim Berkovitch (ZipRecruiter), Maroussia Bednarkiewicz (Eberhard Karls Universität), Irene Kirchner (Georgetown University), Sinai Rusinek (University of Haifa), and Romain Thurin (University of Notre Dame)
  • Moderator: Christiane Fellbaum (Princeton University)
  • In-person registration or Zoom registration

4:30-6:00 pm EDT. David Bamman (UC-Berkeley), “Representation in Literary NLP” (keynote)

  • Live-streamed. No registration required
  • Julis Romo Rabinowitz Building 399

Thursday, May 12

1:00-2:30 pm EDT. Creating Annotated Corpora for Yoruba, Efik and Tigrinya (panel; Institute participants only)

  • Speakers: Cameron Gibson (CUNY Graduate Center), Utitofon Inyang (UC-Riverside), Temitayo Olatoye (University of Eastern Finland), and Aidan Malanoski (CUNY Graduate Center)

2:45-4:00 pm EDT. East Asian Historical Language Models: Beyond the Mainstream (panel)

  • Speakers: Nick Budak (Stanford University), Alíz Horváth (Eötvös Loránd University), Gian Rominger (Princeton University)
  • Moderator: Anna Shields (Princeton University)
  • In-person registration or Zoom registration

4:30-6:00 pm EDT. Ines Montani (Explosion AI), “Solutions for Advanced NLP for Diverse Languages” (keynote)

  • Live-streamed. No registration required
  • Computer Science Building 104

This event is part of the New Languages for NLP: Building Linguistic Diversity in the Digital Humanities Institute, hosted by the Center for Digital Humanities in partnership with DARIAH-EU, and with generous support from a grant from the National Endowment for the Humanities.

logo of the National Endowment for the Humanities