Info: Welcome to the new CDH website!

In 2024-25, we are celebrating ten years as a center at Princeton. Explore our redesigned website to get better acquainted with us and the many things we do!

Info: We're hiring!

Apply as our new Research Software Engineer (RSE): More info.

May 11–12: New Languages for NLP Conference

1 May 2022

Participants from the New Languages for NLP Institute will share results, challenges and lessons learned while training NLP models for under-resourced languages.

NLP_1472x400.jpg

Updated 5/5

Join us in-person or virtually May 11 and 12 for the culmination of the year-long New Languages for NLP: Building Linguistic Diversity in the Digital Humanities Institute, hosted by the Center for Digital Humanities in partnership with DARIAH-EU.

Researchers from the language teams will gather to share results and challenges from creating linguistic data and training NLP models for under-resourced languages.

The conference will feature keynotes by David Bamman (UC-Berkeley) and Ines Montani (Explosion AI).

Three of the four panels and both keynotes are open to the public. For the panels, registration is required for both in-person and Zoom options. Registration is not required for the keynotes.

The panels will take place at the CDH on B Floor of Firestone Library; see below for keynote locations.

Wednesday, May 11

1:00-2:30 pm EDT. Challenges in the Development of NLP Resources for New Languages: Case Studies from Kannada, Quechua, and Russian (panel)

  • Speakers: Katherine Bowers (University of British Columbia), John Hale (University of Georgia), Kate Holland (University of Toronto), Chad Howe (University of Georgia), and Jajwalya Karajgikar (University of Pennsylvania)
  • In-person registration or Zoom registration

2:45-4:15 pm EDT. Right to Left and Back: NLP for Ottoman Turkish, Yiddish, and Classical Arabic (panel)

  • Speakers: Ephraim Berkovitch (ZipRecruiter), Maroussia Bednarkiewicz (Eberhard Karls Universität), Irene Kirchner (Georgetown University), Sinai Rusinek (University of Haifa), and Romain Thurin (University of Notre Dame)
  • Moderator: Christiane Fellbaum (Princeton University)
  • In-person registration or Zoom registration

4:30-6:00 pm EDT. David Bamman (UC-Berkeley), “Representation in Literary NLP” (keynote)

  • Live-streamed. No registration required.
  • Julis Romo Rabinowitz Building 399

Thursday, May 12

1:00-2:30 pm EDT. Creating Annotated Corpora for Yoruba, Efik, and Tigrinya (panel; Institute participants only)

  • Speakers: Cameron Gibson (CUNY Graduate Center), Utitofon Inyang (UC-Riverside), Temitayo Olatoye (University of Eastern Finland), and Aidan Malanoski (CUNY Graduate Center)

2:45-4:00 pm EDT. East Asian Historical Language Models: Beyond the Mainstream (panel)

  • Speakers: Nick Budak (Stanford University), Alíz Horváth (Eötvös Loránd University), and Gian Rominger (Princeton University)
  • Moderator: Anna Shields (Princeton University)
  • In-person registration or Zoom registration

4:30-6:00 pm EDT. Ines Montani (Explosion AI), “Solutions for Advanced NLP for Diverse Languages” (keynote)

  • Live-streamed. No registration required.
  • Computer Science Building 104

This event is part of the New Languages for NLP: Building Linguistic Diversity in the Digital Humanities Institute, in partnership with DARIAH-EU and generously supported by a grant from the National Endowment for the Humanities. Any views, findings, conclusions, or recommendations expressed in this blog post do not necessarily represent those of the National Endowment for the Humanities.

Screen Shot 2022-04-19 at 11.56.37 PM.png