BEGIN:VCALENDAR
BEGIN:VEVENT
SUMMARY:SCOOP: Source Codes of the Past: Launching an international ATR/HT
 R Network for Manuscript Analysis
DTSTART:20250612T130000Z
DTEND:20250613T223000Z
UID:https://cdh.princeton.edu/events/2025/06/scoop-source-codes-of-the-pas
 t-launching-an-international-atrhtr-network-for-manuscript-analysis/
DESCRIPTION:\n\n\n    \n        \n        This workshop is by invitation o
 nly. For more information\, please get in touch with croughan@princeton.ed
 u\, hreimitz@princeton.edu\, or lucia.waldschuetz@princeton.edu.\n    \n\n
 \n\n  Organized by:Institute for Advanced Study (IAS)\, PrincetonCenter fo
 r Digital Humanities (CDH)\, Princeton UniversityManuscripts\, Rare Books\
 , and Archival Studies (MARBAS)\, Princeton UniversityHumanities Initiativ
 e\, Princeton UniversityDigital Lab\, Institute for Medieval Research (IMA
 FO)\, Austrian Academy of SciencesWhile recent years have seen many signif
 icant developments in ATR/HTR (automatic text recognition\, handwritten te
 xt recognition)\, there remains work to be done when adapting these techno
 logies for different scripts\, textual traditions\, and manuscript structu
 res – and especially so for low-resource languages and materials. Additi
 onally\, the integrations of ATR with dataset curation\, text re-use analy
 sis\, editorial workflow automation\, and other methodologies are transfor
 ming the way researchers engage with manuscript sources. This workshop wil
 l bring together humanities/social science scholars\, software engineers\,
  and machine learning researchers so that technological and humanistic exp
 ertise might mutually inform one another. The workshop will also lay the g
 roundwork and define the agenda for a second SCOOP exchange meeting in Vie
 nna in the summer of 2026\, to be hosted by the Austrian Academy of Scienc
 es and the University of Vienna.Sponsored by:Center for Collaborative Hist
 ory\, Princeton UniversityDepartment of Classics\, Princeton UniversitySee
 ger Center for Hellenic Studies\, Princeton University\, with the support 
 of the Stanley J. Seeger Hellenic FundProgram in Medieval Studies\, Prince
 ton UniversityCommittee for the Study of Late Antiquity\, Princeton Univer
 sity\n\n\n\n  \n    \n    Program Schedule\n    \n\n  \n\n\n\n\n\n\n    \n
     \n        \n            \n            \n                \n            
         \n                    Thursday\, June 12th\n                    \n
                     \n                    \n                \n            
 \n            \n                \n                \n                    \n
                     9:15 - 10:30\n                    \n                  
   Session 1a: HTR Technology DevelopmentModerator: Martin Roček (Charles 
 University\, Prague\, Institute for Medieval Research\, Austrian Academy o
 f Sciences\, Vienna)Tobias Hodel (University of Bern) “Building General 
 Models: Approaches\, State-of-the-Art\, and Challenges”Achim Rabus (Univ
 ersity of Freiburg) “Pragmatic HTR: Smart Models\, Synthetic Data\, and 
 Navigating the Performance-Usability Landscape”Benjamin Kiessling (Paris
  Sciences et Lettres University) “Large Multilingual ATR Models and Huma
 nities Practice - Conflicts and Pathways”\n                    \n       
          \n                \n                \n                    \n     
                10:45-12:00\n                    \n                    Sess
 ion 1b: HTR Technology DevelopmentModerator: Martin Roček (Charles Univer
 sity\, Prague\, Institute for Medieval Research\, Austrian Academy of Scie
 nces\, Vienna)Matthew Miller (University of Maryland) “Approaches to Ope
 n Source\, Large-scale Arabic-script Text Recognition”Andrew Janco (Prin
 ceton University) &amp\; Ann Farnsworth-Alvear (University of Pennsylvania
 ) “Auto-Cataloging Research Materials with &#x27\;Small&#x27\; Vision-La
 nguage Models”John Pavlopoulos (Athens University of Economics and Busin
 ess\, and Archimedes\, Athena Research Center\, Greece) “Learning to Ada
 pt: Addressing Character Frequency Distribution Shifts in ΗΤR”\n      
               \n                \n                \n                \n    
                 \n                    1:00-2:30\n                    \n   
                  Session 2: Document or Handwriting ClassificationModerato
 r: Tobias Hodel (University of Bern)Aaron Hershkowitz (Institute for Advan
 ced Study) &amp\; Nicholas Howe (Smith College) “Classifying Squeezes: E
 xperiments in HTR for Greek Epigraphy”Sebastian Sobecki (University of T
 oronto) “Communities of Practice: Scripts\, Scribes\, and the Production
  of Literature in London\, 1377-1471”Serena Ammirati (Università Roma T
 re) &amp\; Paolo Merialdo (Università Roma Tre) “Explanatio manifesta: 
 towards high-level explanations of medieval handwriting identification sys
 tems”Isabelle Marthot-Santaniello (University of Basel) &amp\; Giuseppe 
 De Gregorio (University of Basel) &quot\;Comparing Alphas: Detection and R
 ecognition of Ancient Greek Characters on Papyri and their Applications in
  Digital Paleography&quot\;\n                    \n                \n     
            \n                \n                    \n                    2
 :45-4:30\n                    \n                    Session 3: HTR Methodo
 logical ChallengesModerator: Christine Roughan (Princeton University)Alexa
 ndra Gillespie (University of Toronto) “What is a Book in the Age of Mac
 hine Learning?”Bernhard Bauer (University of Graz) “HTR and Early Medi
 eval Multilingual Glosses: Establishing the GlossIT Corpus”Anna Michalco
 vá (Charles University Prague\, Czech Language Institute at the Czech Aca
 demy of Sciences\, Institute for Medieval Research\, Austrian Academy of S
 ciences\, Vienna) “Orthographic Variability as HTR Challenge: Insights f
 rom Medieval Czech Manuscripts”Maria Konstantinidou (Democritus Universi
 ty of Thrace) “First Pass at the Unsung: HTR for Byzantine Music Notatio
 n”Jan Odstrčilík (Institute for Medieval Research\, Austrian Academy o
 f Sciences\, Vienna) “Different Transcription Conventions for Various La
 nguages in ATR: The Case of Latin-Czech Medieval Sermons”\n             
        \n                \n                \n                \n           
          \n                    4:45-6:15\n                    \n          
           Session 4: Language ChallengesModerator: Anna Michalcová (Charl
 es University Prague\, Czech Language Institute at the Czech Academy of Sc
 iences\, Institute for Medieval Research\, Austrian Academy of Sciences\, 
 Vienna)George Kiraz (Beth Mardutho: The Syriac Institute\, and Institute f
 or Advanced Study) “Challenges in Building Syriac OCR: HTR Models for Sy
 riac”Jajwalya Karajgikar (University of Pennsylvania) “An Overview of 
 HTR for South Asian Manuscripts”Ajay Rao (University of Toronto) &amp\; 
 Sloane Geddes (University of Toronto): &quot\;Opportunities and Obstacles:
  Deploying Escriptorium in the HTR of Early Modern Sanskrit Manuscripts&qu
 ot\;Osama Eshera (University of Maryland) &quot\;From Script to Structure:
  Open Problems in the Automatic Analysis of Islamic Manuscripts&quot\;\n  
                   \n                \n                \n                  
   \n        \n    \n    \n\n\n\n\n\n\n    \n    \n        \n            \n
             \n                \n                    \n                    
 Friday\, June 13th\n                    \n                    \n          
           \n                \n            \n            \n                
 \n                \n                    \n                    9:00-10:45\n
                     \n                    Session 5: Datasets and Institut
 ionsModerator: Jan Odstrčilík (Institute for Medieval Research\, Austria
 n Academy of Sciences\, Vienna)Alix Chagué (Inria\, Paris\, and Universit
 é de Montréal) “HTR-United schema for dataset descriptions”Jessie Du
 mmer (University of Pennsylvania) “Collections as Data at Penn Libraries
  and Beyond”Thibault Clérice (ALMAnaCH\, Inria\, Paris) “From a post-
 doc about old French to a 200k lines dataset in 10 different languages: bu
 ilding CATMuS Medieval”Seth Kulick (Linguistic Data Consortium\, Univers
 ity of Pennsylvania) “Linguistic Data Consortium Pilot Project: Motivati
 ons and Design”Christine Roughan (Princeton University) “Integrating A
 TR Software with University HPC Infrastructure”\n                    \n 
                \n                \n                \n                    \
 n                    11:00-12:30\n                    \n                  
   Session 6: Leveraging Outputs: Text Reuse\, NLP\, and moreModerator: Tim
  Geelhaar (Goethe University Frankfurt)William Mattingly (Yale University)
  “Semantic Searching with Vector Databases and their Applications in Quo
 te Identification”David Smith (Khoury College of Computer Science\, Nort
 heastern University) “Textual Criticism as Language Modeling: From Trans
 cription to Collation and Back Again”Martin Roček (Charles University\,
  Prague\, Institute for Medieval Research\, Austrian Academy of Sciences\,
  Vienna) “Enhancing Sentence Similarity Search with S-BERT: A Semantic A
 pproach”Seth Kulick (Linguistic Data Consortium\, University of Pennsylv
 ania) “Orthographic variation and post-OCR correction for Yiddish”\n  
                   \n                \n                \n                \n
                     \n                    2:00-3:30\n                    \
 n                    Plenary Discussion: Future of HTR\n                  
   \n                \n                \n                \n                
     \n                    4:00-5:30\n                    \n               
      Organizational session\; planning for the 2026 meeting\n             
        \n                \n                \n                    \n       
  \n    \n    \n\n\nhttps://cdh.princeton.edu/events/2025/06/scoop-source-c
 odes-of-the-past-launching-an-international-atrhtr-network-for-manuscript-
 analysis/
END:VEVENT
END:VCALENDAR
