Text Technologies for Manuscript Cultures
Using emerging technologies to transform research, teaching and understanding of pre-modern evidence

This research group explores how machine learning and AI approaches can be leveraged for analyzing the textual culture of the pre-modern world.
Launched in 2022 with the Machine Learning and Future of Philology Symposium, this group is a collaboration between CDH and MARBAS (Manuscript, Rare Book and Archive Studies).
We are particularly interested in techniques for handwritten text recognition (HTR) for scripts and languages such as Arabic, Hebrew, Syriac, Latin, and Ancient and Byzantine Greek. We run workshops and host events to build skills and foster a dynamic network of scholars both at Princeton and beyond.
Key Princeton partners include the Geniza Lab, the Logion Project, and faculty and students from History, Medieval Studies, and the Seeger Center for Hellenic Studies.
The group welcomes Princeton affiliates from all disciplines who are interested in exploring technologies for manuscripts, early print and other pre-modern evidence. Please contact CDH postdoc Christine Roughan for more information or to get involved.
Related projects
Bringing HTR to the HPC
Customizing the eScriptorium HTR software for use on Princeton high performance computing hardware

Princeton Open HTR Initiative
Establishing research infrastructures to support Princeton use of HTR for manuscripts and archival documents in a variety of languages and scripts

Segmenting Paratextual Material in Arabic Scientific Manuscripts
Computational methods for classifying and analyzing visual aspects of the manuscript folio

Related events
SCOOP: Source Codes of the Past: Launching an international ATR/HTR Network for Manuscript Analysis
