The First Age of Lexicons

Lexicon Production and Information Management in Early China

Computer Vision
History
woodshaving

Qiran Jin (East Asian Studies) sought to “reconstruct” a corpus from two millennia ago: thousands of wood manuscripts and “fragmented shavings” from Han China.

The fragments are “relics of student writing practices with a lost primer at the time,” Qiran explained. “In this practice, students would hold a multifaceted stick in one hand and practice writing Chinese characters with a brush in the other. After each practice session, they would scrape off the writing with a knife, creating shavings as they repeated the process.”

The result was multiple layers of writing.

Over the course of the semester, Qiran used stylometry tools, such as “principal component analysis (PCA), Bootstrap consensus trees and BERT,” to try to determine the content of both the primer and the “the multi-layers of the fragmented manuscripts.”

Team

Graduate Fellow

Grants

2025

Graduate Fellowship