Meet our new RSE, Hao Tan!
8 August 2025
Hao joined the CDH in June.
Earlier this summer, we welcomed Hao Tan as our new research software engineer (RSE)! We asked about her research interests and her hopes for her role at Princeton.
Your graduate work is in both Computer and Information Technology and East Asian Languages and Civilizations. What drew you to those two disciplines?
I’ve always been fascinated by how a shift in perspective can reveal something we thought we already knew. Computer science—especially machine learning—gives us tools to uncover patterns invisible to human eyes, whether mapping social networks in a corpus or tracing how metaphors travel across centuries. East Asian Studies, rooted in the culture I grew up in, reminds me that every data point represents a human choice shaped by power, memory, and place. It draws me to quieter, alternative narratives—women’s voices, local chronicles, micro-histories—that often slip through the cracks. Studying both disciplines lets me ask two essential questions: “What does the data show?” and “Why was it written that way?” This conversation between method and meaning is what draws me to digital humanities.
This conversation between method and meaning is what draws me to digital humanities.
Tell us a little about your research interests.
I have three major areas of research interest:
- Applying AI to support humanities research. In the era of large language models (LLMs), there's significant potential to integrate these tools into humanities research. I’m interested in how we can leverage the power of AI to help develop pipelines, workflows, and methods that extract, curate, and interpret data at scale while preserving the interpretive flexibility that humanists value.
- Cultural and literary analysis related to East Asia. I’m particularly drawn to using Natural Language Processing (NLP) to examine how historical narratives are constructed, remembered, and sometimes challenged. Using digital humanities methods to uncover societal trends at a quantitative scale and to identify textual details for close reading.
- Leveraging computer vision in art historical research. My graduate work on East Asian wall paintings sparked my curiosity about how vision techniques can help analyze visual culture—from motif tracking and style analysis to cross-cultural influence detection. While this area remains experimental and faces challenges like copyright restrictions and computing resource limitations, I’m excited by its potential and eager to connect with others exploring similar paths.
What are you looking forward to in your new role? What kind of projects do you plan to work on as an RSE at the CDH?
I’m most excited about CDH’s collaborative research partnerships, where I work as a RSE and collaborate with Princeton faculty and scholars—from historians and musicologists, to literary critics—to turn their research questions from puzzles on a whiteboard into elegant, sustainable code, data, and insight together.
Concretely, I plan to focus on three overlapping strands of work. First, research partnerships: building bespoke pipelines—cleaning messy archives, fine-tuning language models, designing faceted search—so faculty can ask questions they couldn’t pose before. Second, language diversity: extending tools beyond English-centric assumptions. My background in East Asian studies and NLP will help us support right-to-left scripts, character-based languages, and mixed historical orthographies, and turn those requirements into reusable components for other non-Latin projects. Third, lightweight reproducibility: Not only building web applications that host and present data, but also delivering interactive notebooks, data pipelines, and clear documentation so scholars can rerun, tweak, and cite every step without wrestling with infrastructure.
In short, I’m looking forward to being a bridge—between disciplines, between human interpretation and computational scale, and between Princeton researchers and the wider DH community.
Stepping outside your usual toolkit keeps the work fresh and reminds you how much there is to learn.
What advice do you have for students with an interest in digital humanities?
From my own (still-evolving) experience, two things have helped.
- Try something unfamilar and cutting-edge. If you’re comfortable with text, grab a handful of images or a short audio clip and see what you can do with a vision or speech library. When the material changes, the questions—and often the insights—change with it. Stepping outside your usual toolkit keeps the work fresh and reminds you how much there is to learn. Feel free to try and test the latest model or method out—just enough to see what it might add.
- Keep the experiments small and low-stakes. Quick notebooks that never leave your laptop are fine; each one teaches a workflow or library you can reuse later. The more prototypes you build and discard, the quicker you’ll recognize what really works—and the easier it is to share clean, reproducible code when it matters.