Info: Call for Graduate Fellows!

Applications for the CDH Graduate Fellowship are open through April 1, 2026. Apply now.

Info: Call for proposals: Collaborative Research Partnerships

Applications are open through April 17, 2026. Apply now!

Making Research Easier to Save: A Guide to Zotero Integration for Academic Websites

Research

11 November 2025

Authors

Screenshot of a Zotero library showing a list of items with columns for title, creator, and date.

A screenshot of the CDH Zotero Library

Zotero is one of the most popular tools for managing academic references, and for good reason. With its free browser extension, the Zotero Connector, you can save citations from journal articles, books, and websites with a single click. It automatically captures titles, authors, publication dates, and other details, so you can focus on your research instead of typing out citations by hand.

But if you’ve ever tried to save a citation from an academic website or a historical resource site, you may have noticed it doesn’t always work as expected. Sometimes the citation is missing key details, or the Zotero Connector doesn’t detect anything at all. This isn’t a flaw in Zotero itself — it’s because Zotero relies on websites to expose their metadata in a way it can understand.

At the CDH, we wanted to make it easier for scholars to save and cite our work. That meant improving how metadata is displayed on our websites — including the CDH Blog and the Princeton Prosody Archive (PPA) — so that Zotero could pick it up cleanly. Better metadata doesn’t just help Zotero; it also improves search engine visibility and benefits anyone trying to reuse our content. This became my first task after joining the CDH, and it was a perfect way to get familiar with our codebases, development cycle, and, most importantly, the persistence and problem-solving mindset needed for this kind of work.

To address the problem of making Zotero pick up data correctly, many large journal websites, like arXiv and Google Scholar, write custom “translators” for Zotero — special scripts that tell Zotero how to read their pages. But writing and maintaining a translator can be time-consuming. Working with our Lead RSE, Rebecca, we found a much lighter-weight solution: simply exposing the right metadata in the right format — CoinS — so Zotero could scrape it without a custom script.

Before diving deeper into the approaches we explored and the experiments we ran, here’s the quick takeaway: despite being hard for humans to read, CoinS is one of the most effective methods for implementing Zotero integration on websites. It has a key advantage over many other formats: it can successfully expose genre data, allowing Zotero to detect the correct item type — something most other methods can’t do.

Experiments and Results

We didn’t know from the start that CoinS would turn out to be the best option to integrate Zotero into the website, because it is an outdated technology from a decade ago, and not a lot of people are still using it nowadays. In fact, it is so lacking in maintenance that we can only access its documentation through the Wayback Machine. Therefore, the whole journey of finding the best Zotero integration solution turned into a series of rabbit holes. To get there, Rebecca and I ran a lot of experiments with different approaches. In this section, we’ll share those experiments and what we learned — partly to save you the time we spent testing, and partly to show the trade-offs between different methods.

We began with Zotero’s own documentation on exposing metadata, which outlines several supported formats. From there, we experimented with five different approaches:

  1. Highwire metadata tags
  2. Schema.org JSON-LD
  3. CoinS
  4. unAPI + MARC
  5. unAPI + Dublin Core

We started testing these formats on the CDH Blog. In this first stage, we tried three methods: Highwire metadata tags, Schema.org JSON-LD, and CoinS.

Highwire metadata tags were developed by Highwire Press in the late 1990s for scholarly publishers to describe article-level details in HTML headers. They’re essentially <meta> tags that list key fields like title, author, publication date, journal name, and volume/issue. They’re easy to add to any web page, human-readable, and widely recognized by tools like Zotero. In our experiments, we found that Highwire tags can help Zotero detect some resource types — for example, adding a “journal title” tag will prompt Zotero to classify the item as a journal article. However, the format doesn’t cover all content types, and in our case it couldn’t represent the “blog post” type we wanted. As a result, even if a page was clearly a blog post, Zotero would still save it generically as a “web page”. Currently, if you view a CDH Blog post with the Zotero Connector installed in your web browser, the Zotero icon changes to a “blog post” icon; with Highwire metadata tags alone, that icon would remain the generic page icon, even if all other citation fields were correct.

Schema.org JSON-LD is a linked-data standard backed by Google, Bing, and other search engines. JSON-LD (“JavaScript Object Notation for Linked Data”) allows you to embed rich structured data in a <script> tag, making it easy for machines to read without cluttering the HTML. It’s widely used across the web, especially in academic publishing, and is the format Google Scholar prefers. Zotero’s documentation doesn’t list JSON-LD as supported, but there have been multiple GitHub issues and forum threads requesting it, with the most recent activity in 2023. Because JSON-LD is so widely adopted and useful for other metadata consumers, we were hopeful it might be supported by Zotero as well. We tested it with Zotero 7.0.17 and the latest Chrome Connector, but found that Zotero still ignores JSON-LD entirely. As a technical solution, this method is highly desirable, and we hope in the future to hear from the Zotero team about their plans for JSON-LD support.

CoinS (Context Objects in Spans) is a method from the early 2000s designed for embedding OpenURL ContextObjects directly in HTML using <span> elements. It looks like a block of URL-encoded text, which is not very readable to humans. Also, as mentioned earlier, it is currently lacking maintenance. However, the good part is that it is very lightweight - almost a “single line” solution. Moreover, Zotero supports CoinS natively. So, it is straightforward to implement: you just generate the encoded string and drop it into a <span> tag with the right class attribute. Moreover, unlike Highwire tags, we happily discovered that CoinS is able to encode detailed type information with a &rft.genre attribute, allowing Zotero to correctly identify a blog post, journal article, book, or other genre.

After implementing CoinS to the CDH Blog, we carefully tested and verified that it provided all the features we needed, and the exported citation data is accurate and complete. So, after seeing CoinS work so well for the CDH Blog, we moved on to the Princeton Prosody Archive (PPA). Here, we faced a challenge the blog didn’t have: the PPA includes a search results page, and we wanted researchers to be able to export multiple items at once.

Screenshot of a Zotero desktop library window open in front of a web page, showing a saved article titled “Checking in with Ed Baring: Motivation and lessons behind citing Marx.”

Export CDH blogpost to Zotero with a single click.

Our existing solution on the PPA for bulk export used MARC + unAPI. MARC (“Machine-Readable Cataloging”) is a metadata standard developed by the Library of Congress for library catalogs. It’s very precise and can represent almost any bibliographic record — but it’s verbose and hard to read. unAPI is a simple web protocol that lets a page advertise that an item is available in multiple metadata formats, so a tool like Zotero can fetch the preferred one. For PPA, we work with data from both HathiTrust and Gale. HathiTrust serves MARC records directly via API, so that data is retrieved when requested; for Gale, we cache MARC files in a pairtree directory structure. This dual approach had three major drawbacks: first, it was slow overall, since each bulk export required multiple unAPI calls per item; second, adding new collections meant additional development work to incorporate new MARC logic or cache more files; lastly, it only worked for full works and not articles/excerpts.

To improve on that, we attempted a new solution using Dublin Core + unAPI. Dublin Core is a much simpler metadata schema, developed in the 1990s by the Dublin Core Metadata Initiative to describe resources across the web. It defines just 15 core elements — such as title, creator, date, and type — and is easy to embed in HTML or serve through unAPI. It’s very human-readable and straightforward to implement. However, we discovered that its type field is too broad — using only general categories like “Text,” “Image,” or “Video” — and can’t specify subtypes such as “Book,” “Book Section,” or “Journal Article.” As a result, it failed for the same reason as Highwire metadata tags: Zotero could not detect the specific item type.

Screenshot of the Princeton Prosody Archive website with a dialog box prompting the user to select items to add to a Zotero library.

PPA batch exportation demo

In the end, we returned to CoinS. To enable Zotero to save multiple items from the PPA search results page, we simply embedded a CoinS <span> for each search result. Despite being hard for humans to read, CoinS delivered the correct item type, worked reliably for both single pages and bulk exports, and avoided the complexity and performance problems of MARC and unAPI. With CoinS integration, the website can support academic reference exportation without requiring any additional HTTP requests, storage plan, or dependencies. Also, it is so lightweight that it can complete the task within a second after a single click.

Building CoinS for Zotero Integration

Although CoinS turned out to be the best approach, it’s an older standard, and there aren’t many up-to-date tutorials for it. So we prepared a separate technical guide for developers on how to implement CoinS step-by-step, using examples from our Django/Wagtail projects at the CDH.

Wrapping Up

Improving Zotero integration for the CDH Blog and PPA turned out to be both a technical challenge and a practical win for our users. On the technical side, we tested multiple metadata approaches, learned their strengths and weaknesses, and ultimately settled on CoinS as the most reliable, low-maintenance solution — and shared our process so other developers can benefit. On the user side, researchers can now save our blog posts and PPA resources to Zotero with the correct item type and complete metadata, without extra cleanup.

Although CoinS is an older standard, it remains highly effective for this purpose, and its simplicity makes it easy to maintain across projects. What seemed like a simple task at first required more research, problem-solving, and iteration than expected — very much in the spirit of digital humanities projects. Better metadata is more than just a technical upgrade — it’s part of making research more accessible, discoverable, and connected. Whether you’re working on a small blog or a large digital archive, giving your readers that one-click “Save to Zotero” experience is a small change that can make a big difference.

If you try this approach or adapt it for your own project, I’d love to hear how it works for you - and if you’ve discovered a better method, I’d be just as interested to learn from your experience.

This work was a true collaboration with CDH Lead RSE Rebecca Koeser, who dove deep into every step of the process — from combing through Zotero’s documentation to running countless experiments.

Technical guide for developers: