SIC's spring cleaning: The challenges to "neutral" data
15 October 2015
The Blue Mountain bibliographic editing team has recently finished revising metadata for the French modernist magazine SIC (1916-1919), one of the 34 avant-garde periodicals in the Blue Mountain Project digital archive. This process has revealed the dynamic and fluid life of periodicals, the developments of modernist art and literature, and a slight tension between generating the ideal of “neutral” data and our scholarly urge to interpret. We are creating this metadata to provide information that is machine-actionable; that is, we are writing code to allow a computer program to harvest information about SIC and render it in a usable form. For example, if a researcher wants to track the price of the magazine over time, a few lines of computer script will read our metadata, harvest the data, and the researcher can use visualization software to render this information in a graph. If Blue Mountain is to provide a spring of clean data, the encoded metadata serves as the wellspring.
We are editing our metadata within two frameworks that are actively maintained and used in the library and DH communities: Metadata Object Description Schema (MODS), and the Textual Encoding Initiative (TEI). Each unit of information is given an “element” wrapper, which is a way of categorizing the specific information we want to encode. We create a MODS record for each issue, capturing the bibliographic and physical details of the magazine— title, date, publication information, types of contributions, author names, library call number, and so on. Each issue also has a TEI record, which describes its publication history and material features .
The (meaning “manuscript description”) section of the TEI header, which derives its language from the manuscript tradition, provides the set of elements for encoding paper type, binding, page layout, typography, decoration, illustration, and price. But often, aspects of the magazine are not so easy to identify or fit into a single available element tag. For instance, what is this mark below – a section divider, a decorative woodcut, a stray blob of toner? Or is it a combination of all three? Furthermore, we have to determine what level of detail is both attainable and most useful for scholars. Is describing at a granular level feasible (we must encode around 2,500 Blue Mountain magazine issues…)?
(SIC, Vol. 1, Issue 2. February 1916. Page 7 [leaf 4R]
Even something as seemingly straightforward as page numbering has proven complex: the issues of SIC are mostly unpaginated, with the exception of a few towards the end of the run. What—if anything—does this indicate about the SIC editors’ (or printers’) ideas about reading sequence, order, and citation? Was leaving pages blank an intentional choice, or just happenstance? Even the issues that have page numbers are varied: for instance, the first two pages of Volume 4, Issue 39 are unnumbered, and the third page is numbered “282.” How, as creators of a digital resource, do we provide consistent pagination in the metadata when the physical copies of the magazine aren’t consistent themselves? For encoding “extent,” we eventually decided to borrow from the manuscript tradition, and call each printed “page” one side of a “leaf.” A typical issue of SIC has eight pages—four leaves each with a recto and a verso side. For the occasional paginated issues, we continue to encode recto-verso designation, and indicate the printed pagination as well. It may seem archaic to rely on the language of manuscript description to describe the features of small-press avant-garde journals, but it was the best choice for ensuring a controlled method of content attribution.
Creating enriched, detailed MODS and TEI records for historical periodicals is a formidable challenge. The conventions of cataloging and descriptive bibliography are generally geared toward monographs. Moreover, the experimental nature of late 19th-early 20th c. magazine culture, underscored by the evolving technological, cultural and political realities from which they emerged, often make Blue Mountain magazines a moving target for strict, defined categorization. The process of metadata encoding and bibliographic research has created opportunities for lively discussion about our project and material overall, and continues to make us aware of the tension between the goal to create “neutral” data and the necessity of interpretation.