- Dmitri Tymoczko
There are two major kinds of computational musical analyses, those using raw musical data (i.e. scores) and those using annotated data (scores along with indications of chord and key). Annotated datasets provide more power and detail, but require a lot of labor to make. Recently, a team based at the Federal École Polytechnique in Lausanne has generated a major annotated corpus (scores and annotations) of the complete Beethoven string quartets: 16 quartets and 70 movements, comprising approximately 37,000 individual chords and 435 pages of music.
This dataset promises to be a major source of information about the syntax of classical-era tonality in general and the specifics of Beethoven’s language in particular. It will provide an unprecedented pedagogical resource, allowing teachers and students to search for arbitrary harmonic and melodic patterns. And it promises to help resolve major theoretical questions such as the degree to which classical progressions obey “local” (or chord-to-chord) harmonic laws. Unfortunately, the current dataset is in many ways unreliable. The project team (in cooperation with Markus Neuwirth, the principal investigator at Lausanne) will work to clean this data to make it more consistent and reliable.
Curation work will include (a) fixing the scores themselves, particularly with respect to the vexed issue of measure numbers, partial measures, and repeats; (b) improving the analyses; (c) adapting my existing computer programs so that they work with this music, allowing for more systematic proofreading and analysis; and (d) developing a no-frills web application that will let nonprogrammers use the data.
CDH Grant History
- 2019–2020 Dataset Curation