Sonic Patterns

3 min read

My computer skills, alas, are far more primitive than those of my collaborators Kris, Jordan, and David. This is partly why I am so excited to be part of this project: so that I can learn something from them! This is also why my first blog post will be decidedly less technical than theirs. I approach this project primarily from a music-analytical point of view, with a set of questions that I think corpus analysis can help us to answer.

How do composers respond to the sounds of words when they set those words to music? And how do listeners respond to the conjoining of these sound worlds when they hear songs? What can we learn about the interaction of text and music if we pay as much attention to how words sound as to what they mean?

I ask these questions as a music theorist and also as a singer. Singers are trained to be sensitive to the sounds of words—they take diction classes, they think about how to pronounce and emphasize vowels and consonants, they see language as something physical as well as semantic. Generally speaking, however, song analysts are more attentive to the meaning of language than they are to its materiality; experts at dealing with the sonic patterns of music, they are less accustomed to applying this level of scrutiny to the sonic patterns of language.

In my own work on art song, I have tried to enhance my skills at analyzing these linguistic patterns. (My most recent effort is an article called “The Fourth Dimension of a Song,” which appears in the latest issue of Music Theory Spectrum.) Improving my poetic “aural skills” has required a lot of work, but it has paid off.

It has also required a lot of hunting and pecking—flipping through scores, singing and playing through songs, listening for those striking moments where a composer seems to be emphasizing or exaggerating or transforming the “music” of poetry.

The benefits or a large corpus-based endeavor like The Lieder Project is that it will allow me and my colleagues to find those moments with greater ease, discovering patterns that we might never have imagined otherwise. My hope is that it might allow others to do the same, so that they can pursue their own questions about how speech sounds and musical sounds relate.


Processing IPA Unicode data with Python

3 min read

One of the main challenges I anticipated for this project was dealing with our phonetic data. Vocalists typically use the International Phonetic Alphabet (IPA) to guide their pronunciation while singing in a non-native language, and there are many sources of IPA transcriptions of art song texts, so it seemed like a natural place to start. However, my software coding experience has been limited to the processing of numerical data and plain text, and IPA involves a number of "special characters." I thought it would be a big challenge for my initial coding effort.

However, it turned out to be fairly simple. I write my code using the Python scripting language, which — as it turns out — offers good support for Unicode text. We also found a Unicode font designed specifically for IPA. Putting these two together has made the analysis of IPA text fairly straightforward.

First, here is a sample German poem, "Nacht und Träume," and its IPA transcription:

Heil'ge Nacht, du sinkest nieder;
Nieder wallen auch die Träume
Wie dein Licht durch die Räume,
Lieblich durch der Menschen Brust.
Die belauschen sie mit Lust;
Rufen, wenn der Tag erwacht:
Kehre wieder, heil'ge Nacht!
Holde Träume, kehret wieder!

ha:Ilgə naχt du zIŋkəst nidəʁ
nidəʁ val:lən a:ʊχ di trɔ:ymə
vi da:In montlIçt dʊɾç di ɾɔ:ymə
dʊɾç deʁ mɛnʃən ʃtIl:lɛ bɾʊst
di bɛla:ʊʃən zi mIt lʊst
ɾufən vɛn deʁ tak ɛɾvaχt
keɾɛ vidəʁ ha:Ilgə naχt
hɔldə tɾɔ:ymə keɾət vidəʁ

We began by making a plain text file containing the IPA transcription. Then we used Python's codecs framework to import the text in a usable format.

import codecs
content = [line.rstrip('\n') for line in'NachtUndTraume.txt', encoding='utf-8')]

Analyzing the text takes a little more work, but it's still fairly simple. For example, one thing we're looking at is the relative occurrence of different vowel types, and how that changes poem-to-poem, stanza-to-stanza, line-to-line. That analysis begins with categorizing the vowels in the poem: open, open-mid, close-mid, close, neutral. To do this, we use a Python dictionary, but we have to interact with the Unicode background to make this work. Using the chart provided with the IPA Keyboard Layout, we identified the IPA designation for each character. Then we used those to setup the dictionary.

phonemeCategory = {   
    'a': 'open',
    u'\u0061': 'open',
    'e': 'closeMid',
    u'\u025b': 'openMid',
    u'\u0259': 'neutral',
    'i': 'close',
    'I': 'open',
   'o': 'closeMid',
    u'\u0254': 'openMid',
    u'\u00f8': 'closeMid',
    u'\u0153': 'openMid',
    'y': 'close',
    u'\u0153': 'close',
    'u': 'close',
    u'\028a': 'close',

Note that for regular Roman characters, we can simply type the character. Only the "special characters" need the full Unicode treatment.

With this dictionary defined, we can simply ask the category of each phoneme


and use the usual tools to calculate probabilities, make comparisons, etc.

Once we had an IPA-friendly Unicode font, processing the IPA text became very simple.

Entering that IPA text is another story...


Melody, Harmony, and Trinket...Oh My!

3 min read

I have been working on the musical transcriptions and harmonic analysis of "Nacht und Traüme" and the first few songs from Die schöne Müllerin. I've been using Trinket tinyNotation to encode the vocal melody for these songs. Shown below is the vocal melody for "Nacht und Traüme."



Using this program has been very simple and easy to use. However, there are limitations due to the simplicity. When I was encoding "Wohin?" from Die schöne Müllerin, Schubert uses grace notes and turns to color the notes. Using Trinket, there is no way to input these notes. What I have done is encode them rhythmically based on singing interpretations. The obvious problem is that these interpretations vary from performer to performer. Talking with Kris, he showed me a way to enter them as grace notes, it just will not recognize the notation in Trinket. My next steps with Die schöne Müllerin is to go back and correct these passages.



I send the notation like this (see above photo) to Kris. Each line break denotes separation of phrases in the music. When I was asked to separate each line where the phrases started and stopped, I wasn't sure whether to separate based on the musical phrase or the text phrasing (based off the German poems). In many instances, the musical phrases start and end in the same places as the text, but that is not always the case. I decided to break up the phrases based on text phrasing (where line breaks occur in the poems) so we can compare the music directly to the text. Then, I write out a harmonic analysis—simply the starting and ending key of each phrase. This analysis is not encoded, I just have it on paper for reference later. 

The phrase dileneation brought up another problem: the discrepency between poetic and musical text. I have 10 phrases. If you look at our poem analysis, there are only eight phrases. In Schubert's song, he repeats the lines "Die belauschen sie mit Lust" and "Holde Träume, kehret wieder." Besides just the number of phrases, Schubert also chooses to repeat certain words. This means the words per line will differ from poem to musical text. For instance, the line "Durch der Menschen stille Brust" reads "Durch der Menschen stille, stille Brust" in Schubert's song. Kris noted that we may want to start encoding the poem "Nacht und Träume" and Schubert's musical text. As we begin to analyze the patterns in text and music, it is going to be important to note these differences and I am interested to see how we will treat these discrepencies. 


A big thank you to UROP (Undergraduate Research Opportunities Program) for their support. 


Check out this article from CU–Boulder on The Lieder Project.


The Lieder Project's first poem, initial code, and analytical output are now on GitHub. Check it out! But be warned—it is very much a work-in-progress. (Explanatory blog post(s) coming soon.)


Getting Started – The Lieder Project

2 min read

The Lieder Project will chronicle a research project that Stephen Rodgers (music theorist from the University of Oregon), Jordan Pyle (CU–Boulder student), David Lonowski (CU–Boulder student), and I will be undertaking beginning this summer. Recently Stephen has been researching the relationship between the sounds of poems and the structures of the music to which they are set. Our team will be collaborating on a corpus-study expansion of this project. The initial goal is to find if the patterns Stephen has discovered in songs like "Nacht und Träume" by Schubert are characteristic of a broader repertoire, or if his findings represent isolated incidents. We're also thinking through possible expansion to other genres, such as pop/rock "diva songs," to see if similar patterns emerge. (For example, in the case of "diva songs," I'm interested in seeing if songs written for specific singers make particular use of phonemes such as "open" vowels in prominent parts of the song — the chorus, the longest notes, the highest notes, etc. — in order to maximize vocal quality at key musical moments.)

At this point, we are encoding a set of nineteenth-century Romantic German poems in plain text using the IPA Unicode fonts and keyboard from SIL International. We will also be encoding the melodies of the settings of these poems to music by Romantic composers using Trinket-flavored TinyNotation (see and music21 for more information). We're still working on how we'll be encoding harmony, form, and other musical parameters that may be helpful to analyze. Finally, we'll be using original Python scripts (possibly supported by music21) to analyze patterns in the text and music, and of course their relationship. 

I'm excited about this project, and about the insights and community feedback we'll receive as we blog the process. We'll try to keep a steady stream of posts and examples coming throughout the summer! We'll also be publishing many of our materials to GitHub, so that others can make use of our IPA text transcriptions, musical transcriptions, and Python code.

We are thankful to UROP (the Undergraduate Reasearch Opportunities Program) at CU–Boulder for their financial support of Jordan and David's work on this project this summer.