Analyzing rhyme networks (From rhymes to networks 6)


For this, final post of my little series on rhyme networks, I set myself the ambitious goal of providing concrete examples how rhyme networks for languages other than Chinese can be analyzed. Unfortunately, I have to admit that this goal turned out to be a bit too ambitious. Although I managed to create a first corpus of annotated German rhymes, I am still not entirely sure how to construct rhyme networks from this corpus. Even if this problem is solved pragmatically, I realized that the question of how to analyze the rhyme network data is far less straightforward than I originally thought.

I will nevertheless try to end this series by providing a detailed description of how a preliminary rhyme network of the German poetry collection can be analyzed. Since these initial ideas for analysis still have a rather preliminary nature, I hope that they can be sufficiently enhanced in the nearer future.

Constructing directed rhyme networks

I mentioned in last month's post that the it is not ideal to count, as rhyming with each other, all words that are assigned to the same rhyme cluster in a given stanza of a given poem, since this means that one has to normalize the weights of the edges when constructing the rhyme network afterwards (List 2016). I also mentioned the personal communication with Aison Bu, who shared the idea of counting only those rhymes that are somehow close to each other in a stanza.

During this month, I finally found time to think about how to account for this idea in practice, and I came up with a procedure that essentially yields a directed network. In this procedure, we first extract all of the rhyme words in a given stanza in the order of their appearance. We then proceed from the first rhyme word and iterate over the rest of the rhyme words until we find a match. Having found a match, we interrupt the loop and add a directed edge to our rhyme network, which goes from the first rhyme word to its first match. We then delete the first rhyme word from the list and proceed again.

This procedure yields a directed, weighted rhyme network. At first sight, one may not see any specific advantages in the directionality of the network, but in my opinion it does not necessarily hurt; and it is straightforward to convert the network into an undirected one by simply ignoring the directions of the edges and collapsing those which go in two directions in a given pair of rhyme words.

Handling complex rhymes

In last month's blog post, I also mentioned the problem of handling rhymes that stretch across more than one word. While these are properly annotated (in my opinion), I had problems handling them in the rhyme network I presented last week. We find similar problems when working with certain rhymes involving words with more than one syllable. As an example, consider the following words which are all taken from the song Cruisen, and which I further represent in syllabified form in phonetic transcription.

Rhyme WordsStressed SyllableUnstressed Syllable
Tubetuː
Budebuː
Gurkeguɐ
hupehuː
Kurvekuɐ
Schurkeʃuɐ
Punktepuŋ

These words do not rhyme according to traditional poetry rules (where unstressed syllables following stressed syllables need to be identical), but they do reflect a common rhyme tendency in German Hip Hop, where rhyme practice has been evolving lately. In order to properly account for this, I assigned both the first and the second syllable of the words to their own rhyme group (one stressed syllable rhyme and one unstressed syllable rhyme).

When constructing the rhyme network, however, the separation into two rhyme groups turned out to not make much sense any longer, since the rhymes occur on a sub-morphemic level, where the parts to not themselves express a meaning anymore. To cope with this, I modified the network code slightly by treating only those words as rhyming with each other which show identical rhyme groups in all of their syllables.

Infomap communities and connected components

Having constructed the rhyme network in this new way, we can start with some preliminary analyses. As a first step, it is useful to check the general characteristics of the network. When using the new approach for network construction and the correction for complex rhymes, as reported above, the network consists of 3,104 nodes which together occur as many as 7,707 times. The network itself is only sparsely connected, being separated into 840 connected components.

As a first and very straightforward analysis, I used the Infomap algorithm (Rosvall and Bergstrom 2008) to see whether the connected components could be split any further. This analysis resulted in 932 communities, indicating that quite a few of the larger connected components in the rhyme network seem to show an additional community structure.

Unfortunately, I have not had time for a complete revision of all of the communities, but when checking a few of the larger connected components that were later separated into several communities, it seemed that most of these cases are due to very infrequent rhymes that are only licensed in very specific situations. As an example, consider the figure below, in which a larger connected component is shown along with the three communities identified by the Infomap algorithm.

The three communities, marked by the color of the nodes in the network, reflect three basic German rhyme patterns, which we can label -ung, -um, and -und. Transitions between the communities are sparse, although they are surely licensed by the phonetic similarity of the rhyme patterns, since they share the same main vowel and only differ by their finals, which all show a nasal component. The Infomap analysis assigns the nodes rum and krumm wrongly to the -und pattern but, given how sparse the graph is (with weights of one occurrence only for all of the edges), it is not surprising that this can happen. Both instances where edges connect the communities are rhymes occurring in the same Hip Hop lyrics from the song Geschichten aus der Nachbarschaft, as can be seen from the following annotated line of the song.

 Judging from quickly eye-balling the data, most of the communities that further split the connected components of the network reflect groups of very closely rhyming words (usually corresponding to what one might call perfect rhymes). Links between communities reflect either possible similarities between the rhyme words represented by the communities, or direct errors introduced by my encoding.

Unfortunately, I could not find time to further elaborate on this analysis. What would be interesting to do, for example, would be a phonetic alignment analysis of the communities, with the goal of identifying the most general sound sequence that might represent a given community. It would also help to measure to what degree transitions between communities conform to these patterns, or to what degree individual words might reflect the communities' consensus rhyming more or less closely.

But even the brief analysis here has shown me that, first, there are still many errors in my annotation, and, second, the Infomap algorithm for community detection seems to work just as well with German rhyme data as it works on Chinese rhyme data.

Frequent rhyme pairs and promiscuous rhyme words

As a last example of how rhyme networks can be analyzed, I want to have a look at frequently recurring patterns in the current poetry collection. A very simple first test we can do in this regard is to look at the edges with the highest weights in our networks. Poets typically try to be very original in their work, since nothing is considered as boring as repetition in the literature. Nevertheless, since the pool of words from which poets can choose when creating their poems is, by nature, limited, there are always patterns that are more frequently used.

The following table shows those directed rhymes that occur most frequently in the German poetry database.

Rhyme Part ARhyme Part BNo. of Poems
seinlein10
aushaus10
hausaus9
triebeliebe9
lebengeben9
gebenleben9
zeitkeit9
neinsein8
wiederlieder7
nurtur7

 This collection may not tell you too much, if you are not a native speaker of German. But if you are, then you will easily see that most of these rhymes are very common, involving either very common words (sein "to be"), or suffixes that frequently recur in different words of the German lexicon (-lein either as diminutive suffix or as part of allein "alone"). We also find the very sad match of liebe (Liebe "love") and triebe (Triebe "urges"), which is mostly thanks to the poems by Rainer Maria Rilke (1875-1926), who wrote a lot about "love", and had the same problem as most German poets: there are not many words rhyming nicely with Liebe (the only other candidates I know of would be bliebe "would stay" and Hiebe "stroke or blow").

As a last example, we can consider promiscuous rhyme words, that is, rhyme words that tend to be reused in many poems with many other words as partners. The following table shows the top ten in terms of rhyme promiscuity in the German poetry dataset.

Rhyme PartRhyme PartnersOccurrences
sein1487
ein934
bei936
sagen819
leben839
schein826
mehr825
nicht
8
zeit836
welt732

Here, I find it rather interesting that we find so many words rhyming with -ein in this short list. However, when checking the community of -ein, we can see that there is, indeed, a rather large number of words from which one can choose (including basic words like Bein "leg", Schein "shine", Stein "stone"). Additionally, there are a larger number of verbs of the form -eien that are traditionally shortened in colloquial speech (compare the node schreien "to scream").

Concluding remarks

When I started this series on rhyme networks, I was hoping to achieve more in the six months that I had ahead. In the light of my initial hopes, the analyses I have shown here are somewhat disappointing. However, even if I could not keep the promises I made to myself, I have learned a lot during these months, and I remain optimistic that many of the still untackled problems can be solved in the near future. What today's analysis has specifically shown to me, however, is that more data will be needed, since the network produced from the small collection of 300 German poems is clearly too small to serve for a fully fledged analysis of rhymes in German poetry.  

References

List, Johann-Mattis (2016) Using network models to analyze Old Chinese rhyme data. Bulletin of Chinese Linguistics 9.2: 218-241.

Rosvall, M. and Bergstrom, C. T. (2008) Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences 105.4: 1118-1123.

Data and Code 

Data and code are available in the form of a GitHub Gist.

Posted by in Uncategorized

Permalink

Constructing rhyme networks (From rhymes to networks 5)


As is now happening for the summer, this little series on rhyme networks is also coming to its end. We have only two more blog posts to go, with this one discussing the construction of rhyme networks, and then one more post in September, discussing how rhyme networks can be analyzed.

A preliminary annotated collection of rhymed poetry in German

While my original plan was to have all of Goethe's Faust annotated by the end of this series, so that I could illustrate how to make rhyme analyses with a large dataset of rhyme patterns in a language other than Chinese, I now have to admit that this plan was way too ambitious.

Nevertheless, I have managed to assemble a larger collection of German rhymes from various pieces of literature, ranging from boring love poems to recent examples of German Hip-Hop; and all of the rhymes have been manually annotated by myself during recent months.

This little corpus currently consists of 336 German "œuvres" (the data collection itself has more poems and songs from different languages), which make up a total of 1,544 stanzas (deliberately excluding the refrains in songs). There are 3,950 words that rhyme in this collection; and together they occur 5,438 times in a total of 49,797 words written by 72 different authors. The following table summarizes major features of the German part of the database.

AspectScore
components994
authors72
poems336
stanzas1544
lines8340
rhyme words3950
words rhyming  5438
words total49797

The whole collection, which is currently available under the working title "AntRhyme: Annotated Rhyme Database", can be inspected online at https://digling.org/rhyant/, but due to copyright restrictions for texts from recent pop songs, not all of the poems can be displayed. In order to share the annotated rhymes along with the initial Python code that I wrote for this post, I have therefore created a version in which only the annotated rhyme words are provided, along with dummy words in which each character was replaced by a miscellaneous symbol. As a result, the song "Griechicher Wein" ("Greek wine") by Udo Jürgens from 1974 now looks as shown in the following figure.


Modeling rhymes with networks

As far as Chinese rhyme networks were concerned, I have always given the impression (and also truly thought this myself) that the reconstruction of a rhyme network is something rather trivial. Given a stanza in a given poem, all one has to do is to model the rhyme words in the stanza as nodes in the network, and then add connections for all of the words that rhyme with each other according to the annotation.

While I still think that this simple rhyme network model is a very good starting point, there are certain non-trivial aspects that one needs to carefully consider when working with this kind of rhyme network. First, there is the question of weighting. In the first study that I devoted to Old Chinese poetry (List 2016), I weighted the nodes by counting their appearance, and I also weighted the edges by first counting how often they occurred. I then normalized this score in order to receive a more balanced weighting. The normalization would first count each rhyme pair only once, even if the same word occurred more than one time in the same stanza, and then apply a formula for normalization based on the number of words rhyming with each other within the same stanza (see ibid. 228 for details).

However, in the meantime, a young scholar Aison Bu has suggested an even better way of counting rhymes, in an email conversation with me. [The pandemic prevented us meeting in person at a conference in early April, so we could never follow this up.] Since rhyming is essentially linear, my original counting of all rhymes that are assigned to the same rhyme partition in a given stanza may essentially be misleading. Instead, Aison suggested counting only adjacent rhymes.

To provide a concrete example, consider the third stanza in the song "Griechischer Wein" by Udo Jürgens (shown above). Here, we have the rhyme group labeled as f, which occurs three times in the data, with the rhyme words Wind (wind), sind (they are), and Kind (child). The normalization procedure that I proposed in the study from 2016 would now construct a network in which all three words rhyme with each other. To normalize the edge weights, each individual edge weight would be modified by the factor 1 / (G-1), where G is the number of rhymes in the rhyme group in the stanza (3 in this case, as we have three words rhyming with each other). Aison's rhyme network construction, however, would only add two edges, one for Wind and sind, and one for sind and Kind, as they immediately follow each other in the verse. A specific normalization of the edge weights would not be needed in this case.

A first rhyme network

Unfortunately, I have not had time so far to test Aison's idea, to draw only edges for adjacent rhymes when constructing rhyme networks. However, with the data for more than 300 German poems and songs assembled, I have had enough time to construct a first and very simple network of German rhyme data.

For this network, I disregarded all normalization issues, and just added an edge for each pair of words that would have been assigned to the same rhyme group in my rhyme annotation. This network resulted in a rather sparse collection of 994 connected components. This is in strong contrast to the Chinese poems I have analyzed in the past (List 2016, List 2020), which were all very close to small-world networks, with one huge connected component, and very few additional components. However, it would be too early to conclude that German rhyme networks are fundamentally different from Chinese ones, given that the data may just be too sparse for this kind of experiment.

At this stage of the analysis, it is therefore important to carefully inspect the networks, in order to explore to what degree the network modeling or the data annotation could be further improved. When looking at the largest connected component, shown in the following figure, for example, it is clear that typical rhyme groups that we would expect to find separated in rhyme dictionaries do cluster together. We find -aut on the left, -aus and -auf on the right, with the word auch (also) as a very central rhyme word, as well as Frau (woman).




While these words can be defended as rhymes, given that they share the diphthong au, we also find some strange matches. Among these is as the cluster with -ut on the bottom left, which links via Mut (courage) to Bauch (belly) and resolut (straightforward). Another example is the link between Frau and trauern (mourn). The former link is due to an annotation error in the poem "Freundesbrief an einen Melancholischen" ("Friendly letter to a melancholic") by Otto Julius Bierbaum (1921), where I wrongly annotated Bauch and auch to rhyme with resolut and Mut.

However, the second example is due to a modeling problem with rhymes that encompass more than one word. This pattern is very frequent in Hip-Hop texts, and I have not yet found a good way of handling it. In the case of Frau rhyming with trauern, the original text rhymes trauern with Frau an, the latter being a part of the sentence "schaut euch diese Frau an" ("look at this woman"). Since my conversion of the text to rhyme networks only considers the first part of multi-word rhymes as the word under question, it obviously mistakenly displays the rhyme, which is also show in its original form in the figure below.


Conclusion

The initial construction of German rhyme networks which I have presented in this post has shown some potential problems in the conversion of rhyme judgments to rhyme networks. First, we have to count with certain errors in the annotation (which seem to be inevitable when doing things manually). Second, certain aspects of the annotation, especially rhymes stretching over more than one word, need to be handled more properly. Third, assuming that poetry is spoken, and spoken texts are realized in linear form, it may be useful to reconsider the current rhyme network construction, by which edges for rhyme examples are added for all possible combinations of rhyme words occuring in the same rhyme group. For the final post in this series next month, I hope that I will find time to address all of these problems in a satisfying way.

References

List, Johann-Mattis (2016) Using network models to analyze Old Chinese rhyme data. Bulletin of Chinese Linguistics 9.2: 218-241.

List, Johann-Mattis (2020) Improving data handling and analysis in the study of rhyme patterns. Cahiers de Linguistique Asie Orientale 49.1: 43-57.

For those of you interested in data and code that I used in this study, you can find them in this GitHub Gist.

Automated detection of rhymes in texts (From rhymes to networks 4)


Having discussed how to annotate rhymes in last month's blog post, we can now discuss the automated detection of rhymes. I am fascinated by this topic, although I have not managed to find a proper approach yet. What fascinates me more, however, is how easily the problem is misunderstood. I have witnessed this a couple of times in discussions with colleagues. When mentioning my wish to create a magic algorithm that does the rhyme annotation for me, so that I no longer need to do it manually, nobody seems to agree with me that the problem is not trivial.

On the contrary, the problem seems to be so easy that it should have been solved already a couple of years ago. One typical answer is that I should just turn to artificial intelligence and neural networks, whatever this means in concrete, and that they would certainly outperform any algorithm that was proposed in the past. Another typical answer, which is slightly more subtle, assumes that some kind of phonetic comparison should easily reveal what we are dealing with.

Unfortunately, none of these approaches work. So, instead of presenting a magic algorithm that works, I will use this post to try and explain why I think that the problem of rhyme detection is far less trivial than people seem to think.

Defining the problem of automated rhyme detection

Before we can discuss potential solutions to rhyme detection, we need to define the problem. If we think of a rhyme annotation model that allows us to annotate rhymes at the level of specific word parts (not restricted to entire words), the most general rhyme detection problem can be presented as follows:
Given a rhyme corpus that is divided into poems, with poems divided into stanzas, and stanzas being divided into lines, find all of the word parts that clearly rhyme with each other within each stanza within each poem within the corpus.
With respect to machine learning strategies, we can further distinguish supervised versus unsupervised learning. While supervised learning for the rhyme detection problem would build on a large annotated rhyme corpus, in order to infer the best strategies to identify words that rhyme and words that do not rhyme, unsupervised approaches would not require any training data at all.

With respect to the application target, we should further specify whether we want our approach to work for a multilingual sample or just a single language. If we want the method to work on a truly multilingual (that is: cross-linguistic) basis, we would probably need to require a unified transcription for speech sounds as input. It is already obvious that, although the annotation schema I presented last month is quire general, it would not work for those languages with writing systems that are not spelled from left to write, for example, not to speak of writing systems that are not alphabetic.

Why rhyme detection is difficult

It is obvious that the most general problem for rhyme detection would be the cross-linguistic unsupervised detection of rhymes within a corpus of poetry. Developing systems for monolingual rhyme detection seems to be a bit trivial, given that one could just assemble a big list of words that rhyme in a given language, and then find where they occur in a given corpus. However, given that the goal of poetry is also to avoid "boring" rhymes, and come up with creative surprises, it may turn out to be less trivial than it seems at first sight.

As an example, consider the following refrain from a recent hip-hop song by German comedian Carolin Kebekus, in which the text rhymes Gemeinden (community) with vereinen (unite), as well as Mädchen (girl) with Päpstin (female pope) (the video has English subtitles for those who are interested in the text but do not speak German).

Figure 1: Rhyme example from a recent German hip-hop song.

While one could argue whether those words qualify as proper rhymes and were intended as such, I am quite convinced that the words were chosen for their near-rhyme similarity, and I am also convinced that most native speakers of German listening to the song will understand the intended rhyme here. Both rhymes are not perfect, but they are close enough, and they are beyond doubt creative and unexpected — it is extremely unlikely that one could find them in any German rhyme book. This example shows that humans' creative treatment of language keeps constantly searching for similarities that have not been used before by others. This leads to a situation where we cannot simply use a static look-up table of licensed rhyme words, to solve the problem of rhyme detection for a particular language.

What we instead need is some way to estimate the phonetic similarity of words parts, in order to check whether they could rhyme or not. However, since languages may have different rhyme rules, these similarities would have to be adjusted for each language. While phonetic similarity can be measured fairly well with the help of alignment algorithms applied to phonetic transcriptions, what counts as being similar may differ from language to language, and rhyme usually reflects local similarity of words.

Since rhyme is closely accompanied by rhythm and word or phrase stress, we would also need this information to be supplied from the original transcriptions. All in all, working on a general method for rhyme detection seems like a hell of an enterprise, specifically whilever we lack any datasets that we could use for testing and training.

Less interesting sub-problems and proposed solutions

While, to the best of my knowledge, nobody has every tried to propose a solution for the general problem of rhyme detection as I outlined it above, there are some studies in which a sub-problem of rhyme detection has been tackled. This sub-problem can be presented as follows:
Given a rhyme corpus of poems that are divided into stanzas, which are themselves divided into lines, try to find the rhyme schemas underlying each stanza.
This problem, which has been often called rhyme scheme discovery, has been addressed using at least three approaches that I have been able to find. Reddy and Knight (2011) employ basic assumptions about the repetition of rhyme pairs in order to create an unsupervised method based on expectation maximization. Addanki and Wu (2013) test the usefulness of Hidden Markov Models for unsupervised rhyme scheme detection. Haider and Kuhn (2018)use Siamese Recurrent Networks for a supervised approach to the same problem. Additionally, Plechač (2018) proposes a modification of the algorithm by Reddy and Knight, and tests it on three languages (English, Czech, and French).

One could go into the details, and discuss the advantages and disadvantages of these approaches. However, in my opinion it is much more important to emphasize the fundamental difference between the task of rhyme scheme detection and the problem of general rhyme detection, as I have outlined it above. Rhyme scheme detection does not seek to explain rhyme in terms of partial word similarity, but rather assumes that a general overarching structure (in terms of rhyme schemas) underlies all kinds of rhymed poetry.

There are immediate consequences to assuming that rhymed poetry needs to be organized by rhyme schemes. First, the underlying model does not accept rhymes that occur in any other place than the end of a given line, which is problematic, specifically when dealing with more recent genres like hip-hop. Second, if one assumes that rhyme scheme structure dominates rhymed poetry, the model does not accept any immediate, more spontaneous forms of rhyming, which, however, frequently occur in human language (compare the famous examples in political speech, discussed by Jakobson 1958).

Concentrating on rhyme schemes, instead of rhyme word detection, has immediate consequences for the algorithms. First, the methods need to be applied to "normal" poetry, given that any form of poetry that evades the strict dominance of rhyme schemes cannot be characterized properly by the underlying rhyme model. Second, all that the methods need as input are the words occurring at the end of a line, since these are the only ones that can rhyme (and the test datasets are all constructed in this way alone). Third, the methods are all trained in such a way that they need to identify rhymes in a text, so that they cannot be used to test whether a given text collection rhymes or not.

Outlook

In this post, I have tried to present what I consider to be the "ultimate" problem of rhyme detection, a problem that I consider to be the "general" rhyme detection problem in computational approaches to literature. In contrast, I think that the problem of detecting only rhyme schemes is much less interesting than the general rhyme detection problem. The focus on rhyme schemes, instead of focusing on the actual words that rhyme, reflects a certain lack of knowledge regarding the huge variation by which people rhyme words across different languages, cultures, styles, and epochs.

If all poetry followed the same rhyme schemes, then we would not need any rhyme detection methods at all. Think of Shakespeare's 154 sonnets, all coded in the same rhyme schema: no algorithm would be needed to detect the rhyme schema, as we already know it beforehand — for a perfect supervised method, it would be enough to pass the algorithm the line numbers and the resulting schema.

The picture changes, however, when working with different styles, especially those representing an emerging rather than an established tradition of poetry. Rhyme schemes in the most ancient Chinese inscriptions, for example, are far less fixed (Behr 2008). In modern hip-hop lyrics, which also represent a tradition that has only recently emerged, it does not make real sense to talk about rhyme schemes either, as can be easily seen from the following excerpt of Akhenaton's Mes soleils et mes lunes, which I have tried to annotate to the best of my knowledge.

Figure 2: First stanza from Akhenaton's Mes soleils et mes lunes

Surprisingly, both Haider and Kuhn (2018), as well as Addanki and Wu (2013) explicitly test their methods on hip-hop corpora. They interpret them as normal poems, extract the rhyme words, and classify them line by line. I would be curious what these methods would yield if they are fed non-rhyming text passages. For me, the ability of an algorithm to distinguish rhyming from non-rhyming texts is one of the crucial tests for its suitability. We do not need approaches that confirm what we already know.

Ultimately, we hope to find methods for rhyme detection that could actively help us to learn something about the difference between conscious rhyming versus word similarities by chance. But, given the huge differences in rhyming practice across languages and cultures, it is not clear if we will ever arrive at this point.

References

Addanki, Karteek and Wu, Dekai (2013) Unsupervised rhyme scheme identification in Hip Hop lyrics using Hidden Markov Models. In: Statistical Language and Speech Processing, pp. 39-50.

Behr, Wolfgang (2008) Reimende Bronzeinschriften und die Entstehung der Chinesischen Endreimdichtung. Bochum:Projekt Verlag.

Haider, Thomas and Kuhn, Jonas (2018) Supervised rhyme detection with Siamese recurrent networks. In: Proceedings of Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pp. 81-86.

Jakobson, Roman (1958) Typological studies and their contribution to historical comparative linguistics. In: Proceedings of the Eighth International Congress of Linguistics, pp. 17-35.

Plecháč, Petr (2018) A collocation-driven method of discovering rhymes (in Czech, English, and French poetry). In: Masako Fidler and Václav Cvrček (eds.) Taming the Corpus: From Inflection and Lexis to Interpretation. Cham:Springer, pp. 79-95.

Automated detection of rhymes in texts (From rhymes to networks 4)


Having discussed how to annotate rhymes in last month's blog post, we can now discuss the automated detection of rhymes. I am fascinated by this topic, although I have not managed to find a proper approach yet. What fascinates me more, however, is how easily the problem is misunderstood. I have witnessed this a couple of times in discussions with colleagues. When mentioning my wish to create a magic algorithm that does the rhyme annotation for me, so that I no longer need to do it manually, nobody seems to agree with me that the problem is not trivial.

On the contrary, the problem seems to be so easy that it should have been solved already a couple of years ago. One typical answer is that I should just turn to artificial intelligence and neural networks, whatever this means in concrete, and that they would certainly outperform any algorithm that was proposed in the past. Another typical answer, which is slightly more subtle, assumes that some kind of phonetic comparison should easily reveal what we are dealing with.

Unfortunately, none of these approaches work. So, instead of presenting a magic algorithm that works, I will use this post to try and explain why I think that the problem of rhyme detection is far less trivial than people seem to think.

Defining the problem of automated rhyme detection

Before we can discuss potential solutions to rhyme detection, we need to define the problem. If we think of a rhyme annotation model that allows us to annotate rhymes at the level of specific word parts (not restricted to entire words), the most general rhyme detection problem can be presented as follows:
Given a rhyme corpus that is divided into poems, with poems divided into stanzas, and stanzas being divided into lines, find all of the word parts that clearly rhyme with each other within each stanza within each poem within the corpus.
With respect to machine learning strategies, we can further distinguish supervised versus unsupervised learning. While supervised learning for the rhyme detection problem would build on a large annotated rhyme corpus, in order to infer the best strategies to identify words that rhyme and words that do not rhyme, unsupervised approaches would not require any training data at all.

With respect to the application target, we should further specify whether we want our approach to work for a multilingual sample or just a single language. If we want the method to work on a truly multilingual (that is: cross-linguistic) basis, we would probably need to require a unified transcription for speech sounds as input. It is already obvious that, although the annotation schema I presented last month is quire general, it would not work for those languages with writing systems that are not spelled from left to write, for example, not to speak of writing systems that are not alphabetic.

Why rhyme detection is difficult

It is obvious that the most general problem for rhyme detection would be the cross-linguistic unsupervised detection of rhymes within a corpus of poetry. Developing systems for monolingual rhyme detection seems to be a bit trivial, given that one could just assemble a big list of words that rhyme in a given language, and then find where they occur in a given corpus. However, given that the goal of poetry is also to avoid "boring" rhymes, and come up with creative surprises, it may turn out to be less trivial than it seems at first sight.

As an example, consider the following refrain from a recent hip-hop song by German comedian Carolin Kebekus, in which the text rhymes Gemeinden (community) with vereinen (unite), as well as Mädchen (girl) with Päpstin (female pope) (the video has English subtitles for those who are interested in the text but do not speak German).

Figure 1: Rhyme example from a recent German hip-hop song.

While one could argue whether those words qualify as proper rhymes and were intended as such, I am quite convinced that the words were chosen for their near-rhyme similarity, and I am also convinced that most native speakers of German listening to the song will understand the intended rhyme here. Both rhymes are not perfect, but they are close enough, and they are beyond doubt creative and unexpected — it is extremely unlikely that one could find them in any German rhyme book. This example shows that humans' creative treatment of language keeps constantly searching for similarities that have not been used before by others. This leads to a situation where we cannot simply use a static look-up table of licensed rhyme words, to solve the problem of rhyme detection for a particular language.

What we instead need is some way to estimate the phonetic similarity of words parts, in order to check whether they could rhyme or not. However, since languages may have different rhyme rules, these similarities would have to be adjusted for each language. While phonetic similarity can be measured fairly well with the help of alignment algorithms applied to phonetic transcriptions, what counts as being similar may differ from language to language, and rhyme usually reflects local similarity of words.

Since rhyme is closely accompanied by rhythm and word or phrase stress, we would also need this information to be supplied from the original transcriptions. All in all, working on a general method for rhyme detection seems like a hell of an enterprise, specifically whilever we lack any datasets that we could use for testing and training.

Less interesting sub-problems and proposed solutions

While, to the best of my knowledge, nobody has every tried to propose a solution for the general problem of rhyme detection as I outlined it above, there are some studies in which a sub-problem of rhyme detection has been tackled. This sub-problem can be presented as follows:
Given a rhyme corpus of poems that are divided into stanzas, which are themselves divided into lines, try to find the rhyme schemas underlying each stanza.
This problem, which has been often called rhyme scheme discovery, has been addressed using at least three approaches that I have been able to find. Reddy and Knight (2011) employ basic assumptions about the repetition of rhyme pairs in order to create an unsupervised method based on expectation maximization. Addanki and Wu (2013) test the usefulness of Hidden Markov Models for unsupervised rhyme scheme detection. Haider and Kuhn (2018)use Siamese Recurrent Networks for a supervised approach to the same problem. Additionally, Plechač (2018) proposes a modification of the algorithm by Reddy and Knight, and tests it on three languages (English, Czech, and French).

One could go into the details, and discuss the advantages and disadvantages of these approaches. However, in my opinion it is much more important to emphasize the fundamental difference between the task of rhyme scheme detection and the problem of general rhyme detection, as I have outlined it above. Rhyme scheme detection does not seek to explain rhyme in terms of partial word similarity, but rather assumes that a general overarching structure (in terms of rhyme schemas) underlies all kinds of rhymed poetry.

There are immediate consequences to assuming that rhymed poetry needs to be organized by rhyme schemes. First, the underlying model does not accept rhymes that occur in any other place than the end of a given line, which is problematic, specifically when dealing with more recent genres like hip-hop. Second, if one assumes that rhyme scheme structure dominates rhymed poetry, the model does not accept any immediate, more spontaneous forms of rhyming, which, however, frequently occur in human language (compare the famous examples in political speech, discussed by Jakobson 1958).

Concentrating on rhyme schemes, instead of rhyme word detection, has immediate consequences for the algorithms. First, the methods need to be applied to "normal" poetry, given that any form of poetry that evades the strict dominance of rhyme schemes cannot be characterized properly by the underlying rhyme model. Second, all that the methods need as input are the words occurring at the end of a line, since these are the only ones that can rhyme (and the test datasets are all constructed in this way alone). Third, the methods are all trained in such a way that they need to identify rhymes in a text, so that they cannot be used to test whether a given text collection rhymes or not.

Outlook

In this post, I have tried to present what I consider to be the "ultimate" problem of rhyme detection, a problem that I consider to be the "general" rhyme detection problem in computational approaches to literature. In contrast, I think that the problem of detecting only rhyme schemes is much less interesting than the general rhyme detection problem. The focus on rhyme schemes, instead of focusing on the actual words that rhyme, reflects a certain lack of knowledge regarding the huge variation by which people rhyme words across different languages, cultures, styles, and epochs.

If all poetry followed the same rhyme schemes, then we would not need any rhyme detection methods at all. Think of Shakespeare's 154 sonnets, all coded in the same rhyme schema: no algorithm would be needed to detect the rhyme schema, as we already know it beforehand — for a perfect supervised method, it would be enough to pass the algorithm the line numbers and the resulting schema.

The picture changes, however, when working with different styles, especially those representing an emerging rather than an established tradition of poetry. Rhyme schemes in the most ancient Chinese inscriptions, for example, are far less fixed (Behr 2008). In modern hip-hop lyrics, which also represent a tradition that has only recently emerged, it does not make real sense to talk about rhyme schemes either, as can be easily seen from the following excerpt of Akhenaton's Mes soleils et mes lunes, which I have tried to annotate to the best of my knowledge.

Figure 2: First stanza from Akhenaton's Mes soleils et mes lunes

Surprisingly, both Haider and Kuhn (2018), as well as Addanki and Wu (2013) explicitly test their methods on hip-hop corpora. They interpret them as normal poems, extract the rhyme words, and classify them line by line. I would be curious what these methods would yield if they are fed non-rhyming text passages. For me, the ability of an algorithm to distinguish rhyming from non-rhyming texts is one of the crucial tests for its suitability. We do not need approaches that confirm what we already know.

Ultimately, we hope to find methods for rhyme detection that could actively help us to learn something about the difference between conscious rhyming versus word similarities by chance. But, given the huge differences in rhyming practice across languages and cultures, it is not clear if we will ever arrive at this point.

References

Addanki, Karteek and Wu, Dekai (2013) Unsupervised rhyme scheme identification in Hip Hop lyrics using Hidden Markov Models. In: Statistical Language and Speech Processing, pp. 39-50.

Behr, Wolfgang (2008) Reimende Bronzeinschriften und die Entstehung der Chinesischen Endreimdichtung. Bochum:Projekt Verlag.

Haider, Thomas and Kuhn, Jonas (2018) Supervised rhyme detection with Siamese recurrent networks. In: Proceedings of Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pp. 81-86.

Jakobson, Roman (1958) Typological studies and their contribution to historical comparative linguistics. In: Proceedings of the Eighth International Congress of Linguistics, pp. 17-35.

Plecháč, Petr (2018) A collocation-driven method of discovering rhymes (in Czech, English, and French poetry). In: Masako Fidler and Václav Cvrček (eds.) Taming the Corpus: From Inflection and Lexis to Interpretation. Cham:Springer, pp. 79-95.

Annotating rhymes in texts (From rhymes to networks 3)


Having discussed some general aspects of rhyming in a couple of different languages, in last month's blog post, the third post in this series is devoted to the question of how rhyme can be annotated. Annotation plays a crucial role in almost all fields of linguistics. The main idea is to add value to a given resource (Milà-Garcia 2018). What value we add to resources can differ widely, but as far as textual resources are concerned, we can say that the information that we add can usually not be extracted automatically from the resource.

In our case, the information we want to explicitly add to rhyme texts or rhyme corpora is the rhyme relations between words. Retrieving this information may be trivial, as in the case of Shakespeare's Sonnets, where we know the rhyme schema in advance, but it is considerably complicated when working with other, less strict types of rhyming.

One usually distinguishes two basic types of annotation: inline and stand-off (Eckart 2012). For inline annotation, we add our information directly into our textual resource, while stand-off annotation creates an index over the resource, and then adds the information in a separate resource that refers to the index of the original text.

Both methods have their pros and cons. Stand-off annotation often seems to provide a cleaner solution (as one never knows how much a manual annotation added into a text might modify the text involuntarily). However, inline annotation has, in my experience, the advantage of allowing for a much faster annotation process, at least as long as the annotation has to be done in text files directly, without interfaces that could help to assist in the annotation process.

Overview of existing annotation practice

If we look at different practices that have been used to annotate rhymes in collections of poetry, we will find quite a variety of techniques that have been used so far.

Wáng (1980), for example, uses an inline annotation style in his corpus of the rhymes in the Book of Odes, as illustrated in the following example taken from List et al. (2019). In this annotation, rhyme words are indirectly annotated by providing reconstructed readings for the Chinese characters which are supposed to narrow the original pronunciation. Whenever two rhyme words share the same main vowel, the author would have judged them to have rhymed in the original text.

Annotation in Wáng (1980)

Baxter (1992) uses a stand-off annotation, which is shown (again taken from List et al. 2019) in the following table. An advantage of Baxter's annotation is that it allows him to provide multiple layers of information for each rhyme word. A disadvantage is that a clear index to the words in the poem is lacking. While this is not entirely problematic, since it is usually easy to identify which words are in rhyme position, it is not entirely "safe", from an annotation point-of-view, as it may still create ambiguities.

Annotation in Baxter (1992)

In a study of automated rhyme word detection, Haider and Kuhn (2018) use annotated rhyme datasets from a variety of German styles (Hip Hop, contemporary lyrics, and more ancient lyrics). To annotate the data, they use the standard format of the Text Encoding Initiative, which is based essentially on XML. Unfortunately, however, they do not provide tags for each word that rhymes, but instead only add an attribute to each stanza, indicating the rhyme schema, as can be seen in the example below:
<lg rhyme="aabccb" type="stanza">
<l>Vor seinem Löwengarten,</l>
<l>Das Kampfspiel zu erwarten,</l>
<l>Saß König Franz,</l>
<l>Und um ihn die Großen der Krone,</l>
<l>Und rings auf hohem Balkone</l>
<l>Die Damen in schönem Kranz.</l>
</lg>
The drawback of this annotation style is that it places the annotation where it does not belong, assuming that a poem only rhymes the words that appear in the end of a line, and that there are no exceptions.

For French, I found an interesting website called métrique en ligne, offering a large number of phonetically analyzed texts in French. They offer a rhyme analysis in an interactive fashion: one can have a look at a poem in raw form and then see which parts of the words appear in rhyme relation. A screenshot of the website (with the poem "Les Phares" from Charles Baudelaire) illustrates this annotation:



It is very nice that the project offers the rhyme annotation in such a clear form, annotating explicitly those parts of the words (albeit in orthography) that are supposed to be responsible for the rhyming. However, the annotation has a clear drawback, in that it provides rhyme annotation only on the level of the stanza, although we know well that quite a few poems have recurring rhymes that are reused across many stanzas, and we would like to acknowledge that in our annotation.

The most complete annotation of poetry I have found so far is ``MCFlow: A Digital Corpus of Rap Transcriptions'' (Condit-Schultz 2017). The goal of the annotation was not to annotate rhyme in the primary instance, but to provide a corpus that also takes the musical and rhythmic aspects of rap into account. As a result it offers annotations along seven major aspects: rhythm, stress, tone, break, rhyme, pronunciation, and the lyrics themselves. The rhyme annotation itself is provided for each syllable (the texts themselves are all syllabified), with capital letters indicating stressed, and lower case letters indicating unstressed syllables. Rhyme units (usually, but not necessarily words) are marked by brackets. The following figure from Condit-Schultz (2017) illustrates this schema.

Annotation of rhymes by Condit-Schultz (2017)

What I do not entirely understand is the motivation of using the same lowercase letters for unstressed syllables as for the stressed ones in a rhyme sequence. Given that the information about stress is generally available from the annotation, it seems redundant to add it; and it is not clear to me for what it serves, specifically also because unstressed syllables do not necessarily rhyme in rhyme sequences. But apart from this, I find the information that this annotation schema provides quite convincing, although I find the format difficult to parse computationally; and I also imagine that it is quite difficult to annotate it manually.

Initial reflections on rhyme annotation

When dealing with annotation schemas and trying to develop a framework for annotation, it is always useful to recall the Zen of Python, especially the first seven lines:
  • Beautiful is better than ugly.
  • Explicit is better than implicit.
  • Simple is better than complex.
  • Complex is better than complicated.
  • Flat is better than nested.
  • Sparse is better than dense.
  • Readability counts.
What I think we can extract from these seven lines are the following basic rules for an initial annotation schema for rhyme data.
  • First, ideally, we want an annotation schema that gives us the same look and feel that we know when reading a poem. This does not mean we need to store the full annotation in this schema, but for a quick editing of rhyme relations, such an annotation schema has many advantages.
  • Second, in order to maintain explicitness, all rhymes should be treated as rhyming globally inside a poem — we should never restrict annotation of rhymes to a single stanza, and we should also avoid brackets to mark rhyming sequences, as there are other ways to assign words to units.
  • Third, we should be explicit enough to show which parts of a word rhyme but, for now, I think it is not necessary to annotate all syllables at the same time. Since this would cost a lot of time, and specifically since syllabification differs from language to language, it seems better to add this information later on a language-specific basis, semi-automatically. Since many words repeat across poems, one can design a lookup-table to syllabify a word much more easily from a corpus that has been assembled, than adding the information when preparing each poem.

Towards a: Standardized Annotation of Rhyme Data

Last year, we proposed an annotation schema for rhyme annotation (List et al. 2019). Our basic idea was inspired by tabular formats. These are used in linguistic software packages dealing with problems in computational historical linguistics, such as LingPy. They are also used as the backbone of the Cross-Linguistic Data Formats Initiative (Forkel et al. 2018), which uses tabular formats in combination with metadata in order to render linguistic datasets (wordlists, information on structural features) cross-linguistically comparable. Essentially, the format can be seen as a stand-off annotation, where the original data are not modified directly. While our basic format was rather powerful with respect to what can be annotated, it is also very difficult to code data in this format, at least in the absence of a proper annotation tool.

At the same time, to ease the initial preparation of annotated rhyme data conforming to these standards, we proposed an intermediate format, in which a poem was provided just in text form, with minimal markup for metadata, and in which rhymes could be annotated inline. As an example, consider the first two stanzas of the poem "Morning has broken" by Eleanor Farjeon (1881-1965):
@ANNOTATOR: Mattis
@CREATED: 2020-06-26 06:09:04
@TITLE: Morning has broken
@AUTHOR: Eleanor Farjeon
@BIODATE: 1881-1965
@YEAR: before 1965
@MODIFIED: 2020-06-26 06:09:46
@LANGUAGE: English

Morning has [a]broken like the first morning
Blackbird has [a]spoken like the first [b]bird
Praise for the [c]singing, praise for the morning
Praise for them [c]springing fresh from the [b]Word

Sweet the rain's [e]new_[f]fall, sunlit from heaven
Like the first [e]dew_[f]fall on the first [g]grass
Praise for the [d]sweet[h]ness of the wet garden
Sprung in com[d]plete[h]ness where His feet [g]pass
As you can see from this example, we start with some metadata (which is more or less a free form, consisting of the formula @key: value, and then render the stanzas, line by line, separating stanzas by one blank line. Rhymes are annotated by enclosing rhyme labels in angular brackets before the part of the word responsible for the rhyme. If wanted, one can annotate rhymes for each syllable, as done in the rhyme words [d]sweet[h]ness and com[d]plete[h]ness, but one can also only annotate the rhyme as a whole, as done in the rhyme words [a]broken and [a]spoken.

In order to assign words to rhyme units, an understroke can be used that indicates that two orthographic words are perceived as one unit in the rhyme, which is the case for [e]new_[f]fall rhyming with [e]dew_fall. Furthermore, if a stanza reappears throughout a poem or song in the form of a refrain, this can be indicated by adding two spaces before all lines of the stanza.

Comments can be added by beginning a line with the hash symbol #, as shown in this small excerpt of Bob Dylan's "Sad-Eyed Lady of the Lowlands".
# [Verse 1]
With your mercury mouth in the missionary [c]times
And your eyes like smoke and your prayers like [c]rhymes
And your silver cross, and your voice like [c]chimes
Oh, who do they think could [i]bury_[j]you?
With your pockets well protected at [e]last
And your streetcar visions which ya' place on the [e]grass
And your flesh like silk, and your face like [e]glass
Who could they get to [i]carry_[j]you?

# [Chorus]
Sad-eyed lady of the lowlands
Where the sad-eyed prophet say that no man [a]comes
My warehouse eyes, my Arabian [a]drums
Should I put them by your [b]gate
Or, sad-eyed lady, should I [b]wait?
When testing this framework on many different kinds of poems from different languages and styles, I realized that the greedy rhyme annotation that I used (you place the rhyme tag before a word, and all letters that follow will be considered to belong to that very rhyme tag) has a disadvantage in those situations where syllables in multi-syllabic rhyme units essentially do notrhyme. As an example consider the following lines from Eminem's "Not Afraid":
I'ma be what I set out to be, 
without a doubt, undoubtedly
And all those who look down on me,
I'm tearin' down your balcony
Here, the author plays with rhymes centering around the words out to be, undoubtedly, down on me, and balcony. Condit-Schultz has annotated the rhymes as follows (I use the rhyme schema inline for simplicity):
I'ma D|be what I set (C|out c|to D|be), 
wi(C|thout c|a) (C|doubt, c|un)(C|doub.c|ted.D|ly)
And all those who look (C|down c|on D|me),
I'm tearin' C|down your (C|bal.c|co.D|ny)
In my opinion, however, the parts annotated with c by Condit-Schultz do not really rhyme in these lines, they are mere fillers for the rhythm, while the most important rhyme parts, which are also perceived as such, are the stressed syllables with the main vowel ou. To mark that a syllable is not really rhyming, but also in order to mark the border of a rhyme (and thus allow indication that only the first syllable of a word rhymes with another word), I therefore decided to introduce a specific "empty" rhyme symbol, which is now represented by a plus. My annotation of the lines thus looks as follows:
I'ma be what I set [h]out_[+]to_[e]be, 
wi[h]thout a [h]doubt, un[h]doub[+]tab[e]ly
And all those who look [h]down_[d]on_[e]me
I'm tearin' down your bal[d]co[e]ny

An Interactive Tool for Rhyme Annotation

While I consider the inline-annotation format as now rather complete (with all limitations resulting from inline-annotation), I realized, when trying to annotate poems by using the format, that it is not fun to edit text files in this way. I am not talking about small edits, like one stanza, or typing in some metadata — annotating a whole rap song can become very tedious and even problematic, as one may easily forget which rhyme tags one has already used, or oversee which words have been annotated as rhyming, or forget brackets and the like.

As a result, I decided to write an interactive rhyme annotation tool that supports the inline-annotation format and can be edited both in the text and interactively at the same time. This is a bit similar to the text processing programs in blogging software, which allow writing both in the HTML source and in a more convenient version that shows you what you will get.

The following screenshot in the database, for example, shows how the rhymes in Shakespeare's Sonnet Number 98 are visually rendered.

Visual display of Shakespeare's Sonnet 98

This tool is now already available online. I call it RhyAnT, which is short for Rhyme Annotation Tool. I have been using it in combination with a small server, to populate a first database with rhymes in different languages, which already contains more than 350 annotated poems. This database can be accessed and inspected by everybody interested, at AntRhyme; but copyrighted texts from modern songs can — unfortunately — not be rendered yet (as I am not sure how many I would be allowed to share).

I do not want to claim that I am gifted as a designer (I am surely not), and it is possible that there are better ways to implement the whole interface. However, I find it important to note that the format itself, with the coloring of rhyme words, has dramatically increased my efficiency at annotating rhyme data, and also my accuracy in spotting similarities.

Annotating the same poem with RhyAnT, the interactive rhyme annotator

The above screenshot shows how I can edit the poem from my edit access to the database. Alternatively, one can just paste in the text and edit it on the publicly accessible interface of the RhyAnT tool, edit the data, and then copy-paste it to store it. In this form, the interface can already be used by anybody who wants to annotate rhymes in their work.

Outlook

The current annotation framework that I have illustrated here is not almighty, specifically because it does not allow for multi-layered annotation (Banski 2019: 230f), which would allow us to add pronunciation, rhythm, and many other aspects than rhyming alone. However, I hope that many of these aspects can be later added quickly, by creating lookup tables and processing the annotated corpus automatically. Following the Zen of Python, this seems to be much simpler than investing a lot of time in the creation of a highly annotated dataset that would discourage working with the data from the beginning.

References

Bański, Piotr and Witt, Andreas (2019) Modeling and annotating complex data structures. In: Julia Flanders and Fotis Jannidis (eds) The Shape of Data in the Digital Humanities: Modeling Texts and Text-based Resources. Oxford and New York: Routledge, pp. 217-235.

Baxter, William H. (1992) A Handbook of Old Chinese Phonology. Berlin: de Gruyter.

Nathaniel Condit-Schultz (2017) MCFlow: A Digital Corpus of Rap Transcriptions. Empirical Musicology Review 11.2: 124-147.

Eckart, Kerstin (2012):Resource annotations. In: Clarin-D, AP 5 (ed.) Berlin: DWDS, pp. 30-42.

Forkel, Robert and List, Johann-Mattis and Greenhill, Simon J. and Rzymski, Christoph and Bank, Sebastian and Cysouw, Michael and Hammarström, Harald and Haspelmath, Martin and Kaiping, Gereon A. and Gray, Russell D. (2018) Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics. Scientific Data 5.180205: 1-10.

Haider, Thomas and Kuhn, Jonas (2018) Supervised rhyme detection with Siamese recurrent networks. In: Proceedings of Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pp. 81-86.

List, Johann-Mattis and Nathan W. Hill and Christopher J. Foster (2019) Towards a standardized annotation of rhyme judgments in Chinese historical phonology (and beyond). Journal of Language Relationship 17.1: 26-43.

Milà‐Garcia, Alba (2018) Pragmatic annotation for a multi-layered analysis of speech acts: a methodological proposal. Corpus Pragmatics 2.1: 265-287.

Wáng, Lì 王力 (2006) Hànyǔ shǐgǎo 漢語史稿 [History of the Chinese language]. Běijīng 北京:Zhōnghuá Shūjú 中华书局.

General remarks on rhyming (From rhymes to networks 2)


In this month's post, I want to provide some general remarks on rhyming and rhyme practice. I hope that they will help lay the foundations for tackling the problem of rhyme annotation, in the next post. Ideally, I should provide a maximally unbiased overview that takes all languages and cultures into account. However, since this would be an impossible task at this time (at least for myself), I hope that I can, instead, look at the phenomenon from a viewpoint that is a bit broader than the naive prescriptive accounts of rhyming used by teachers torture young school kids mentally.

What is a rhyme?

It is not easy to give an exact and exhaustive definition of rhyme. As a starting point, one can have a look at Wikipedia, where we find the following definition:
A rhyme is a repetition of similar sounds (usually, exactly the same sound) in the final stressed syllables and any following syllables of two or more words. Most often, this kind of perfect rhyming is consciously used for effect in the final positions of lines of poems and songs. Wikipedia: s. v. "Rhyme", accessed on 21.05.2020
This definition is a good starting point, but it does not apply to rhyming in general, but rather to rhyming in English as a specific language. While stress, for example, seems to play an important role in English rhyming, we don't find stress being used in a similar way in Chinese, so if we tie a definition of rhyming to stress, we exclude all of those languages in which stress plays a minor role or no role at all.

Furthermore, the notion of similar and identical sounds is also problematic from a cross-linguistic perspective on rhyming. It is true that rhyming requires some degree of similarity of sounds, but where the boundaries are being placed, and how the similarity is defined in the end, can differ from language to language and from tradition to tradition. Thus, while in German poetry it is fine to rhyme words like Mai [mai] and neu[noi], it is questionable whether English speakers would ever think that words like joy could form a rhyme with rye. Irish seems to be an extreme case of very complex rules underlying what counts as a rhyme, where consonants are clustered into certain classes (b, d, g, or ph, f, th, ch) that are defined to rhyme with each other (provided the vowels also rhyme), and as a result, words like oba and foda are judged to be good rhymes (Cuív 1966).

When looking at philological descriptions of rhyme traditions of individual languages, we often find a distinction between perfect rhymes on the one hand and imperfect rhymes on the other. But what counts as perfect or imperfect often differs from language to language. Thus, while French largely accepts the rhyming of words that sound identical, this is considered less satisfactory in English and German, and studies seem to have confirmed that speakers of French and English indeed differ in their intuitions about rhyme in this regard (Wagner and McCurdy 2010.

Peust (2014) discusses rhyme practices across several languages and epochs, suggesting that similarity in rhyming was based on some sort of rhyme phonology, that would account for the differences in rhyme judgments across languages. While the ordinary phonology of a language is a classical device in linguistics to determine those sounds that are perceived as being distinctive in a given language, rhyme phonology can achieve the same for rhyming in individual languages.

While this idea has some appeal at first sight, given that the differences in rhyme practice across languages often follow very specific rules, I am afraid it may be too restrictive. Instead, I rather prefer to see rhyming as a continuum, in which a well-defined core of perfect rhymes is surrounded by various instances of less perfect rhymes, with language-specific patterns of variation that one would still have to compare in detail.

Beyond perfection

If we accept that all languages have some notion of a perfect rhyme that they distinguish from less perfect rhymes, which will, nevertheless, still be accepted as rhymes, it is useful to have a quick look at differences in deviation from the perfect. German, for example, is often used as an example where vowel differences in rhymes are treated rather loosely; and, indeed, we find that diphthongs like the above-mentioned [ai] and [oi]are perceived as rhyming well by most German speakers. In popular songs, however, we find additional deviations from the perceived norm, which are usually not discussed in philological descriptions of German rhyming. Thus, in the famous German Schlager Griechischer Wein by Udo Jürgens (1934-2014), we find the following introductory line:
Es war schon dunkel, als ich durch Vorstadtstrassen heimwärts ging.
Da war ein Wirtshaus, aus dem das Licht noch auf den Gehsteig schien.
[Translation: It was already dark, when I went through the streets outside of the city. There was a pub which still emitted light that was shining on the street.]
There is no doubt that the artist intended these two lines to rhyme, given that the overall schema of the song shows a strict schema of AABCCB. So, in this particular case, the artist judged that rhyming ging [gɪŋ] with schien[ʃiːn] would be better than not attempting a rhyme at all, and it shows that it is difficult to assume one strict notion of rhyme phonology to guide all of the decisions that humans make when they create poems.

More extreme cases of permissive rhyming can be found in some traditions of English poetry, including Hip Hop (of course), but also the work of Bob Dylan, who does not have a problem rhyming time with fine, used with refused, or own with home, as in Like a Rolling Stone. In Spanish, where we also find a distinction between perfect (rima consonante) and imperfect rhyming (rima asonante), basically all that needs to coincide are the vowels, which allows Silvio Rodriguez to rhyme amór with canción in Te doy una canción.

While most languages coincide on the notion of perfect rhymes (notwithstanding certain differences due to general differences in their phonology), the interesting aspects for rhyming are those where they allow for imperfection. Given that rhyming seems to be something that reflects, at least to some extent, a general linguistic competence of the native speakers, a comparison of the practices across languages and cultures may help to shed light on general questions in linguistics.

Rhyming is linear

When discussing with colleagues the idea of making annotated rhyme corpora, I was repeatedly pointed to the worst cases, which I would never be able to capture. This is typical for linguists, who tend to see the complexities before they see what's simple, and who often prefer to not even try to tackle a problem before they feel they have understood all the sub-problems that could arise from the potential solution they might want to try.

One of the worst cases, when we developed our first annotation format as presented last year (List et al. 2019), was the problem of intransitive rhyming. The idea behind this is that imperfect rhyming may lead to a situation where one word rhymes with a word that follows, and this again rhymes with a word that follows that, but the first and the third would never really rhyme themselves. We find this clearly stated in Zwicky (1976: 677):
Imperfect rhymes can also be linked in a chain: X is rhymed (imperfectly) with Y, and Y with Z, so that X and Z may count as rhymes thanks to the mediation of Y, even when X and Z satisfy neither the feature nor the subsequence principle.
Intransitive rhyming is, indeed, a problem for annotation, since it would require that we think of very complex annotation schemas in which we assign words to individual rhyme chains instead of just assigning them to the same group of rhymes in a poem or a song. However, one thing that I realized afterwards, which one should never forget is: rhyming is linear. Rhyming does proceed in a chain. We first hear one line, then we hear another line, etc, so that each line is based on a succession of words that we all hear through time.

It is just as the famous Ferdinand de Saussure (1857-1913) said about the linguistic sign and its material representation, which can be measured in a single dimension ("c'est une linge", Saussure 1916: 103). Since we perceive poetry and songs in a linear fashion, we should not be surprised that the major attention we give to a rhyme when perceiving it is on those words that are not too far away from each other in their temporal arrangement.

The same holds accordingly for the concrete comparison of words that rhyme: since words are sequences of sounds, the similarity of rhyme words is a similarity of sequences. This means we can make use of the typical methods for automated and computer-assisted sequence comparison in historical linguistics, which have been developed during the past twenty years (see the overview in List 2014), when trying to analyze rhyming across different languages and traditions.

Conclusion

When writing this post, I realized that I still feel like I am swimming in an ocean of ignorance when it comes to rhyming and rhyming practices, and how to compare them in a way that takes linguistic aspects into account. I hope that I can make up for this in the follow-up post, where I will introduce my first solutions for a consistent annotation of poetry. By then, I also hope it will become clearer why I give so much importance to the notion of imperfect rhymes, and the emphasis on the linearity of rhyming.

References

Brian Ó Cuív (1966) The phonetic basis of classical modern irish rhyme. Ériu 20: 94-103.

List, Johann-Mattis (2014) Sequence Comparison in Historical Linguistics. Düsseldorf: Düsseldorf University Press.

List, Johann-Mattis and Nathan W. Hill and Christopher J. Foster (2019) Towards a standardized annotation of rhyme judgments in Chinese historical phonology (and beyond). Journal of Language Relationship 17.1: 26-43.

Peust, Carsten (2014) Parametric variation of end rhyme across languages. In: Grossmann et al. Egyptian-Coptic Linguistics in Typological Perspective. Berlin: Mouton de Gruyter, pp. 341-385.

de Saussure, Ferdinand (1916) Cours de linguistique générale. Lausanne:Payot.

Wagner, M. and McCurdy, K. (2010) Poetic rhyme reflects cross-linguistic differences in information structure. Cognition 117.2: 166-175.

Zwicky, Arnold (1976) Well, This rock and roll has got to stop. Junior’s head is hard as a rock. In: Papers from the Twelfth Regional Meeting of the Chicago Linguistic Society 676-697.

From rhymes to networks (A new blog series in six steps)


Whenever one feels stuck in solving a particular problem, it is useful to split this problem into parts, in order to identify exactly where the problems are. The problem that is vexing me at the moment is how to construct a network of rhymes from a set of annotated poems, either by one and the same author, or by many authors who wrote during the same epoch in a certain country using a certain language.

For me, a rhyme network is a network in which words (or parts of words) occur as nodes, and weighted links between the nodes indicate how often the linked words have been found to rhyme in a given corpus

An example

As an example, the following figure illustrates this idea for the case of two Chinese poems, where the rhyme words represented by Chinese characters are linked to form a network (taken from List 2016).


Figure 1: Constructing a network of rhymes in Chinese poetry (List 2016)

One may think that it is silly to make a network from rhymes. However, experiments on Chinese rhyme networks (of which I have reported in the past) have proven to be quite interesting, specifically because they almost always show one large connected component. I find this fascinating, since I would have expected that we would see multiple connected components, representing very distinct rhymes.

It is obvious that some writers don't have a good feeling for rhymes and fail royally when they try to do it — this happens across all languages and cultures in which rhyming plays a role. However, it was much less obvious to me that rhyming can be seen to form at least some kind of a continuum, as you can see from the rhyme networks that we have constructed from Chinese poetry (again) in the past (taken from List et al. 2017).


Figure 2: A complete rhyme network of poems in the Book of Odes (ca. 1000 BC, List et al. 2017)

The current problem

My problem now is that I do not know how to do the same for rhyme collections in other languages. During recent months, I have thought a lot about the problem of constructing rhyme networks for languages such as English or German. However, I always came to a point where I feel stuck, where I realized that I actually did not know at all how to deal with this.

I thought, first, that I could write one blog post listing the problems; but the more I thought about it, I realized that there were so many problems that I could barely do it in one blogpost. So, I decided then that I could just do another series of blog posts (after the nice experience from the series on open problems in computational historical linguistics I posted last year), but this time devoted solely to the question of how one can get from rhymes into networks.

So for the next six months, I will discuss the four major issues that keep me from presenting German or English rhyme networks here and now. I hope that at the end of this discussion I may even have solved the problem, so that I will then be able to present a first rhyme network of Goethe, Shakespeare, or Bob Dylan. (I would not do Eminem, as the rhymes are quite complex, and tedious to annotate).

Summary of the series

Before we can start to think about the modeling of rhyme patterns in rhymed verse, we need to think about the problem in general, and discuss how rhyming shows up in different languages. So, I will start the series with the problem of rhyming in general, by discussing how languages rhyme, where these practices differ, and what we can learn from these differences. Having looked into this, we can think about ways of annotating rhymes in texts in order to acquire a first corpus of examples. So, the following post will deal with the problems that we encounter when trying to annotate the rhyme words that we identify in poetry collections.

If one knows how to annotate something, one will sooner or later get impatient, and long for faster ways to do these boring tasks. Since this also holds for the manual annotation of rhyme collections (which we need for our rhyme networks), it is obvious to think about automated ways of finding rhymes in corpora — that is, to think about the inference of rhyme patterns, which can also be done semi-automatically, of course. So the major problems related to automated rhyme detection will be discussed in a separate post.

Once this is worked out, and one has a reasonably large corpus of rhyme patterns, one wants to analyze it — and the way I want to analyze annotated rhyme corpora is with the help of network models. But, as I mentioned before, I realized that I was stuck when I started to think about rhyme networks of German and English (which are relatively easy languages, one should think). So, it will be important to discuss clearly what seems to be the best way to constructrhyme networks as a first step of analysis. This will therefore be dealt with in a separate blogpost. In a final post, I then plan to tackle the second analysis step, by discussing very briefly what one can do with rhyme networks.

All in all, this makes for six posts (including this one); so we will be busy for the next six months, thinking about rhymes and poetry, which is probably not the worst thing one can do. I hope, but I cannot promise at this point, that this gives me enough time to stick to my ambitious annotation goals, and then present you with a real rhyme network of some poetry collection, other than the Chinese ones I already published in the past.

References

List, Johann-Mattis, Pathmanathan, Jananan Sylvestre, Hill, Nathan W., Bapteste, Eric, Lopez, Philippe (2017) Vowel purity and rhyme evidence in Old Chinese reconstruction. Lingua Sinica 3.1: 1-17.

List, Johann-Mattis (2016) Using network models to analyze Old Chinese rhyme data. Bulletin of Chinese Linguistics 9.2: 218-241.

Evolution unchained: The development of person names and the limits of sequences


What do person names like Jack and Hans have in common, and what unites Joe and Pepe? Both name pairs go back to a common ancestor. For Jack and Hans, this would be John (ultimately going back to Iōánnēs in Greek), and for Joe and Pepe, this would be Josef (originally from Hebrew). Given the striking dissimilarity of the names in their current form, the pathways of change by which they have evolved into their current shape are quite complicated.

While the German name Hans can be easily shown to be a short form of the German variant Johannes, the evolution of Jack is more complicated. First (at least this is what people on Wikipedia suppose), Iōánnēs becomes John in English, similar to the process that transformed German Johannesinto Hans. Then, in an ancient form of English, a diminutive was built for John, which yielded the form Jenkin, with the diminutive suffix -kin that has a homologous counterpart in German -chen (which can be attached to Hansas well, yielding Hänschen). Etymologically, Jack is little Johnny.

While Joe in English is a shortening of Josef, the development of Pepe is again a bit more complex. First, we find the form Giuseppe as an Italian counterpart of Josef. How this form then yielded Pepe as a diminutive is not completely clear to me; but since we find the pe in the Italian form, we can think of a process by which Giuseppe becomes Giuseppepe, leaving Pepeafter the deletion of the initial two syllables.

The complexity of person-name evolution

Even from these two examples alone, we can already see that the evolution of person names can easily become quite complex. If all words in all spoken languages in the world evolved in the same way in which our person names evolve, we would have a big problem in historical linguistics, since the amount of speculation in our etymologies would drastically increase.

When comparing etymologically related words from different languages, we generally assume that they show regular correspondences among their sound segments. This presupposes that there is still enough sound material that reflects these correspondences, allowing us to detect and assess them. But since the evolution of person names rarely consists of the regular modification of sounds, but rather results in the deletion, reduplication, and rearrangement of whole word parts, there is rarely enough left in the end that could be used as the basis for a classical sequence comparison.

With the name Tina in German being the short form of Bettina, Christina, and at times even Katharina, and with Bettina itself going back to Elisabeth, and with Tina becoming Tinchen, Tinka, or Tine, we face an almost insurmountable challenge when trying to model the complexity of the various patterns by which names can change.

Modeling word derivation with directed networks

That words do not evolve solely by the alternation of sounds, but also by different forms of derivation, is nothing new for historical linguistics. We face the problem, for example, when looking for etymologically related words in the basic lexicon of phylogenetically related languages. However, these phenomena can be easily investigated by enhanced means of annotation. The evolution of person names, on the other hand, presents us with larger challenges.

While working as a research fellow in France in 2015-2016, I had the time to develop a small tool that allows us to represent derivational relations between related words with help of a directed network, and thus allows us to model these relations in a rough way. Such a graph is directed, and our words are the nodes in the network, with the edges drawn between the assumed ancestor word forms and their descendants. This tool, which I then called DeriViz, is still available online. and makes it possible to visualize network relations between words.

I have now conducted a small experiment with this tool, by taking name variants of Elisabeth, as they are listed in Wikipedia, and trying to model them in a directed network, along with intermediate stages. You can do this easily yourself, by copying the network that I have constructed in text form below, and pasting it into the field for data entry at the DeriViz-Homepage. The network will be visualized when you press on the OK button; and you can play with it by dragging it around.
Elisabeth → BETT
BETT → Betty
BETT → Bettina
BETT → Bettine
BETT → Betsi
Elisabeth → ELISABETH
ELISABETH → Elise
ELISABETH → Elsbeth
ELISABETH → Else
ELISABETH → Elina
Elisabeth → ILSA
ILSA → Ilsa
ILSA → Ilse
Elisabeth → Isabella
Elisabeth → LISA
LISA → Lieschen
LISA → Liese
LISA → Liesel
LISA → Lis
LISA → Lisa
LISA → Lisbeth
LISA → Lisette
LISA → Lise
LISA → Liesl
Elisabeth → LILA
LISA → Lila
LISA → Liliane
LISA → Lilian
LISA → Lilli
Elisabeth → Sisi
I intentionally reduced the amount of data here, in order to make sure that the graphic can still be inspected. But it is clear that even this simple model, which assumes unique ancestor-descendant relations among all of the derived person names, is stretched to its limits when applied to names as productive as Elisabeth, at least as far as the visualization is concerned.

Derivation network of names derived from Elisabeth


If you now imagine that there are various processes that turn an ancestral name into a descendant name, and that one would ideally want to model the differences between these processes as well, one can see easily that it is indeed not a trivial problem to model the evolution of person names (and we are not even speaking of inferring any of these relations).

How names evolve

Names evolve in various ways along different dimensions. With respect to their primary function, or their use, we tend to use, among others, nick names. Formally, nick names are often a short form of an original name, but depending on the community of speakers, it is also possible that there is a formal procedure by which a nick name can be derived from a base name. Thus, every speaker of Russian should know that Jekaterina can be turned into Katerina, which can be turned into Katja, which can be turned into Katjuscha, or, in the case of a Vocative, into Katj. Once the primary function of a name changes, its form usually also changes, as we can now see in many examples.

But the form can also change when a name crosses language borders. If you go with your name into another country, and the speakers have problems pronouncing certain sounds that occur in your name, it is very likely that they will adjust your name's pronunciation to the phonetic needs of their own language, and modify it. Names cross language borders very quickly, since we tend not to leave them at home when visiting or migrating to foreign countries. As a result, a great deal of the diversity of person names  observed today is due to the migration of names across the world's larger linguistic communities.

How we change names when building short forms or nick names, or when trying to adapt a name to a given target language, depends on the structure of the language. The most important part is the phonology of the language in which the change happens. For example, when transferring a name from one language to another, and the new language lacks some of the sounds in the original name, speakers will replace them with those sounds which they perceive to be closest to the lacking ones.

But the modification is not restricted to the replacement of sounds. My own given name, Mattis, for example, usually has the stress on the first syllable, but in France, most people tend to call me Matisse, with the accent on the second syllable, reflecting the general tendency to stress the last syllable of a word in French. In Russian, on the other hand, Mattis could be perfectly pronounced, but since people do not know the name, they often confuse it with its variant Matthias, which then sounds like Matjes when pronounced in Russian (which is the name for soused herring in Germany). There are more extreme cases; and both English and German speakers are also good at drastically adjusting foreign names to the needs of their mother tongues.

It would be nice if it was possible to investigate the huge diversity in the evolution of person names more systematically. In principle, this should be possible. I think, starting from directed networks is definitely a good idea; but it would probably have to be extended by distinguishing different types of graph edges. Even if a given selection may not handle all of the processes known to us, it might help to collect some primary data in the first place.

With a large enough set of well-annotated data, on the other hand, one might start to look into the development of algorithms that could infer derivation relationships between person names; or one could analyze the data and search for the most frequent processes of person name evolution. Many more analyses might be possible. One could see to which degree the processes differ across languages, or how names migrate from one language to another across times, usage types, and maybe even across fashions.

Outlook

I assume that the result of such a collection would be interesting not only for couples who are about to replicate themselves, but would also be interesting for historical research and research in the field of cultural evolution. Whether such a collection will ever exist, however, seems less likely. The problem is that there are not enough scholars in the world who would be interested in this topic, as one can see from the very small number of studies that have been devoted to the problem up to now (as one of the few exceptions known to me, compare the nice overview of person name classification by Handschuh 2019). I myself would not be able to help in this endeavour, given that I lack the scholarly competence of investigating name evolution. But I would sure like to investigate and inspect the results, if they every become available.

Reference

Handschuh, Corinna (2019) The classification of names. A crosslinguistic study of sex-specific forms, classifiers, and gender marking on personal names. STUF — Language Typology and Universals 72.4: 539-572.

How should one study language evolution?

This is a joint post by Justin Power, Guido Grimm, and Johann-Mattis List.

Like in biology, we have two basic possibilities for studying how languages evolve:
  • We set up a list of universal comparanda. These should occur in all languages and show a high enough degree of variation that we can use them as indicators of how languages have evolved;
  • We create individual lists of comparanda. These are specific for certain language groups that we want to study.
Universal comparanda

While most studies would probably aim to employ a set of universal comparanda, the practice often requires a compromise solution in which some non-universal characteristics are added. This holds, for example, for the idea of a core genome in biology, which ends up being so small in overlap across all living species that it makes little sense to compute phylogenies based on it, except for for closely related species (Dagan and Martin 2006). Another example is the all-inclusive matrices that are used to establish evolutionary relationships of extinct animals characterized by high levels of missing data (eg. Tschopp et al. 2015; Hartman et al. 2019). The same holds for historical linguistics, with the idea of a basic lexicon or basic vocabulary, represented by a list of basic concepts that are supposed to be expressed by simple words in every human language (Swadesh 1955), given that the number of concepts represented by simple words shared across all human languages is extremely small (Hoijer 1956).

Figure 1: All humans have hands and arms but some words for ‘hands’ and ‘arms’ address different things (see our previous post "How languages loose body parts").


Apart from the problem that basic vocabulary concepts occurring in all languages may be extremely limited, test items need to fulfill additional characteristics that may not be easy to find,in order to be useful for phylogenetic studies. They should, for example, be rather resistant to processes of lateral transfer or borrowing in linguistics. They should preferably be subject to neutral evolution, since selective pressure may lead to parallel but phylogenetically independent processes (in biology known as convergent evolution) that are difficult to distinguish and can increase the amount of noise in the data (homoplasy).

Selective pressure, as we might find, for example, in a specific association between certain concepts and certain sounds across a large phygenetically independent sample of human languages, is rarely considered to be a big problem in historical linguistics studies dealing with the evolution of spoken languages (see Blasi et al. 2016 for an exception). In sign language evolution, however, the problem may be more acute because of a similar iconic motivation of many lexical signs in phylogenetically independent sign languages (Guerra Currie et al. 2002), as well as the representation of concepts such as body parts and pronouns using indexical signs with similar forms. This latter characteristic of all known sign languages has led to the design of a basic vocabulary list that differs from those traditionally used in the historical linguistics of spoken languages (Woodward 1993); and we know of only one proposal attempting to address the problem of iconicity in sign languages for phylogenetic research (Parkhurst and Parkhurst 2003).

Figure 2: Basic processes in the evolution of languages, spoken or signed  (see our previous post How languages loose body parts).

All in all, it seems that there may be no complete solution for a list of lexical comparanda for all human languages, including sign languages, given the complexities of lexical semantics, the high variability in expression among the languages of the world (see Hymes 1960 for a detailed discussion on this problem), and the problems related to selective pressures highlighted above. Scholars have proposed alternative features for comparing languages, such as grammatical properties (Longobardi et al. 2015) or other "structural" features (Szeto et al. 2018), but these are either even more problematic for historical language comparison—given that it is never clear if these alternative features have evolved independently or due to common inheritance—or they are again based on a targeted selection for a certain group of languages in a certain region.

Targeted comparanda

If there is no universal list of features that can be used to study how languages have evolved, we have to resort to the second possibility mentioned above, by creating targeted lists of comparanda for the specific language groups whose evolution we want to study. When doing so, it is best to aim at a high degree of universality in the list of comparanda, even if one knows that complete universality cannot be achieved. This practice helps to compare a given study with alternative studies; it may also help colleagues to recycle the data, at least in part, or to merge datasets for combined analyses, if similar comparanda have been published for other languages.

But there are cases where this is not possible, especially when conducting studies where no previous data have been published, and rigorous methods for historical language comparison have yet to be established. Sign languages can, again, be seen as a good example for this case. So far, few phylogenetic studies have addressed sign language evolution, and none have supplied the data used in putting forward an evolutionary hypothesis. Furthermore, because the field lacks unified techniques for the transcription of signs, it is extremely difficult to collect lexical data for a large number of sign languages from comparable glossaries, wordlists, and dictionaries, the three primary sources, apart from fieldwork, that spoken language linguists would use in order to start a new data collection. We are aware of one comparative database with basic vocabulary for sign languages that is currently being built (Yu et al. 2018), and that may represent lexical items in a way that can be compared efficiently, but these data have not yet been made available to other researchers.

Sign languages

When Justin Power approached Mattis about three years ago, asking if he wanted to collaborate on a study relating to sign language evolution, we quickly realized that it would be infeasible to gather enough lexical data for a first study. Tiago Tresoldi, a post-doc in our group, suggested the idea of starting with sign language manual alphabets instead. From the start, it was clear that these manual alphabets might have certain disadvantages — because they are used to represent written letters of a different language, they may constitute a set of features evolving independently from the sign language itself.

Figure 3: Processes shaping manual alphabets. The evolution of signed concepts may be affected by the same, leading to congruent patterns, or different processes, leading to incongruent differentiation patterns (see our previous post: Stacking networks based on sign language manual alphabets).

But on the other hand, the data had many advantages. First, a sufficient number of examples for various European sign languages were available in online databases that could be transcribed in a uniform way. Second, the comparison itself was facilitated, since in most cases there was no ambiguity about which “concepts” to compare, in contrast to what one would encounter in a comparison of lexical entries. For example, an “a” is an “a” in all languages. Third, it turned out that for quite a few languages, historical manual alphabets could be added to the sample. This point was very important for our study. Given that scholars still have limited knowledge regarding the details of sign change in sign language evolution, it is of great importance to compare sources of the same variety, or those assumed to be the same, across time—just as spoken language linguists compared Latin with Spanish and Italian in order to study how sounds change over time. And finally, manual alphabets in fact constitute an integrated part of many sign languages that may, for example, contribute to the forms of lexical signs, making the idea more plausible that an understanding of the evolution of manual alphabets could be informative about the evolution of sign languages as a whole.

Figure 4: Early evolution of handshapes used to sign ‘g’ (see our previous post: Character cliques and networks – mapping haplotypes of manual alphabets).

Guido later joined our team, providing the expertise to analyze the data with network methods that do not assume tree-like evolution a priori. We therefore thought that we had done a rather good job when our pilot study on the evolution of sign language manual alphabets, titled Evolutionary Dynamics in the Dispersal of Sign Languages, finally appeared last month (Power et al. 2020). We identified six basic lineages from which the manual alphabets of the 40 contemporary sign languages developed. The term "lineage" was deliberately chosen in this context, since it was unclear whether the evolution of the manual alphabets should be seen as representative of the evolution of the sign languages as a whole. We also avoided the term "family", because we were wary of making potentially unwarranted assumptions about sign language evolution based on theories in historical linguistics.

Figure 5: The all-inclusive Neighbor-net (taken from Power et al. 2020).

While the study was positively received by the popular media, and even made it onto the title page of the Süddeutsche Zeitung (one of the largest daily newspapers in Germany), there were also misrepresentations of our results in some media channels. The Daily Mail (in the UK), in particular, invented the claim that all human sign languages have evolved from five European lineages. Of course, our study never said this, nor could it have, since only European sign languages were included in our sample. (We included three manual alphabets representing Arabic-based scripts from Afghan, Jordanian, and Pakistan Sign Languages, where there was some indication that these may have been informed by European sources.)

Study of phylogenetics

While we share our colleagues’ distaste for the Daily Mail’s likely purposeful misrepresentation (in the end, unfortunately, it may have achieved its purpose as click bait), some colleagues went a bit further. One critique that came up in reaction to the Daily Mail piece was that our title opens the door to misinterpretation, because we had only investigated manual alphabets and, hence, cannot say anything about the "evolutionary dynamics of sign languages".

While the title does not mention manual alphabets, it should be clear that any study on evolution is based on a certain amount of reduction. Where and how this reduction takes place is usually explained in the studies. Many debates in historical linguistics of spoken languages have centered around the question of what data are representative enough to study what scholars perceive as the "overall evolution" of languages; and scholars are far from having reached a communis opinio in this regard. At this point, we simply cannot answer the question of whether manual alphabets provide clues about sign language evolution that contrast with the languages’ "general" evolution, as expressed, for example, in selecting and comparing 100 or 200 words of basic vocabulary. We suspect that this may, indeed, be the case for some sign languages, but we simply lack the comparative data to make any claims in this respect.

Figure 6: Evolution doesn’t mean every feature has to follow the same path: a synopsis of molecular phylogenies inferred for oaks, Quercus, and their relatives, Fagaceae (upcoming post on Res.I.P.) While nuclear differentiation matches phenotypic evolution and the fossil record (likely monophyla in bold font), the evolution of the plastome is partly decoupled (gray shaded: paraphyletic clades). Likewise, we can expect that different parts of languages, such as manual alphabets vs. core “lingome” of sign languages, may indicate different relationships.

The philosophical question, however, goes much deeper, to the "nature" of language: What constitutes a language? What do all languages have in common? How do languages change? What are the best ways to study how languages evolve?

One approach to answering these questions is to compare collectible features of languages ("traits" in biology)­, and to study how they evolve. As the field develops, we may find that the evolution of a manual alphabet does not completely coincide with the evolution of the lexicon or grammar of a sign language. But would it follow from such a result that we have learned nothing about the evolution of sign languages?

There is a helpful analogy in biology: we know that different parts of the genetic code can follow different evolutionary trajectories; we also know that phenotype-based phylogenetic trees sometimes conflict with those based on genotypes. But this understanding does not stop biologists from putting forward evolutionary hypotheses for extinct organisms, where only one set of data is available (phenotypes; Tree of Life). Furthermore, such conflicting results may lead to a more comprehensive understanding of how a species has evolved.

Figure 7: A likely case of convergence: the sign for “г” in Russian and Greek Sign Language, visually depicting the letter (see our previous post Untangling vertical and horizontal processes in the evolution of handshapes). Complementing studies of signed concepts may reveal less obvious cases of convergence (or borrowing).


Because we felt the need to further clarify the intentions of our study, and to answer some of the criticism raised about the study on Twitter, we decided to prepare a short series of blog posts devoted to the general question of "How should one study language evolution" (or more generally: "How should one study evolution?"). We hope to take some of the heat out of the discussion that evolved on Twitter, by inviting those who raised critiques about our study to answer our posts in the form of comments here, or in their own blog posts.

The current blog post can thus be understood as an opening for more thoughts and, hopefully, more fruitful discussions around the question of how language evolution should be studied.

In that context, feel free to post any questions and critiques you may have about our study below, and we will aim to pick those up in future posts.

References

Damián E. Blasi and Wichmann, Søren and Hammarström, Harald and Stadler, Peter and Christiansen, Morten H. (2016) Sound–meaning association biases evidenced across thousands of languages. Proceedings of the National Academy of Science of the United States of America 113.39: 10818-10823.

Dagan, Tal and Martin, William (2006) The tree of one percent. Genome Biology 7.118: 1-7.

Guerra Currie, Anne-Marie P. and Meier, Richard P. and Walters, Keith (2002) A cross-linguistic examination of the lexicons of four signed languages. In R. P. Meier, K. Cormier, & D. Quinto-Pozos (Eds.), Modality and Structure in Signed and Spoken Languages (pp.224-236). Cambridge University Press.

Hoijer, Harry (1956) Lexicostatistics: a critique. Language 32.1: 49-60.

Hymes, D. H. (1960) Lexicostatistics so far. Current Anthropology 1.1: 3-44.

Longobardi, Giuseppe and Ghirotto, Silva and Guardiano, Cristina and Tassi, Francesca and Benazzo, Andrea and Ceolin, Andrea and Barbujan, Guido (2015) Across language families: Genome diversity mirrors linguistic variation within Europe. American Journal of Physical Anthropology 157.4: 630-640.

Parkhurst, Stephen and Parkhurst, Dianne (2003) Lexical comparisons of signed languages and the effects of iconicity. Working Papers of the Summer Institute of Linguistics, University of North Dakota Session, vol. 47.

Power, Justin M. and Grimm, Guido and List, Johann-Mattis (2020) Evolutionary dynamics in the dispersal of sign languages. Royal Society Open Science 7.1: 1-30. DOI: 10.1098/rsos.191100

Swadesh, Morris (1955) Towards greater accuracy in lexicostatistic dating. International Journal of American Linguistics 21.2: 121-137.

Szeto, Pui Yiu and Ansaldo, Umberto and Matthews, Steven (2018) Typological variation across Mandarin dialects: An areal perspective with a quantitative approach. Linguistic Typology 22.2: 233-275.

Woodward, James (1993) Lexical evidence for the existence of South Asian and East Asian sign language families. Journal of Asian Pacific Communication 4.2: 91-107.

From words to deeds?


If you want to annoy a linguist, then there are three easy ways to do so: ask them how many languages they speak; ask them for their opinion regarding the German spelling reform; or ask them whether it is true that the Eskimo language has 50 words for snow. What those three questions have in common is that they all touch upon some big issues in linguistics, which are so big that they give us a headache when being reminded of them.

For the first question, asking about a linguist's linguistic talent touches upon the conviction of quite a few linguists that in order to practice linguistics, one does not need to study many languages. One language is usually enough; and even if that language is only English, this may also be sufficient (at least according to some fanatics who practice syntax). To put it in different words: knowing only one language does not prevent a linguist from making claims about the evolution of whole language families. Knowing how to describe a language, or how to compare several languages, does not necessarily require anyone to be able to speak them. After all, mathematicians also pride themselves on not being able to calculate.

The second question, regarding the German spelling reform, marks the last time when German linguists failed royally in proving the importance of their studies to the broader public. The problem was that the German spelling reform, the first after some 100 years of linguistic peace, was mostly done without any linguistic input. Those who commented on it were, instead, novelists, poets and journalists, usually a bit older in age, who felt that the reform was proposed mainly in order to annoy them personally. At the same time, and this was maybe no coincidence, more and more institutes for comparative linguistics disappeared from German universities. The reason was again that the field had not succeeded in explaining its importance to the public. However, historical language comparison can, indeed, be important when discussing the reform of a writing system that is being used by millions of people, specifically also because the investigation of historically evolving linguistic systems is one of the specialties of historical-comparative linguistics. This was completely ignored by then.

The last question concerns the almost ancient debate about the hypothesis commonly known attributed to Edward Sapir (1884-1939) and Benjamin Lee Whorf (1897-1941). This says, in its strong form (Whorf 1950), that speaking influences thinking to such an extent that we might, for example, develop a different kind of Relativity Theory in physics if we started to practice our science in languages different from English, French, and German. Given that Eskimo languages are said to have some 50 different words for snow (as people keep repeating), it should be clear enough that those speaking an Eskimo language must think completely differently from those who start to forget what snow is after all.

The latter concept leads to an interesting use of networks, which I will discuss here.

Words versus deeds

The hypothesis by Sapir and Whorf annoys many linguists (including myself), because it has been long since disproved, at least in its strong, naive form. It was disproved by linguistic data, not by arguments; and the data were the data used by Whorf in order to prove his point in a first instance. However, although there is little evidence for the hypothesis in its strong form, people keep repeating it, especially in non-linguistic circles, where it is often instrumentalized.

Whether we can find evidence for a weak form of the hypothesis — which would say that we can find some influence of speaking on thinking — is another question; which is, however, difficult to answer. It may well be possible that our thoughts are channeled to some degree by the material we use in order to express them. When distinguishing color shades, for example, such as light blue and dark blue, by distinct words, such as goluboj and sin'ij in Russian or celeste and azul in Spanish, it may be that we develop different thoughts when somebody talks about blue cheese, which is called dark blue cheese in Spanish (queso azul).

But this does not mean that somebody who speaks English would never know that there is some difference between light and dark blue, just because the language does not primarily make the distinction between the two color tones. It is possible that the stricter distinction in Russian and Spanish triggers an increased attention among speakers, but we do not know how large the underlying effect is in the end, and how many people would be affected by it.

Particular languages are thus neither a template nor a mirror of human thinking — they do not necessarily channel our thoughts, and may only provide small hints as to how we perceive things around us. For example, if a language expresses different concepts, such as "arm" and "hand" with the same word, this may be a hint that "arm" and "hand" are not that different from each other, or that they belong together functionally in some sense, which is why we may perceive them as a unit. This is the case in Russian, where we find only one expression ruka for both concepts. In daily conversations, this works pretty well, and there are rarely any situations where Russian speakers would not understand each other due to ambiguities, since most of the time the context in which people speak disambiguates all they want to express well enough.

Colexification network with the central concept "MIND" and the geographical distribution of languages colexifying "MIND" and "BRAIN"

These colexifications, as we now call the phenomenon (François2008), occur frequently in the languages of the world. This is due to the polysemy of many of the words we use, since no single word denotes only one concept alone, but often denote several similar concepts at the same time. On the other hand, we encounter identical word forms in the same language which express completely different things, resulting from coincidental processes by which originally different pronunciations came to sound alike (called convergence, in biology). Those colexifications that are not coincidental but result from polysemy are the most interesting ones for linguists, not least because the words are related by network graphs not trees (as shown above). When assembled in large enough numbers, across a sufficiently large sample of languages, they may allow us some interesting insights into human cognition.

The procedure to mine these insights from cross-linguistic data has already been discussed in a previous blog, from 2018. The main idea is to collect colexifications for as many concepts and languages and possible, in order to construct a colexification network, in which each concept is represented by a node, and weighted links between the nodes represent how often each colexification between the linked concepts occurs; that is, they represent how often we find a language that expresses the two linked concepts with the same word.

Having proposed a first update of our Database of Cross-Linguistic Colexifications (CLICS) back in 2018, we have now been able to further increase the data. With this third installment of the database, we could double the number of language varieties, from 1,200 to 2,400. In addition, we could enhance the workflows that we use to aggregate data from different sources, in a rigorously reproducible way (Rzymski et al. 2020).

Current work

Even more interesting than these data, however, is a study initiated by colleagues from psychology from the University of North Carolina, which was recently published, after more than two years of intensive collaboration (Jackson et al. 2019). In this study, the colexifications for emotion concepts, such as "love", "pity", "surprise", and "fear", were assembled and the resulting networks were statistically compared across different language families. The surprising result was that the structures of the networks differed quite considerably from each other (an effect that we could not find for color concepts derived from the same data). Some language families, for example, tend to colexify "surprise" and "fear (fright)" (see our subgraph for "surprised"), while others colexified "love" and "pity" (see the subgraph for "pity").

Not all aspects of the network structures were different. An additional analysis involving informants showed that especially the criterion of valency (that is, if something is perceived as negative or positive) played an important role for the structure of the networks; and similar effects could be found for the degree of arousal.

These results show that the way in which we express emotion concepts in our languages is, on the one hand, strongly influenced by cultural factors, while on the other hand there are some cognitive aspects that seem to be reflected similarly across all languages.

What we cannot conclude from the results, however, is, that those, who speak languages in which "pity" and "love" are represented by the same word, will not know the difference between the two emotions. Here again, it is important to emphasize, what I mentioned above with respect to color terms: if a particular distinction is not present in a given language, this it does not mean that the speakers do not know the difference.

It may be tempting to dig out the old hypothesis of Sapir and Whorf in the context of the study on emotions; but the results do not, by any means, provide evidence that our thinking is directly shaped and restricted by the languages we speak. Many factors influence how we think. Language is one aspect among many others. Instead of focusing too much on the question as to which languages we speak, we may want to focus on how we speak the language in which we want to express our thoughts.

References

François, Alexandre (2008) Semantic maps and the typology of colexification: intertwining polysemous networks across languages. In: Vanhove, Martine (ed.): From polysemy to semantic change. Amsterdam: Benjamins, pp. 163-215.

Joshua Conrad Jackson, Joseph Watts, Teague R. Henry, Johann-Mattis List, Peter J. Mucha, Robert Forkel, Simon J. Greenhill and Kristen Lindquist (2019) Emotion semantics show both cultural variation and universal structure. Science 366.6472: 1517-1522. DOI: 10.1126/science.aaw8160

Rzymski, Christoph, Tiago Tresoldi, Simon Greenhill, Mei-Shin Wu, Nathanael E. Schweikhard, Maria Koptjevskaja-Tamm, Volker Gast, Timotheus A. Bodt, Abbie Hantgan, Gereon A. Kaiping, Sophie Chang, Yunfan Lai, Natalia Morozova, Heini Arjava, Nataliia Hübler, Ezequiel Koile, Steve Pepper, Mariann Proos, Briana Van Epps, Ingrid Blanco, Carolin Hundt, Sergei Monakhov, Kristina Pianykh, Sallona Ramesh, Russell D. Gray, Robert Forkel and Johann-Mattis List (2020): The Database of Cross-Linguistic Colexifications, reproducible analysis of cross- linguistic polysemies. Scientific Data 7.13: 1-12. DOI: 10.1038/s41597-019-0341-x

Benjamin Lee Whorf (1950) An American Indian Model of the Universe. International Journal of American Linguistics 16.2: 67-72.