How should one study language evolution?

This is a joint post by Justin Power, Guido Grimm, and Johann-Mattis List.

Like in biology, we have two basic possibilities for studying how languages evolve:
  • We set up a list of universal comparanda. These should occur in all languages and show a high enough degree of variation that we can use them as indicators of how languages have evolved;
  • We create individual lists of comparanda. These are specific for certain language groups that we want to study.
Universal comparanda

While most studies would probably aim to employ a set of universal comparanda, the practice often requires a compromise solution in which some non-universal characteristics are added. This holds, for example, for the idea of a core genome in biology, which ends up being so small in overlap across all living species that it makes little sense to compute phylogenies based on it, except for for closely related species (Dagan and Martin 2006). Another example is the all-inclusive matrices that are used to establish evolutionary relationships of extinct animals characterized by high levels of missing data (eg. Tschopp et al. 2015; Hartman et al. 2019). The same holds for historical linguistics, with the idea of a basic lexicon or basic vocabulary, represented by a list of basic concepts that are supposed to be expressed by simple words in every human language (Swadesh 1955), given that the number of concepts represented by simple words shared across all human languages is extremely small (Hoijer 1956).

Figure 1: All humans have hands and arms but some words for ‘hands’ and ‘arms’ address different things (see our previous post "How languages loose body parts").


Apart from the problem that basic vocabulary concepts occurring in all languages may be extremely limited, test items need to fulfill additional characteristics that may not be easy to find,in order to be useful for phylogenetic studies. They should, for example, be rather resistant to processes of lateral transfer or borrowing in linguistics. They should preferably be subject to neutral evolution, since selective pressure may lead to parallel but phylogenetically independent processes (in biology known as convergent evolution) that are difficult to distinguish and can increase the amount of noise in the data (homoplasy).

Selective pressure, as we might find, for example, in a specific association between certain concepts and certain sounds across a large phygenetically independent sample of human languages, is rarely considered to be a big problem in historical linguistics studies dealing with the evolution of spoken languages (see Blasi et al. 2016 for an exception). In sign language evolution, however, the problem may be more acute because of a similar iconic motivation of many lexical signs in phylogenetically independent sign languages (Guerra Currie et al. 2002), as well as the representation of concepts such as body parts and pronouns using indexical signs with similar forms. This latter characteristic of all known sign languages has led to the design of a basic vocabulary list that differs from those traditionally used in the historical linguistics of spoken languages (Woodward 1993); and we know of only one proposal attempting to address the problem of iconicity in sign languages for phylogenetic research (Parkhurst and Parkhurst 2003).

Figure 2: Basic processes in the evolution of languages, spoken or signed  (see our previous post How languages loose body parts).

All in all, it seems that there may be no complete solution for a list of lexical comparanda for all human languages, including sign languages, given the complexities of lexical semantics, the high variability in expression among the languages of the world (see Hymes 1960 for a detailed discussion on this problem), and the problems related to selective pressures highlighted above. Scholars have proposed alternative features for comparing languages, such as grammatical properties (Longobardi et al. 2015) or other "structural" features (Szeto et al. 2018), but these are either even more problematic for historical language comparison—given that it is never clear if these alternative features have evolved independently or due to common inheritance—or they are again based on a targeted selection for a certain group of languages in a certain region.

Targeted comparanda

If there is no universal list of features that can be used to study how languages have evolved, we have to resort to the second possibility mentioned above, by creating targeted lists of comparanda for the specific language groups whose evolution we want to study. When doing so, it is best to aim at a high degree of universality in the list of comparanda, even if one knows that complete universality cannot be achieved. This practice helps to compare a given study with alternative studies; it may also help colleagues to recycle the data, at least in part, or to merge datasets for combined analyses, if similar comparanda have been published for other languages.

But there are cases where this is not possible, especially when conducting studies where no previous data have been published, and rigorous methods for historical language comparison have yet to be established. Sign languages can, again, be seen as a good example for this case. So far, few phylogenetic studies have addressed sign language evolution, and none have supplied the data used in putting forward an evolutionary hypothesis. Furthermore, because the field lacks unified techniques for the transcription of signs, it is extremely difficult to collect lexical data for a large number of sign languages from comparable glossaries, wordlists, and dictionaries, the three primary sources, apart from fieldwork, that spoken language linguists would use in order to start a new data collection. We are aware of one comparative database with basic vocabulary for sign languages that is currently being built (Yu et al. 2018), and that may represent lexical items in a way that can be compared efficiently, but these data have not yet been made available to other researchers.

Sign languages

When Justin Power approached Mattis about three years ago, asking if he wanted to collaborate on a study relating to sign language evolution, we quickly realized that it would be infeasible to gather enough lexical data for a first study. Tiago Tresoldi, a post-doc in our group, suggested the idea of starting with sign language manual alphabets instead. From the start, it was clear that these manual alphabets might have certain disadvantages — because they are used to represent written letters of a different language, they may constitute a set of features evolving independently from the sign language itself.

Figure 3: Processes shaping manual alphabets. The evolution of signed concepts may be affected by the same, leading to congruent patterns, or different processes, leading to incongruent differentiation patterns (see our previous post: Stacking networks based on sign language manual alphabets).

But on the other hand, the data had many advantages. First, a sufficient number of examples for various European sign languages were available in online databases that could be transcribed in a uniform way. Second, the comparison itself was facilitated, since in most cases there was no ambiguity about which “concepts” to compare, in contrast to what one would encounter in a comparison of lexical entries. For example, an “a” is an “a” in all languages. Third, it turned out that for quite a few languages, historical manual alphabets could be added to the sample. This point was very important for our study. Given that scholars still have limited knowledge regarding the details of sign change in sign language evolution, it is of great importance to compare sources of the same variety, or those assumed to be the same, across time—just as spoken language linguists compared Latin with Spanish and Italian in order to study how sounds change over time. And finally, manual alphabets in fact constitute an integrated part of many sign languages that may, for example, contribute to the forms of lexical signs, making the idea more plausible that an understanding of the evolution of manual alphabets could be informative about the evolution of sign languages as a whole.

Figure 4: Early evolution of handshapes used to sign ‘g’ (see our previous post: Character cliques and networks – mapping haplotypes of manual alphabets).

Guido later joined our team, providing the expertise to analyze the data with network methods that do not assume tree-like evolution a priori. We therefore thought that we had done a rather good job when our pilot study on the evolution of sign language manual alphabets, titled Evolutionary Dynamics in the Dispersal of Sign Languages, finally appeared last month (Power et al. 2020). We identified six basic lineages from which the manual alphabets of the 40 contemporary sign languages developed. The term "lineage" was deliberately chosen in this context, since it was unclear whether the evolution of the manual alphabets should be seen as representative of the evolution of the sign languages as a whole. We also avoided the term "family", because we were wary of making potentially unwarranted assumptions about sign language evolution based on theories in historical linguistics.

Figure 5: The all-inclusive Neighbor-net (taken from Power et al. 2020).

While the study was positively received by the popular media, and even made it onto the title page of the Süddeutsche Zeitung (one of the largest daily newspapers in Germany), there were also misrepresentations of our results in some media channels. The Daily Mail (in the UK), in particular, invented the claim that all human sign languages have evolved from five European lineages. Of course, our study never said this, nor could it have, since only European sign languages were included in our sample. (We included three manual alphabets representing Arabic-based scripts from Afghan, Jordanian, and Pakistan Sign Languages, where there was some indication that these may have been informed by European sources.)

Study of phylogenetics

While we share our colleagues’ distaste for the Daily Mail’s likely purposeful misrepresentation (in the end, unfortunately, it may have achieved its purpose as click bait), some colleagues went a bit further. One critique that came up in reaction to the Daily Mail piece was that our title opens the door to misinterpretation, because we had only investigated manual alphabets and, hence, cannot say anything about the "evolutionary dynamics of sign languages".

While the title does not mention manual alphabets, it should be clear that any study on evolution is based on a certain amount of reduction. Where and how this reduction takes place is usually explained in the studies. Many debates in historical linguistics of spoken languages have centered around the question of what data are representative enough to study what scholars perceive as the "overall evolution" of languages; and scholars are far from having reached a communis opinio in this regard. At this point, we simply cannot answer the question of whether manual alphabets provide clues about sign language evolution that contrast with the languages’ "general" evolution, as expressed, for example, in selecting and comparing 100 or 200 words of basic vocabulary. We suspect that this may, indeed, be the case for some sign languages, but we simply lack the comparative data to make any claims in this respect.

Figure 6: Evolution doesn’t mean every feature has to follow the same path: a synopsis of molecular phylogenies inferred for oaks, Quercus, and their relatives, Fagaceae (upcoming post on Res.I.P.) While nuclear differentiation matches phenotypic evolution and the fossil record (likely monophyla in bold font), the evolution of the plastome is partly decoupled (gray shaded: paraphyletic clades). Likewise, we can expect that different parts of languages, such as manual alphabets vs. core “lingome” of sign languages, may indicate different relationships.

The philosophical question, however, goes much deeper, to the "nature" of language: What constitutes a language? What do all languages have in common? How do languages change? What are the best ways to study how languages evolve?

One approach to answering these questions is to compare collectible features of languages ("traits" in biology)­, and to study how they evolve. As the field develops, we may find that the evolution of a manual alphabet does not completely coincide with the evolution of the lexicon or grammar of a sign language. But would it follow from such a result that we have learned nothing about the evolution of sign languages?

There is a helpful analogy in biology: we know that different parts of the genetic code can follow different evolutionary trajectories; we also know that phenotype-based phylogenetic trees sometimes conflict with those based on genotypes. But this understanding does not stop biologists from putting forward evolutionary hypotheses for extinct organisms, where only one set of data is available (phenotypes; Tree of Life). Furthermore, such conflicting results may lead to a more comprehensive understanding of how a species has evolved.

Figure 7: A likely case of convergence: the sign for “г” in Russian and Greek Sign Language, visually depicting the letter (see our previous post Untangling vertical and horizontal processes in the evolution of handshapes). Complementing studies of signed concepts may reveal less obvious cases of convergence (or borrowing).


Because we felt the need to further clarify the intentions of our study, and to answer some of the criticism raised about the study on Twitter, we decided to prepare a short series of blog posts devoted to the general question of "How should one study language evolution" (or more generally: "How should one study evolution?"). We hope to take some of the heat out of the discussion that evolved on Twitter, by inviting those who raised critiques about our study to answer our posts in the form of comments here, or in their own blog posts.

The current blog post can thus be understood as an opening for more thoughts and, hopefully, more fruitful discussions around the question of how language evolution should be studied.

In that context, feel free to post any questions and critiques you may have about our study below, and we will aim to pick those up in future posts.

References

Damián E. Blasi and Wichmann, Søren and Hammarström, Harald and Stadler, Peter and Christiansen, Morten H. (2016) Sound–meaning association biases evidenced across thousands of languages. Proceedings of the National Academy of Science of the United States of America 113.39: 10818-10823.

Dagan, Tal and Martin, William (2006) The tree of one percent. Genome Biology 7.118: 1-7.

Guerra Currie, Anne-Marie P. and Meier, Richard P. and Walters, Keith (2002) A cross-linguistic examination of the lexicons of four signed languages. In R. P. Meier, K. Cormier, & D. Quinto-Pozos (Eds.), Modality and Structure in Signed and Spoken Languages (pp.224-236). Cambridge University Press.

Hoijer, Harry (1956) Lexicostatistics: a critique. Language 32.1: 49-60.

Hymes, D. H. (1960) Lexicostatistics so far. Current Anthropology 1.1: 3-44.

Longobardi, Giuseppe and Ghirotto, Silva and Guardiano, Cristina and Tassi, Francesca and Benazzo, Andrea and Ceolin, Andrea and Barbujan, Guido (2015) Across language families: Genome diversity mirrors linguistic variation within Europe. American Journal of Physical Anthropology 157.4: 630-640.

Parkhurst, Stephen and Parkhurst, Dianne (2003) Lexical comparisons of signed languages and the effects of iconicity. Working Papers of the Summer Institute of Linguistics, University of North Dakota Session, vol. 47.

Power, Justin M. and Grimm, Guido and List, Johann-Mattis (2020) Evolutionary dynamics in the dispersal of sign languages. Royal Society Open Science 7.1: 1-30. DOI: 10.1098/rsos.191100

Swadesh, Morris (1955) Towards greater accuracy in lexicostatistic dating. International Journal of American Linguistics 21.2: 121-137.

Szeto, Pui Yiu and Ansaldo, Umberto and Matthews, Steven (2018) Typological variation across Mandarin dialects: An areal perspective with a quantitative approach. Linguistic Typology 22.2: 233-275.

Woodward, James (1993) Lexical evidence for the existence of South Asian and East Asian sign language families. Journal of Asian Pacific Communication 4.2: 91-107.

Untangling vertical and horizontal processes in the evolution of handshapes

Justin Power


[This is a guest-post by Justin Power, and the 3rd part of our miniseries on sign language manual alphabets]

In Guido’s most recent post in this miniseries on manual alphabet evolution in sign languages, he discussed the role of character mapping on networks in phylogenetic inference. He pointed out how we used this approach to infer evolutionary pathways of languages, and why this step in exploratory data analysis is important, given the complexity of the underlying signal in this data set.

In this new post, I take up the topic of hand-shape evolution in more detail, explaining some of the complexities involved in studying sign language evolution. I will specifically look at how we can identify both vertical and horizontal processes in the evolution of hand-shapes.

Introduction

We know very little about how signs and hand-shapes actually evolve. There have been a few studies — most of them from decades ago — comparing American Sign Language in videos and dictionaries from the early 20th century with then contemporary forms (Frishberg 1975; Battison et al. 1975). One study in particular argued that, as a sign language emerges in a community of signers, crystallizing into a stable linguistic system, the signs evolve in a quasi-teleological way from earlier, more gesture- or pantomime-like forms to more language-like forms, cutting similar evolutionary pathways leading to more constraints on articulation and to general systematization.

But what happens (in this story) once sign languages become linguistic systems? Do they continue evolving, as happens in spoken languages? If yes, how? Investigating these kinds of questions was one of my motivations for tracking down historical examples of manual alphabets for over a dozen sign languages. The pay-off (besides the thrill of the treasure hunt) is that, by tracing hand-shapes through historical examples and comparing them with contemporary sign languages, we can infer the vertical and horizontal evolutionary processes affecting sign languages and hand-shape forms.

Vertical and horizontal aspects of hand-shape evolution

Consider part of the Neighbor-net from our paper (see Part 1) including the Austrian-origin and Russian groups in the figure below. Russian 1835 is the earliest manual alphabet in our sample published in Russia (St. Petersburg); and Danish 1808, in the Danish subgroup, was published in Copenhagen.


While the two manual alphabets are found in different neighborhoods in the graph, they share a number of hand-shapes, some of which were (and still are) shared widely throughout Europe, for reasons that we discuss in the main paper.

One such hand-shape represents the Latin / Cyrillic letter "A" in both Danish 1808 and Russian 1835, as illustrated in the timeline here.


Note the position of the thumbs at the bottom of the figure: in both early examples, the thumb is adjacent to the bent index finger. In an example from Danish SL in 1907 (and subsequently in 1926 and 1967), the position of the thumb has shifted across the index finger. For Russian SL, too, the position of the thumb in the contemporary hand-shape representing the Cyrillic letter A has crept across the index finger to the front of the fist (the hand-shape in the figure is my attempt to reproduce the source; see here for the real thing).

There are two points to note here in connection with evolutionary processes. First, these changes in thumb position appear to have a vertical aspect: as signers in a community used these hand-shapes and transmitted them to later generations, they also modified the forms in subtle ways, perhaps unconsciously in a process with analogies to sound change in spoken language.

Second, the changes also include a horizontal aspect: the forms evolved in similar ways, as the two signing communities converged on the same shape (apparently) independently, possibly due to similar articulatory or perceptual pressures. The horizontal aspect of this process contributes to signal incompatibility in the dataset underlying the network — the more convergence there is, then the less tree-like will be the Neighbor-net (in this case, the more spiderweb-like).

Convergence

In addition to the preceding example, a typical case of convergence can be seen in the independent creation of similar hand-shapes to represent the Greek and Cyrillic letter "Г".


Beginning again with the main Neighbor-net in the figure immediately above, we see that Russian 1835 and contemporary Greek SL are found in different neighborhoods, with Greek in the French-origin group. The two languages, however, share the Г-representing hand-shape (the Russian form is from Fleri 1835, while the Greek form is, again, my own hand; see here for the real one). Because Greek SL is the only language in the French-origin group to share this hand-shape with the Russian group, there is a clear suggestion of a horizontal process that resulted in similar hand-shapes across unrelated languages. The most likely processes here are convergence due to the independent creation of iconic representations of the written letter; or lateral transfer — called borrowing in linguistics — via some historical instance of contact between signers of the two languages. [My intuition is for the former explanation.]

Borrowing

The final example deals with a clear case of borrowing. The figure below shows the time- / taxon-filtered Neighbor-net, including historical manual alphabets up to about 1840 (see Part 2), but only annotated with the relevant languages.


The two earliest manual alphabets in our dataset were published in Madrid in 1593 (de Yebra) and 1620 (Bonet). In neither case do we see any trace of a hand-shape representing the letter "W", which was not needed to represent these Latin alphabets. Later, too, manual alphabets published in Spain in 1815, 1845, and 1859 still did not include the letter "W". In contrast, in Austrian 1786 and French 1800 (as well as other languages), hand-shape forms representing the letter W are found in the earliest examples we have for those languages. Some 160–230 years later, however, we find similar forms for "W" in contemporary Austrian, French and Spanish SLs. We deduce that contemporary Spanish SL did not inherit the "W" hand-shape from the 19th century Spanish manual alphabets. Instead, the hand-shape may have been borrowed from some other language, possibly French SL given its influence on deaf education in Europe, or possibly later from the International Sign manual alphabet (also part of the French-origin Group).

Conclusion

As these examples show, there are different types of horizontal processes contributing to conflicting signal in the data set. Using the splits network graphs together with historical examples of manual alphabets, we can untangle the horizontal signal in many cases. The approach has also given us some insight into the evolutionary processes contributing to the diversity of contemporary sign languages, a topic that we plan to investigate more fully.

Cited literature, further reading and data
  • Battison, Robin, Harry Markowicz, & James Woodward (1975) A good rule of thumb: Variable phonology in American Sign Language. In Ralph W. Fasold & Roger W. Shuy (eds.), Analyzing Variation in Language: Papers from the Second Colloquium on New Ways of Analyzing Variation, Part 3, pp. 291–302. Washington D.C.: Georgetown University Press.
  • Bonet, Juan Pablo (1620). Reduction de las letras y arte para enseñar a ablar los mudos. Madrid: Francisco Abarca de Angulo.
  • Fleri, Viktor I. (1835) Глухонемые, рассматриваемые в отношении к их состоянию и к способам образования, самым свойственнымих при. St. Petersburg:Типография А. Плюшара.
  • Frishberg, Nancy 1975 Arbitrariness and iconicity: Historical change in American Sign Language. Language 51(3): 696–719.
  • Yebra, Melchor de (1593) Libro llamado Refugium Infirmorum: Muy util y prouechoso para todo genero de gente : En el qual se contienen muchosauisos espirituales para socorro de los afligidos enfermos, y para ayudar à bien morir a los que estan en lo ultimo de su vida ; con un Alfabeto de S. Buenauentura para hablar por la mano. Madrid: Luys Sa[n]chez
A comprehensive reference list can be found in our pre-print at Humanties Commons. The raw data and analysis files are available via GitHub.

Other posts in this miniseries

Character cliques and networks – mapping haplotypes of manual alphabets


[This post is the second part of our miniseries on the origin and evolution of sign language manual alphabets]

One aspect of exploratory data analysis (EDA) is for us to try to understand how our data relate to our inference(s). This is especially important when the signal from our data is increasingly complex. Sign language manual alphabets are such a case.

In our first post about sign language manual alphabets, I introduced the principal networks that we used to classify sign languages. Here, I'll describe our character mapping procedure and why we did it as part of our EDA framework, in order to establish scenarios for the origin and evolution of sign languages.

Characters and mapping

We encoded each hand-shape used to signify a certain concept, such as the letters included in the standard Latin alphabet "a", "b", "c", .... "x", "y", "z", as a binary sequence – the presence or absence of a certain COGID (we will explain and discuss this in a later post). These binary sequences can be seen as an analogy of the genetic code, as a sort of 'linguistic haplotype', and their evolution can be mapped onto a network based on the entire dataset.

For instance, our matrix has three binaries (haplotypes) for the concept [g] in the oldest set of sign languages (pre-1840), two of which can be found in the earliest alphabets in our dataset: those of Yebra 1953 and Bonet 1620. Russian 1835, the oldest Cyrillic alphabet, uses a somewhat different hand-shape for its counterpart of the Latin "g", the Cyrillic "г".

For the concept [g], we thus have three taxon cliques, each defined by a distinct binary/haplotype: the 'Yebra haplotype', the 'Bonet haplotype', and the 'Cyrillic haplotype'.

By mapping these haplotypes on the network, as shown in the next figure, we can see that there is a small edge bundle reflecting the basic split between the Yebra and Bonet haplotypes.

Hand-shape drawings are taken from the original manuscripts.

We can also see that the Russian haplotype either evolved from the Yebra haplotype kept in the older Austrian-origin Group, ie. is an adaptation of the Yebra haplotype, or that it is a genuinely new invention — note the similarity of the Russian hanshape with the letter г.

We repeated this procedure for all 26 concepts of the standard Latin alphabet, to get an idea of how often the encoded linguistic haplotypes fit with the overall pattern visualized in the inferred Neighbor-nets (ie. the neighborhoods as defined by edge bundles). This is shown in the next figure.

The arrows indicate inferred evolutionary processes (replacement or invention).

Using this network mapping(which, in principle, uses the logic of parsimony/median networks), we can make direct inferences about the general mode of evolution.

For instance, even though Russian 1835 uses a different set of hand-shapes (ie. is defined by partly unique haplotypes), the hand-shapes for the concepts [p] and [z] are exclusively shared with the Austrian-origin Group. The biological equivalent would be: the 'Austrian haplotypes' are a uniquely shared derived feature reflecting a putative common origin of the Austrian and Russian lineages — ie a potential linguistic synapomorphy. We also can see that all haplotypes shared by Russian and all ([a][c][f][r][u][y]) or part ([b][e][i][k][n][o][x]) of the French-origin Group, an alternative source that may have inspired this early Cyrillic alphabet, lack this quality.

We can also make inferences about:
  1. which hand-shape is the original one (O);
  2. lineage-specific / diagnostic hand-shapes, eg. At. = Austrian, Da. = Danish (using two letter abbreviations);
  3. which hand-shapes are shared but apparently derived, eg. At.-Fr. are hand-shapes / haplotypes shared by members of the Austrian- and French-origin groups not found in the Yebra or Bonet alphabets — C stands for cosmopolitan, non-original handshapes common in various lineages, including British-origin Group, and D represents derived but rare hand-shapes without any clear lineage-affiliation; and
  4. alphabet-unique (ie. represent a linguistic autapomorphy.
In addition, we can explore certain details, including patterns (character-based taxon cliques) that are at odds with the overall reconstruction. The latter are to be expected, because the graph is planar (2-dimensional) but the processes that shaped sign alphabets are likely to be multi-dimensional. For instance, our networks failed to resolve the affinity of the contemporary Norwegian Sign Language, the reason for which can be seen in the following character map.


Note the position of Norwegian 1955, which is still part of the Austrian-origin Group (like older manual alphabets used in the late 19th century in Norway). However, it is already influenced by international standardization — eg. concepts [k], [p], and [z] use(d) French hand-shapes. Hence, Norwegian 1955 shares quite a high number of lineage-diagnostic hand-shapes with Danish 1967 and the Icelandic Sign Language. These, and others, were further replaced in its contemporary counterpart (Norwegian SL) by hand-shapes borrowed from various lineages — eg. [c],[f] from the nearly extinct Austrian-origin Group, [p] from the Russian Group, [k] same as in the Spanish Group) — as well as unique hand-shapes, including hand-shapes evolved from earlier forms or those that have been genuinely invented.

Why we map character evolution along networks

In many cases, we only have one set of data, in order to draw our conclusions based on the graph(s) we infer. We cannot test to which degree our data (the way we scored the differentiation patterns) and inferences are systematically biased. Thus, we want to explore which aspects of our inference are supported by character splits, and establish taxon cliques and evolutionary pathways for the characters (scored traits). Lacking an independent source of data, the latter would involve circular reasoning — ie. mapping the traits along a tree derived from those same traits.

By inferring a tree, we crystallize one pattern dimension out of the data, although more often than not this will be a comprise from multidimensional signals. A network, such as a Neighbor-net, has two dimensions, and hence our mapping can consider two alternatives at the same time — this enables us to make a choice, if we have to. Another practical advantage of a Neighbor-net is that it is quick to infer, so that we can easily reduce the data set and use a more focused graph for the map.

In cases where 2-dimensional graphs don't suffice, there are still Consensus networks, which would allow mapping character evolution based on a sample of many alternative trees.

We could even eliminate the circular reasoning while maintaining a relatively stable inference framework. Deleting a character or several characters (or recoding them: see eg. Should we try to infer trees on tree-unlikely matrices?) can easily lead to a new tree topology, although it has less effect on the structure of a Neighbor-net. When we would need to worry about circular reasoning for mapping a certain concept, or two concepts that may have interacted, we just base our Neighbour-net on a distance matrix calculated from a reduced character matrix, and then map only those concepts not considered for the inference.

Other posts in this miniseries