Historical data visualization panel

Manuel Lima hosted a free online panel with Michale Friendly and Sandra Rendgen historical data visualization. It already happened, but you can listen to the archived version:

Human beings have been involved in the visual representation of information for thousands of years. While some books on Data Visualization go as far back as the 18th century, to what’s considered to be the golden age of information graphics, the history of the practice is much deeper. The participants on this panel have spent years exploring key characters and major contributions to the field of Data Visualization over many centuries. We will be discussing ancient visual metaphors, the challenges of doing research in this area, what we can learn from the past, and many other topics.

Tags: , , ,

Disgraced Korea scholar, formerly of Columbia, loses paper for plagiarism

A former historian at Columbia University who resigned last year in the wake of a plagiarism scandal involving his award-winning book on North Korea has lost a 2005 paper for misusing his sources.  In 2017, Charles Armstrong, once a leading figure in Korean scholarship, returned the 2014 John King Fairbank Prize from the American Historical … Continue reading Disgraced Korea scholar, formerly of Columbia, loses paper for plagiarism

Data shelf life

Stephen M. Stigler argues that data have a limited shelf life. The abstract:

Data, unlike some wines, do not improve with age. The contrary view, that data are immortal, a view that may underlie the often-observed tendency to recycle old examples in texts and presentations, is illustrated with three classical examples and rebutted by further examination. Some general lessons for data science are noted, as well as some history of statistical worries about the effect of data selection on induction and related themes in recent histories of science.

In a nutshell, while data itself doesn’t change, everything around it — the people who collected the data, the things that the data is about, and where the data came from — changes over time.

Tags: ,

Some hitherto unkown genealogical trees of music

In last week's post, I discussed Petter Hellström's recent doctoral thesis: Trees of Knowledge: Science and the Shape of Genealogy. In this thesis he discusses three "genealogical tees" in detail. Augustin Augier’s tree of plant families and Félix Gallet’s family tree of languages have already been covered in this blog (you can look them up using the Search box, to the right), but Henri Montan Berton’s family tree of chords has not.

Indeed, the historical literature at large has pretty much ignored the idea of a genealogical tree being associated with music. Nevertheless, the tree itself is explicitly labeled a Genealogical Tree of Chords. This tree, and its predecessor by François Guillaume Vial, thus deserve examination.

Henri Montan Berton (1767–1844) is well known within the history of music; and his tree was published as an independent broadsheet as two (almost identical) editions in c. 1807 and 1815. It seems to have been produced as a teaching tool, as indeed were also the trees of Augier and Gallet. As Petter Hellström notes, for these authors "genealogy did not necessarily involve chronology or change ... the introduction of family trees into secular knowledge production had more to do with the needs of information management, visualisation and communication".

Berton himself states (translated from the French):
In composing the Genealogical Tree, one has has had the intention to present to the eye, at a single glance, the reunion of the great family of Chords, and to demonstrate to the eye that there is only one Primordial [Chord], and that it is the source of all Harmonies.
At the base of the tree is a fundamental bass note along with its 12th and 17th major — this was the harmonic series in 18th century music theory. From here the tree produces 8 branches above, each labeled (at the bottom) with a musical chord, and with another 20 chords labeled further up the branches (all highlighted by arrows at the left). The main trunk (denoted A) is labeled Perfect or Constant Chord. The eight branches are intended to show the relationships between "8 fundamental chords [bottom arrow] and 20 inverted chords [the upper arrows]".

The tree thus displays the harmonic relationships among the chords, rather than any sort of chronological development. It was devised as an aid to learning the fundamentals of music composition.

Berton was not the first to use this idea within music theory. Four decades earlier, in 1766, François Guillaume Vial (1725–?) had produced another broadsheet, this time labeled Genealogical Tree of Harmony.

Like Berton's tree, this is not about chronology, but is about "family relationships" in a different sense. Moreover, in this instance the branching aspect of the tree is abandoned, and the tree foliage is simply festooned with medallions, labeled with chords — it is the different sections of the tree's crown that show relationships, not different branches.

The objective here was to illustrate "the most natural order of harmonic modulation", once again devised as a teaching tool. The two compass roses at the bottom left and right show the circle of fifths (left), guiding horizontal modulation among the chords, and the circle of thirds (right), guiding vertical modulation among the chords.

Vial himself states (translated from the French):
This Genealogical Tree simplifies and allows those who are capable of intonation [to practice] the art of preluding not only on a leading note, but even to change between the most desired modulations of any instrument.
Hellström traces these uses of the "family tree" metaphor in music back to Jean-Philippe Rameau (1683–1764), an influential music theorist. Thus, he concludes that we should:
read the trees of Vial and Berton as graphical codifications of an already established metaphor and manner of thinking about harmony, especially as both authors were informed by Rameau in their understanding of harmony in the first place.
In constructing their respective tree diagrams, Berton and Vial both seized upon an already existing metaphor and made it visible on paper. Their trees are not 'genealogical' in the sense that they charted family history or cross-generational relationships, they are 'genealogical' in the sense that they depict presumably natural, organic relationships, in which every part has its place in the whole, and where every part can be referred back to a common source or root.
These trees do not, therefore, fit into the usual history of genealogical trees, as this blog recognizes them, denoting a chronological history. They, would, however, fir neatly into the post on Relationship trees drawn like real trees.

The early beginnings of visual thinking

Visualization is a relatively new field. Sort of. The increased availability of data has pushed visualization forward in more recent years, but its roots go back centuries. Michael Friendly and Howard Wainer rewind back to the second half of the 1800s, looking at the rise of visual thinking.

On the first construction of the periodic table of elements:

On February 17, 1869, right after breakfast, and with a train to catch later that morning, Mendeleev set to work organizing the elements with his cards. He carried on for three days and nights, forgetting the train and continually arranging and rearranging the cards in various sequences until he noticed some gaps in the order of atomic mass. He later recalled, “I saw in a dream, a table, where all the elements fell into place as required. Awakening, I immediately wrote it down on a piece of paper.” (Strathern, 2000) He named his discovery the “periodic table of the elements.”

I sometimes wonder what they will say about current visualization work a couple of centuries from now. At what point will the historians say, “This is when visualization crashed and burned, never to be seen again.” Or, maybe it’ll go the other way: “This is when everyone understood and communicated with data, and visualization was the vehicle to do it.”

Tags: , , ,

A recent thesis about Trees of Knowledge

Recently, Petter Hellström successfully defended his doctoral thesis:
Trees of Knowledge: Science and the Shape of Genealogy
Department of the History of Science and Ideas
Uppsala University, Sweden
The thesis itself is obviously of great interest to readers of this blog. It is not currently online, but you can obtain a printed or electronic copy by contacting:

Here is the abstract:
This study investigates early employments of family trees in the modern sciences, in order to historicise their iconic status and now established uses, notably in evolutionary biology and linguistics. Moving beyond disciplinary accounts to consider the wider cultural background, it examines how early uses within the sciences transformed family trees as a format of visual representation, as well as the meanings invested in them.
Historical writing about trees in the modern sciences is heavily tilted towards evolutionary biology, especially the iconic diagrams associated with Darwinism. Trees of Knowledge shifts the focus to France in the wake of the Revolution, when family trees were first put to use in a number of disparate academic fields. Through three case studies drawn from across the disciplines, it investigates the simultaneous appearance of trees in natural history, language studies, and music theory. Augustin Augier’s tree of plant families, Félix Gallet’s family tree of dead and living languages, and Henri Montan Berton’s family tree of chords served diverse ends, yet all exploited the familiar shape of genealogy.
While outlining how genealogical trees once constituted a more general resource in scholarly knowledge production — employed primarily as pedagogical tools — this study argues that family trees entered the modern sciences independently of the evolutionary theories they were later made to illustrate. The trees from post-revolutionary France occasionally charted development over time, yet more often they served to visualise organic hierarchy and perfect order. In bringing this neglected history to light, Trees of Knowledge provides not only a rich account of the rise of tree thinking in the modern sciences, but also a pragmatic methodology for approaching the dynamic interplay of metaphor, visual representation, and knowledge production in the history of science.
The trees of Augier and Gallet have been covered in this blog, but that of Berton has not. I will discuss it in the next post.

Where are we, 60 years after Hennig?

Phylogenetic analysis is common in the modern study of evolutionary biology, and yet it often seems to be a poorly understood tool. Indeed, it seems to often be seen as nothing more than a tool, and one for which one does not need much expertise.

For example, we do not need to spend much time on Twitter to realize that many evolutionary biologists do not understand even the most basic things about the difference between taxa and characters. Taxa are often referred to as "primitive", particularly by people studying the so-called Origin of Life. However, taxa themselves cannot be either primitive or derived; instead, they are composed of mixtures of primitive and derived characters — they have derived characters relative to their ancestors and primitive ones compared to their descendants.

The logical relationship between common ancestors and monophyletic / paraphyletic groups is also apparently unknown to many evolutionary biologists. There is endless debate about whether the Last Universal Common Ancestor was a Bacterium or an Archaean when, of course, it cannot be either. That is, we sample contemporary organisms for analysis, which come from particular taxonomic groupings, and from these data we infer hypothetical ancestors. However, those ancestors cannot be part of the same taxonomic group as their descendants unless that taxonomic group is monophyletic.

This is all basic stuff, first expounded in the 1950s by Willi Hennig. So, why do so many people apparently still not know any of this 60 years later? I suspect that somewhere along the line the molecular geneticists got the idea that Hennig was part of Parsimony Analysis, and since they adopted Likelihood Analysis, instead, he is thus irrelevant.

However, Hennigian Logic underlies all phylogenetic analyses, of whatever mathematical ilk. All such analyses are based on the search for unique shared derived characters, which is the only basis on which we can objectively produce a rooted phylogenetic tree or network.

In the molecular world, many analysis techniques are based on analyzing the similarity of the taxa. However, similarity is only relevant if it is based on shared derived characters — if it is based on shared primitive characters then it cannot reliably detect phylogenetic history. This was Hennig's basic insight, and it is as true today as it was 60 years ago.

The confusing thing here is that most similarity among taxa will be based on both primitive and derived characters. This means that some of the analysis output reflects phylogenetic history and some does not. The further we go back in evolutionary time, the more likely it is that similarity reflects shared primitive characters rather than shared derived characters. This simple limitation seems to be poorly understood by evolutionary biologists.

Perhaps it would be a good idea if university courses in molecular evolutionary biology actually taught phylogenetics as a topic of its own, rather than as an incidental tool for studying evolution. After all, there is more to getting a scientific answer than feeding data into a computer program.

Obviously, I may be wrong in painting my picture with such a broad brush. If so, then it must be that the people I have described seem to have gathered on Twitter, like birds of a feather.

And yet, I see the same thing in the literature, as well. Consider this recent paper:
A polyploid admixed origin of beer yeasts derived from European and Asian wine populations. Justin C. Fay, Ping Liu, Giang T. Ong, Maitreya J. Dunham, Gareth A. Cromie, Eric W. Jeffery, Catherine L. Ludlow, Aimée M. Dudley. 2019. PLoS Biology 17(3): e3000147.
This seems to be quite an interesting study of a reticulate evolutionary history involving budding yeasts, from which the authors conclude that:
The four beer populations are most closely related to the Europe/wine population. However, the admixture graph also showed strong support for two episodes of gene flow into the beer lineages resulting in 40% to 42% admixture with the Asia/sake population.

However, they then undo all of their good work with this sentence:
The inferred admixture graph grouped the four beer populations together, with the lager and two ale populations being derived from the lineage leading to the Beer/baking population.
Nonsense! Neither lineage derives from the other, but instead they both derive from a common ancestor. This is like saying that I derive from the lineage leading to my younger brother, when in fact we both derive from the same parents. I doubt that the authors believe the latter idea, so why do they apparently believe the former?

That is a little test that you can all use when writing about phylogenetics. If your words don't make sense for a family history, then they don't make sense for phylogenetics either.

Posted by in history


Is racism Christian?

I was taught that racism developed out of Johannes Blumenbach’s Anthropological Treatises in the late eighteenth century, specifically his doctoral thesis On the Natural Variety…

What is R, what it was, and what it will become

Roger Peng provides a lesson on the roots of R and how it got to where it is now:

Chambers was referring to the difficulty in naming and characterizing the S system. Is it a programming language? An environment? A statistical package? Eventually, it seems they settled on “quantitative programming environment”, or in other words, “it’s all the things.” Ironically, for a statistical environment, the first two versions did not contain much in the way of specific statistical capabilities. In addition to a more full-featured statistical modeling system, versions 3 and 4 of the language added the class/methods system for programming (outlined in Chambers’ Programming with Data).

I’m starting feel my age, as some of the “history” feels more like recent experience.

You can also watch Peng’s keynote in the video version.

Tags: , ,

My father on D-Day: 75 years ago

Today is the 75th anniversary of D-Day—the day British, Canadian, and American troops landed on the beaches of Normandy.1

For us baby boomers it always meant a day of special significance for our parents. In my case, it was my father who took part in the invasions. That's him on the right as he looked in 1944. He was an RAF pilot flying rocket-firing typhoons in close support of the ground troops. His missions were limited to quick strikes and reconnaissance during the first few days of the invasion because Normandy was at the limit of their range from southern England. During the second week of the invasion (June 14th) his squadron landed in Crepon, Normandy and things became very hectic from then on with several close support missions every day [see Hawker Hurricanes and Typhoons in World War II].

I have my father's log book and here are the pages from June 1944 (below). The red letters on June 6 say "DER TAG." It was his way of announcing D-Day. On the right it says "Followed SQN across channel. Saw hundreds of ships ... jumped by 190s. LONG AWAITED 2nd FRONT IS HERE." Later that day they shot up German vehicles south-east of Caen where there was heavy fighting by British and Canadian troops. The next few weeks saw several sorties over the allied lines. These were mostly attack missions using rockets to shoot up German tanks, vehicles, and trains.

The photograph on the right shows a crew loading rockets onto a typhoon based just a few kilometers from the landing beaches in Normandy. You can see from the newspaper clipping in my father's log book that his squadron was especially interested in destroying German headquarter units and they almost got Rommel. It was another RAF squadron that wounded Rommel on July 17th.

The colorized photo on the left is my father in his Typhoon.

The log book entry (above) for June 10th says, "Wizard show. Recco area at 2000' south west of Caen F/S Moore and self destroyed 2 flak trucks, 2 arm'd trucks, and 1 arm'd command vehicle, Every vehicle left burning but one. Must have been a divisional headquarters? No casualties."

Here's another description of that rocket-firing typhoon raid [Air Power Over the Normandy Beaches and Beyond].
Intelligence information from ULTRA set up a particularly effective air strike on June 10. German message traffic had given away the location of the headquarters of Panzergruppe West on June 9, and the next evening a mixed force of forty rocket-armed Typhoons and sixty-one Mitchells from 2 TAF struck at the headquarters, located in the Chateau of La Caine, killing the unit's chief of staff and many of its personnel and destroying fully 75 percent of its communications equipment as well as numerous vehicles. At a most critical point in the Normandy battle, then, the Panzer group, which served as a vital nexus between operating armored forces, was knocked out of the command, control, and communications loop; indeed, it had to return to Paris to be reconstituted before resuming its duties a month later.

My father was awarded the Distinguished Flying Cross (DFC) for his efforts during the war.

(This article was first posted on June 6, 2014.)

1. The British landed at Sword Beach and Gold Beach, the Canadians at Juno Beach, and American troops landed at Omaha and Utah Beaches.