10th Anniversary Video Series: Shaking Things Up

0000-0002-8715-2896 10th Anniversary Video Series: Shaking Things Up   Posted May 18, 2017 by PLOS ONE Editors in 10th Anniversary post-info AddThis Sharing Buttons above As PLOS ONE celebrates its tenth birthday, we take a

Blog anniversary: 5 years


The first post was put up on this blog on Saturday, February 25 2012, which makes today the fifth anniversary.

First blog header

By my reckoning, this is the 469th blog post, not all of them written by me, of course; but this makes an average of one post for every 3.9 of the 1,827 days. I have never counted the number of actual words, but if I had ever contemplated that number then I probably would never have started.

Second blog header

It is rather tricky to estimate the readership, because of the number of blog hits that clearly come from robots. However, even trying to take that into account, I get an estimate just short of 500,000 pageviews over the 5 years.

Third blog header

So, thanks to everyone for dropping by. If you ever feel inclined to re-read any of the old posts, then they are grouped roughly by topic in the "Pages" at the top of the right-hand column.

Multiple sequence alignment


Following a previous post on Multiple sequence alignment, celebration of the 20th anniversary of my first publication in the alignment field continues, with a new publication:

  • Morrison DA, Morgan MJ, Kelchner SA (2015) Molecular homology and multiple sequence alignment: an analysis of concepts and practice. Australian Systematic Botany 28: 46-62.

This paper places sequence alignment within the larger picture of detecting homologies in molecular data, emphasizing the hierarchical nature of homologies. Surprisingly, this relationships has not been emphasized before. It also points out why nucleotide alignments are a unique form of homology assessment, even within this framework. Indeed, the only genotypic data are nucleotides, since everything else is an expression of the nucleotide sequences, rather than being inherited.

The article is Open Access.


Multiple sequence alignment


I started actively working on phylogenetic networks more than 10 years ago, when I gave a talk at the Phylogenetic Combinatorics and Applications meeting in Uppsala in July 2004.

However, before I started working on networks I had for several years been working on multiple sequence alignment methodology, and I still do. This work is also of direct relevance to network construction, of course, since faulty alignments will generate conflicting signals that can confound the biological signals that alone should appear in the network.

This year marks the 20th anniversary of my first publication in the alignment field (see the list appended below). To celebrate this I have some review / commentary articles planned. The first of these has now appeared online, and I would like to draw it to your attention:
  • Morrison DA (2015) Is multiple sequence alignment an art or a science? Systematic Botany 40: 14-26.
This paper relates current sequence alignment procedures to homology assessments as they are practiced for other data. Most algorithms can be seen as implementing only one of the several criteria that are used to identify homologies, which is inadequate. Suggestions are made for improving this situation.


There will also be a couple of upcoming blog posts canvassing a few issues that I see as important for the future development of alignment methods.

Previous Publications

Theory

Ellis J, Morrison DA (1995) Effects of sequence alignment on the phylogeny of Sarcocystis deduced from 18S rDNA sequences. Parasitology Research 81: 696-699.

Morrison DA, Ellis JT (1997) Effects of nucleotide sequence alignment on phylogeny estimation: a case study of 18S rDNAs of Apicomplexa. Molecular Biology and Evolution 14: 428-441. [This has been the most cited of these publications, surprising me by still getting cited about once per month]

Morrison DA (2006) Multiple sequence alignment for phylogenetic purposes. Australian Systematic Botany 19: 479-539.

Morrison DA (2009) A framework for phylogenetic sequence alignment. Plant Systematics and Evolution 282: 127-149. [This was actually accepted for publication in 2007]

Morrison DA (2009) Why would phylogeneticists ignore computerized sequence alignment? Systematic Biology 58: 150-158.

Morrison DA (2010) [Book review of] ‘Sequence Alignment: Methods, Models, Concepts, and Strategies’. Systematic Biology 59: 363-365.

Empirical examples

Mugridge NB, Morrison DA, Johnson AM, Luton K, Dubey JP, Votypka J, Tenter AM (1999) Phylogenetic relationships of the genus Frenkelia: a review of its history and new knowledge gained from comparison of large subunit ribosomal RNA gene sequences. International Journal for Parasitology 29: 957-972.

Mugridge NB, Morrison DA, Heckeroth AR, Johnson AM, Tenter AM (1999) Phylogenetic analysis based on full-length large subunit ribosomal RNA gene sequence comparison reveals that Neospora caninum is more closely related to Hammondia heydorni than to Toxoplasma gondii. International Journal for Parasitology 29: 1545-1556.

Mugridge NB, Morrison DA, Jäkel T, Heckeroth AR, Tenter AM, Johnson AM (2000) Effects of sequence alignment and structural domains of ribosomal DNA on phylogeny reconstruction for the protozoan family Sarcocystidae. Molecular Biology and Evolution 17: 1842-1853.

Beebe NW, Cooper RD, Morrison DA, Ellis JT (2000) Subset partitioning of the ribosomal DNA small subunit and its effects on the phylogeny of the Anopheles punctulatus group. Insect Molecular Biology 9: 515-520.

Beebe NW, Cooper RD, Morrison DA, Ellis JT (2000) A phylogenetic study of the Anopheles punctulatus group of malaria vectors comparing rDNA sequence alignments derived from the mitochondrial and nuclear small ribosomal subunits. Molecular Phylogenetics and Evolution 17: 430-436.

Three years of network blogging


Today is the third anniversary of starting this blog, and this is post number 325. Thanks to all of our visitors over the past three years — we hope that the next year will be as productive as this past one has been.

I have summarized here some of the accumulated data, in order to document at least some of the productivity.

As of this morning, there have been 238,613 pageviews, with a median of 192 per day. The blog has continued to grow in popularity, with a median of 70 pageviews per day in the first year, 189 per day in the second year, and 353 per day in the third year. The range of pageviews was 172-1148 per day during this past year. The daily pattern for the three years is shown in the first graph.

Line graph of the number of pageviews through time, up to today.
The largest values are off the graph. The green line is the half-way mark.
The inset shows the mean (blue) and standard deviation of the daily number of pageviews.

There are a few general patterns in the data, the most obvious one being the day of the week, as shown in the inset of the above graph. The posts have usually been on Mondays and Wednesdays, and these two days have had the greatest mean number of pageviews.

Some of the more obvious dips include times such as Christmas - New Year; and the biggest peaks are associated with mentions of particular blog posts on popular sites.

Unfortunately, the data are also seriously skewed by visits from troll sites. These have been particularly from the Ukraine, which is solely responsible for the peak between days 900 and 1000. The smaller following peak represents visits from Taiwan.

The posts themselves have varied greatly in popularity, as shown in the next graph. It is actually a bit tricky to assign pageviews to particular posts, because visits to the blog's homepage are not attributed by the counter to any specific post. Since the current two posts are the ones that appear on the homepage, these posts are under-counted until they move off the homepage, (after which they can be accessed only by a direct visit to their own pages, and thus always get counted). On average, 30% of the blog's pageviews are to the homepage, rather than to a specific post page, and so there is considerable under-counting.

Scatterplot of post pageviews through time, up to last week; the line is the median.
Note the log scale, and that the values are under-counted (see the text).

It is good to note that the most popular posts were scattered throughout the years. Keeping in mind the initial under-counting, the top collection of posts (with counted pageviews) have been:
129
42
172
10
181
73
58
188
146
98
49
29
8
The Music Genome Project is no such thing
Charles Darwin's unpublished tree sketches
The acoustics of the Sydney Opera House
Why do we still use trees for the dog genealogy?
How do we interpret a rooted haplotype network?
Carnival of Evolution, Number 52
Who published the first phylogenetic tree?
Phylogenetics with SpongeBob
Charles Darwin's family pedigree network
Faux phylogenies
Evolutionary trees: old wine in new bottles?
Network analysis of scotch whiskies
Tattoo Monday
8,347
5,271
5,052
3,954
3,644
2,398
2,077
2,037
2,011
1,951
1,870
1,756
1,747
This list is not very different to the same time last year. Posts 129 (which is linked in Wikipedia) and 172 continue to receive visitors almost every day.

The audience for the blog continues to be firmly in the USA. Based on the number of pageviews, the visitor data are:
United States
France
Ukraine [spurious]
Germany
United Kingdom
Russia
Canada
Australia
China
Turkey
40.3%
6.8%
5.1%
5.0%
4.7%
3.1%
1.8%
1.6%
1.0%
0.7%

Finally, if anyone wants to contribute, then we welcome guest bloggers. This is a good forum to try out all of your half-baked ideas, in order to get some feedback, as well as to raise issues that have not yet received any discussion in the literature. If nothing else, it is a good place to be dogmatic without interference from a referee!

ADA Anniversary: Including People With Disabilities in Public Health

 

July 26th marks the 24th anniversary of the landmark Americans with Disabilities Act (ADA), a civil rights law that strengthens the inclusion of people with disabilities.    

Anyone can have a disability and a disability can occur at any point in a person’s life.  An estimated 37 million1 to 57 million2 people are living with a disability in the U.S. and many people will experience a disability some time during the course of their life.  When the ADA was enacted on July 26, 1990, its stated goals were to promote equal opportunity, full participation, independent living and economic self-sufficiency.3

Disability and Health Data System (DHDS) Interactive Comparison Map

CDC has embraced the spirit of the ADA. In 2010, CDC Director, Dr. Thomas R. Frieden established an initiative to serve the health needs of people with disability in the United States.   He later stated in 2012, “If we’re not inclusive we end up with two huge problems: one is that we’re unjust to the population and the second is that we’re not being as effective as we could be as organizations; we’re not taking advantage of everyone’s capacity.”4    

People with disabilities need public health programs and health care services for the same reasons anyone does—to be well, active, and a part of the community.  Unfortunately, major health gaps exist between adults with and without disabilities on leading indicators of health.  For example:  

  • 30.3% of adults with disabilities reported they currently smoked cigarettes every day or some days, compared to 16.7% of adults without disabilities. (Disability and Health Data System (DHDS), 2012 data.)
  • 38.4% of adults with disabilities were obese, based on body mass index (BMI) calculated from self-reported weight and height (kg/m2), compared to 24.4% of adults without disabilities. (Disability and Health Data System (DHDS), 2012 data.)
  • 42.7% of adults with disabilities reported sufficient aerobic physical activity, compared to 54.5% of adults without disabilities. (Disability and Health Data System (DHDS), 2011 data.)

CDC’s Division of Human Development and Disability at the National Center on Birth Defects and Developmental Disabilities (NCBDDD) recognizes ADA as a platform for the inclusion of people with disabilities in federal efforts related to health and health care.   

NCBDDD developed Disability and Health Data System (DHDS) to provide public health programs with instant access to national and state-level health and demographic data on adults with disabilities.  

Man talking with a doctor

Jerry is a 53 year old father of four children. Jerry has also had a disability for over 35 years. In 1976, Jerry was hit by a drunk driver. The accident left him as a partial paraplegic. Jerry’s life is not defined by his disability. He lives life just like anyone else without a disability would live their life. Jerry states, "I don't expect the world to revolve around us. I will adapt, just make it so I can adapt."

 Supporting State and National Disability and Health Programs  

Currently, 18 state-based disability and health programs are supported by NCBDDD to make sure that individuals with disabilities are included in ongoing disease prevention, health promotion, and emergency response activities within the state. NCBDDD also partners with five National Public Health Practice and Resource Centers (NPHPRC) to improve the lives of individuals living with disabilities by promoting health information, education, consultation and inclusion of health care professionals, people with disabilities, caregivers, media, researchers, policymakers and the public.  

Moving Forward  

Visit CDC’s website for more information on CDC’s work to include people with disabilities in public health.

References: 

Map of United States with CDC funded state disability and health programs

18 CDC-funded State Disability and Health Programs

  1. U.S. Census Bureau; American Community Survey, 2011 American Community Survey 1-Year Estimates, Table S1810; generated by Michael H. Fox.
  2. Brault MW. Americans with disabilities: 2010. Washington, DC: US Census Bureau; 2012.
  3. State of Georgia’s ADA Coordinator’s Office.  History and Spirit Behind the ADA.
  4. Public Health Matters Blog.  Grand Rounds: People with Disabilities and Public Health

Fireworks erupt in honor of the coronation of Iran’s Shah,…



Fireworks erupt in honor of the coronation of Iran’s Shah, March 1968.

Today is National Geographic Found’s one-year anniversary.

Here’s to a wonderful year filled with incredible and inspiring photographs from the past.Photograph by Winfield Parks, National Geographic

Fireworks erupt in honor of the coronation of Iran’s Shah,…



Fireworks erupt in honor of the coronation of Iran’s Shah, March 1968.

Today is National Geographic Found’s one-year anniversary.

Here’s to a wonderful year filled with incredible and inspiring photographs from the past.Photograph by Winfield Parks, National Geographic

Two years of network blogging


Today is the second anniversary of starting this blog, and this is post number 222. Thanks to all of our visitors over the past two years — we hope that the next year will be as productive as this past one has been.

I have summarized here some of the accumulated data, in order to document at least some of the productivity.

As of this morning, there have been 104,211 pageviews, with a median of 129 per day. The blog has continued to grow in popularity, with a median of 70 pageviews per day in the first year and 189 per day in the second year. The range of pageviews was 69-812 per day during this past year, and 3-667 the previous year. The daily pattern for the two years is shown in the first graph.

Line graph of the number of pageviews through time, up to today.
The largest values are off the graph. The green line is the half-way mark.
The inset shows the mean (blue) and standard deviation of the daily number of pageviews.

The erratic nature of the daily variation is apparently all too typical of blogs, and there appears to be no good explanation for it.  So, we might take this as a good example of the stochastic nature of the web.

There are a few general patterns in the data, the most obvious one being the day of the week, as shown in the inset of the above graph. The posts have usually been on Mondays and Wednesdays, and these two days have had the greatest mean number of pageviews.

Some of the more obvious dips include times such as Christmas - New Year; and the biggest peaks are associated with mentions of particular blog posts on popular sites. There also continue to be a few instances of "rogue" visits. These tend to be visits from sites such as Referer and Vampirestat.

The posts themselves have varied greatly in popularity, as shown in the next graph. It is actually a bit tricky to assign pageviews to particular posts, because visits to the blog's homepage are not attributed by the counter to any specific post. Since the current two posts are the ones that appear on the homepage, these posts are under-counted until they move off the homepage, (after which they can be accessed only by a direct visit to their own pages, and thus always get counted). On average, 30% of the blog's pageviews are to the homepage, rather than to a specific post page, and so there is considerable under-counting.

Scatterplot of post pageviews through time, up to last week; the line is the median.
Note the log scale, and that the values are under-counted (see the text).

It is good to note that the most popular posts were scattered throughout the two years. Keeping in mind the initial under-counting, the top collection of posts (with counted pageviews) have been:
129
42
73
172
10
98
58
49
29
19
67
188
8
The Music Genome Project is no such thing
Charles Darwin's unpublished tree sketches
Carnival of Evolution, Number 52
The acoustics of the Sydney Opera House
Why do we still use trees for the dog genealogy?
Faux phylogenies
Who published the first phylogenetic tree?
Evolutionary trees: old wine in new bottles?
Network analysis of scotch whiskies
Tattoo Monday IV
Metaphors for evolutionary relationships
Phylogenetics with SpongeBob
Tattoo Monday
4,552
3,100
1,964
1,891
1,641
1,451
1,359
1,352
1,298
1,247
1,178
1,088
1,051
This is quite a different list to the same time last year. Posts 129, 42 and 172 continue to receive visitors almost every day.

The audience for the blog continues to be firmly in the USA. Based on the number of pageviews, the visitor data are:
United States
United Kingdom
Germany
France
Russia
Canada
Australia
China
Brazil
Poland
41.1%
5.6%
4.9%
3.8%
3.3%
2.7%
2.1%
1.4%
1.0%
0.8%
You will note that this list is dominated by English-speaking countries. The blog does have a link to Google Translate to help other people, but it is clear that the audience is made up almost entirely of those people who are comfortable with English (or Australian, at any rate).

Finally, if anyone wants to contribute, then we welcome guest bloggers. This is a good forum to try out all of your half-baked ideas, in order to get some feedback, as well as to raise issues that have not yet received any discussion in the literature. If nothing else, it is a good place to be dogmatic without interference from a referee!

One year of network blogging


Today is the first anniversary of starting this blog, and this is post number 120. So, a big thankyou to all of our visitors over the past year. We hope that the next year will be as productive as this past one has been.

We have summarized here some of the accumulated data, in order to document at least some of the productivity.

As of this morning, there have been 29,316 pageviews, for a median of 70 per day, but with a range of 3-667 pageviews. The daily pattern for the year is shown in the first graph.

Line graph of pageviews through time, up to today.
The largest value (Day 224) is off the graph.

The erratic nature of the daily variation is apparently all too typical of blogs, and there appears to be no good explanation for it. So, we might take this as a good example of the stochastic nature of the web. Nervertheless, there are general patterns detectable. For example, the steady rise from one third of the way through the year is very gratifying, although the slight dip right at the end is less so. The recent mean pageview data are:
October – November
December
Christmas – New Year
January – mid February
late February
90
130
90
130
90

Some of the sharp peaks in the graph were due to various identifiable events, including the email announcing the existence of the blog, the addition of the blog to the Systematic Biology homepage, the mention of the blog in some posts at the Scientopia blog, and the mention of some of the posts in the monthly Carnival of Evolution blog roundup.

The biggest peak (which goes off the graph) was due to hosting an edition of the Carnival of Evolution, which generated an extra 2,000 pageviews. There were also unexpected Twitter announcements for particular posts, including the fourth Tattoo post (which got picked up when it happened to go out on April Fool's Day) and the one on Scotch Whiskies, which is apparently a topic of widespread interest.

There are also other general patterns in the data, the most obvious one being the day of the week, as shown in the second graph. The posts have usually been on Mondays and Wednesdays, and these two days have had the greatest mean number of pageviews (84 and 90, respectively), The other weekdays have had somewhat less (Tuesday 82, Thursday 75, Friday 65), and the weekend even fewer (Saturday 50, Sunday 63).

Boxplot of the daily pageviews, up to last Friday.
The largest value has been excluded.

There were also a few instances of what appear to be "rogue" visits during late December and early January. These involved an almost instantaneous addition of c.100 pageviews, without obvious explanation, which presumably came from bots examining the blog. They occurred once the blog reached 100 posts, which may not be coincidental.

The posts themselves have varied greatly in popularity, as shown in the next graph. It is actually a bit tricky to assign pageviews to particular posts, because visits to the blog's homepage are not attributed by the counter to any specific post. Since the current two posts are the ones that appear on the homepage, these posts are under-counted until they move off the homepage, (after which they can be accessed only by a direct visit to their own pages, and thus always get counted). On average, 33% of the blog's pageviews are to the homepage, rather than to a specific post page, and so there is considerable under-counting.

Scatterplot of post pageviews through time, up to today; the line is the median.
Note the log scale, and that the values are under-counted (see the text).

The fact that 33% of the blog's pageviews are to the homepage means that one-third of the visitors are reading the blog as the posts are posted, while two-thirds are visiting via web searches and external links. So, we do have a regular readership, as well as having itinerant visitors.

It is good to note that the most popular posts were scattered throughout the year. Keeping in mind the under-counting, the top collection of posts (with counted pageviews) have been:
73
42
19
49
10
58
98
26
67
17
29
2
35
Carnival of Evolution Number 52
Charles Darwin's unpublished tree sketches
Tattoo Monday IV
Evolutionary trees: old wine in new bottles?
Why do we still use trees for the dog genealogy?
Who published the first phylogenetic tree?
Faux phylogenies
Steven Jay Gould was wrong
Metaphors for evolutionary relationships
Tattoo Monday III
Network analysis of scotch whiskies
The first phylogenetic network (1755)
Tattoo Monday V
1,559
1,302
737
687
666
606
600
429
420
415
414
403
394

This blog has two possible uses: (i) providing an outlet for commentaries and ideas by professionals; and (ii) advertising phylogenetic networks to a wider audience. It has turned out that the latter posts have appeared mostly on Mondays and the former mostly on Wednesdays. Furthermore, it seems reasonable for the latter posts to have fewer pageviews, since the expected audience is much smaller (or "more select", as we prefer to see it).

There have been five main types of posts:

(i) Discussions of methodology
These are the mainstay of the blog for those who are professionally interested in phylogenetic networks. A wide range of topics have been discussed, and there is plenty more that can be said.

If anyone wants to contribute to this part of the blog, then we welcome guest bloggers. This is a good forum to try out all of your half-baked ideas, in order to get some feedback, as well as to raise issues that have not yet received any discussion in the literature. If nothing else, it is a good place to be dogmatic without interference from a referee!

As a blogger, you are very likely to get feedback from people, even if they do not leave comments on the blog itself. Professionals do not yet seem to be very used to writing blog comments, but they will send you an email.

(ii) Explanations
There are all sorts of things that seem obvious to professionals but which are obscure to non-experts. These posts are designed to redress this situation, so that there is somewhere on the web for people to go when they want to find out. They seem to have been rather popular posts.

(iii) Data analyses
The EDA analyses are intended to illustrate the usefulness of networks as data summaries (as opposed to their use for strictly evolutionary analyses). In particular, choosing datasets outside science advertizes the potential uses of scientific data analysis to a wider public. Networks provide a valuable way of visualizing a table of numbers -- so, any time you see such a table you should be tempted to find out whether a network will help people to picture what it says. Most of the analyses have proved quite popular in terms of pageviews, but there has been little feedback about whether the public understands any of it.

(iv) Historical commentaries
These have usually been among the most popular posts with visitors. They simply involve bits of information that have accumulated through time, and the blog seems to be a good place to put them. They often involve phylogenetic trees, rather than networks, but that is only because trees have been used more often and thus have more history. Mind you, you have to have a good title in order to attract the public's attention!

(v) Miscellaneous
These are uncategorizable posts, which just consist of things that relate in some way to phylogenetic analysis, however peripherally. There are almost no other phylogenetics blogs on the web, and so there is no other obvious outlet for this information. The most popular of these posts have been the ones compiling the various pictures of phylogenetic tattoos that are lying around the web -- these are the most common Google search hits to the blog, along with the first compilation of Darwin's unpublished tree sketches.

Along with these posts, we have also started compiling a list of datasets that will be useful for evaluating network algorithms. Such datasets, where biologists seem to have an independently validated idea about the phylogenetic pattern, are hard to come by, and so it is worthwhile to make them available at a centralized location. A blog page is a good as anywhere else for this purpose, and the number of visits to this page is quite steady. Contributions of datasets are always welcome.

Finally, the audience for the blog has been, not unexpectedly, firmly in the USA. Based on the number of  pageviews, the data are:
United States
United Kingdom
Germany
Russia
Canada
France
Australia
New Zealand
Netherlands
Sweden
37.4%
6.6%
5.3%
4.7%
4.0%
2.7%
2.3%
1.7%
1.6%
1.5%
You will note that this list is dominated by English-speaking countries. The blog does have a link to Google Translate to help other people, but it is clear that the audience is made up almost entirely of those people who are comfortable with English (or Australian, any any rate).