Your place in the vaccine line

Using estimates from the Surgo Foundation and Ariadne Labs, Stuart A. Thompson for NYT Opinion shows how many people are in front of you to get the coronavirus vaccine. Just enter your age, if you’re an essential worker, and the county you live in for an idea of where you are.

Tags: , , , ,

Scented candle reviews on Amazon and Covid-19

Prompted by a tweet about scented candles without smell and Covid-19, Kate Petrova plotted Amazon reviews for scented and unscented candles over time. Notice the downward trend for scented candles after the first confirmed case for Covid-19.

Interesting if true. I’m imagining a bunch of people opening their new scented candles, taking a big whiff, and not smelling anything.

But I wonder if there are outside forces (a.k.a. confounding factors) at work here. For example, Petrova only looked at reviews for the “top 3” scented candles. What do we see with other candles? Maybe a higher demand for scented candles from more people staying at home put a strain on the manufacturer. Maybe there was a shortage of some scented ingredient, which led to less potent candles. Maybe new scented candles customers have unrealistic expectations of what candles smell like.

I don’t know.

Maybe the decreasing average review really is related to Covid-19 symptoms.

Petrova put up the code and data, in case you want to dig into it.

Tags: , , , ,

Mapping 250,000 people

As we’ve talked about before, it can be hard to really understand the scale of big numbers. So when we hear that over 250,000 people died because of the coronavirus, it can be hard to conceptualize that number in our head. Lauren Tierney and Tim Meko for The Washington Post provide a point of comparison by highlighting counties that have have populations under 250,000.

Whole counties, or whole clusters of counties, that would be completely wiped out.

It’s a lot.

Tags: , , ,

Where there are hospital staff shortages

Reporting for NPR, Sean McMinn and Selena Simmons-Duffins on staffing shortages:

On data availability:

This is the first time the federal agency has released this data, which includes limited reports going back to summer. The federal government consistently started collecting this data in July. After months of steadily trending upward, the number of hospitals reporting shortages crossed 1,000 this month and has stayed above since.

The data, however, are still incomplete. Not all hospitals that report daily status COVID-19 updates to HHS are reporting their staffing situations, so it’s impossible to tell for sure how much these numbers have increased.

The first time.

It was back in March, a few lifetimes ago, when we were talking about flattening the curve so that hospitals could provide care to those who needed it. This federal dataset is just coming out now in November? Obscene.

Tags: , , ,

Why small gatherings can be dangerous too

A small gathering of 10 people or fewer can seem like a low-risk activity, and at the individual level, it’s lower risk than going to a big birthday party. But when a lot of people everywhere are gathering, small or large, the collective risk goes up. For FiveThirtyEight, Maggie Koerth and Elena Mejía illustrate the reasoning.

The collective part is where many seem to get tripped up. “Flattening the curve” only works when everyone works together. Lower your risk, and you lower the collective risk. You’re helping others. You’re helping those you care about.

Then, collectively, we all get out of this mess.

Tags: , , , ,

State restrictions and hospitalizations

The University of Oxford’s Blavatnik School of Government defined an index to track containment measures for the coronavirus. For The New York Times, Lauren Leatherby and Rich Harris plotted the index against cases and hospitalizations:

When cases first peaked in the United States in the spring, there was no clear correlation between containment strategies and case counts, because most states enacted similar lockdown policies at the same time. And in New York and some other states, “those lockdowns came too late to prevent a big outbreak, because that’s where the virus hit first,” said Thomas Hale, associate professor of global public policy at the Blavatnik School of Government, who leads the Oxford tracking effort.

A relationship between policies and the outbreak’s severity has become more clear as the pandemic has progressed.

States with more restrictions tend to have lower rates.

From these plots, it seems clear what we need to do. But I think most people have made up their minds already, and the interpretation of the data leads people to different conclusions.

With the holidays coming up, I just hope you lean towards clarity.

Tags: , , , ,

Rooted phylogenetic networks for coronaviruses

In a previous post, Guido constructed trees for coronaviruses in the SARS group to search for evidence of recombination. He also constructed unrooted data-display networks using SplitsTree. Here, we discuss our attempts to construct rooted genealogical phylogenetic networks for the same dataset [6] but with some modifications.

In particular, we deleted some sequences, giving a smaller data set with only 12 taxa. These taxa include, next to SARS-CoV-2 (the virus causing COVID-19) and SARS-CoV (responsible for the SARS epidemic in 2002/2003), the viruses MP789 and PCoV_GX-P1E sampled from Malayan pangolins from two different Chinese provinces and several viruses found in different bat species in the horseshoe bat genus (Rhinolophus), all from China.

This research was done by Rosanne Wallin, an MSc student at VU Amsterdam and UvA. Her full thesis as well as all data and results can be found on github.

The first algorithm we applied to this data set was the TreeChild Algorithm [1], which is one of the methods that take a number of discordant (rooted, binary) trees as input and finds a rooted network containing each input tree, minimizing the number of reticulate events in the network. To filter out some noise, we contracted some poorly-supported branches and then resolved multifurcations consistently across the trees (using a tool within the TreeChild Algorithm). This gave the network below. Note that the method is restricted to so-called tree-child networks, meaning that certain complex scenarios are excluded (where a network node only has reticulate children). Also note that this is not necessarily the only optimal tree-child network and not all topological differences can be distinguished based on the trees [5].

Figure 1: Phylogenetic network constructed by the Tree-Child algorithm (blocks_A_len0.01_supp70).

The network shows no reticulation in the SARS-CoV-2 clade (the bottom four taxa) and puts SARS-CoV-2 right next to RaTG13. Furthermore, it shows a reticulation between an ancestor of HKU3-1 and a common ancestor of SARS-CoV-2 and RaTG13 leading to bat-SL-CoVZC45. However, it cannot exactly identify which common ancestor of SARS-CoV-2 and RaTG13 is the parent, leading to multiple branches (in red) leading into this reticulation. All these observations are consistent with previous research [2].

Importantly, we cannot directly conclude that each reticulation corresponds to a recombination event. See Table 2.1 of David’s book [10] for a nice overview of possible causes of reticulation. Nevertheless, based on [2], it does look like at least the reticulation leading to bat-SL-CoVZC45 corresponds to a recombination event.

The second algorithm we applied was TriLoNet [3], which constructs a rooted network directly from sequence data. It is restricted to so-called level-1 networks, meaning that it cannot construct overlapping cycles. This method produced the network below.

Figure 2: Phylogenetic network constructed by TriLoNet.

At first sight, the network may look a bit different from the previous one (Figure 1). However, note that the three observations above also hold for this second network. Moreover, the SARS-CoV-2 clade is identical in both networks. This network contains only one reticulation, which is most likely due to the level-1 restriction.

Nevertheless, we can still use this method to find more putative recombination events. To do so, we simply exclude the recombinant bat-SL-CoVZC45 from the analysis and rerun the algorithm. This gives the following network.

Figure 3: Phylogenetic network constructed by TriLoNet, after omitting bat-SL-CoVZC45.

We have now found a second putative recombination event with Rf1 as recombinant. Note that this is also consistent with the network in Figure 1. On the other hand, also note that the branching order in the SARS-CoV clade (the bottom 7 taxa in Figure 3) has changed a bit. This could mean that more recombination events are present in the SARS-CoV clade, as we also see in Figure 1.

One interesting follow-up question is whether the two (or more) networks produced by TriLoNet can be combined into a single higher-level network, in order to show multiple reticulations simultaneously (see [4] for an algorithm that could be useful).

Another interesting observation from these networks is that there is no sign of recombination involving the pangolin coronaviruses MP789 and PCoV_GX-P1E. It rather looks like these viruses evolved from common ancestors of SARS-CoV-2 and RaTG13, but it is important to note that we cannot exclude a recombination event on the basis of these networks. The relationship between SARS-CoV-2 and pangolin coronaviruses is still being debated in the literature [2,7,8,9].

Some limitations of the algorithms were noticed during this study. Firstly, the depicted networks are purely topological, i.e., the branch lengths do not represent anything. Adapting these algorithms to take branch length information into account could possibly improve their accuracy for this data set since the extant taxa have precise time stamps and for recent divergence events these times can be estimated quite accurately, see [2].

Another limitation is that we had to remove several taxa from the original data set [6] before the TreeChild algorithm could find a solution. By removing taxa, we reduced the number of reticulations needed to display the trees, making the TreeChild algorithm run in reasonable time. We made sure to include a diverse set of taxa (based on their pairwise distances [6]) to represent as much of the subgenus as possible. 

Rosanne used several other algorithms, taxon selections and also used trees based on genes rather than fixed-length blocks (which we did above, following Guido’s post), see her thesis on github.

Conclusion
Although rooted phylogenetic network methods are often limited in the number of taxa that can be analysed and/or the complexity of the networks that can be constructed, we have seen that these methods can be useful for constructing hypothetical evolutionary histories. Moreover, although the constructed networks are not identical, we have seen that they share certain key properties, which are also consistent with previous research.  

Rosanne Wallin, Leo van Iersel, Mark Jones, Steven Kelk and Leen Stougie


[1] Leo van Iersel, Remie Janssen, Mark Jones, Yukihiro Murakami and Norbert Zeh. A Practical Fixed-Parameter Algorithm for Constructing Tree-Child Networks from Multiple Binary Trees. arXiv:1907.08474 [cs.DM] (2019).

[2] Maciej F. Boni, Philippe Lemey, Xiaowei Jiang, Tommy Tsan-Yuk Lam, Blair W. Perry, Todd A. Castoe, Andrew Rambaut and David L. Robertson. Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic. Nat Microbiol 5, 1408–1417 (2020). https://doi.org/10.1038/s41564-020-0771-4

[3] James Oldman, Taoyang Wu, Leo van Iersel and Vincent Moulton. TriLoNet: Piecing together small networks to reconstruct reticulate evolutionary histories. Molecular Biology and Evolution, 33 (8): 2151-2162 (2016). http://dx.doi.org/10.1093/molbev/msw068 (postprint)

[4] Yukihiro Murakami, Leo van Iersel, Remie Janssen, Mark Jones and Vincent Moulton. Reconstructing Tree-Child Networks from Reticulate-Edge-Deleted Subnetworks. Bulletin of Mathematical Biology, 81(10):3823–3863 (2019).

[5] Fabio Pardi and Celine Scornavacca. Reconstructible phylogenetic networks: do not distinguish the indistinguishable. PLoS Comput Biol, 11(4), e1004135 (2015).

[6] Grimm, Guido; Morrison, David (2020): Harvest and phylogenetic network analysis of SARS virus genomes (CoV-1 and CoV-2). figshare. Dataset. https://doi.org/10.6084/m9.figshare.12046581.v3

[7]  Lam, Tommy Tsan-Yuk, Marcus Ho-Hin Shum, Hua-Chen Zhu, Yi-Gang Tong, Xue-Bing Ni, Yun-Shi Liao, Wei Wei, et al. Identifying SARS-CoV-2 Related Coronaviruses in Malayan Pangolins. Nature, 583, 282–285 (2020). https://doi.org/10.1038/s41586-020-2169-0

[8] Wang, Hongru, Lenore Pipes, and Rasmus Nielsen. Synonymous Mutations and the Molecular Evolution of SARS-Cov-2 Origins. [Preprint] Evolutionary Biology, April 21, 2020. https://doi.org/10.1101/2020.04.20.052019

[9] Li, Xiaojun, Elena E. Giorgi, Manukumar Honnayakanahalli Marichannegowda, Brian Foley, Chuan Xiao, Xiang-Peng Kong, Yue Chen, S. Gnanakaran, Bette Korber, and Feng Gao. Emergence of SARS-CoV-2 through Recombination and Strong Purifying Selection. Science Advances, Vol. 6, no. 27 (2020). https://doi.org/10.1126/sciadv.abb9153 

[10] David Morrison, Introduction to Phylogenetic Networks. RJR Productions, Uppsala, Sweden (2011). http://www.rjr-productions.org/Networks/index.html


Coronavirus cases rising in prisons

Coronavirus cases are rising (again), which includes prisoners and prison staff. The Marshall Project has been tracking cases since March and provides a state-by-state rundown:

New infections this week rose sharply to their highest level since the start of the pandemic, far outpacing the previous peak in early August. Iowa, Michigan and the federal prison system each saw more than 1,000 prisoners test positive this week, while Texas prisons surpassed 2,000 new cases.

Tags: , ,

Estimate your Covid-19 risk, given location and activities

The microCOVID Project provides a calculator that lets you put in where you are and various activities to estimate your risk:

This is a project to quantitatively estimate the COVID risk to you from your ordinary daily activities. We trawled the scientific literature for data about the likelihood of getting COVID from different situations, and combined the data into a model that people can use. We estimate COVID risk in units of microCOVIDs, where 1 microCOVID = a one-in-a-million chance of getting COVID.

Tags: ,

Spike past 100k Covid-19 cases in a day

Meanwhile… based on estimates from The COVID Tracking Project, the United States had an all-time high for daily counts yesterday, at 103,087. And 1,116 people died.

Tags: , ,