Possible coronavirus deaths compared against other causes

Based on estimates from public health researcher James Lawler, The Upshot shows the range of coronavirus deaths, given variable infection and fatality rate. Adjust with the sliders and see how the death count (over a year) compares against other major causes of death:

Dr. Lawler’s estimate, 480,000 deaths, is higher than the number who die in a year from dementia, emphysema, stroke or diabetes. There are only two causes of death that kill more Americans: cancer, which kills just under 600,000 in a year, and heart disease, which kills around 650,000.

A coronavirus death toll near the top of the C.D.C. range (1.7 million) would mean more deaths from the disease than the number of Americans typically killed by cancer and heart disease put together.

Can we all agree now that brushing off coronavirus by floating annual flu numbers is a bad comparison? The most worrisome part of the data we have is the uncertainty and then the range of possibilities that come out of that uncertainty.

Tags: , , ,

All data is wrong

Vicki Boykis riffing off the George Box quote, “All models are wrong, some are useful.”:

The point is that, whatever data you dig into, at any given point in time, that looks solid on the surface, will be a complete mess underneath, plagued by undefined values, faulty studies, small sample problems, plagiarism, and all of the rest of the beautiful mess that is human life.

Just as all deep learning NLP models are really grad students reading phone books, if you dig deep enough, you’ll get to a place where your number is wrong or calculated differently than you’ve assumed.

I think of statistics as uncertainty management. It’s about estimates and figuring out how much you can trust them. Working with data is rarely about getting an exact truth.

Tags: ,

Super Tuesday simulator

With Super Tuesday on the way, there’s still a lot of uncertainty for what’s going to happen. FiveThirtyEight has their forecast, but even with results expressed as odds and probabilities, the outcome almost seems static and concrete. So FiveThirtyEight has a different way of poking at their forecast. Pick the winners in each state, note how the conditional probabilities change as you go, and see what might happen in the rest of the primary given your picks.

Tags: , , , ,

Statistical uncertainty as certainty

Mark Rober, who is having a good run of science and engineering videos on YouTube, posted a short note on how he embraces statistical uncertainty:

As humans we are really good at using hindsight bias to convince ourselves we are more in control of things than we really are. For example, if you give 1024 people a coin and give them 10 tries to get as many tails as possible, it’s a statistical certainty that one of them will flip 10 tails in a row (and some unlucky chap will get 10 heads in a row). And yet at that point the media will swoop in and analyze his wrist motion and dissect his training regime and he’ll write books about his life story and how it all prepared him for that moment of greatness. Pretty much all situations in life are a roll of the dice. You can/should do as much as possible to weight the dice but there is always a dice roll.

[…]

I always do everything I can to stack the dice in my favor but truly internalizing that some big part of what happens is out of my control gives me permission to just feel grateful for the experiences I’ve had and not beat myself up when things don’t go as I hoped. I can still feel happy about life even if the views aren’t what they used to be and at the same time I get to feel stoked for the person that will inevitably take my place… just hopefully later than sooner ;)

Tags: ,

Useful and not so useful Statistics

Hannah Fry, for The New Yorker, describes the puzzle of Statistics to analyze general patterns used to make decisions for individuals:

There is so much that, on an individual level, we don’t know: why some people can smoke and avoid lung cancer; why one identical twin will remain healthy while the other develops a disease like A.L.S.; why some otherwise similar children flourish at school while others flounder. Despite the grand promises of Big Data, uncertainty remains so abundant that specific human lives remain boundlessly unpredictable. Perhaps the most successful prediction engine of the Big Data era, at least in financial terms, is the Amazon recommendation algorithm. It’s a gigantic statistical machine worth a huge sum to the company. Also, it’s wrong most of the time.

Be sure to read this one. I especially liked the examples used to explain statistical concepts that sometimes feel mechanical in stat 101.

Tags: , ,

Gallery of uncertainty visualization methods

It must be uncertainty month and nobody told me. For Scientific American, Jessica Hullman briefly describes her research in uncertainty visualization with a gallery of options from worst to best.

Tags: , ,

What that hurricane map means

For The New York Times, Alberto Cairo and Tala Schlossberg explain the cone of uncertainty we often see in the news when a hurricane approaches. People often misinterpret the graphic:

The cone graphic is deceptively simple. That becomes a liability if people believe they’re out of harm’s way when they aren’t. As with many charts, it’s risky to assume we can interpret a hurricane map correctly with just a glance. Graphics like these need to be read closely and carefully. Only then can we grasp what they’re really saying.

Depict uncertainty more clearly, and people will understand the probabilities and confidence intervals more clearly.

Tags: , , ,

xkcd and the needle of probability

xkcd referenced the ever-so-loved forecasting needle. I’m so not gonna look at it this year. Maybe.

Tags: , , ,

Hotter days where you were born

It’s getting hotter around the world. The New York Times zooms in on your hometown to show the average number of “very hot days” (at least 90 degrees) since you were born and then the projected count over the next decades. Then you zoom out to see how that relates to the rest of the world.

I’ve always found it interesting that visualization and analysis are typically “overview first, then details on demand”, whereas storytelling more often goes the opposite direction. Focus on an individual data point first and then zoom out after.

Tags: , , ,

Needle of uncertainty

The Upshot has used a needle to show shifts in their live election forecasts, because many readers don’t understand probability. Nate Cohn and Josh Katz:

This was evident before the result of the 2016 election, and as a result we tried something new: a jitter, where the needle quivered to reflect the uncertainty around the forecast. Although many readers disliked it, the jitter reflected an earnest attempt to give tangible meaning to abstract probabilities. Nonetheless, we turned the jitter off for all of our 2017 forecasts.

Tonight, readers will have the option to turn the jitter off. We expect that some readers will opt to do so, but remember this: Switching it off only hides the uncertainty — it doesn’t make it go away.

Read the whole thing for why the needle, what the needle means, and how The Upshot is using it.

As much as I hated what the needle showed me the first time I saw it, I’ve grown to appreciate the uncertainty it represents.

Tags: , ,