Introduction to Probability for Data Science, a free book

Introduction to Probability for Data Science is a free-to-download book by Purdue statistics professor Stanley H. Chan:

We need a book that balances the theory and practice. We need a book that provides insights and not just theorems and proofs. We need a book that motivates the students, telling them why probability is so essential to their work. We need a book that highlights the impacts of the subject. From over than half a decade of teaching the course, I have distilled what I believe to be the core of probabilistic methods. I put the book in the context of data science, to emphasize the inseparability between data (computing) and probability (theory) in our time.

Download a free PDF copy or buy a physical copy.

Tags: , ,

Odds of winning the big Mega Millions prize

With tonight’s Mega Millions jackpot estimated at $1.28 billion, you might be wondering what the odds of winning are, even if you know the chances are super slim for an individual. (On the other hand, the more tickets purchased overall, the greater the chances that someone in the country wins.) For The Washington Post, Bonnie Berkowitz and Shelly Tan made a playful quiz to test your perception of 1 in 302.6 million.

Tags: ,

Calculating win probabilities

Zack Capozzi, for USA Lacrosse Magazine, explains how he calculates win probabilities pre-game and during games. On interpretation, which could easily apply to other sports and all forecasts:

But interpretation here matters quite a bit. And this is frustrating for some people, but that 61 percent should be interpreted as: “if these teams played 100 times, we would expect Marquette to win 61 of those games.” It definitely does not mean that the model is 61 percent confident that Marquette will win.

This is a bit odd, but this also means that if the Win Probability model gives Team A a 90% chance to beat Team B, there is nothing wrong with the model if Team B ends up winning the game. The issue would arise if, out of 100 90-percent win probability games, the favorite wasn’t winning around 90 of those games. When the model says 90 percent, you want it to mean 90 percent.

I wonder how many people incorrectly interpret the probability as “61 percent confident”. I bet a lot.

I do know that ever since the Golden State Warriors lost to the Cleveland Cavaliers in the 2016 NBA Finals — while holding a 90-something percent win projection by FiveThirtyEight — I stopped paying attention to win probability. But learning more about the calculation made it more interesting.

Tags: , ,

Looking for similar NBA games, based on win probability time series

Inpredictable, a sports analytics site by Michael Beuoy, tracks win probabilities of NBA games going back to the 1996-97 season. When a team is up by a lot, their probability of winning is high, and then flip that for the losing team. So for each game, you have a minute-by-minute time series of win probability.

Beuoy added a new feature that looks for games with similar patterns a.k.a. “Dopplegamers”.

Tags: ,

Probability you will break up with your partner

Rosenfeld, et al. from Stanford University ran a survey in 2009 for a study on How Couples Meet and Stay Together. Dan Kopf and Youyou Zhou for Quartz used this dataset to estimate the probability that you will break up with your partner, given a few bits of information about your current relationship.

The Stanford data page says a 2017 release is on the way. I’m curious how, if anything, has changed in relationships between 2009 and now.

Tags: , ,

Nearly impossible to predict mass shootings with current data

Predicting mass shooting

Even if there were a statistical model that predicted a mass shooter with 99 percent accuracy, that still leaves a lot of false positives. And when you’re dealing with individuals on a scale of millions, that’s a big deal. Brian Resnick and Javier Zarracina for Vox break down the simple math with a cartoon.

Tags: ,

What probability means in different fields

Political scientist probability meaning

Statistically, probability ranges from 0 to 1 — impossible to definitely without a doubt. Math with Bad Drawings characterized what those values mean in various fields of expertise. This amuses me.

Tags: ,

The Price is Right winner and cancer survivor calculates the odds

Elisa Long, a professor in Decisions, Operations, and Technology Management at the University of California, Los Angeles, was diagnosed with breast cancer. The Price is Right films a breast cancer awareness episode every August. Long wanted to get on that show. So she watched episodes during her 6-hour chemotherapy sessions to familiarize herself with games and rules, and most importantly, to maximize her odds of winning.

Long describes her thought process and probability calculations on her way to surviving cancer and winning it all on The Price is Right.

My goal in going on "The Price Is Right" was to play the best I possibly could given tremendous uncertainty about the outcome. The same was true for my breast cancer. The stakes were just higher.

Ah, the uncertainty of life.

Tags: , , , ,