Introduction to Deep Learning

Sebastian Raschka made 170 videos on deep learning, and you can watch all of the lessons now:

I just sat down this morning and organized all deep learning related videos I recorded in 2021. I am sure this will be a useful reference for my future self, but I am also hoping it might be useful for one or the other person out there.

It’s split into 19 lessons over five parts: introduction, mathematical foundations, neural networks, deep learning for computer vision, and generative models. Might be useful, even if you just want to learn more about machine learning is.

Tags: , ,

Introduction to Modern Statistics

Introduction to Modern Statistics by Mine Cetinkaya-Rundel and Johanna Hardin is a free-to-download book:

Introduction to Modern Statistics is a re-imagining of a previous title, Introduction to Statistics with Randomization and Simulation book. The new book puts a heavy emphasis on exploratory data analysis (specifically exploring multivariate relationships using visualization, summarization, and descriptive models) and provides a thorough discussion of simulation-based inference using randomization and bootstrapping, followed by a presentation of the related Central Limit Theorem based approaches.

Read it in the browser or buy a print version. A good deal either way.

Tags: ,

Billionaire tax rates

ProPublica anonymously obtained billionaires’ tax returns. Combining the data with Forbes’ billionaire wealth estimates, ProPublica calculated a “true tax rate” for America’s 25 richest people:

The results are stark. According to Forbes, those 25 people saw their worth rise a collective $401 billion from 2014 to 2018. They paid a total of $13.6 billion in federal income taxes in those five years, the IRS data shows. That’s a staggering sum, but it amounts to a true tax rate of only 3.4%.

It’s a completely different picture for middle-class Americans, for example, wage earners in their early 40s who have amassed a typical amount of wealth for people their age. From 2014 to 2018, such households saw their net worth expand by about $65,000 after taxes on average, mostly due to the rise in value of their homes. But because the vast bulk of their earnings were salaries, their tax bills were almost as much, nearly $62,000, over that five-year period.

As you might guess, a lot of the disparity has to do with wealth held in unrealized capital gains. The other part is how the ultrawealthy still pay for everything when most of their money is in investments and how that factors into deductions.

Tags: , , ,

Converting Minecraft worlds to photorealistic ones using neural networks

Researchers from NVIDIA and Cornell University made GANcraft:

GANcraft aims at solving the world-to-world translation problem. Given a semantically labeled block world such as those from the popular game Minecraft, GANcraft is able to convert it to a new world which shares the same layout but with added photorealism. The new world can then be rendered from arbitrary viewpoints to produce images and videos that are both view-consistent and photorealistic.

This is impressive, but what amazes me more is that Minecraft is still very much a thing after all these years.

Tags: , ,

When you don’t own your face

For The New York Times, Kashmir Hill describes the implications of facial recognition becoming a thing that everyone just has:

Retail chains that get their hands on technology like this could try to use it to more effectively blacklist shoplifters, a use Rite Aid has already piloted (but abandoned). In recent years, surveillance companies casually rolled out automated license-plate readers that track cars’ locations, which are frequently used to solve crimes; such companies could easily add face reading as a feature. The advertising industry that tracks your every movement online would be able to do so in the real world: That scene from “Minority Report” in which Tom Cruise’s character flees through a shopping mall of targeted pop-up ads — “John Anderton, you could use a Guinness right about now!” — could be our future.

No thank you.

Tags: , , ,

Public agencies using facial recognition software without oversight

An anonymous source supplied BuzzFeed News with usage data from Clearview AI, the facial recognition service that was banned by many police departments nationwide. Many agencies still used and/or tried it:

The data, provided by a source who declined to be named for fear of retribution, has limitations. When asked about it in March of this year, Clearview AI did not confirm or dispute its authenticity. Some 335 public entities in the dataset confirmed to BuzzFeed News that their employees had tested or worked with the software, while 210 organizations denied any use. Most entities — 1,161 — did not respond to questions about whether they had used it.

Still, the data indicates that Clearview has broadly distributed its facial recognition software to federal agencies and police departments nationwide, offering the app to thousands of police officers and government employees, who at times used it without training or oversight. Often, agencies that acknowledged their employees had used the software confirmed it happened without the knowledge of their superiors, let alone the public they serve.

BuzzFeed News also made a searchable table so you can see if your local agencies are on the list.

Tags: , , ,

Technopolitics of the U.S. census

Dan Bouk and Danah Boyd wrote an essay on the data infrastructure and politics behind the decennial census:

Like all infrastructures, the U.S. decennial census typically lives in the obscurity afforded by technical complexity. It goes unnoticed outside of the small group of people who take pride in being called “census nerds.” It rumbles on, essentially invisible even to those who are counted. (Every 10 years, scores of people who answered the census forget they have done so and then insist that the count must have been plagued by errors since it had missed them, even though it had not.) Almost no one notices the processes that produce census data—unless something goes terribly wrong. Susan Leigh Star and Karen Ruhleder argue that this is a defining aspect of infrastructure: it “becomes visible upon breakdown.” In this paper, we unspool the stories of some technical disputes that have from time to time made visible the guts of the census infrastructure and consider some techniques that have been employed to maintain the illusion of a simple, certain count.

This process, whether we know what’s going on or not, in turn affects voices and democracy across the country. So it’s kind of important.

Tags: ,

The Data Journalism Handbook

The Data Journalism Handbook: Towards a Critical Data Practice now has a second edition, updated from the original 2012 edition:

The Data Journalism Handbook: Towards a Critical Data Practice provides a rich and panoramic introduction to data journalism, combining both critical reflection and practical insight. It offers a diverse collection of perspectives on how data journalism is done around the world and the broader consequences of datafication in the news, serving as both a textbook and a sourcebook for this emerging field. With more than 50 chapters from leading researchers and practitioners of data journalism, it explores the work needed to render technologies and data productive for journalistic purposes.

Download the digital version for free or buy a physical copy.

Tags: ,

Teaching statistical models with wine tasting

For The Pudding, Lars Verspohl provides an introduction to statistical models disguised as a lesson on finding good wine. Start with a definition of wine, which becomes a way to describe it with the numbers. Define what makes a wine good. Find the wines that look closer to that definition.

Tags: , ,

Statistical limits

Reviewing Deborah Stone’s Counting and Tim Harford’s The Data Detective, Hannah Fry discusses the usefulness of data and its limitations for The New Yorker:

Numbers are a poor substitute for the richness and color of the real world. It might seem odd that a professional mathematician (like me) or economist (like Harford) would work to convince you of this fact. But to recognize the limitations of a data-driven view of reality is not to downplay its might. It’s possible for two things to be true: for numbers to come up short before the nuances of reality, while also being the most powerful instrument we have when it comes to understanding that reality.

This builds on Fry’s similarly themed article from a couple of years ago, as well as her book Hello World.

Data is limited, and the better we understand those limitations, the better use we can get out of what’s there.

Tags: ,