Examining the dataset driving machine learning systems

For Knowing Machines, an ongoing research project that examines the innards of machine learning systems, Christo Buschek and Jer Thorp turn attention to LAION-5B. The large image dataset is used to train various systems, so it’s worth figuring out where the dataset comes from and what it represents.

As artists, academics, practitioners, or as journalists, dataset investigation is one of the few tools we have available to gain insight and understanding into the most complex systems ever conceived by humans.

This is why advocating for dataset transparency is so important if AI systems are ever going to be accountable for their impacts in the world.

If articles covering similar themes have confused you or were too handwavy, this one might clear that up. It describes the system and steps more concretely, so you finish with a better idea of how systems can end up with weird output.

Tags: , , ,

National identity stereotypes through generative AI

For Rest of World, Victoria Turk breaks down bias in generative AI in the context of national identity.

Bias in AI image generators is a tough problem to fix. After all, the uniformity in their output is largely down to the fundamental way in which these tools work. The AI systems look for patterns in the data on which they’re trained, often discarding outliers in favor of producing a result that stays closer to dominant trends. They’re designed to mimic what has come before, not create diversity.

“These models are purely associative machines,” Pruthi said. He gave the example of a football: An AI system may learn to associate footballs with a green field, and so produce images of footballs on grass.

Between this convergence to stereotypes and the forced diversity from Google’s Gemini, has anyone tried coupling models with demographic data to find a place in between?

Tags: , , ,

Racial bias in OpenAI GPT resume rankings

AI is finding its way into the HR workflow to sift through resumes. This seems like a decent idea on the surface, until you realize that the models that the AI is built on lean more towards certain demographics. For Bloomberg, Leon Yin, Davey Alba, and Leonardo Nicoletti experimented with the OpenAI GPT showing a bias:

When asked to rank those resumes 1,000 times, GPT 3.5 — the most broadly-used version of the model — favored names from some demographics more often than others, to an extent that would fail benchmarks used to assess job discrimination against protected groups. While this test is a simplified version of a typical HR workflow, it isolated names as a source of bias in GPT that could affect hiring decisions. The interviews and experiment show that using generative AI for recruiting and hiring poses a serious risk for automated discrimination at scale.

Yeah, that sounds about right.

Tags: , , , ,

Coin flips might tend towards the same side they started

The classic coin flip is treated as a fair way to make decisions, assuming an even chance for heads or tails on each flip. However, František Bartoš was curious and recruited friends and colleagues to record over 350,000 flips. There appeared to be a slight bias.

For Scientific American, Shi En Kim reports:

The flipped coins, according to findings in a preprint study posted on arXiv.org, landed with the same side facing upward as before the toss 50.8 percent of the time. The large number of throws allows statisticians to conclude that the nearly 1 percent bias isn’t a fluke. “We can be quite sure there is a bias in coin flips after this data set,” Bartoš says.

There is probably more than one caveat here, but even though there were a lot of flips, they only came from 48 people and the bias varied across flippers.

Of course, if you’re trying to get a call in your favor, maybe try to catch a glimpse of which side is up and choose accordingly. Couldn’t hurt.

Tags: , , ,

Building fair algorithms

Emma Pierson and Kowe Kadoma, for Fred Hutchinson Cancer Center, have a short and free course on Coursera on practical steps for building fair algorithms:

Algorithms increasingly help make high-stakes decisions in healthcare, criminal justice, hiring, and other important areas. This makes it essential that these algorithms be fair, but recent years have shown the many ways algorithms can have biases by age, gender, nationality, race, and other attributes. This course will teach you ten practical principles for designing fair algorithms. It will emphasize real-world relevance via concrete takeaways from case studies of modern algorithms, including those in criminal justice, healthcare, and large language models like ChatGPT. You will come away with an understanding of the basic rules to follow when trying to design fair algorithms, and assess algorithms for fairness.

It’s geared for beginners and no coding is required.

Tags: , , ,

Demonstration of bias in AI-generated images

For The Washington Post, Nitasha Tiku, Kevin Schaul and Szu Yu Chen demonstrate how AI generators lead to biased images. The systems use data slurped up from the internet to guess what pixels to show based on the text (i.e. a prompt) that you provide. So the images are often the result of calculations that look for the most common pixels in the source data rather than a real-world representation.

To most people, the bias probably seems harmless with an assumption that the systems will improve. And that might be the case. But just you wait until an AI chart generator, based on the inputs of visualization critiques scraped from the internets, only produces bar charts with obscene amounts of white space no matter what you try. Then you’ll be sorry you didn’t care sooner.

Tags: , ,

Flawed Rotten Tomatoes ratings

Rotten Tomatoes aggregates movie reviews to spit out a freshness score for each film. There’s a problem though. For Vulture, Lane Brown reports on the flawed system:

But despite Rotten Tomatoes’ reputed importance, it’s worth a reminder: Its math stinks. Scores are calculated by classifying each review as either positive or negative and then dividing the number of positives by the total. That’s the whole formula. Every review carries the same weight whether it runs in a major newspaper or a Substack with a dozen subscribers.

If a review straddles positive and negative, too bad. “I read some reviews of my own films where the writer might say that he doesn’t think that I pull something off, but, boy, is it interesting in the way that I don’t pull it off,” says Schrader, a former critic. “To me, that’s a good review, but it would count as negative on Rotten Tomatoes.”

Studios have of course learned how to game the system, not to mention most of the site is now owned by movie ticket seller Fandango.

Tags: , , , ,

Generative AI exaggerates stereotypes

Perhaps to no one’s surprise, generative artificial intelligence models contain bias rooted in the data that drive them and by the people who design the systems. For Bloomberg, Leonardo Nicoletti and Dina Bass examined the extent to which the bias exists through the lens of Stable Diffusion.

Most of the piece is a rundown of what Stable Diffusion shows, but the biggest tell is the chart that compares the generated images against reality.

Tags: , , ,

AI and the American smile

Jenka Gurfinkel discusses the appearance of the American smile in AI-generated images and its implications in interpreting data:

Every American knows to say “cheese” when taking a photo, and, therefore, so does the AI when generating new images based on the pattern established by previous ones. But it wasn’t always like this. More than a century after the first photograph was captured, a reference to “cheesing” for photos first appeared in a local Texas newspaper in 1943. “Need To Put On A Smile?” the headline asked, “Here’s How: Say ‘Cheese.’” The article quoted former U.S. ambassador Joseph E. Davies who explained that this influencer photo hack would be “Guaranteed to make you look pleasant no matter what you’re thinking […] it’s an automatic smile.” Davies served as ambassador under Franklin D. Roosevelt to the U.S.S.R.

My natural face is generously non-smiley, so this resonated.

Tags: , , ,

Bias in AI-generated images

Lensa is an app that lets you retouch photos, and it recently added a feature that uses Stable Diffusion to generate AI-assisted portraits. While fun for some, the feature reveals biases in the underlying dataset. Melissa Heikkilä, for MIT Technology Review, describes problematic biases towards sexualized images for some groups:

Lensa generates its avatars using Stable Diffusion, an open-source AI model that generates images based on text prompts. Stable Diffusion is built using LAION-5B, a massive open-source data set that has been compiled by scraping images off the internet.

And because the internet is overflowing with images of naked or barely dressed women, and pictures reflecting sexist, racist stereotypes, the data set is also skewed toward these kinds of images.

This leads to AI models that sexualize women regardless of whether they want to be depicted that way, Caliskan says—especially women with identities that have been historically disadvantaged.

Tags: , , , , ,