Regulating deepfakes

It continues to get easier to take someone’s face and put that person in compromising situations. For The Markup, Mariel Padilla reports on states trying to catch up with the fast-developing technology.

Carrie Goldberg, a lawyer who has been representing victims of nonconsensual porn—commonly referred to as revenge porn—for more than a decade, said she only started hearing from victims of computer-generated images more recently.

“My firm has been seeing victims of deepfakes for probably about five years now, and it’s mostly been celebrities,” Goldberg said. “Now, it’s becoming children doing it to children to be mean. It’s probably really underreported because victims might not know that there’s legal recourse, and it’s not entirely clear in all cases whether there is.”

The internet is going to get very weird and very confusing, especially for those who can’t fathom how a photo, a video, or audio could be fake when it seems so real. Scammers’ imaginations must be running wild these days.

Tags: , , ,

OpenAI previews voice synthesis

OpenAI previewed Voice Engine, a model to generate voices that mimic, using just a 15-second audio sample:

We first developed Voice Engine in late 2022, and have used it to power the preset voices available in the text-to-speech API as well as ChatGPT Voice and Read Aloud. At the same time, we are taking a cautious and informed approach to a broader release due to the potential for synthetic voice misuse. We hope to start a dialogue on the responsible deployment of synthetic voices, and how society can adapt to these new capabilities. Based on these conversations and the results of these small scale tests, we will make a more informed decision about whether and how to deploy this technology at scale.

They provide worthwhile use cases, such as language translation and providing a voice to those who are non-verbal, but oh boy, the authenticity of online things is going to get tricky very soon.

Tags: , , ,

Examining the dataset driving machine learning systems

For Knowing Machines, an ongoing research project that examines the innards of machine learning systems, Christo Buschek and Jer Thorp turn attention to LAION-5B. The large image dataset is used to train various systems, so it’s worth figuring out where the dataset comes from and what it represents.

As artists, academics, practitioners, or as journalists, dataset investigation is one of the few tools we have available to gain insight and understanding into the most complex systems ever conceived by humans.

This is why advocating for dataset transparency is so important if AI systems are ever going to be accountable for their impacts in the world.

If articles covering similar themes have confused you or were too handwavy, this one might clear that up. It describes the system and steps more concretely, so you finish with a better idea of how systems can end up with weird output.

Tags: , , ,

National identity stereotypes through generative AI

For Rest of World, Victoria Turk breaks down bias in generative AI in the context of national identity.

Bias in AI image generators is a tough problem to fix. After all, the uniformity in their output is largely down to the fundamental way in which these tools work. The AI systems look for patterns in the data on which they’re trained, often discarding outliers in favor of producing a result that stays closer to dominant trends. They’re designed to mimic what has come before, not create diversity.

“These models are purely associative machines,” Pruthi said. He gave the example of a football: An AI system may learn to associate footballs with a green field, and so produce images of footballs on grass.

Between this convergence to stereotypes and the forced diversity from Google’s Gemini, has anyone tried coupling models with demographic data to find a place in between?

Tags: , , ,

Flipbook Experiment, like the Telephone game but visual

This looks fun. The Pudding is running an experiment that functions like a visual version of Telephone. In Telephone, the first person whispers a message to their neighbor and the message is passed along until you end with a message that is completely different. Instead of a message, you have a sketch that each new person traces.

I traced something around frame 200 and the sketch looked like a scribble already. I’m curious where this ends.

Tags: , , ,

Language-based AI to chat with her dead husband

For the past few years, Laurie Anderson has been using an AI chatbot to talk her husband who died in 2013. For the Guardian, Walter Marsh reports:

In one experiment, they fed a vast cache of Reed’s writing, songs and interviews into the machine. A decade after his death, the resulting algorithm lets Anderson type in prompts before an AI Reed begins “riffing” written responses back to her, in prose and verse.

“I’m totally 100%, sadly addicted to this,” she laughs. “I still am, after all this time. I kind of literally just can’t stop doing it, and my friends just can’t stand it – ‘You’re not doing that again are you?’

“I mean, I really do not think I’m talking to my dead husband and writing songs with him – I really don’t. But people have styles, and they can be replicated.”

One part of me feels like this isn’t the way to preserve a memory of someone who is gone, but the other part feels that I would do the same thing if I were in her situation and had the opportunity.

See also the man who trained an AI chatbot with old texts from his dead fiancee.

Tags: , , ,

Love: math or magic?

This American Life tells the tales as old as time:

When it comes to finding love, there seems to be two schools of thought on the best way to go about it. One says, wait for that lightning-strike magic. The other says, make a calculation and choose the best option available. Who has it right?

Spoiler alert: there is a mix of practicality and feel. They each inform the other.

Tags: ,

DNA face to facial recognition in attempt to find suspect

In an effort to find a suspect in a 1990 murder, there was a police request in 2017 to use a 3-D rendering of a face based on DNA. For Wired, Dhruv Mehrotra reports:

The detective’s request to run a DNA-generated estimation of a suspect’s face through facial recognition tech has not previously been reported. Found in a trove of hacked police records published by the transparency collective Distributed Denial of Secrets, it appears to be the first known instance of a police department attempting to use facial recognition on a face algorithmically generated from crime-scene DNA.

This seems like a natural progression, but it should be easy to see how the pairing of the tech could cause all sorts of issues when someone’s face is poorly constructed and then misclassified with facial recognition. What’s the confidence interval equivalent for a face?

Tags: , , , , ,

Coin flips might tend towards the same side they started

The classic coin flip is treated as a fair way to make decisions, assuming an even chance for heads or tails on each flip. However, František Bartoš was curious and recruited friends and colleagues to record over 350,000 flips. There appeared to be a slight bias.

For Scientific American, Shi En Kim reports:

The flipped coins, according to findings in a preprint study posted on arXiv.org, landed with the same side facing upward as before the toss 50.8 percent of the time. The large number of throws allows statisticians to conclude that the nearly 1 percent bias isn’t a fluke. “We can be quite sure there is a bias in coin flips after this data set,” Bartoš says.

There is probably more than one caveat here, but even though there were a lot of flips, they only came from 48 people and the bias varied across flippers.

Of course, if you’re trying to get a call in your favor, maybe try to catch a glimpse of which side is up and choose accordingly. Couldn’t hurt.

Tags: , , ,

AI-based things in 2023

There were many AI-based things in 2023. Simon Willison outlined what we learned over the year:

The most surprising thing we’ve learned about LLMs this year is that they’re actually quite easy to build.

Intuitively, one would expect that systems this powerful would take millions of lines of complex code. Instead, it turns out a few hundred lines of Python is genuinely enough to train a basic version!

What matters most is the training data. You need a lot of data to make these things work, and the quantity and quality of the training data appears to be the most important factor in how good the resulting model is.

Tags: , ,