OpenAI previews voice synthesis

OpenAI previewed Voice Engine, a model to generate voices that mimic, using just a 15-second audio sample:

We first developed Voice Engine in late 2022, and have used it to power the preset voices available in the text-to-speech API as well as ChatGPT Voice and Read Aloud. At the same time, we are taking a cautious and informed approach to a broader release due to the potential for synthetic voice misuse. We hope to start a dialogue on the responsible deployment of synthetic voices, and how society can adapt to these new capabilities. Based on these conversations and the results of these small scale tests, we will make a more informed decision about whether and how to deploy this technology at scale.

They provide worthwhile use cases, such as language translation and providing a voice to those who are non-verbal, but oh boy, the authenticity of online things is going to get tricky very soon.

Tags: , , ,

Racial bias in OpenAI GPT resume rankings

AI is finding its way into the HR workflow to sift through resumes. This seems like a decent idea on the surface, until you realize that the models that the AI is built on lean more towards certain demographics. For Bloomberg, Leon Yin, Davey Alba, and Leonardo Nicoletti experimented with the OpenAI GPT showing a bias:

When asked to rank those resumes 1,000 times, GPT 3.5 — the most broadly-used version of the model — favored names from some demographics more often than others, to an extent that would fail benchmarks used to assess job discrimination against protected groups. While this test is a simplified version of a typical HR workflow, it isolated names as a source of bias in GPT that could affect hiring decisions. The interviews and experiment show that using generative AI for recruiting and hiring poses a serious risk for automated discrimination at scale.

Yeah, that sounds about right.

Tags: , , , ,

News organizations blocking OpenAI

Ben Welsh has a running list of the news organizations blocking OpenAI crawlers:

In total, 532 of 1,147 news publishers surveyed by the homepages.news archive have instructed OpenAI, Google AI or the non-profit Common Crawl to stop scanning their sites, which amounts to 46.4% of the sample.

The three organizations systematically crawl web sites to gather the information that fuels generative chatbots like OpenAI’s ChatGPT and Google’s Bard. Publishers can request that their content be excluded by opting out via the robots.txt convention.

On the web, it used to be that you would write or make something and there would be a link to the thing. Other websites could link to the thing, and people would go to the place with the thing. With this recent AI wave, a lot of the thing ends up elsewhere and no one sees the original place.

Fun times ahead.

Tags: , ,

Manual removal of harmful text to train AI models

AI training data comes from the internet, and as we know but maybe forget sometimes, there are harmful areas that are terrible for people. For Time, Billy Perrigo reports on how OpenAI outsourced a firm to label such data, which required people to read disturbing text:

To build that safety system, OpenAI took a leaf out of the playbook of social media companies like Facebook, who had already shown it was possible to build AIs that could detect toxic language like hate speech to help remove it from their platforms. The premise was simple: feed an AI with labeled examples of violence, hate speech, and sexual abuse, and that tool could learn to detect those forms of toxicity in the wild. That detector would be built into ChatGPT to check whether it was echoing the toxicity of its training data, and filter it out before it ever reached the user. It could also help scrub toxic text from the training datasets of future AI models.

To get those labels, OpenAI sent tens of thousands of snippets of text to an outsourcing firm in Kenya, beginning in November 2021. Much of that text appeared to have been pulled from the darkest recesses of the internet.

Tags: , , , ,

Data visualization(-ish) in the style of famous artists

DALL-E is an AI system from OpenAI that creates images from text. You can enter very random things and get very real-looking output. So of course someone entered “data visualization in the style of insert-anything-here” for a wide array of inspiration. I’m partial to the bar chart made out of cake.

Tags: , ,

Neural network creates images from text

OpenAI trained a neural network that they call DALL·E with a dataset of text and image pairs. So now the neural network can take text input and output random combinations of descriptors and objects, like a purse in the style of Rubik’s cube or a teapot imitating Pikachu.

Tags: , , ,

Neural network generates convincing songs by famous singers

Jukebox from OpenAI is a generative model that makes music in the same styles as many artists you’ll probably recognize:

To train this model, we crawled the web to curate a new dataset of 1.2 million songs (600,000 of which are in English), paired with the corresponding lyrics and metadata from LyricWiki. The metadata includes artist, album genre, and year of the songs, along with common moods or playlist keywords associated with each song. We train on 32-bit, 44.1 kHz raw audio, and perform data augmentation by randomly downmixing the right and left channels to produce mono audio.

A lot of the time, generative music sounds artificial and mechanical, but these results are pretty convincing. I mean you can still tell it’s not from the artist, but many of the examples are listenable.

OpenAI also published the code.

Tags: , , ,