Meta Develops AI Speech Tool Voicebox, Holds Off Release Due to Misuse Concerns

Contents

Voicebox Potential Misuse and Meta’s Precautionary Approach Global Concerns Over AI Misuse

Meta, a leading name in the tech industry, has made a significant leap in artificial intelligence (AI) by developing Voicebox, an advanced tool capable of generating lifelike speech.

Despite the tool’s potential, the company has chosen not to release it immediately due to concerns about potential misuse.

Voicebox

Announced last Friday, Voicebox can create convincing voice dialogue, opening up a range of possibilities, from enhancing communication across languages to delivering lifelike character dialogue in video games.

Unique in its functionality, Voicebox can generate speech it wasn’t specifically trained for.

All it requires is some text input and a small audio clip, which it then uses to create a whole new speech in the voice of the source audio.

Introducing Voicebox, a new breakthrough generative speech system based on Flow Matching, a new method proposed by Meta AI. It can synthesize speech across six languages, perform noise removal, edit content, transfer audio style & more.

More details on this work & examples ⬇️
— Meta AI (@MetaAI) June 16, 2023

In a breakthrough from traditional AI speech tools, Voicebox learns directly from raw audio and its corresponding transcription, eliminating the need for task-specific training with carefully curated datasets.

Like other generative AI work, Voicebox is able to create high-quality outputs from scratch or modify samples, but instead of images/video, it produces high-quality audio.

Unlike autoregressive models, it can modify any part of a given sample — not just the end of a clip.
— Meta AI (@MetaAI) June 16, 2023

Moreover, this impressive tool can produce audio in six languages – English, French, German, Spanish, Polish, and Portuguese – offering a realistic representation of natural human speech.

Potential Misuse and Meta’s Precautionary Approach

While Voicebox opens up exciting possibilities, Meta is fully aware of the potential misuse of such a tool.

The AI tool could be misused to create ‘deepfake’ dialogues, replicating the voices of public figures or celebrities in an unethical manner.

To counter this risk, Meta has developed AI classifiers, akin to spam filters, that can differentiate between human speech and speech generated by ‘Voicebox’.

The company is advocating for transparency in AI development, coupled with a firm commitment to responsible use. As part of this commitment, Meta has no current plans to make ‘Voicebox’ publicly available, emphasizing the need to balance openness with responsibility.

Instead of launching a functional tool, Meta is offering audio samples and a research paper to help researchers understand its potential and work towards responsible use.

Global Concerns Over AI Misuse

The rapid advancements in AI are causing concern among global leaders, including the United Nations (UN).

Deepfakes have been utilized in scams and have propagated hate and misinformation online, as highlighted in a recent UN report.

Creating AI tools like ‘Voicebox’ offers numerous possibilities but underscores the importance of cautious development and responsible use to prevent misuse.

As we continue to stride forward in the field of AI, these concerns will remain paramount.