VALL-E X: The Most Dangerous Scammy AI Voice Cloning Tool Now Open Source

An open-source implementation of Microsoft’s VALL-E X zero-shot TTS model has been unveiled, allowing users to delve into the realms of advanced text-to-speech synthesis and voice cloning. This development comes as an expansion of Microsoft’s initial research paper, which lacked the code or pre-trained models necessary for hands-on exploration. With this release, the technology community gains access to a powerful tool for next-generation TTS capabilities.

VALL-E X is an exceptional multilingual text-to-speech model introduced by Microsoft. While the original research paper was informative, it lacked practical application due to the absence of code or pre-trained models. To bridge this gap, the dedicated team took on the challenge of reproducing the results and training our own VALL-E X model. The result of our endeavors is now available to the public, enabling a broader audience to experience the transformative potential of cutting-edge TTS technology.

VALL-E X is marked by several groundbreaking functionalities:

Moreover, VALL-E X extends its support to Chinese and Japanese languages, boasting exceptional performance across all three languages.

The voice cloning capabilities of VALL-E X facilitate the creation of voice prompts using a person’s, character’s, or one’s own voice. A speech sample of 3 to 10 seconds, along with the transcript, is all that’s needed to craft a distinct voice prompt. A user-friendly graphical interface further simplifies interactions with VALL-E X, rendering voice cloning and multilingual speech synthesis an accessible endeavor.

VALLE-X: Speak Foreign Languages with Your Own Voice

In this demo, a Chinese speaker's voice is captured and modified to speak English using a baseline sample, reflecting what the Chinese voice would sound like in English.

Demo page: https://t.co/ppTVWgYRQi pic.twitter.com/xcIVtCdKB3

— AI Breakfast (@AiBreakfast) March 8, 2023

Notably, VALL-E X operates seamlessly on both CPU and GPU (pytorch 2.0+, CUDA 11.7, and CUDA 12.0). The model’s efficient design ensures that a GPU VRAM of 6GB is sufficient for operation without offloading.

In comparison to the Bark model, VALL-E X offers several advantages:

Regarding VRAM requirements, a 6GB GPU VRAM meets the criteria for running VALL-E X effectively. However, for longer text generation, the total length of the audio prompt and the generated audio must remain below 22 seconds to ensure optimal performance.

The open-source licensing of VALL-E X, governed by the MIT License, signifies a new era of accessibility and exploration in the realm of multilingual text-to-speech synthesis and voice cloning.

Source: mPost

This Week in Crypto Games: Dr. Disrespect Dumped, Pixelverse and Catizen Tokens, Notcoin ‘Fresh Start’

Biggest Video Games Releasing in July 2024

Checkmate? Using AI to Build a Better, More Creative Chess Foe

Breachers hands-on: A top-notch tactical VR shooter in the style of Rainbow Six Siege

AI Featured Posts

Ubisoft Will Tap NVIDIA’s AI Tools to Build Chatty Game Characters

Amazon Takes on AI Rivals with Child-Friendly Alexa Features

Environmental Groups Need to Dig Deeper Into AI Energy Use, Researcher Says

X now permits AI-generated adult content

Metaverse Featured Posts

Xbox Game Pass Ultimate coming to Meta Quest 3 in December

Scottish police to equip officers with AR headsets

Meta & LG headset reportedly canceled and postponed to 2027

Meta Quest 3: Nerve-wracking VR vampire hunt Silent Slayer has a release date

NFTs Featured Posts

Trump Dumps Millions in Ethereum After Disastrous NFT Redux

Friend.Tech Sees Trading Surge Weeks After Being Declared ‘Dead’

Amplifying the Power of NFT Marketing Through Social Media

Hip-Hop Icons Tune into Gala Music to Drop Rare NFT Album

Let's Get Social

VALL-E X: The Most Dangerous Scammy AI Voice Cloning Tool Now Open Source

From Minecraft to Cognitive Mastery: OpenAI’s GPT Models Redefine Learning

Virtue Poker Relaunches With Focus on Connecting NFT Communities

Leave a Reply Cancel reply

This Week in Crypto Games: Dr. Disrespect Dumped, Pixelverse and Catizen Tokens, Notcoin ‘Fresh Start’

Biggest Video Games Releasing in July 2024

Checkmate? Using AI to Build a Better, More Creative Chess Foe

Breachers hands-on: A top-notch tactical VR shooter in the style of Rainbow Six Siege

Frame gets smarter: Brilliant Labs pushes its AI smart glasses with new features

AI Featured Posts

Metaverse Featured Posts

NFTs Featured Posts

Let's Get Social

VALL-E X: The Most Dangerous Scammy AI Voice Cloning Tool Now Open Source

Share this article

From Minecraft to Cognitive Mastery: OpenAI’s GPT Models Redefine Learning

Virtue Poker Relaunches With Focus on Connecting NFT Communities

Leave a Reply Cancel reply

Read next