OpenAI unveils ‘Voice Engine’: Mimics human speech with just 15 second audio samples | Tech News

OpenAI unveils ‘Voice Engine’: Mimics human speech with just 15 second audio samples

OpenAI introduces Voice Engine, a groundbreaking tool that can replicate voices from just a 15-second sample, raising concerns over potential misuse.

By: MD IJAJ KHAN
| Updated on: Mar 30 2024, 13:20 IST
WWDC 2024 expected announcements: Apple could unveil iOS 18, AI upgrades and more
OpenAI
1/5 WWDC 2024 location - Continuing the trend of yesteryears, WWDC 2024 will take place at Apple Park in Menlo, Cupertino, California, the home of the company since 2017. (Apple)
OpenAI
2/5 WWDC 2024 announcements - Apple has already announced that the WWDC 2024 will showcase advancements in iOS, iPadOS, macOS, watchOS, tvOS, and visionOS. It is also designed to help developers by providing them insight into several frameworks, tools, features and access to Apple experts. (Unsplash)
OpenAI
3/5 iOS 18 - Despite not being confirmed, iOS 18 is pretty nailed on to be introduced at WWDC 2024, and it is likely to become one of the starring highlights. This is due to several AI upgrades that are in the pipeline. Bloomberg's Mark Gurman says it could be one of the “biggest updates in iPhone's history”. Apple could unveil a Siri powered by Large Language Models (LLMs), whereas AI could be incorporated into apps like Music, Keynote, Pages, and even AppleCare. (Unsplash)
OpenAI
4/5 Other software - In addition to iOS 18, iPadOS 18, macOS 15, watchOS 11, tvOS 18, and HomePod Software 18 are also likely to see the light of day. Moreover, Apple may also unveil visionOS 2, the software powering the Apple Vision Pro headset. (Unsplash)
OpenAI
5/5 Mac Studio, Mac Pro and other devices - Like last year, Apple may take the wraps off its new generation of Mac devices including Mac Studio, Mac Pro and Mac Mini, all of which could be powered by the new M3 chip. The iPhone maker also unveiled a 15-inch MacBook Air last year but no unveil is likely this time as M3-powered MacBook Air was already launched earlier this month. (Unsplash)
OpenAI
icon View all Images
OpenAI unveils 'Voice Engine': A remarkable tool cloning voices with just a 15-second audio snippet. (AP)

OpenAI, renowned for its innovative strides in AI technology with creations like Sora, its video generator, has now introduced 'Voice Engine,' a pioneering voice cloning tool. This remarkable audio model can accurately replicate the nuances of human speech, including intonation and unique speech patterns, utilising just a brief 15-second sample of the original voice. Despite eager anticipation, OpenAI has opted to keep this new feature tightly under wraps, citing concerns over potential misuse and the proliferation of fake content online. 

Remarkable Efficiency and Precision

"Incredibly, our Voice Engine can craft emotive and lifelike voices using just a single 15-second sample," the company stated in a recent blog post.

Also read: Microsoft and OpenAI to launch $100 billion AI data center project with 'Stargate' supercomputer

OpenAI's Voice Engine Versus Industry Standards

In contrast, existing AI voice platforms like ElevenLabs typically require longer samples, with their instant voice cloning tool necessitating at least one minute of audio for operation. For optimal results, approximately 10 minutes of continuous speech are recommended, particularly for professional-grade services.

OpenAI showcased the capabilities of Voice Engine through various demonstrations, including a poignant example where the voice of a young patient, who had lost much of her speaking ability due to a brain tumour, was replicated using an older recording from a school project. The technology enabled her to communicate using her own voice, a feat made possible through collaboration with Lifespan, a nonprofit associated with Brown University's medical school.

Also read: iOS 18 at WWDC 2024: Features, AI upgrades, launch date, supported devices and more

Moreover, OpenAI revealed partnerships with organisations like HeyGen, demonstrating how Voice Engine facilitates natural-sounding translations of speech from one language to another.

Also read: Apple may soon offer ‘topographic maps' on iPhone, Macbook: What is it and all details

According to OpenAI, Voice Engine was initially developed in late 2022 and is already integrated into the preset voices available in OpenAI's text-to-speech API, as well as ChatGPT's Voice and Read Aloud feature. With these latest advancements, the company is proceeding with caution before a wider release.

Catch all the Latest Tech News, Mobile News, Laptop News, Gaming news, Wearables News , How To News, also keep up with us on Whatsapp channel,Twitter, Facebook, Google News, and Instagram. For our latest videos, subscribe to our YouTube channel.

First Published Date: 30 Mar, 13:19 IST
NEXT ARTICLE BEGINS