HT TECH wants to start sending you push notifications. Click allow to subscribe

Meta unveils speech-to-text, text-to-speech AI models for over 1,100 languages; even shares open source data

Meta has unveiled its speech-to-text, text-to-speech AI models for over 1,100 languages.

By: HT TECH
Updated on: May 23 2023, 20:30 IST
Meta says it will make the models open source, allowing developers to freely make new speech apps. (REUTERS)
Meta says it will make the models open source, allowing developers to freely make new speech apps. (REUTERS)

All the tech majors are in a fierce fight over delivering utility to users in the form of artificial intelligence (AI) boosted products. While everyone knows about OpenAI's ChatGPT and Google's Bard, there was very little available on it from Facebook co-founder Mark Zuckerberg's Meta Platforms. Till today, that is. Now, the company has launched its speech-to-text, text-to-speech AI models for over 1,100 languages and the best part is that it is not linked to ChatGPT. Check out the Massively Multilingual Speech (MMS) project.

The biggest takeaway is that Meta has shared the open source and that means it could lead to a skyrocketing of the number of speech apps created across the world.

You may be interested in

Mobiles Tablets Laptops
Apple iPhone 15 Pro Max
  • Black Titanium
  • 8 GB RAM
  • 256 GB Storage
₹156,900
Check details
27% OFF
Samsung Galaxy S23 Ultra 5G
  • Green
  • 12 GB RAM
  • 256 GB Storage
₹109,999₹149,999
Buy now
Google Pixel 8 Pro
  • Obsidian
  • 12 GB RAM
  • 128 GB Storage
₹106,998
Check details
Apple iPhone 15 Plus
  • Black
  • 6 GB RAM
  • 128 GB Storage
₹87,900
Check details
21% OFF
Acer Swift Go SFG14 41 NX KG3SI 002 Laptop
  • Pure Silver
  • 8 GB RAM
  • 512 GB SSD
₹58,990₹74,990
Buy now
41% OFF
Acer Aspire 5 A515 57G Laptop
  • Gray
  • 16 GB RAM
  • 512 GB SSD
₹52,990₹89,999
Buy now
41% OFF
Acer Aspire 3 A315 24 NX KDESI 004 Laptop
  • Silver
  • 8 GB RAM
  • 512 GB SSD
₹33,990₹57,999
Buy now
40% OFF
Asus VivoBook 15 X515JA BQ322WS Laptop
  • Transparent Silver
  • 8 GB RAM
  • 512 GB SSD
₹31,350₹51,990
Buy now
35% OFF
Xiaomi Pad 6
  • Mist Blue
  • 6 GB RAM
  • 128 GB Storage
₹25,999₹39,999
Buy now
55% OFF
Lenovo Tab M10 5G
  • Abyss Blue
  • 6 GB RAM
  • 128 GB Storage
₹20,999₹47,000
Buy now
38% OFF
Realme Pad 2
  • Imagination Grey
  • 6 GB RAM
  • 128 GB Storage
₹17,999₹28,999
Buy now
Honor Pad X9
  • Gray
  • 4 GB RAM
  • 128 GB Storage
₹16,998
Check details

If all goes well in the real world, how useful this can be is clear from Meta's statement, "Existing speech recognition models only cover approximately 100 languages — a fraction of the 7,000+ known languages spoken on the planet."

Also read: Looking for a smartphone? To check mobile finder click here.

Data Crunching

Now, good machine-learning models require large amounts of labeled data — in this case, many thousands of hours of audio, along with transcriptions. For most languages, this data simply does not exist.

However, Meta has overcome that through its MMS project, which combined wav2vec 2.0, its pioneering work in self-supervised learning, and a new dataset that provides labeled data for over 1,100 languages and unlabeled data for nearly 4,000 languages.

Patting itself on the back, Meta, in a statement said, "Our results show that the Massively Multilingual Speech models outperform existing models and cover 10 times as many languages."

It also revealed that, "Today, we are publicly sharing our models and code so that others in the research community can build upon our work. Through this work, we hope to make a small contribution to preserve the incredible language diversity of the world."

How Meta did it

The MMS project's first job was to collect audio data for thousands of languages, but the largest existing speech datasets covered at most 100 languages. The challenge was overcome by "turning to religious texts, such as the Bible, that have been translated in many different languages and whose translations have been widely studied for text-based language translation research".

The MMS project even created a dataset of readings of the New Testament in over 1,100 languages.

Having sensed that the idea was good and that it could be milked for much more, the project also considered unlabeled recordings of various other Christian religious readings. This increased the number of languages available to over 4,000.

Bias, what bias?

EVen though the data is from a specific domain, the biases seemed not to have entered into the system. This is clear from the fact that even though this text is often read by male speakers, Meta analysis showed that its MMS models perform equally well for male and female voices.

And, importantly, though the content of the audio recordings is religious, MMS analysis shows that this does not overly bias the model to produce more religious language.

Meta credits this success to the use of the Connectionist Temporal Classification approach, which it found to be better than the large language models (LLMs) or sequence to-sequence models for speech recognition.

How it was made usable

Meta preprocessed the data to make it usable by machine learning algorithms by training an alignment model on existing data in over 100 languages.

To reduce the error rate, Meta said, "We applied multiple rounds of this process and performed a final cross-validation filtering step based on model accuracy to remove potentially misaligned data.

Results obtained

Meta trained multilingual speech recognition models on over 1,100 languages. The consequence of this was explained by Meta in this way, "As the number of languages increases, performance does decrease, but only very slightly: Moving from 61 to 1,107 languages increases the character error rate by only about 0.4 percent but increases the language coverage by over 18 times."

MMS vs OpenAI Whisper

In a like-for-like comparison with Whisper, Meta said that models trained on the Massively Multilingual Speech data achieve only half the word error rate, but importantly, Massively Multilingual Speech covers 11 times more languages.

Catch all the Latest Tech News, Mobile News, Laptop News, Gaming news, Wearables News , How To News, also keep up with us on ,Twitter, Facebook, , and Instagram. For our latest videos, subscribe to our YouTube channel.

First Published Date: 23 May, 20:20 IST

Sale

Mobiles Tablets Laptops
4% OFF
Samsung Galaxy S24 Ultra
  • Titanium Black
  • 12 GB RAM
  • 256 GB Storage
₹129,999₹134,999
Buy now
7% OFF
Apple iPhone 15 Pro Max
  • Black Titanium
  • 8 GB RAM
  • 256 GB Storage
₹148,900₹159,900
Buy now
13% OFF
Xiaomi 14
  • Matte Black
  • 12 GB RAM
  • 512 GB Storage
₹69,999₹79,999
Buy now
10% OFF
Apple iPhone 15 Plus
  • Black
  • 6 GB RAM
  • 128 GB Storage
₹80,990₹89,900
Buy now
31% OFF
Samsung Galaxy Tab A7 Lite
  • Silver
  • 3 GB RAM
  • 32 GB Storage
₹9,990₹14,500
Buy now
18% OFF
Samsung Galaxy Tab S9 5G 256GB
  • Graphite
  • 8 GB RAM
  • 256 GB Storage
₹92,948₹112,898
Buy now
37% OFF
Wishtel IRA T811
  • 4 GB RAM
  • 64 GB Storage
₹11,999₹18,999
Buy now
42% OFF
Lenovo Tab M11
  • Seaform Green
  • 8 GB RAM
  • 128 GB Storage
₹17,999₹31,000
Buy now
23% OFF
Infinix INBook X1 Neo XL22 Laptop Intel Celeron Quad Core 8 GB 256 GB SSD Windows 11
  • Blue
  • 4 GB RAM
  • 128 GB SSD
₹22,990₹29,990
Buy now
27% OFF
Asus ROG Strix G15 G513QM HF318TS Laptop
  • Eclipse Grey
  • 16 GB RAM
  • 1 TB SSD
₹84,990₹115,990
Buy now
39% OFF
Asus TUF Gaming F15 FX506HF HN026W Laptop
  • Black
  • 8 GB RAM
  • 1 TB SSD
₹55,600₹90,990
Buy now
23% OFF
Asus ROG Strix SCAR II GL504GV ES019T Laptop
  • Gun Metal
  • 16 GB RAM
  • 1 TB HDD
₹199,990₹259,990
Buy now
NEXT ARTICLE BEGINS