HT TECH wants to start sending you push notifications. Click allow to subscribe

Apple AI models trained with YouTube content of MrBeast, MKBHD, PewDiePie and others without permission

Apple's recently unveiled AI model OpenELM was also trained via content from videos of famous YouTubers.

By: AYUSHMANN CHAWLA
Updated on: Jul 18 2024, 07:09 IST
iPhone and MacBooks' upcoming features were revealed by Apple in June. A few weeks after Apple revealed OpenELM AI model which is trained using content from popular YouTubers. (AP)

Apple, Nvidia, Salesforce and few of the other big tech companies across the world have been accused of training their AI models through YouTube videos of famous creators. As per a report by Wired, the tech giants fed subtitle files downloaded by a non-profit company from over 1,70,000 videos of popular creators including MrBeast, Marques Brownlee (MKBHD), PewDiePie, John Oliver, and Jimmy Kimmel and others, without their consent. For those who don’t know, the subtitle files are effectively transcripts of the video content. While many may think of it as violation of privacy and YouTube’s rules, it is also a major concern of potential copyright violation.

Also read: Hybrid AI is the way ahead to make artificial intelligence more practical on smartphones: Samsung’s Won-Joon Choi

You may be interested in

Mobiles Tablets Laptops

How Apple, Nvidia got the data

The report claims that an investigation by Proof News revealed that several tech giants have used subtitles of thousands of videos on YouTube to train AI. Although YouTube did have a policy that doesn’t allow anyone to harvest materials from their platform without permissions. However, the big tech players reportedly sourced the data from EleutherAI, a platform that claims to help small developers and academics to train AI models. It appears that the data extracted by EleutherAI has also been used by companies such as Apple and Nvidia.

Also read: Looking for a smartphone? To check mobile finder click here.

Also read: Apple Intelligence vs Samsung Galaxy AI: Who is ahead in the mobile phone AI race?

Research paper by EleutherAI reveals that their datasets, called the Pile, are open and accessible to anyone with enough computing power and space to access them. The research paper and posts from big tech companies also reflect how these firms valued in hundreds of billions and trillions of dollars, used Pile to train AI. Documents also shed light on Apple using EleutherAI’s Pile to train its high-profile model called OpenELM which debuted in April.

Also read: OpenAI Develops System to Track Progress Toward Human-Level AI

Is Apple responsible for the violation?

It is worth noting that YouTube’s terms and conditions have not been broken by Apple, but by EleutherAI who sourced the data from Google-owned video streaming platform and spread it to numerous developers via Pile. This is not the first example where data has been sourced illegally to train AI systems. One can often spot AI chatbots providing information while plagiarizing entire text when asked for information about niche topics.

One more thing! We are now on WhatsApp Channels! Follow us there so you never miss any updates from the world of technology. ‎To follow the HT Tech channel on WhatsApp, click here to join now!

Catch all the Latest Tech News, Mobile News, Laptop News, Gaming news, Wearables News , How To News, also keep up with us on ,Twitter, Facebook, , and Instagram. For our latest videos, subscribe to our YouTube channel.

First Published Date: 17 Jul, 13:43 IST
NEXT ARTICLE BEGINS