AI model trained to learn through child's eyes and ears in a new research | Tech News

AI model trained to learn through child's eyes and ears in a new research

AI model trained to learn through child's eyes and ears in a new research

By:PTI
| Updated on: Feb 03 2024, 12:22 IST
artificial intelligence
AI model was trained to learn words and concepts through the eyes and ears. (REUTERS)
artificial intelligence
AI model was trained to learn words and concepts through the eyes and ears. (REUTERS)

 In a new research, an AI model was trained to learn words and concepts through the eyes and ears of a single child, using headcam video recordings from when the child was six months and through their second birthday.

Researchers showed that the artificial intelligence (AI) model could learn a substantial number of words and concepts using limited slices of what the child experienced. Even though the video captured only one per cent of the child's waking hours, they said that was enough for genuine language learning.

"By using AI models to study the real language-learning problem faced by children, we can address classic debates about what ingredients children need to learn words - whether they need language-specific biases, innate knowledge, or just associative learning to get going," said Brenden Lake, an assistant professor in NYU's Center for Data Science and Department of Psychology and senior author of the study published in the journal Science.

We are on WhatsApp Channels. Click to join. 

For developing the model, the researchers first analysed a child's learning process captured on first-person video - via a light, head-mounted camera - on a weekly basis beginning at six months and through 25 months.

Using the video footage collected of over 60 hours, the team observed that it contained roughly a quarter of a million word instances - the number of words communicated, many of them repeatedly - that were linked with video frames of what the child saw as those words were spoken.

The footage also included a wide range of different activities across development, including mealtimes, reading books, and the child playing, the team said.

The researchers then trained a multimodal neural network with two separate modules - one that took in single frames of the video and another that took in the transcribed form of the speech directed at the child.

These modules were combined and trained using an algorithm called contrastive learning, which aims to learn by making associations in the input data, they said.

For instance, they explained, when a parent said something in the child's view, it was likely that some of the words used were likely referring to something that the child could see, which meant that comprehension was instilled by linking visual and linguistic cues.

"This provides the model a clue as to which words should be associated with which objects," said Wai Keen Vong, a research scientist at NYU's Center for Data Science.

"Combining these cues is what enables contrastive learning to gradually determine which words belong with which visuals and to capture the learning of a child's first words," said Vong.

After training the model, the team tested it by presenting the model with the target word and an array of four different image options, and asking it to select the image that matched the target word.

The model was able to learn a "substantial" number of the words and concepts present in the child's everyday experience, the researchers said.

Further, for some of the words the model learned, it was observed to be able to generalise them to visual instances different to those it saw in its training data.

This, the researchers said, reflected an aspect of generalisation also seen in children when they are studied in lab.

Also read other top stories today:

Apple foldable coming? Apple may launch its first foldable device in 2026 or 2027, with a 7-8 inch display. Uncertainty surrounds whether it will be a foldable iPhone or iPad. Read all about it here

Love to edit photos? Here are the best for you to do so in a jiffy! Check them out here

Smartphone launch! Infinix Smart 8 features an 8+128GB variant. It boasts a 50MP AI camera, innovative design elements, and a powerful MediaTek Helio G36 Octa-Core Processor. Check it out here

Tourists visiting Paris' Eiffel Tower will now be able to book their trip to the iconic monument using UPI. Read it all here.

Beware of Hackers! A recent report has found 12 malicious apps, with 6 on the Google Play Store that are spreading malware. Know how to protect yourself from such threats. Know what is happening here

Follow HT Tech for the latest tech news and reviews , also keep up with us on Whatsapp channel,Twitter, Facebook, Google News, and Instagram. For our latest videos, subscribe to our YouTube channel.

First Published Date: 03 Feb, 12:22 IST
NEXT ARTICLE BEGINS