Google’s AI Chatbot is trained by Humans who say they’re overworked, underpaid and frustrated

Internal documents show complex instructions for chatbot feedback that workers are asked to complete in minutes.

By:BLOOMBERG
| Updated on: Jul 13 2023, 07:27 IST

Google Pixel 7a launched: 5 points to know-Price, Camera, Battery and more

Google Bard — 1/5 Google Pixel 7a Price and Colours: Launched at the Google I/O event, Google Pixel 7a is priced at Rs. 43999 and will be made available in three colour options namely- Charcoal, Sea, and Snow. (Shaurya / HT Tech)

image caption — 2/5 Google Pixel 7a Chipset and Display: The Pixel 7a runs on the Google Tensor G2 chipset, the same chip that's in Pixel 7 and Pixel 7 Pro. Coming to the display, the phone gets a 6.1-inch FHD+ OLED Always-on display with up to 90 Hz refresh rate, making the phone quick in response. According to the company, the display is scratch resistant, featuring Corning Gorilla Glass and it is made with recycled aluminium, glass and plastic. Pixel 7a can handle water and dust with IP67 protection too. (Shaurya / HT Tech)

Google's Bard artificial intelligence chatbot will answer a question about how many pandas live in zoos quickly, and with a surfeit of confidence.

Ensuring that the response is well-sourced and based on evidence, however, falls to thousands of outside contractors from companies including Appen Ltd. and Accenture Plc, who can make as little as $14 an hour and labor with minimal training under frenzied deadlines, according to several contractors, who declined to be named for fear of losing their jobs.

You may be interested in

MobilesTablets Laptops

25% OFF

37% OFF

7% OFF

The contractors are the invisible backend of the generative AI boom that's hyped to change everything. Chatbots like Bard use computer intelligence to respond almost instantly to a range of queries spanning all of human knowledge and creativity. But to improve those responses so they can be reliably delivered again and again, tech companies rely on actual people who review the answers, provide feedback on mistakes and weed out any inklings of bias.

Also read

Looking for a smartphone? To check mobile finder click here.

It's an increasingly thankless job. Six current Google contract workers said that as the company entered a AI arms race with rival OpenAI over the past year, the size of their workload and complexity of their tasks increased. Without specific expertise, they were trusted to assess answers in subjects ranging from medication doses to state laws. Documents shared with Bloomberg show convoluted instructions that workers must apply to tasks with deadlines for auditing answers that can be as short as three minutes.

“As it stands right now, people are scared, stressed, underpaid, don't know what's going on,” said one of the contractors. “And that culture of fear is not conducive to getting the quality and the teamwork that you want out of all of us.”

Google has positioned its AI products as public resources in health, education and everyday life. But privately and publicly, the contractors have raised concerns about their working conditions, which they say hurt the quality of what users see. One Google contract staffer who works for Appen said in a letter to Congress in May that the speed at which they are required to review content could lead to Bard becoming a “faulty” and “dangerous” product.

Google has made AI a major priority across the company, rushing to infuse the new technology into its flagship products after the launch of OpenAI's ChatGPT in November. In May, at the company's annual I/O developers conference, Google opened up Bard to 180 countries and territories and unveiled experimental AI features in marquee products like search, email and Google Docs. Google positions itself as superior to the competition because of its access to “the breadth of the world's knowledge.”

“We undertake extensive work to build our AI products responsibly, including rigorous testing, training, and feedback processes we've honed for years to emphasize factuality and reduce biases,” Google, owned by Alphabet Inc., said in a statement. The company said it isn't only relying on the raters to improve the AI, and that there are a number of other methods for improving its accuracy and quality.

To prepare for the public using these products, workers said they started getting AI-related tasks as far back as January. One trainer, employed by Appen, was recently asked to compare two answers providing information about the latest news on Florida's ban on gender-affirming care, rating the responses by helpfulness and relevance. Workers are also frequently asked to determine whether the AI model's answers contain verifiable evidence. Raters are asked to decide whether a response is helpful based on six-point guidelines that include analyzing answers for things like specificity, freshness of information and coherence.

They are also asked to make sure the responses don't “contain harmful, offensive, or overly sexual content,” and don't “contain inaccurate, deceptive, or misleading information.” Surveying the AI's responses for misleading content should be “based on your current knowledge or quick web search,” the guidelines say. “You do not need to perform a rigorous fact check” when assessing the answers for helpfulness.

The example answer to “Who is Michael Jackson?” included an inaccuracy about the singer starring in the movie “Moonwalker” — which the AI said was released in 1983. The movie actually came out in 1988. “While verifiably incorrect,” the guidelines state, “this fact is minor in the context of answering the question, ‘Who is Michael Jackson?'”

Even if the inaccuracy seems small, “it is still troubling that the chatbot is getting main facts wrong,” said Alex Hanna, the director of research at the Distributed AI Research Institute and a former Google AI ethicist. “It seems like that's a recipe to exacerbate the way these tools will look like they're giving details that are correct, but are not,” she said.

Raters say they are assessing high-stakes topics for Google's AI products. One of the examples in the instructions, for instance, talks about evidence that a rater could use to determine the right dosages for a medication to treat high blood pressure, called Lisinopril.

Google said that some workers concerned about accuracy of content may not have been training specifically for accuracy, but for tone, presentation and other attributes it tests. “Ratings are deliberately performed on a sliding scale to get more precise feedback to improve these models,” the company said. “Such ratings don't directly impact the output of our models and they are by no means the only way we promote accuracy.”

Read the contract staffers' instructions for training Google's generative AI here:

Ed Stackhouse, the Appen worker who sent the letter to Congress, said in an interview that contract staffers were being asked to do AI labeling work on Google's products “because we're indispensable to AI as far as this training.” But he and other workers said they appeared to be graded for their work in mysterious, automated ways. They have no way to communicate with Google directly, besides providing feedback in a “comments” entry on each individual task. And they have to move fast. “We're getting flagged by a type of AI telling us not to take our time on the AI,” Stackhouse added.

Google disputed the workers' description of being automatically flagged by AI for exceeding time targets. At the same time, the company said that Appen is responsible for all performance reviews for employees. Appen did not respond to requests for comment. A spokesperson for Accenture said the company does not comment on client work.

Other technology companies training AI products also hire human contractors to improve them. In January, Time reported that laborers in Kenya, paid $2 an hour, had worked to make ChatGPT less toxic. Other tech giants, including Meta Platforms Inc., Amazon.com Inc. and Apple Inc. make use of subcontracted staff to moderate social network content and product reviews, and to provide technical support and customer service.

“If you want to ask, what is the secret sauce of Bard and ChatGPT? It's all of the internet. And it's all of this labeled data that these labelers create,” said Laura Edelson, a computer scientist at New York University. “It's worth remembering that these systems are not the work of magicians — they are the work of thousands of people and their low-paid labor.”

Google said in a statement that it “is simply not the employer of any of these workers. Our suppliers, as the employers, determine their working conditions, including pay and benefits, hours and tasks assigned, and employment changes – not Google.”

Staffers said they had encountered bestiality, war footage, child pornography and hate speech as part of their routine work assessing the quality of Google products and services. While some workers, like those reporting to Accenture, do have health care benefits, most only have minimal “counseling service” options that allow workers to phone a hotline for mental health advice, according to an internal website explaining some contractor benefits.

For Google's Bard project, Accenture workers were asked to write creative responses for the AI chatbot, employees said. They answered prompts on the chatbot — one day they could be writing a poem about dragons in Shakespearean style, for instance, and another day they could be debugging computer programming code. Their job was to file as many creative responses to the prompts as possible each work day, according to people familiar with the matter, who declined to be named because they weren't authorized to discuss internal processes.

For a short period, the workers were reassigned to review obscene, graphic and offensive prompts, they said. After one worker filed an HR complaint with Accenture, the project was abruptly terminated for the US team, though some of the writers' counterparts in Manila continued to work on Bard.

The jobs have little security. Last month, half a dozen Google contract staffers working for Appen received a note from management, saying their positions had been eliminated “due to business conditions.” The firings felt abrupt, the workers said, because they had just received several emails offering them bonuses to work longer hours training AI products. The six fired workers filed a complaint to the National Labor Relations Board in June. They alleged they were illegally terminated for organizing, because of Stackhouse's letter to Congress. Before the end of the month, they were reinstated to their jobs.

Google said the dispute was a matter between the workers and Appen, and that they “respect the labor rights of Appen employees to join a union.” Appen didn't respond to questions about its workers organizing.

Emily Bender, a professor of computational linguistics at the University of Washington, said the work of these contract staffers at Google and other technology platforms is “a labor exploitation story,” pointing to their precarious job security and how some of these kinds of workers are paid well below a living wage. “Playing with one of these systems, and saying you're doing it just for fun — maybe it feels less fun, if you think about what it's taken to create and the human impact of that,” Bender said.

The contract staffers said they have never received any direct communication from Google about their new AI-related work — it all gets filtered through their employer. They said they don't know where the AI-generated responses they see are coming from, nor where their feedback goes. In the absence of this information, and with the ever-changing nature of their jobs, workers worry that they're helping to create a bad product.

Some of the answers they encounter can be bizarre. In response to the prompt, “Suggest the best words I can make with the letters: k, e, g, a, o, g, w,” one answer generated by the AI listed 43 possible words, starting with suggestion No. 1: “wagon.” Suggestions 2 through 43, meanwhile, repeated the word “WOKE” over and over.

In another task, a rater was presented with a lengthy answer that began with, “As of my knowledge cutoff in September 2021.” That response is associated with OpenAI's large language model, called GPT-4. Though Google said that Bard “is not trained on any data from ShareGPT or ChatGPT,” raters have wondered why such phrasing appears in their tasks.

Bender said it makes little sense for large tech corporations to be encouraging people to ask an AI chatbot questions on such a broad range of topics, and to be presenting them as “everything machines.”

“Why should the same machine that is able to give you the weather forecast in Florida also be able to give you advice about medication doses?” she asked. “The people behind the machine who are tasked with making it be somewhat less terrible in some of those circumstances have an impossible job.”

Catch all the Latest Tech News, Mobile News, Laptop News, Gaming news, Wearables News , How To News, also keep up with us on Whatsapp channel,Twitter, Facebook, Google News, and Instagram. For our latest videos, subscribe to our YouTube channel.

First Published Date: 13 Jul, 07:27 IST

Tags: google artificial intelligence

NEXT ARTICLE BEGINS

Best Deals For You

Air purifiers to buy in India for healthy and clean air- Here are top 5 picks

Trending Gadgets

Mobiles Laptops Tablets

Google’s AI Chatbot is trained by Humans who say they’re overworked, underpaid and frustrated

Internal documents show complex instructions for chatbot feedback that workers are asked to complete in minutes.

You may be interested in

Tips & Tricks

iPhone 16 series, OnePlus 13, and other 5 flagship smartphones to launch in 2024

iPhone users will be able to transcribe voice recordings with iOS 18: Here is how it works

Protect your Aadhaar Card: How to check, lock, and report misuse effectively online

Wondering if your iPhone has hidden apps? Know how to find and manage them easily

Editor’s Pick

Trending Stories

iPhone SE 4 launch still months away, powerful mid-ranger likely to arrive in…

Apple’s ‘Glowtime’ Event on 9 September: These products, including iPhone SE 4, are not expected to launch

iPhone 16 Pro must improve in these 3 areas—And I say this after using iPhone 15 Pro for almost a year

Aadhaar Card Update for free online: Act before September 14 to avoid future fees

Anil Kapoor featured in TIME's 100 Most Influential People in AI cover, but Sam Altman misses out: Here’s why

Gaming

GTA 6: Take-Two CEO addresses AI, violence concerns, development timeline, and release speculations

GTA 6 remains on track for 2025 release as Take-Two reports strong earnings

GTA 6 gameplay leaks: Explorable buildings, destructibility, and more coming in fall 2025

Good news, PS5 owners: Sony is giving 5 free days of PS Plus after weekend outage

PlayStation network outage affects millions of GTA Online players across PS4 and PS5

Best Deals For You

Air purifiers to buy in India for healthy and clean air- Here are top 5 picks

5 best smartphones for your eyes: Xiaomi 13, Honor 90 to Motorola Edge Plus, check list

Top 10 smartwatch brands: Leading the market with innovation

Japanese toilets in India: TOTO washlet starting price, features and all details to know

Amazon Diwali Sale 2024: Get up to 40% off on ASUS Vivobook S 16 OLED to Lenovo Yoga Slim 6 and more laptops

Trending News