SAN FRANCISCO (Reuters) - Alphabet Inc’s (GOOGL.O) Google in May introduced a slick feature for Gmail that automatically completes sentences for users as they type. Tap out “I love” and Gmail might propose “you” or “it.”
But users are out of luck if the object of their affection is “him” or “her.”
Google’s technology will not suggest gender-based pronouns because the risk is too high that its “Smart Compose” technology might predict someone’s sex or gender identity incorrectly and offend users, product leaders revealed to Reuters in interviews.
Gmail product manager Paul Lambert said a company research scientist discovered the problem in January when he typed “I am meeting an investor next week,” and Smart Compose suggested a possible follow-up question: “Do you want to meet him?” instead of “her.”
Consumers have become accustomed to embarrassing gaffes from autocorrect on smartphones. But Google refused to take chances at a time when gender issues are reshaping politics and society, and critics are scrutinizing potential biases in artificial intelligence like never before.
“Not all ‘screw ups’ are equal,” Lambert said. Gender is a “a big, big thing” to get wrong.
Getting Smart Compose right could be good for business. Demonstrating that Google understands the nuances of AI better than competitors is part of the company’s strategy to build affinity for its brand and attract customers to its AI-powered cloud computing tools, advertising services and hardware.
Gmail has 1.5 billion users, and Lambert said Smart Compose assists on 11 percent of messages worldwide sent from Gmail.com, where the feature first launched.
Smart Compose is an example of what AI developers call natural language generation (NLG), in which computers learn to write sentences by studying patterns and relationships between words in literature, emails and web pages.
A system shown billions of human sentences becomes adept at completing common phrases but is limited by generalities. Men have long dominated fields such as finance and science, for example, so the technology would conclude from the data that an investor or engineer is “he” or “him.” The issue trips up nearly every major tech company.
Lambert said the Smart Compose team of about 15 engineers and designers tried several workarounds, but none proved bias-free or worthwhile. They decided the best solution was the strictest one: Limit coverage. The gendered pronoun ban affects fewer than 1 percent of cases where Smart Compose would propose something, Lambert said.
“The only reliable technique we have is to be conservative,” said Prabhakar Raghavan, who oversaw engineering of Gmail and other services until a recent promotion.
Google’s decision to play it safe on gender follows some high-profile embarrassments for the company’s predictive technologies.
The company apologized in 2015 when the image recognition feature of its photo service labeled a black couple as gorillas. In 2016, Google altered its search engine’s autocomplete function after it suggested the anti-Semitic query “are jews evil” when users sought information about Jews.
Google has banned expletives and racial slurs from its predictive technologies, as well as mentions of its business rivals or tragic events.
The company’s new policy banning gendered pronouns also affected the list of possible responses in Google’s Smart Reply. That service allow users to respond instantly to text messages and emails with short phrases such as “sounds good.”
Google uses tests developed by its AI ethics team to uncover new biases. A spam and abuse team pokes at systems, trying to find “juicy” gaffes by thinking as hackers or journalists might, Lambert said.
Workers outside the United States look for local cultural issues. Smart Compose will soon work in four other languages: Spanish, Portuguese, Italian and French.
“You need a lot of human oversight,” said engineering leader Raghavan, because “in each language, the net of inappropriateness has to cover something different.”
Google is not the only tech company wrestling with the gender-based pronoun problem.
Agolo, a New York startup that has received investment from Thomson Reuters, uses AI to summarize business documents.
Its technology cannot reliably determine in some documents which pronoun goes with which name. So the summary pulls several sentences to give users more context, said Mohamed AlTantawy, Agolo’s chief technology officer.
He said longer copy is better than missing details. “The smallest mistakes will make people lose confidence,” AlTantawy said. “People want 100 percent correct.”
Yet, imperfections remain. Predictive keyboard tools developed by Google and Apple Inc (AAPL.O) propose the gendered “policeman” to complete “police” and “salesman” for “sales.”
Type the neutral Turkish phrase “one is a soldier” into Google Translate and it spits out “he’s a soldier” in English. So do translation tools from Alibaba (BABA.N) and Microsoft Corp (MSFT.O). Amazon.com Inc (AMZN.O) opts for “she” for the same phrase on its translation service for cloud computing customers.
AI experts have called on the companies to display a disclaimer and multiple possible translations.
Microsoft’s LinkedIn said it avoids gendered pronouns in its year-old predictive messaging tool, Smart Replies, to ward off potential blunders.
Alibaba and Amazon did not respond to requests to comment.
Warnings and limitations like those in Smart Compose remain the most-used countermeasures in complex systems, said John Hegele, integration engineer at Durham, North Carolina-based Automated Insights Inc, which generates news articles from statistics.
“The end goal is a fully machine-generated system where it magically knows what to write,” Hegele said. “There’s been a ton of advances made but we’re not there yet.”
Reporting by Paresh Dave; Editing by Greg Mitchell and Marla Dickerson