Sam Whitney (illustration), Getty Images
Men often judge women by their looks. It turns out computers too.
When US and European researchers submitted images of members of Congress to Google's cloud image recognition service, photos of women received three times as many comments about women's appearance as men. The top labels for men were "official" and "businessman"; for women they were "smile" and "chin".
“This leads to women receiving a stereotype with a lower status: women are there to look pretty and men are business leaders,” says Carsten Schwemmer, postdoctoral fellow at the GESIS Leibniz Institute for Social Sciences in Cologne. He worked on the study, published last week, with researchers from New York University, American University, University College Dublin, University of Michigan, and California's nonprofit YIMBY.
The researchers managed their image processing test for Google's artificial intelligence image service and for rivals Amazon and Microsoft. Crowdworkers have been paid to review the comments these services have put on official photos of lawmakers and images that those lawmakers have tweeted.
Enlarge /. Google's AI image recognition service saw men like Senator Steve Daines more like business people, but tagged lawmakers like Lucille Roybal-Allard with terms related to their looks.
The AI services generally saw things that human reviewers could see in the photos as well. But they tended to notice different things about women and men, with women much more likely to be characterized by their looks. Legislators have often been labeled “girl” and “beauty”. The services tended not to see women at all and did not recognize them more often than men.
The study adds evidence that algorithms do not see the world with mathematical distance, but tend to replicate or even reinforce historical cultural prejudices. It was inspired in part by a 2018 project called Gender Shades, which showed that Microsoft and IBM's AI cloud services identified the gender of white men very accurately, but very imprecisely identified the gender of black women.
The new study was published last week, but the researchers had collected data from the AI services in 2018. Experiments by WIRED using the official photos of 10 men and 10 women from the California State Senate suggest that the study's results are still true.
Enlarge /. Amazon's image processing service, Rekognition, tagged images of some California senators, including Ling Ling Chang, a Republican, as "girl" or "child," but did not use similar terms for male lawmakers.
Wired Staff via Amazon
All 20 lawmakers smile in their official photos. In the labels most frequently suggested by Google, only one of the men but seven of the women found smiles. The company's AI Vision Service referred to all 10 men as "businessman", often also with "officials" or "employees". Only five of the senators received one or more of these conditions. Women were also given appearance labels such as “skin”, “hairstyle” and “neck”, which were not used on men.
The services of Amazon and Microsoft apparently showed less obvious prejudice, although Amazon was more than 99 percent certain that two of the ten senators were either a "girl" or a "child". It was not suspected that any of the 10 men were underage. The Microsoft service identified the gender of all men, but only eight of the women, with one identified as a man and no gender marked for another.
Google turned off gender detection on its AI Vision service earlier this year, stating that gender cannot be inferred from a person's appearance. Tracy Frey, executive director of responsible AI at Google's cloud division, says the company continues to work to reduce bias and welcome outside input. "We always strive to get better and continue to work with outside stakeholders – such as academic researchers – to advance our work in this area," she says. Amazon and Microsoft declined to comment. Both companies' services only recognize gender as binary.
The US-European study was inspired in part by what happened when researchers delivered a stunning, award-winning picture to Google's Vision Service from Texas of a Honduran toddler in tears when a US border guard arrested her mother. Google's AI suggested labels like "fun" with a score of 77 percent, which is higher than the 52 percent score assigned to the label "kid". WIRED received the same suggestion after the image was uploaded to the Google service on Wednesday.
Schwemmer and his colleagues started playing with the Google service in hopes that it could help them measure patterns in how people use images to talk about politics online. What he later helped uncover the gender bias in the image services convinced him that the technology is unwilling to be used in this way by researchers and that companies using such services could have dire consequences. "You could get a completely wrong picture of reality," he says. A company using a twisted AI service to organize a large collection of photos could inadvertently black out female business people and instead index them with their smiles.
Enlarge /. When this picture was named World Press Photo of the Year in 2019, a judge noted that it showed "psychological violence". Google's image algorithms found it to be "fun".
Wired staff through Google
Previous research found that prominent records of labeled photos used to train visual algorithms had significant gender biases, such as women cooking and men shooting. The offset appeared to come in part from researchers collecting their pictures online, with the photos available reflecting societal prejudice, for example by providing many more examples of business people than business women. It has been found that machine learning software trained on these datasets increases the bias in the underlying photo collections.
Schwemmer believes biased training data could explain the bias the new study found in the tech giant's AI services, but it's impossible to know without full access to their systems.
Diagnosing and correcting deficiencies and biases in AI systems has become an important research topic in recent years. The way people can instantly absorb subtle context in an image while the AI software is tightly focused on pixel patterns can lead to misunderstandings. The problem has become more pressing as algorithms process images better. "Now they're being used everywhere," says Olga Russakovsky, an assistant professor at Princeton. "So we'd better make sure they are doing the right things in the world and that there are no unintended downstream consequences."
An academic study and tests by WIRED found that Google's image recognition service often labels legislators like California Senator Cathleen Galgiani with labels related to their appearance …
Wired staff through Google
… but sees men legislators like their colleague Jim Beall as businessmen and elders.
Wired staff through Google
One approach to the problem is to work on improving the training data, which can be the main cause of biased machine learning systems. Russakovsky is part of a Princeton project working on a tool called REVISE that will automatically flag some distortions burned into a collection of images, including geographic and gender-specific.
When the researchers applied the tool to the Open Images collection of 9 million photos maintained by Google, they found that men were tagged more often than women in outdoor scenes and on sports fields. And men labeled “sports uniforms” mostly played sports like baseball outdoors, while women played basketball indoors or wore bathing suits. The Princeton team suggested adding more images showing women outdoors, including sports.
Google and its competitors in AI are themselves making an important contribution to research into fairness and bias in AI. This includes working on the idea of developing standardized methods to communicate to developers the limitations and contents of AI software and records – something like an AI nutrition label.
Google developed a format called "model cards" and released maps for the face and object recognition components of its cloud vision service. It claims that Google's face detector works more or less the same for different genders, but does not mention any other possible forms that a gender bias in AI could take.
This story originally appeared on wired.com.