Is AI Racist? It's Complicated

Artificial Intelligence was the top buzzword of 2023, and for good reason. While the likes of GPT-4, Gemini, and Dall-E stole most of the limelight, significant developments were also made in other domains. Multiple research papers highlighted how AI tools such as PathAI have assisted cancer diagnosis; AtomWise sped up the identification of therapeutic materials, the discovery of new materials got supercharged, and trading decisions also got a speed boost with a little assistance from AI. Drug discovery is at the cusp of a quantum leap, and material science is also looking at a similar shift.

Climate AI and Verdigris are working in the field of climate change and energy assessments, while solutions like DarkTrace offer real-time cybersecurity threat detection and mitigation. We even had AI helping finish a Beatles song, artists like Grimes accepting the dark reality of synthetic music, and an AI-first wearable like the Humane Pin arrived on the scene to steal the thunder from smartphones. With an impact of such high magnitude and diversity, alarms are being raised about its impact on human job security.

The likes of Elon Musk are a warning about its potential for civilizational destruction. But before the world goes full throttle on AI deployment, experts have also warned about the inherent biased AI behavior, in general. Even the most sophisticated tools like ChatGPT have a history of hallucinations, which means they cook up facts and present them as reality. But an AI model is nothing without its foundation data, which dictates whether an AI is racist or otherwise. So far, race-sensitive bias in AI has been documented widely, with little concrete solution coming out of the debate.

A history of failures

Not too long ago, Microsoft released an AI chatbot on Twitter called Tay but quickly had to pull it because it started spewing racist garbage. Microsoft apologized for the fiasco, but that was in 2015, back when transformer models powering generative AI were far, far away. In 2021, the Allen Institute created Delphi AI and tasked it with giving ethical advice, but it soon started suggesting to people that murder, genocide, and white supremacy are acceptable. Then came OpenAI, claiming to have built the world's most advanced language model and gave us ChatGPT.

Repeated research proved it's not immune to racist tendencies. But the repercussions of AI's racial bias have already started leaching into the lives of people. In 2020, a Black individual was charged by cops in Michigan for theft, but the arrest came courtesy of a botched facial recognition check. It was later revealed that the AI program behind the facial recognition system was predominantly trained on white faces. Interestingly, experts had already warned about the implications. From loan sanctioning and hiring to renting and public housing, AI systems have consistently demonstrated race bias.

Early in 2023, I wrote about a ChatGPT blunder. When the ChatGPT mania was at its peak, someone prompted the AI model to tell jokes about Hindu deities. The AI model faltered at one particular name, and that was because of tonal variations in Indian languages, which, when translated to English, produce two different spellings of the same name. But the joke ended up making prime-time news headlines, stirring nothing short of a culture war in the world's most populous country, branding AI tools like ChatGPT as a conspiracy against a religion followed by over a billion people. By the way, that's not the first ChatGPT failure.

What does research say about AI race bias?

In 2022, collaborative research published by experts at Johns Hopkins University revealed that robots trained using a well-known AI model exhibited race and gender bias, all because the model itself was flawed. "While many marginalized groups are not included in our study, the assumption should be that any such robotics system will be unsafe for marginalized groups until proven otherwise," one of the researchers pointed out. In his book "Hidden in White Sight," which involves research across multiple domains where AI is employed, IBM Engineer Calvin D Lawrence found evidence of how AI bias has become a societal problem.

But even if the training dataset is diverse, algorithmic bias can persist, as documented in this paper published in Science. An expansive report (PDF) courtesy of the U.S. Department of Commerce's National Institute of Standards and Technology notes that societal factors are often overlooked, but the government is working to address it all. Interestingly, OpenAI chief Sam Altman told Rest of World that AI tools will eventually be able to fix the bias problem on their own.

Experts note that we need to address the bias problem through rigorous quality testing of AI systems, full transparency of datasets, easy opt-out routes, and an in-built 'right to be forgotten' mechanism for users. They also note that people should be able to easily check what data is held against their names and be given clear access to recourse if the data is inaccurate. But that needs to be done concurrently with deep testing. The stress should also be on finding vulnerabilities that can break these models, a common strategy that has managed to break past the guardrails with something as simple as making an AI program repeat the same word.

It's as much as about data as it's about inclusion

When it comes to building an AI model, every company boasts about its size and the parameters it can handle. Some of the biggest names in the game — OpenAI, Google, Meta, Microsoft, and Anthropic — regularly throw around figures like billions and petabytes. But it's not just about the size of the training data that matters here. It's also about quality and diversity, a process that needs careful human intervention and expert steering. Then there's the issue of contextual awareness because simply filtering out problematic keywords can lead to the exclusion of crucial historical context.

Experts often flag that data is largely responsible for biased racist and culturally insensitive tropes in AI tools. Programs that crawl the web and gobble petabytes of information are like babies, lacking an understanding of what data is good or bad, which, when uncorrected, strengthens harmful stereotypes in the output. "These kinds of stereotypes are quite deeply ingrained in the algorithms in very complicated ways," Stanford professor James Zou told CBC News.

However, the aspect of human intervention is vital and demands diversity as well. "AI bias also occurs because of who is in the room formulating and framing the problem," Dr. Alex Hanna, Director of Research at the Distributed AI Research Institute, said in a Harvard interview. Right now, that's a huge problem. For a person unaware of gender, race, and cultural bias tropes, it would be nigh impossible for them to train a model that won't produce biased responses. Then, there's the question of how much AI aligns with a company's corporate goals.

The intention conflict

Google's firing of Dr. Timnit Gebru, co-lead of its ethical AI team, drew widespread attention. MIT Technology Review got a hold of Gebru's research paper, which led to her ouster. Notably, the paper highlighted the risks of racial bias in AI models as well as mitigation strategies. It's worth noting here that Google's AI team came up with the concept of transformers, which laid the foundations of generative AI products such as ChatGPT, Dall-E 3, and more. Google lost the leadership to the likes of OpenAI, but in the past few months, its work on models like Gemini and PaLM 2 has put the company right at the summit of AI innovation.

Now, let's talk about OpenAI. The company — with the backing of heavyweights like Elon Musk, Peter Thiel, and Reid Hoffman — started as a non-profit. OpenAI went by a charter that says it will guide the company to act in humanity's best interests. "Our primary fiduciary duty is to humanity," it claims. Musk left OpenAI soon after but voiced concern about the company chasing a for-profit model. "That profit motivation can be potentially dangerous," he said earlier this year. Meanwhile, OpenAI moved ahead with a subscription tool and plug-ins covering its text, image, and voice models.

This profit-chasing caused an internal stir and CEO Sam Altman's firing at OpenAI. It won't be the only example where ethics contradicted corporate goals. StabilityAI also put aside ethics and landed itself a plagiarism lawsuit from Getty. Some of the solutions to fix the racial bias problem don't even sound like solutions at all. Anthropic, one of the biggest players, recently published a research paper. It essentially told users to tell the AI not to give them unbiased answers.

How to fix AI's bias problem?

The research community is divided on how best to tackle the problem of racial bias in AI. Joy Boulamwini, founder of the Algorithmic Justice League, found "large gender and racial bias in AI systems" hawked by giants like Microsoft, IBM, and Amazon. In a thoughtful Time article, Boulamwini mentioned that the racial bias problem can be addressed by "enabling marginalized communities to engage in the development and governance of AI."

Then, there is a group that wants a pause on out-of-control development and deployment of AI tools until existing problems like bias are addressed. The likes of Elon Musk and Apple co-founder Steve Wozniak recently signed an open letter asking for a pause on development on any model smarter than GPT-4, asking for more research into the real risks posed by AI tools before we move further ahead.

A few take a rather hard stance, vouching for AI inclusion in our lives only when they can be trusted. "We should not cede control of essential civic functions to these tech systems, nor should we claim they are 'better' or 'more innovative' until and unless those technical systems work for every person regardless of skin color, class, age, gender, and ability," says Meredith Broussard, an expert on the topic of bias in tech and an associate professor at the New York University.

Can regulations solve the issue of AI racial bias?

A lot of experts out there have made it clear that only with proper regulations and oversight can the biggest players in the AI game be forced to make their tools more inclusive. The EU's AI Act, which received a provisional nod in December 2023, is being seen as a landmark move in the right direction. In an interview, EU's competition commissioner Margrethe Vestager told the BBC, "AI risks are more that people will be discriminated [against], they will not be seen as who they are."

The EU's white paper on regulating AI also talks about how real biases in human society can seep into AI development and the tools making their way to the public domain, posing a great risk of discrimination. However, experts tell Politico that policymakers should first fix the inherent racial bias at systemic failures affecting the bloc's citizens, of which there is a well-documented history. Ignoring the latter aspect will cascade its way into the future. No such proposals talking about tight rules against AI race bias have been discussed in major Asian markets like China and India so far.

In the U.S., experts have pinned their hopes on the Standard for AI Safety and Security proposal, which was greenlit via an executive order by President Biden in October this year. Highlighting the risk of bias in AI tools, it asks for developing principles and standards that companies need to follow to mitigate the risks of discrimination. It tasks the National Institute of Standards and Technology to put in place dedicated teams that will test and analyze AI tools comprehensively ahead of their public release and commercial deployment. It remains to be seen when and how it is put into action.