Why Are Some AI Systems Inadvertently Racist?

by Was Rahman, author of AI and Machine Learning

In July 2020, the UK government withdrew a system that used Artificial Intelligence (AI) to screen visa applications, after claims it was automatically rejecting applicants of certain nationalities. This was the latest high-profile example of apparently “racist” AI, such as the Microsoft chatbot that made anti-Semitic remarks, or the Google online photo service that labelled African-American faces as gorillas.

To understand why these happen, we need to first understand how AI makes decisions such as rejecting applications or composing responses to questions. This doesn’t need technical knowledge, only familiarity with AI’s underlying concepts. In the case of inadvertent racism, three concepts are key: data, algorithms and machine learning. 

The Importance of Data 

AI works by finding patterns in data, making inferences relating to its purpose, and applying them to new data. For example, recruitment AI analyses data about employees, identifying characteristics of successful ones. It then searches for these characteristics in candidates. 

This process of finding patterns in data is fundamental to AI. It relates to inadvertent racism because if the data containing these patterns is biased, an AI system built by analysing it could contain the same biases. For example, if most of the firm’s offices are in India, most employees will be Indian. Therefore a list of the most successful employees may contain mostly Indians - but of course it doesn’t follow that being Indian is a sign of a successful employee. 

This level of bias is straightforward to remove, as long as the AI team is aware of it - for example by ensuring equal representation of nationalities. However other biases in data may be less straightforward to identify and are easier to overlook. Spotting these requires human as well as a statistical understanding of data, and in the realm of Data Science. 

A critical role of Data Scientists is investigating, understanding and preparing data, so that 2 the possibility of data bias is removed. 

What Algorithms Do 

“Algorithm” refers to the logic used in AI systems to make decisions and draw conclusions from data. For example, AI surveillance uses at least two algorithms: spotting human faces in images; and matching those faces against a database, such as employees authorised to enter a building. 

Algorithms work by performing complex mathematical and statistical operations on data to: 

•detect patterns in the data; 

•attribute meaning to those patterns by comparison with other patterns, and 

•make decisions or draw conclusions to help solve a problem or perform a function. 

For example, an AI recruitment system might use algorithms to extract relevant information from candidates’ applications, highlighting those with characteristics that indicate a strong fit. The algorithms will contain mathematical and statistical representations of those characteristics. 

The relationship to possible racism lies in potentially biased decision-making by an algorithm, even using unbiased data. Algorithms typically generate a result based on many factors and steps, so an undetected bias in one of these may only create a small bias in the overall AI system. This may not even be apparent in the overall results initially. 

For example, a hypothetical Indian IT firm’s AI recruitment system might include “number of languages spoken” as a factor in finding good candidates, because the firm considers language proficiency a potential indicator of programming skill. When first used, it may be that the algorithm finds good candidates effectively, so is considered a success. However, over time such a system could turn out to have an inadvertent bias against British applicants, albeit a small one. 

The reason this might happen lies in the fact that Indian applicants may be statistically more likely to speak two or three languages (English, Hindi and a State language), whereas British applicants might be more likely to speak only one. Thus, using language proficiency as a 3 factor in the algorithm may inadvertently create a small pro-Indian hiring bias that’s not initially apparent. 

So it’s not just data that must be unbiased, but also the algorithms using it. Such biases may only become apparent during use, especially small ones that only arise over time and use. Again, the role of data scientists is to prevent such issues from the outset. 

How Machine Learning “Improves” AI Results Over Time 

Machine Learning is an AI that improves its own results over time. It does this by trying out algorithm changes, incorporating those that improve the quality of its results. Those changes are made automatically, continually optimising the overall results of the AI system against a specific measure of success. An example could be a system that uses AI to decide loan applications and uses machine learning to continually improve loan default rate by tweaking the algorithm it uses to approve or reject applications. 

The link to inadvertent racism is that if a machine learning system makes a change that improves results but happens to be racist, the racism won’t become apparent until it’s reflected and spotted in the overall results. In the loan application system above, data scientists may well have ensured race isn’t an explicit loan approval factor by removing any data about applicant nationality. 

However, if over time the system finds that applicants from certain pin (or zip or post) code default more often than others, it may adjust its algorithm to approve fewer applications from those areas. What a human would realise but an AI system may miss, is that those applications come from more socially deprived areas, with higher proportions of BAME (Black, Asian and Minority Ethnic) residents. 

Obviously, this adds complexity as well as a delicacy to the situation and requires much more consideration than simply optimising loan approvals to minimise default rates. But machine learning isn’t yet designed - or even able - to apply the judgement needed for such situations. 

The examples in this article have been deliberately simplified, to ensure clarity of the underlying concepts. In practice, the issues described would be picked up and prevented by any experienced data scientist or AI team. However, less obvious versions of the same kind of problem do get missed, which is ultimately why some AI systems are inadvertently racist. 

AI is all around, often without us realising. With its significant benefits come difficulties and dilemmas, such as those above. You can learn about AI in everyday life, how it works, and what we can do about its challenges, in the new my new book AI and Machine Learning 


  1. At first, to get the Roku TV Account password, visit my.roku.com/password/reset on your computer, and if you can view the My account page, then you are signed into your Roku account. Access the Welcome menu and choose to Sign out. Reach the same site once again, and you can see the Forgot password page, type the email ID of the Roku account, and choose the Submit button. A password recovery email will be sent and select the link. Tap and access the link page. Moreover, on the Create password site, key in the new password in both password columns and submit and finally log in using the new password.

    You can talk with our specialized expert crew by dialing the toll free number to get more details regarding Roku TV Account.

  2. Initiate the HP DeskJet 2600 wireless setup by removing all the packaging tapes from the printer. After removing the tapes, supply power to the printer. All you have to do is connect the printer with an external power source with the help of power chord. Visit the printer setting and set your time, region and language. Just visit our site 123.hp.com/setup for more information related to HP Deskjet 2600 Wireless setup, Driver Download and Troubleshooting via 123.hp.com/setup 2600 or call our support team number +1-800-237-0201

  3. This comment has been removed by the author.

  4. I’m excited to uncover this page. I need to thank you for your time for this, particularly fantastic read!! I definitely really liked every part of it and I also have you saved to fav to look at new information in your site.
    Python classes in Aurangabad

  5. Thanks for sharing the information and keep updating us.

    Autocad training in Ahmednagar

    Autocad training in Ahmednagar

  6. Blockchain Ends the Double Spending Problem
    What is a double-spend issue in Blockchain? As the name indicates, it means that the same cryptocurrency is spent twice. The thief attempting to steal the cryptocurrency will send the duplicate transaction to the blockchain and try to make it look legitimate. Blockchain ends the double-spending problem by broadcasting each transaction to the network for verification using the consensus mechanism.

  7. Group eCards For The Modern Office is a service that allows you to create group cards with unlimited signers and photo, music, and text. It's the perfect way to send virtual greetings to your coworkers, Online Leaving Cards.


Post a Comment