This Monday saw Syndeo CTO, Alan Beck speak about ‘Bias in AI’ at the London Chatbot and Voice Meetup hosted by Vodafone and Accenture UK. It was an excellent event with plenty of discussion on conversational AI, including other talks from Nat Walker of Vodafone and Rui Teimao at Accenture Song.
Alan kicked off his talk by sharing a riddle with the audience, highlighting unconscious bias, and setting the scene for his presentation. He then went on to speak about what bias in AI is, how it happens, how to mitigate bias in AI and how to ‘be better.’
WHAT IS BIAS IN AI AND HOW DOES IT HAPPEN?
In the presentation clip below Alan shares a contrived example of bias in AI.
Bias – Representational
Following this example, Alan explored historical and societal bias. He talked about how historically nurses tended to be female while doctors tended to be male. Therefore, if you trained an AI solely on historical data, you would have an AI that associates nurses with females and doctors with males. When used in a hiring scenario there would clearly be bias in the AI as it would hire females as nurses and males as doctors.
Alan went on to say that, “what if, after hiring decisions were made, the AI was retrained on past hiring decisions.” The bias already present in its hiring decisions is now being reintroduced as training data and feeding back in, exacerbating the problem. The AI may identify features that are associated with bad hires, but if all hires were female, then only females would be hired going forward regardless. This also leads to “bias magnification,” said Alan.
Bias – Correlated Features
Alan continuing to speak within a hiring context said that the AI was then used to hire programmers. However, the developers realized it wouldn’t hire any females due to the bias in the training data. In an attempt to remove this bias, the developers removed all mention of gender from the applications. However, the AI can identify correlated features, so it located hobbies – namely, sports. Sports associated with males and females can vary (such as netball), once again making biased hiring decisions. Alan provided AI thought process examples illustrating how bias through proxy variables and correlated features can persist even when the primary variable has been removed.
Bias – Sampling and Labelling
Alan talked about how the internet does not have a true distributed representation of reality, stating that world events on the internet will focus on certain areas. Referring to the image below he pointed out how there is a much higher distribution of green circles than blue triangles, much higher in fact than there would be in reality.
If an AI is trained on internet news, especially politics, it is likely that we will find it is strongly biased towards American politics and that this may be particularly relevant at certain times of the year (or four-year election cycle). He outlined how because of this, our AI may take on biased characteristics of those cultures. Alan highlighted that, “with labelling bias, even if we had a more curated unbiased sample, if our labeler is biased, we suffer similar problems.”
Exploring how you might mitigate bias in AI, Alan discussed:
He said, “Understand the level of bias in various forms that exist in external data sources (as well as the overall quality of the data). Decide whether the data can be cleaned, screened or curated in any way to minimize bias. It may need to be accepted that the data (and source are not fit for the purpose you need it.”
Alan stressed, “Care is needed to ensure that bias while mitigated in one area is not introduced in another due to the unintentional subconscious bias of the adjuster (or confirmation bias).”
Alan added, “If you have a depth of understanding of the data, consider augmentation to find balance and reduce bias. If you have an overwhelming sampling of data that is causing a skew in the data predictions, could data generation help mitigate?”
The same care is needed with augmentation, ensuring the adjuster does not introduce pre-conceived notions that don’t necessarily align with reality.
Alan outlined that model testing should be part of the development and delivery process and that this is one obvious step to attempt to reduce bias.
However, he highlighted that testing shows the presence of defects, not their absence and that you can not perform a selection of tests and assume that your model is free from bias.
Alan talked about adding a person dip into the process flow of AI decision making. “An unusual way to form the sentence, born from the Syndeo ‘Dip in Dip out’ human intervention functionality within the Syndeo platform”, said Alan.
He explained that this can happen as part of testing if you have an appropriate R&D pipeline. It can also be part of a temporary (or permanent) production roll out. Some or all decisions can be passed through a human screening process and accepted, adjusted or rejected. Adjustments and rejections can later reviewed by others for model updates.
“Such updates should not be automatically fed into the system without review by an appropriate selection of domain knowledge experts”, advised Alan.
Word Embedding – 3D Visualization
Discussing how you might mitigate bias in AI further, Alan spoke about Word Embedding and 3D Visualization. This involves taking words which computers don’t really understand and converting them to vectors in a multi-dimensional space (watch presentation clip below to find out more).
According to Alan a common Word Embedding implementation may have between 100-300 dimensions. Trying to even understand a fourth dimension is difficult for the mind, given the fact that we have grown up in a ‘3D space’.
Alan explained that normalization can be done through manual configuration to fix the bias. Although this approach is not using machine learning to fix the bias it is still extremely useful as it can be done very quickly and with little data.
Sharing an analogy, Alan said, “Machines make decisions based on the algorithm (the mind) and the data (the experience). We can clearly see through simple examples how bias impacts AI decisions. People are the same. How a person thinks and the conclusions they draw depends largely (though not completely) on their experience.”
Alan went on to say that people may have beliefs and views about others that may not be right or reasonable upon reflection and that this is known as ‘unconscious bias.’ Unconscious bias is something everyone has. “It would be impossible not to have it unless you have experienced everything and have had an even distribution of life. No person has and no person ever will” Alan explained.
In his final remarks, Alan offered some advice to the audience on what they can do:
- Be aware that you will have biases and understand the impact they might have on others.
- Think about how your decisions and thoughts may be impacted from poor sampling (a single bad interaction with someone in a given demographic) or feedback loops and magnification (passing that poor interaction forward to that demographic).
In closing, Alan reminded everyone to be thankful that “we can be made aware of our biases, whilst machines cannot…..YET.”