The flipside of AI – when models learn society’s biases and stereotypes

Artificial Intelligence has made significant advancements through its history, advancements that many sectors are already leveraging today.  

Specifically, in real estate, AI and machine learning is being utilised to aggregate data, value properties and model markets, leading to informed data-driven decision making.

For years, advances in AI were hard won and gained little attention amongst the general public.

Excitement grew when major stories began to make their mark outside of the scientific community. A real turning point was reached when AI started to beat humans in sophisticated games like Chess and Go, and media buzz increased.

After speculation that AI was to succeed humanity, excitement quickly fizzled out. When AlphaGo from DeepMind beat the Go world champion in 2016, this was headline news. One year later, AlphaZero beat AlphaGo 100:1 and the feat gained little attention.

Fast forward to May 2020, OpenAI released GPT-3, a 175 billion language model which is a factor 10 bigger than any other AI ever created. A triumph – not exactly. A warning as to what can go wrong – yes.

Trouble in paradise?

GPT-3 is sold as a cloud service with the premise: “No need to think, just send us your data and we will solve the problem”.

On closer inspection, it isn’t this easy, and there are some facts that need to be taken into consideration, which can be applied to any responsible use of AI.

Companies that use GPT-3 must be extremely cautious: it can expose anyone who publishes its output without full review, potentially leading to legal and reputational risk. It must be realised that the model cannot disambiguate facts from fiction – essentially, it does not know right from wrong.

Other problems stem from the fact that the model is trained on Common Crawl, a filtered copy of the internet, which includes negative biases, such as racism and sexism. Models trained on a copy of the internet will include racism, bigotry and other candidate toxic biases that people should be wary of.

60% of the training data consists of sources from the internet with varying scales of truth. On one hand, there are reputable sources such as the BBC, and on the other hand there are social networks and user generated forums, such as Reddit, which provide a platform for all. Using racism as an example, the graph below shows racial sentiment across different model sizes, which demonstrates some of the inductive biases from the training dataset. When GPT-3 is exposed to text containing ‘Black’, it is more likely to suggest negative words when compared to the corresponding ‘Asian’ or ‘Indian’ prompts.

Figure 1: The sentiment of words associated to different races

These inductive biases do not solely occur with racism, but extend to opinions about products, brands and those in the public spotlight in general. More dangerous are opinions with regards to vaccinations or other medical related queries.

The pitfall here is that during training, the model cannot gauge the credibility of a source, for example a Wikipedia article or a piece written by a minor. As a result, the model attempts to answer medical questions based on opinions from a whole universe of web users and bloggers – a dangerous way to go. 

The main challenge is that the general-purpose language models are perceived as an intelligence agent, which can perform any task provided to them. It is of interest of the user to not only know the answer, but also the sources used to make the prediction.

Therefore, these models should actually act as information-retrieval systems that provide the answer with a source which supports it. This allows humans to verify the source and learn the strengths and weaknesses of the model.

Efforts are already made in this direction. Facebook recently introduced KILT, a dataset for NLP models to build machines with deep, broadly useful knowledge of our world.

The potential of Machine Learning remains undeniably high. The latest systems are able to perform a wide range of tasks with high precision – something we are already fully utilising within real estate markets at RE5Q. On the flip side, it can be prone to picking up society’s biases and prejudices if not trained properly.

It is not strange that language models are biased if they are trained on the internet – a medium crowded with personal blogs and fake news.  As with any data-related problem: garbage in, garbage out. The only solution is to train models with high-quality datasets and to verify the predictions to detect anomalies. As a practitioner of AI, this comes only naturally to RE5Q.

Society’s biases can be reinforced by language models if not properly trained. GPT-3 is an example case, but not the exception.