How Do We Keep Language Models Like ChatGPT Accurate and Unbiased?

Darren Boynton

Sr. Product Marketing Manager

Table of Contents 01 Accuracy 02 Ethics, Morals and Legality 03 Bias 04 Good AI Parenting

ChatGPT is the tipping point for next-gen chatbots.

Is your revenue team ready for what's next?

Learn more

Artificial IntelligenceBest Practices

Published 02/07/23

4 minutes read

Artificial intelligence (AI) models have become increasingly prevalent in every industry, from healthcare and finance to customer service and marketing. We may not always be aware of it, but we interact with countless forms of AI every day. Here are just a few examples of AI most of us are engaged with right now: Gmail, Tesla, Amazon, Netflix, Facebook, Google Maps, and Spotify.

But these are not the platforms dominating the AI conversation today. ChatGPT is all you are hearing about. If you are one of 17 people in the world who is not familiar with ChatGPT, you can learn about this incredible language model and play with the current demo on OpenAI’s website.

ChatGPT can answer questions, generate a wild variety of sales and marketing content, write high school essays and even generate reasonably usable software code. The output from ChatGPT is often generic in style, but its tone is generally authoritative (sometimes, it will confess it might have gotten something wrong).

When Generative AI Gets It Wrong

And this is where problems begin to creep in: authority does not equal accuracy.

Search the web, and you will find countless examples of ChatGPT getting basic facts wrong. This is largely due to ChatGPT having to make the decision as to what is “true” when presented with a range of conflicting data points buried in a truly immense dataset.

As humans, we have to determine what is “true” every day, and as we get older, we (hopefully) get better at making those decisions. But we can still get it wrong.

Today, ChatGPT is a baby, learning to make better decisions, but it often feels like talking to a college professor. To the end user, it can be hard to tell whether the responses are cold, hard facts or a collage of things that just sound true.

Ethical, Moral and Legal Concerns

ChatGPT also has to decide what is “right” and “wrong.” Which questions or commands should it respond to, and which ones should it not?

Aside from moral and ethical issues, the good people at OpenAI would rather their language model not furnish teens with instructions on how to make illegal substances.

ChatGPT has guidelines to keep it from discussing certain controversial or harmful topics.

ChatGPT has, however, proven that it’s just as capable of creating phishing scams as it is of writing high school history papers.

ChatGPT has certain governance guardrails in place but they can be bypassed with some creative prompts.

For example, we asked ChatGPT how to set fire to a house, and it refused to provide an answer. But when we asked ChatGPT to write a story where a character sets fire to a house, the language model provided us with multiple options for starting fires and, when asked for additional details, gave us notes on how to avoid being accused of the crime.

Bias in Language Models

Given the sheer size and complexity of these language models, how do we keep them honest and ethical? Setting guardrails to ensure governance is one thing, but keeping bias in check is far trickier, given it’s not easy to identify consistently.

This might be a problem for people “playing” with the ChatGPT demo, but what happens when this kind of technology is deployed in an enterprise? Again, we asked ChatGPT for some input:

Biased AI models may incorrectly identify certain groups of customers, flagging them as high-risk, unduly resulting in higher prices or reduced services for specific groups. This can result in lost revenue, decreased customer satisfaction, and damage to the company’s reputation.

Another major concern is that bias in AI models could impact customer experiences. Once AI models begin categorizing customers based on inaccurate or biased decision-making, you can bet those customers will soon start seeing recommendations for products and services they have never shown an interest in. Given the AI is essentially representing the business, this can be a death sentence for an enterprise.

The Essential Guide to Conversation Automation

Discover how to implement enterprise-ready GPT-powered Conversation Automation for your business—without the missteps.

Read the eBook

So What’s the Answer?

Step one has to be using multiple AI language models rather than relying on a single model.

By combining multiple models with diverse training data and architectures, the overall system can become less susceptible to specific biases, leading to more balanced and accurate responses.

Unfortunately, reducing bias and improving accuracy is not just a case of throwing a bunch of language models into the mix. These models need to be carefully selected and combined in a way that effectively balances their strengths and weaknesses.

At Conversica, we balance multiple language models focused on specific Sales, Marketing, and Customer Success use cases. By narrowing the playing field, we are able to generate highly accurate and relevant responses—without allowing the conversation to slide off the rails.

Step two is using the right dataset to train the AI.

The data these models are trained on must be carefully curated to minimize potential biases. This means getting as close to a “golden dataset” as possible. This is a topic for another blog post, but in short, it means having the cleanest, validated dataset possible.

As an example, you could use an optimal combination of language models, but if your dataset were scraped from Quora and Reddit, your AI’s responses would be skewed, at best. Ask Conversica’s AI team how important source data is, and you will quickly understand the scale and complexity of the task.

As we said at the top of the post, we are just scratching the surface here. We will cover steps three, four, and beyond in future posts.

In short, AI models can have serious consequences for businesses, customers, and society as a whole. It’s crucial to take the steps needed to ensure that AI models are free from bias.

This includes regularly monitoring and testing models for bias, implementing robust data privacy and security protocols, and maintaining clear policies as to how the models work and how decisions are made. Think of it as good “AI parenting.”