monkeyjump labs


Human-in-the-loop: avoiding the negative consequences of AI hallucination

Making the most of the human-AI relationship

Scrabble letters all scrambled up
Written by Jonathan Dexter

AI has created some truly amazing tools, revolutionizing industries and simplifying tasks across the board.

But they are by no means bulletproof.

Many of us have already discovered that AI tools can lie to us - often, and with a confidence that puts some conmen to shame. The term “AI hallucination” was created to describe the errors of AI such as GPT-4 that can produce the wrong data based on its source of information. LLMs operate on patterns, not facts, so the room for error is uncomfortably large. This means that if action is taken purely on the output of an LLM, it is inevitable that that action will be wrong more times than we'd like. This problem demands human intervention for outcome integrity, which defines the term “human-in-the-loop”.

Of course, it is highly advantageous for a human to give AI the space and data to streamline processes and improve efficiencies. Who doesn’t want to enlist the help of external data and information to best support and elevate their work? But carelessness on any level when it comes to the human-AI relationship can prove catastrophic for individuals and/or the companies they work for.

AI outputs require human oversight to discern the nuances or inaccuracies that show up.

Where (and when) do humans come in?

Contrary to what many people may assume, the objective of avoiding AI hallucinations is to involve humans from the very beginning.

Let's first imagine building an AI tool, whereupon a given output (example: detecting the words "create a calendar event"), we perform that action - we create a calendar event. This is "giving AI a tool" — well, a very naive approach. More capable approaches include MRKL, ReAct, and others that enable the tool to "reason" about the action to take next.

If we just let the tool take action based on the outcome, even with these advanced techniques, it will get it wrong.

So, if we build our system to do just that, we get something like this:

Starting with the setup, some techniques enable AI to utilize tools. For example, ReAct AI development enables speech/voice recognition, but the system needs to be trained with human input first. Python is another example of a programming language for building websites and software applications using machine learning but requires human intervention for these capabilities to get it right.

Before any action is taken, we recommend humans be involved in the data collection, validation, and training process.

What a Human-in-the-loop process looks like

To better understand at what point humans should get involved, let's get a little more specific. Say a business development representative wants to summarize content from a video call and create a HubSpot or Salesforce deal from the recording.

There are multiple steps to take to make this possible:

  • Recording the video call
  • Converting the audio to text
  • Summarizing the text
  • Converting the summary (or full text) to a structured deal or opportunity record
  • Submitting the structured deal to our CRM

Before that final step, it's important to enable an expert to review the output.

This is the "human-in-the-loop" pattern — a pattern that places a human step before the system takes some destructive or state-changing action.

We would modify the workflow with one additional step:

Why it's so important

Again, AIs make mistakes quite frequently, but AI can exclusively provide information within its boundaries. Without an understanding of nuances or exceptions, it can be common to encounter situations where AI suggests books that simply don't exist or provides code that doesn't work at all. AI is a fantastic tool that can supercharge various tasks, but it's vital to remember that the results it produces must always be double-checked by an expert.

For instance, in the world of health care, radiologists often use AI to speed up their work, but they still need to review the AI-generated findings. This is crucial not only for quality assurance but also because of legal and liability concerns. These real risks and liability concerns will undoubtedly require a human to be involved — at least for the foreseeable future.

Embracing human-in-the-loop confidence

In many organizations, some tasks involve summarizing information, coordinating efforts, or planning activities. These areas are ripe for the integration of AI, which can streamline processes and assist in decision-making, leaving the final judgments to the true experts.

It's important to recognize that these AI tools are designed to enable and accelerate human capabilities, not to replace them entirely. To identify opportunities for improving your team or product, start by:

  1. Pinpointing pain points, particularly at the intersections of different systems, such as connecting Salesforce to a custom ERP system, and at process boundaries, such as when the strategy team communicates Q4 goals to product teams.
  2. Considering employing a Build/Buy/AI Evaluation Framework to determine the best approach.
  3. If you opt for AI, like using a Language Model, clearly define the inputs and outputs of the system.
  4. Develop the AI workflow, and plan for a Human-in-the-loop (HIL) step before implementing any changes to data.

We believe that when you embrace human-in-the-loop confidence, AI can become one of your most powerful tools and effective assets to get AI right. We also believe that it’s an impactful way to satisfy your customers, shine in the face of your competitors, and level up your business.

Throughout this journey, MJL is here to assist you at every step.

We recognize that one of the most significant challenges facing companies today is the desire to innovate, coupled with the uncertainty about which AI approach best suits their business needs.

Our process at MJL is to take an honest look at company needs, and how to get to their end goal quickly with our MVP-to-Scale process (we build in weeks, not months).

We’ve created a white paper that breaks down two Conversational AI models that are used by many companies, helping you to better understand the type of AI that is available to you right now.

Download our white paper today!