Researchers devise a way to detect artificial intelligence hallucinations

Researchers at the University of Oxford in the UK have announced the development of a method to detect hallucinations of artificial intelligence and large language models, especially in cases of “confusing” answers – when incorrect and arbitrary information is generated. According to the team, this method can determine the likelihood of a command producing entanglement and alert people about the reliability of AI-generated data.

The results of the system developed by the Oxford research team were published on Wednesday (19) in the scientific journal Nature.

The problem with hallucinations of artificial intelligence

to Artificial intelligence hallucinations They are a critical issue when creating content using large language models (LLMs), because they undermine confidence in the answers generated and can mislead people.

–
Canaltech on YouTube: news, product analysis, tips, event coverage and much more! Subscribe to our YouTube channel
Every day, a new video for you!
–

Simplifications are particularly worrying when they produce convincing but wrong answers, the result of errors in the database used to train the model or a systematic failure in inference.

“Responding unreliably or without the necessary information hinders the adoption of AI in many areas, with problems including fabricating legal precedents, false facts in news articles and even misrepresenting… Danger to human life in medical fields Like x-rays,” the researchers justify in the article.

How to detect hallucinations

The method developed at the University of Oxford uses probabilistic tools to measure the “semantic entropy” of responses generated by models. This semantic entropy is calculated as an estimate of uncertainties at the meaning level, in contrast to previous methods that measure only lexical or syntactic differences.

This technique involves sampling several possible answers to a question and grouping these answers into similarly meaningful groups, using LLM and natural language inference (NLI) tools. If one answer has the same meaning as another, the two are grouped into the same semantic group.

However, when a language model has too little background information to provide a complex answer, it tends to generate a higher proportion of answers – and combinations – that may have different meanings, even if they use a similar set of words.

In this case, there is a higher rate of “semantic entropy,” indicating a greater likelihood of generating incoherent or unfounded responses.

“To detect confusions, we use probabilistic tools to identify and then measure the semantic entropy of the resulting content Master's -Entropy is calculated on the meaning of sentences. “High entropy corresponds to high uncertainty, so semantic entropy is a way to estimate semantic uncertainties,” the researchers’ text explains.

Method results

The results presented in the article show that confusion detection through semantic entropy is effective in different language models and domains.

The methodology has been evaluated on various datasets such as TriviaQA (trivia questions), SQuAD (general knowledge questions), BioASQ (biological sciences), and NQ-Open (open-ended questions based on queries from… Google
), in addition to identifying hallucinations in mathematical problems and in generating an autobiography.

Tests were performed with Llama 2 (from Meta)
And the Mistral trainer (from the French company Mistral) and Falcon (from the Dubai Institute of Technology Innovation). Popular solutions such as ChatGPT (from OpenAI) and Gemini (from Google) were not part of the research.

The Oxford approach also has the advantage of being unsupervised, meaning it does not require labeled examples of mixing processes to train the system. This makes the method more adaptable to new situations and less dependent on specific patterns of AI hallucinations.

The team confirms that the methodology can be used to solve one of the biggest problems related to artificial intelligence models. “We showed that semantic entropy can be used to predict many incorrect answers and improve accuracy by rejecting answers to questions that the model was unsure about,” the research team explains.

The method is expected to help increase the firmness and reliability of the results obtained through artificial intelligence.

Read the article on Canaltech
.

Popular on Canaltech: