Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Neural Machine Translation (NMT) is the standard for machine translation applications.
NMT models can produce severely pathological translations, known as hallucinations.
It is important to implement strategies to guarantee proper functioning.
This paper addresses the problem of hallucination detection in NMT.
Hallucinations exhibit encoder-decoder attention patterns that are different from good quality translations.
An optimal transport formulation is proposed to detect hallucinations.
The detector outperforms previous model-based detectors and is competitive with detectors that use large models.

Paper Content

Introduction

Neural machine translation (NMT) is the mainstream method for automatic translation
Hallucinations are severely pathological translations that are detached from the source sequence content
There is a need to develop security mechanisms to address this issue
Leveraging the encoder-decoder attention mechanism to develop an on-the-fly hallucination detector
Anomaly detection with an optimal transport (OT) formulation to find translations with source attention mass distributions that are highly distant from those of good translations

Background

Neural machine translation with transformer models

Autoregressive NMT model M defines a probability distribution over an output space of hypotheses conditioned on a source sequence.
Model is parameterized by an encoder-decoder transformer model with a set of learned weights.
Multi-head encoder-decoder attention mechanism computes a distribution over all source sentence words for each generation step.
Scaled dot-product attention is computed using queries, keys, and values matrices.
Multi-head attention is used, with a representation for each head computed by invoking Equation 1 in parallel.
Attention maps are obtained by averaging across all heads of the last layer of the decoder module.

Optimal transport problem and wasserstein distance

The Wasserstein distance is a measure of the distance between two probability distributions.
The Wasserstein-1 distance, also known as Earth Mover’s Distance, is obtained with the choice c(u, v) = u − v 1.
The EMD represents the minimum amount of “work” required to transform one pile into the other, where the work is defined as the amount of mass moved multiplied by the distance it is moved.

Hallucinations in nmt

Hallucinations are extreme end of NMT pathologies
Hallucinations contain content detached from source sentence
Hallucinations can be categorized as largely fluent detached or oscillatory
Detached hallucinations can be split according to severity of detachment
Oscillatory hallucinations contain erroneous repetitions of words and phrases

Detection of hallucinations in nmt

Categorization of on-the-fly detectors

On-the-fly hallucinations detectors detect hallucinations without reference translations
Previous work on on-the-fly detection of hallucinations in NMT has focused on two categories of detectors: external and model-based
External detectors use large language models trained on millions or billions of samples
Model-based detectors only require access to the NMT model and leverage internal features
Model-based detectors can be predictive of hallucinations and outperform quality estimation models
This work proposes a new model-based detector that achieves greater improvements over all previously proposed detectors

Problem statement

Model-based detectors require obtaining internal features from a model M.
A scoring function s M and a threshold τ are used to build a binary rule g M .
If s M is an anomaly score, g M (x) = 0 implies a ’normal’ translation and g M (x) = 1 implies a ‘hallucination’ translation.

Unsupervised hallucination detection with optimal transport

Anomalous attention maps have been connected to the hallucinatory mode in several works
Our method uses the Wasserstein distance to estimate the cost of transforming a translation source mass distribution into a reference distribution
The higher the cost of transformation, the more distant and anomalous the attention of the translation is
We rely on the generated translation and its source mass distribution to decide whether the translation is an hallucination or not
We compute an anomaly score by measuring the Wasserstein distance between the source mass distribution and a reference distribution
The reference distribution is the uniform distribution
The cost function is the 0/1 cost function
We use a held-out dataset to construct a set of held-out source attention distributions
We apply a length filter to construct the sample reference set
We compute pairwise Wasserstein-1 distances between the source mass distribution and each element of the reference set
We obtain the anomaly score by averaging the bottom-k distances
We evaluate how anomalous a given translation is compared to the data distribution
We combine Wass-to-Unif and Wass-to-Data into a single detector

Model and data

Dataset of 3415 translations for WMT'18 DE-EN news translation data released by Guerreiro et al. (2022)
Structured annotations on critical errors and hallucinations
Dataset is the only available with real hallucinations produced naturally by a NMT model
Experiments use same Transformer model that generated translations in dataset

Baseline detectors

Attn-ign-SRC is a method that computes the proportion of source words with a total incoming attention mass lower than a threshold
Seq-Logprob is a method that computes the length-normalised sequence log-probability of the generated translation
CometKiwi is a reference-free model trained on nearly one million direct assessment annotations
LaBSE is a method that leverages cross-lingual sentence representations for the source sequence and translation

Evaluation metrics

Use AUROC and FPR@90TPR to evaluate performance of detectors
Threshold-independent evaluation metrics provide comprehensive view of performance without being influenced by threshold choice

Implementation details

Used library POT: Python Optimal Transport (Flamary et al., 2021)
6 Results

Performance on on-the-fly detection

Wass-Combo is the best model-based detector
Wass-Combo outperforms most other methods
Data proximity is helpful to detect hallucinations
Combining model-based and data-driven methods brings further performance improvements
LaBSE outperforms the state-of-the-art quality estimation system CometKiwi

Do detectors specialize in different types of hallucinations?

Performance of different detectors for different types of hallucinations is analyzed
Fully detached hallucinations are easy to detect
Combining methods can be helpful
Strongly detached hallucinations are the hardest to detect
Oscillatory hallucinations are difficult to detect with model-based detectors
Combining methods can detect oscillatory hallucinations
Constructing reference set with translations of any quality improves performance
Length-filtering the distributions in reference set boosts performance significantly

Conclusions

Propose a novel plug-in model-based detector for hallucinations in NMT
Detector aims to find translations with source attention mass distribution that are distant from good quality translations
Outperforms all previous model-based detectors
Competitive with detectors that use large, pre-trained models
Does not require any training data
Can be easily deployed in real-world scenarios
Training quality estimation models with more negative examples can improve ability to penalize hallucinations
Ablations on Wass-to-Data and Wass-Combo for all relevant hyperparameters
Qualitative analysis on fixed-threshold scenario
Not able to detect fully detached hallucinations
Struggles with oscillatory hallucinations
Implemented binary heuristic to detect oscillatory hallucinations

Link to paper#

Abstract#

Paper Content#

Introduction#

Background#

Neural machine translation with transformer models#

Optimal transport problem and wasserstein distance#

Hallucinations in nmt#

Detection of hallucinations in nmt#

Categorization of on-the-fly detectors#

Problem statement#

Unsupervised hallucination detection with optimal transport#

Model and data#

Baseline detectors#

Evaluation metrics#

Implementation details#

Performance on on-the-fly detection#

Do detectors specialize in different types of hallucinations?#

Conclusions#

Link to paper

Abstract

Paper Content

Introduction

Background

Neural machine translation with transformer models

Optimal transport problem and wasserstein distance

Hallucinations in nmt

Detection of hallucinations in nmt

Categorization of on-the-fly detectors

Problem statement

Unsupervised hallucination detection with optimal transport

Model and data

Baseline detectors

Evaluation metrics

Implementation details

Performance on on-the-fly detection

Do detectors specialize in different types of hallucinations?

Conclusions