Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Recent work has shown that adversarial poisoning of input contexts can cause large drops in accuracy for production systems.
  • Little to no work has proposed methods to defend against these attacks.
  • A new method is proposed that uses query augmentation to search for a diverse set of retrieved passages.
  • A novel confidence method is designed to compare the predicted answer to its appearance in the retrieved contexts.
  • This method allows for a simple but effective way to defend against poisoning attacks.
  • Gains of 5-20% exact match across varying levels of data poisoning are achieved.

Paper Content

Introduction

  • ODQA is the task of answering a given question based on evidence from a large corpus of documents
  • Recent work in ODQA has resulted in many well-curated datasets
  • Malicious actors can affect articles input to an ODQA system
  • Recent work has recognized the potential for bad actors to influence automated knowledge-intensive NLP systems
  • Problem of defending against data poisoning attacks is still understudied
  • Proposed query augmentation scheme to gather a larger set of diverse passages
  • Proposed a new confidence method to decide when to use the newly gathered contexts vs the original

Experimental details

  • Simulate realistic misinformation attacks on Wikipedia
  • Poison entire articles at a time
  • Experimented with poisoning top/random contexts, similar results

Data

  • Natural Questions and TriviaQA are popular datasets for open domain question answering
  • Natural Questions dataset was gathered from real-user queries on Google Search
  • TriviaQA dataset was collected by scraping question and answer pairs from trivia websites
  • Data poisoning is simulated using code from Longpre et al. (2021)
  • Data poisoning uses answers to questions to suggest an entity of the same type to replace the correct answer

Models

  • FiD is an encoder-decoder architecture that generates an answer given retrieved evidence
  • FiD uses the DPR bi-encoder architecture for retrieval, embedding documents and queries into a single dense vector

Metrics and hyperparameters

  • Previous work in question answering used Exact Match (EM)
  • Data was split into validation and test sets, taking 50% of the data from Longpre et al. (2021) for each split
  • Experiments were run on a cluster of V100 GPUs, with each job running on a 4 to 8 GPU node
  • Models were used as provided by the original authors with default retriever hyperparameters

Query augmentation

  • Query augmentation is a traditional information retrieval technique
  • Recently, neural models have been used to generate query expansions
  • Confidence and calibration of QA models have been studied to reflect correct answer rate
  • Answer redundancy has been studied in other NLP contexts

Confidence from answer redundancy

  • CAR is a novel method for measuring ODQA confidence
  • CAR measures how often the predicted answer appears in the retrieved contexts
  • CAR is used as a signal for downstream calibration efforts

Answer resolution

  • Answer redundancy confidence helps to know when a query is confident about its predicted answer
  • Strategies used to explore combining more than one question and passage set: baseline, randomly pick a new augmented question, majority vote of augmented question’s predictions, answer redundancy
  • Three different data type settings: original contexts with original/new questions, new augmented questions and contexts, original question with new contexts

Results

  • FiD model highlights key findings
  • As poisoned data increases, model performance decreases
  • Majority vote performs worse than original question
  • Original question with new contexts performs best
  • Gains of 5-20% EM across datasets
  • Even one augmented query provides gains over baseline
  • Edge cases missed by original poisoning method

Conclusion

  • Data poisoning attacks can be used to attack open-domain question answering systems
  • Two novel methods proposed to defend against data poisoning attacks: query augmentation and answer redundancy
  • Performance improvement of almost 20 points in exact match
  • Focused on TriviaQA and Natural Questions datasets
  • Defense strategy depends on finding alternate sources of information outside of the poisoned contexts
  • Data poisoning attacks have a long history in NLP
  • Related work focuses on making harder questions rather than simulating a real attack
  • Results show that as the number of augmented queries increases, so does the performance