Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Recent work has shown that adversarial poisoning of input contexts can cause large drops in accuracy for production systems.
- Little to no work has proposed methods to defend against these attacks.
- A new method is proposed that uses query augmentation to search for a diverse set of retrieved passages.
- A novel confidence method is designed to compare the predicted answer to its appearance in the retrieved contexts.
- This method allows for a simple but effective way to defend against poisoning attacks.
- Gains of 5-20% exact match across varying levels of data poisoning are achieved.
Paper Content
Introduction
- ODQA is the task of answering a given question based on evidence from a large corpus of documents
- Recent work in ODQA has resulted in many well-curated datasets
- Malicious actors can affect articles input to an ODQA system
- Recent work has recognized the potential for bad actors to influence automated knowledge-intensive NLP systems
- Problem of defending against data poisoning attacks is still understudied
- Proposed query augmentation scheme to gather a larger set of diverse passages
- Proposed a new confidence method to decide when to use the newly gathered contexts vs the original
Experimental details
- Simulate realistic misinformation attacks on Wikipedia
- Poison entire articles at a time
- Experimented with poisoning top/random contexts, similar results
Data
- Natural Questions and TriviaQA are popular datasets for open domain question answering
- Natural Questions dataset was gathered from real-user queries on Google Search
- TriviaQA dataset was collected by scraping question and answer pairs from trivia websites
- Data poisoning is simulated using code from Longpre et al. (2021)
- Data poisoning uses answers to questions to suggest an entity of the same type to replace the correct answer
Models
- FiD is an encoder-decoder architecture that generates an answer given retrieved evidence
- FiD uses the DPR bi-encoder architecture for retrieval, embedding documents and queries into a single dense vector
Metrics and hyperparameters
- Previous work in question answering used Exact Match (EM)
- Data was split into validation and test sets, taking 50% of the data from Longpre et al. (2021) for each split
- Experiments were run on a cluster of V100 GPUs, with each job running on a 4 to 8 GPU node
- Models were used as provided by the original authors with default retriever hyperparameters
Query augmentation
- Query augmentation is a traditional information retrieval technique
- Recently, neural models have been used to generate query expansions
- Confidence and calibration of QA models have been studied to reflect correct answer rate
- Answer redundancy has been studied in other NLP contexts
Confidence from answer redundancy
- CAR is a novel method for measuring ODQA confidence
- CAR measures how often the predicted answer appears in the retrieved contexts
- CAR is used as a signal for downstream calibration efforts
Answer resolution
- Answer redundancy confidence helps to know when a query is confident about its predicted answer
- Strategies used to explore combining more than one question and passage set: baseline, randomly pick a new augmented question, majority vote of augmented question’s predictions, answer redundancy
- Three different data type settings: original contexts with original/new questions, new augmented questions and contexts, original question with new contexts
Results
- FiD model highlights key findings
- As poisoned data increases, model performance decreases
- Majority vote performs worse than original question
- Original question with new contexts performs best
- Gains of 5-20% EM across datasets
- Even one augmented query provides gains over baseline
- Edge cases missed by original poisoning method
Conclusion
- Data poisoning attacks can be used to attack open-domain question answering systems
- Two novel methods proposed to defend against data poisoning attacks: query augmentation and answer redundancy
- Performance improvement of almost 20 points in exact match
- Focused on TriviaQA and Natural Questions datasets
- Defense strategy depends on finding alternate sources of information outside of the poisoned contexts
- Data poisoning attacks have a long history in NLP
- Related work focuses on making harder questions rather than simulating a real attack
- Results show that as the number of augmented queries increases, so does the performance