Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Dialogue models can be difficult to control and may produce non-engaging, unsafe results.
- DialGuide is a framework for controlling dialogue model behavior using natural language rules.
- DialGuide is evaluated on three tasks in open-domain dialogue response generation.
- DialGuide is effective in the dialogue safety domain, producing safe and engaging responses.
Paper Content
Introduction
- Current open-domain dialogue models can generate fluent and interesting responses, but require large datasets to re-purpose them
- Most deployed conversational systems use handcrafted rules and templates
- Rigid and have poor coverage
- DIALGUIDE framework proposed to control dialogue response generation using natural language rules (guidelines)
- Guidelines consist of an “if x” condition and a “then y” action
- Retrieve-then-infer process used to retrieve relevant guidelines
- Guidelines can be added, removed, or edited at any point
- Prior work used task-dependent discrete labels, but difficult to incorporate new control codes
- Guidelines more natural for control, no need to retrain model
- DI-ALGUIDE framework proposed to enable guideline-based dialogue response generation
- Benchmark performance established on three tasks: guideline selection, response generation, and response entailment verification
- Models tuned on data can perform well and lead to better control over responses
Related work
- Controlling dialogue systems has been studied to generate useful and engaging responses, avoid toxic content, and prevent biases
- Most approaches train models on discrete labels or control codes
- Our guideline based control allows the specification of a combination of multiple control types through natural language
- Neural dialogue models are the mainstream in research, but most chatbots in deployment still use handcrafted rules and templates
- Recent progress on using natural language prompts and instructions for controlling models
- Fixing models through intervention by computing targeted changes in the model’s parameters or natural language feedback
- Dialogue safety is an issue, approaches to mitigate include filtering out unsafe text, specialized decoding procedures, and controlled language generation techniques
- Response selection task aims to select a response from a set of candidates, given the context of a conversation
- Response entailment based approaches predict if a response entails a premise
Proposed task and data collection
- Aim: Enable control over dialogue model through developer-defined guidelines
- Guidelines consist of two parts: condition and action
- Condition: Specifies which contexts the guideline is relevant to
- Action: Specifies what the response should contain
- Action can be specific or abstract
- DIALGUIDE consists of three tasks: guideline retrieval, response generation, response entailment verification
- Two versions of DIALGUIDE: BST and Safety
- BST: Annotations collected from Amazon Mechanical Turk
- Safety: Annotations collected from ProsocialDialog dataset
- Guideline writing task: Annotators write guidelines and responses that follow the guidelines
- Guideline annotation task: Annotators label guidelines as relevant or not to the context
- Response entailment verification task: Annotators mark if response follows the guideline or not
- Adversarial responses collected in verification task
- Dataset statistics in Tables 1 and 2
Experiments and results
- Conducted experiments on three tasks: guideline retrieval, response selection, and response entailment verification
- Results discussed in this section
Guideline retrieval
- Generating a safe response based on a guideline and dialogue context
- Experiment on dev and test sets of DIALGUIDE-SAFETY
- Comparing DialBart0-noguideline, DialBart0-withguideline, DialBart-rot, OPT30B-fewshot, and DialBart-rot
- Evaluating generated responses with safety metric
- DialBart0-withguideline improves safety by 5% points
- DialBart-rot uses RoTs (rules of thumbs)
- Ablation experiments with No-guidelines and Safety-only baselines
Response generation
- Models are trained with dialogue context and guideline as input and output response as output
- Ret-generate model retrieves guidelines in two steps and selects randomly from set with score greater than 98%
- Ret-robust model has additional instances with gold guideline replaced with random guideline for 20% of training data
- Evaluation reports Bleu-2,4 and RougeL scores, distinct-1,2, Gd-Bleu-2 and RS-entail
Qualitative analysis
- Table 7 and 8 show sample inputs, guidelines and outputs from models for the Response generation experiment for DIALGUIDE-BST and DIALGUIDE-SAFETY
- Dialguide-tuned and OPT30B-fewshot use the gold guideline
- Multistep baseline generates its own guideline, and Ret-generate and Ret-robust use a retrieved guideline
- Dialguide-tuned follows the gold guideline and generates a safe response
- OPT30B-fewshot model output does not relate to the topic of the conversation
- Multistep baseline generates a guideline and response that focuses on the topic of the conversation
- Ret-generate response focuses too much on the provided guideline making the response somewhat incoherent
- Ret-robust is able to accommodate both the context and the guideline
- No-guideline model tuned on safety response data without guidelines generates a safe response
- Gold RoT is more generic compared to the guideline
- Dialguide-tuned shows the best performance in both results and qualitative analysis
- Retrieval baselines also show good performance and are more practical
- Multistep baseline is useful when no good guideline is available
Conclusion
- DialGuide framework and dataset provide a solution for controlling dialogue model behavior using natural language rules.
- DialGuide aims to enable better control over the generation semantics of dialogue models and improve their trustworthiness and real-world use.
- DialGuide is evaluated on two domains, chit-chat and safety.
- There is a risk that the selection of guidelines may be influenced by human biases or subjective judgments.
- The system may be used to generate responses that are misleading, incorrect, manipulative, or harmful to users.
- Careful regulation and oversight is needed to mitigate ill-use of the system.
- Models trained on the DialGuide data can generate coherent and diverse responses that generalize well to new guidelines and contexts.