Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Dialogue models can be difficult to control and may produce non-engaging, unsafe results.
DialGuide is a framework for controlling dialogue model behavior using natural language rules.
DialGuide is evaluated on three tasks in open-domain dialogue response generation.
DialGuide is effective in the dialogue safety domain, producing safe and engaging responses.

Current open-domain dialogue models can generate fluent and interesting responses, but require large datasets to re-purpose them
Most deployed conversational systems use handcrafted rules and templates
Rigid and have poor coverage
DIALGUIDE framework proposed to control dialogue response generation using natural language rules (guidelines)
Guidelines consist of an “if x” condition and a “then y” action
Retrieve-then-infer process used to retrieve relevant guidelines
Guidelines can be added, removed, or edited at any point
Prior work used task-dependent discrete labels, but difficult to incorporate new control codes
Guidelines more natural for control, no need to retrain model
DI-ALGUIDE framework proposed to enable guideline-based dialogue response generation
Benchmark performance established on three tasks: guideline selection, response generation, and response entailment verification
Models tuned on data can perform well and lead to better control over responses

Controlling dialogue systems has been studied to generate useful and engaging responses, avoid toxic content, and prevent biases
Most approaches train models on discrete labels or control codes
Our guideline based control allows the specification of a combination of multiple control types through natural language
Neural dialogue models are the mainstream in research, but most chatbots in deployment still use handcrafted rules and templates
Recent progress on using natural language prompts and instructions for controlling models
Fixing models through intervention by computing targeted changes in the model’s parameters or natural language feedback
Dialogue safety is an issue, approaches to mitigate include filtering out unsafe text, specialized decoding procedures, and controlled language generation techniques
Response selection task aims to select a response from a set of candidates, given the context of a conversation
Response entailment based approaches predict if a response entails a premise

Aim: Enable control over dialogue model through developer-defined guidelines
Guidelines consist of two parts: condition and action
Condition: Specifies which contexts the guideline is relevant to
Action: Specifies what the response should contain
Action can be specific or abstract
DIALGUIDE consists of three tasks: guideline retrieval, response generation, response entailment verification
Two versions of DIALGUIDE: BST and Safety
BST: Annotations collected from Amazon Mechanical Turk
Safety: Annotations collected from ProsocialDialog dataset
Guideline writing task: Annotators write guidelines and responses that follow the guidelines
Guideline annotation task: Annotators label guidelines as relevant or not to the context
Response entailment verification task: Annotators mark if response follows the guideline or not
Adversarial responses collected in verification task
Dataset statistics in Tables 1 and 2

Conducted experiments on three tasks: guideline retrieval, response selection, and response entailment verification
Results discussed in this section

Generating a safe response based on a guideline and dialogue context
Experiment on dev and test sets of DIALGUIDE-SAFETY
Comparing DialBart0-noguideline, DialBart0-withguideline, DialBart-rot, OPT30B-fewshot, and DialBart-rot
Evaluating generated responses with safety metric
DialBart0-withguideline improves safety by 5% points
DialBart-rot uses RoTs (rules of thumbs)
Ablation experiments with No-guidelines and Safety-only baselines

Models are trained with dialogue context and guideline as input and output response as output
Ret-generate model retrieves guidelines in two steps and selects randomly from set with score greater than 98%
Ret-robust model has additional instances with gold guideline replaced with random guideline for 20% of training data
Evaluation reports Bleu-2,4 and RougeL scores, distinct-1,2, Gd-Bleu-2 and RS-entail

Table 7 and 8 show sample inputs, guidelines and outputs from models for the Response generation experiment for DIALGUIDE-BST and DIALGUIDE-SAFETY
Dialguide-tuned and OPT30B-fewshot use the gold guideline
Multistep baseline generates its own guideline, and Ret-generate and Ret-robust use a retrieved guideline
Dialguide-tuned follows the gold guideline and generates a safe response
OPT30B-fewshot model output does not relate to the topic of the conversation
Multistep baseline generates a guideline and response that focuses on the topic of the conversation
Ret-generate response focuses too much on the provided guideline making the response somewhat incoherent
Ret-robust is able to accommodate both the context and the guideline
No-guideline model tuned on safety response data without guidelines generates a safe response
Gold RoT is more generic compared to the guideline
Dialguide-tuned shows the best performance in both results and qualitative analysis
Retrieval baselines also show good performance and are more practical
Multistep baseline is useful when no good guideline is available

DialGuide framework and dataset provide a solution for controlling dialogue model behavior using natural language rules.
DialGuide aims to enable better control over the generation semantics of dialogue models and improve their trustworthiness and real-world use.
DialGuide is evaluated on two domains, chit-chat and safety.
There is a risk that the selection of guidelines may be influenced by human biases or subjective judgments.
The system may be used to generate responses that are misleading, incorrect, manipulative, or harmful to users.
Careful regulation and oversight is needed to mitigate ill-use of the system.
Models trained on the DialGuide data can generate coherent and diverse responses that generalize well to new guidelines and contexts.