Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Character-level tasks are challenging for models based in subword tokenization.
Geiger et al. (2021) method is adapted to operate on type-level variables over characters.
Character-level tasks are tested that vary in their dependence on meaning and sequence-level context.
Simple character-level tokenization approaches perform best on purely form-based tasks.
Our method is superior for more complex tasks that blend form, meaning, and context.
Subword-based models have human-interpretable internal representations of characters.

Paper Content

Introduction

Natural language tasks can be described in terms of character-level manipulations
Current models tokenize inputs and outputs into words and subword units
A framework is developed to push subword-based models to encode character-level information
A suite of character-level evaluation tasks is introduced
Pure character-level modeling is superior for tasks involving only meaning or only context
Subword tokenization models are superior for tasks involving both meaning and context
Type-level IIT leads to subword-based models with human-interpretable internal representations of characters

Subword and character modeling

Subword-based models tokenize inputs into words and word pieces
Character-level models represent inputs as character sequences
Character-level models have not been widely used for large language models
Recent character-level large language models have been successful on standard benchmarks
Hybrid character-level and subword models have been created

Character manipulation tasks

Character manipulation tasks are becoming more common in language model evaluations
Recent efforts have focused on linguistic phenomena that depend on character-level manipulations
Examples include digit tokenization, creative blends, puns, and wordplay in crossword clues
RoBERTa can correctly spell more than one-third of tested words, but is still far from reliable
CharacterBERT is not notably better at the task
Subword models are wrong about 10% of the time when mapping from tokens to characters

Intervention-based training methods

Core technique is based on Interchange Intervention Method (IIT)
IIT allows neural model to conform to high-level causal model while still learning from data
IIT belongs to family of causal abstraction techniques
IIT used to obtain human-interpretable explanations of complex neural networks
Key innovation of IIT is to extend explanation techniques to model training
Technical innovation is applying IIT to different variables of same type

Character-level manipulation tasks

Tests models along aspects of form, meaning, and context
Covers all combinations of values for character manipulation tasks

Character reversal

Character Reversal task takes in a string and reverses the characters
Inputs and outputs do not need to be valid English words
Dataset consists of 20K/2K train/dev examples
Evaluate generalization capacity of models with two test splits of 4K/1K
In-Vocab split has source tokens in training vocab, Out-Of-Vocab split has source tokens out of training vocab

Unit conversion

Unit Conversion task involves decimal shifting to convert a number from one unit to another
Task involves form and context, but not meaning
Dataset consists of 20K/2K train/dev examples with 300 and 500 unique subword pieces
Units include large number numerals and length units
IV and OOV splits defined

Unscramble

Unscramble task takes a random permutation of a word and outputs the unscrambled word
Does not constrain the first or last letter of the permutations
Involves meaning, as models need to recognize the sequence of characters in the output as valid English words
Dataset constructed from 30K English words
Dataset consists of 100K train examples and 30K dev examples
IV and OOV splits defined

Single word spelling correction

Single Word Spelling Correction task takes a word with a spelling error and outputs the correct word
Synthetic errors include swapping two adjacent characters, substituting a character with its neighbors on the keyboard, deleting a character, and repeating a character
Involves semantics because the correction needs to create an attested English word
Dataset consists of 100K training examples with 3 corruptions per word and 30K dev examples
Evaluate on 6K real spelling errors collected by Belinkov and Bisk (2018)

Spelling correction with context

Spelling Correction with Context adds contextual aspects to the single-word spelling correction task.
Context can be important in spelling correction as some errors have multiple potential corrections.
Train dataset samples 30K sentences containing the target word from the Wikipedia corpus.
Test sets have two variants: ‘Dependent’ and ‘Independent’.
‘Dependent’ condition has multiple potential corrections and the correct one depends on the context.
‘Independent’ condition has only one valid correction, bringing it closer to Single Word Spelling Correction.

Word search

Word Search task is adapted from a popular puzzle
Players find hidden words in a letter grid that match a theme
Synthetic puzzles are generated with a structure definition
Task is to find a substring that, when reversed, is defined by definition
Reversed words are used to avoid confound of subword tokenization
Definitions or names of WordNet Synsets are used to define words
Train dataset has 30K examples with two valid English words
Test sets have two variations to understand how models use form, meaning, and context
O+P test scenario is the hardest as it requires reasoning about the meaning of the full paraphrase and character-level reasoning

Character-level interventions

Character-level tokenization methods provide models with direct evidence for learning about characters.
Subword tokenization methods do not provide such direct evidence.
Interchange Intervention Training (IIT) is a method that seeks to address this shortcoming.
IIT has three core steps: defining a causal model, aligning a variable with a set of neurons, and training the neural model to make predictions.
Type-level IIT allows interchange interventions to target any two variables of the same type.

Causal models for characters

IIT requires a high-level causal model to be defined.
The Character Reversal task is used to illustrate this.
The model creates intermediate variables for each character and outputs them in reverse order.
Interventions are performed using pairs of examples.

Aligning the causal and neural models

Step 2 of IIT is to define an alignment of variables in the causal model with sets of neurons in the target neural model.
Figure 2b summarizes the alignment of character variables V1, V2, V3 to the first-layer hidden states of the Transformer Encoder.

Training with character interventions

IIT has three steps: model training, standard training objective, and counterfactual objective.
Model training involves intervening on the causal model and neural model.
The intervention assigns the value of one variable to another, and the model is pushed to localize information about the variables.

Handling out-of-vocab items

Procedure does not provide solution for unseen vocabulary items
Interpretable character representations learned with Type-level IIT provide way to resolve problem
Substitute unseen token with sequence of seen tokens
Populate each position with averaged representation of each character in unseen vocab

Experiments

Compared intervention-based approach to three groups of tokenization approaches
Evaluated how each method generalizes with respect to form, meaning, and context

Baselines

Fine-tuned T5-small model for subword-based models
Experimented with in-context learning using GPT-3
Set tokenizer to split input/output into characters for T5-small
Inserted hyphen/space between input/output characters for GPT-3
Fine-tuned ByT5-small model for character-level models

Intervention-based models

Applied character intervention-based method to T5-small model
Coefficients λ 1 and λ 2 set to 1

Evaluation

Test sets described in Section 3
Tasks 1-4 evaluated on In-Vocab (IV) and Out-Of-Vocab (OOV) splits
Task 4 evaluated on “Real” split with natural spelling errors
Task 5 evaluated on “Independent” and “Dependent” splits
Task 6 evaluated on “P”, “O” and “P+O” splits
Metrics: sequence-level accuracy
Unscramble task: anagrams, valid English words, not identical to input
Single Word Spelling Correction: valid English words, synthetic error rules
Decoding: T5-based models use greedy decoding algorithm, IIT models use average pooled character representations

Results

Character-level models are best for purely form-based tasks
IIT boosts performance in OOV setting
Character-level models start losing advantage when meaning aspect is added
Char-S model (character-level inputs, subword outputs) is best overall
Subword models are far behind, but IIT helps them
Character-level inputs/outputs outperform subword inputs/outputs for Reversal task
Character-level outputs consistently underperform subword outputs for tasks with meaning aspect
Subword-based models have advantage when form alone is insufficient to determine output
Subword+IIT models improve accuracy on all splits for word search task

Analysis and discussion

Error analysis on word search

Analyzed performance on Word Search task
Defined two new metrics: Character Match and Synonym Match
Character Match measures percentage of predictions that are a substring of the reversed charstring
Synonym Match measures percentage of predictions that are semantically relevant to the definition
Subword-based models biased towards using meaning and context, character-level models biased towards using form
Subword models appear to be viable, Sub-word+IIT is the best variant

Visualizing internal character representations

Subword+IIT models embed accurate, coherent representations of characters
Subword baseline model does not show evidence of internal character-level representations

Conclusion

Character tasks are difficult for large language models that use subword tokenization
Character-level interventions can help with this limitation
We introduced Type-level IIT to train networks to represent characters
We introduced a suite of tasks to assess models on character-level tasks
Subword+IIT models are superior for tasks that blend form, meaning, and context

Link to paper#

Abstract#

Paper Content#

Introduction#

Related work#

Subword and character modeling#

Character manipulation tasks#

Intervention-based training methods#

Character-level manipulation tasks#

Character reversal#

Unit conversion#

Unscramble#

Single word spelling correction#

Spelling correction with context#

Word search#

Character-level interventions#

Causal models for characters#

Aligning the causal and neural models#

Training with character interventions#

Handling out-of-vocab items#

Experiments#

Baselines#

Intervention-based models#

Evaluation#

Results#

Analysis and discussion#

Error analysis on word search#

Visualizing internal character representations#

Conclusion#

Link to paper

Abstract

Paper Content

Introduction

Related work

Subword and character modeling

Character manipulation tasks

Intervention-based training methods

Character-level manipulation tasks

Character reversal

Unit conversion

Unscramble

Single word spelling correction

Spelling correction with context

Word search

Character-level interventions

Causal models for characters

Aligning the causal and neural models

Training with character interventions

Handling out-of-vocab items

Experiments

Baselines

Intervention-based models

Evaluation

Results

Analysis and discussion

Error analysis on word search

Visualizing internal character representations

Conclusion