Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Character-level tasks are challenging for models based in subword tokenization.
  • Geiger et al. (2021) method is adapted to operate on type-level variables over characters.
  • Character-level tasks are tested that vary in their dependence on meaning and sequence-level context.
  • Simple character-level tokenization approaches perform best on purely form-based tasks.
  • Our method is superior for more complex tasks that blend form, meaning, and context.
  • Subword-based models have human-interpretable internal representations of characters.

Paper Content

Introduction

  • Natural language tasks can be described in terms of character-level manipulations
  • Current models tokenize inputs and outputs into words and subword units
  • A framework is developed to push subword-based models to encode character-level information
  • A suite of character-level evaluation tasks is introduced
  • Pure character-level modeling is superior for tasks involving only meaning or only context
  • Subword tokenization models are superior for tasks involving both meaning and context
  • Type-level IIT leads to subword-based models with human-interpretable internal representations of characters

Subword and character modeling

  • Subword-based models tokenize inputs into words and word pieces
  • Character-level models represent inputs as character sequences
  • Character-level models have not been widely used for large language models
  • Recent character-level large language models have been successful on standard benchmarks
  • Hybrid character-level and subword models have been created

Character manipulation tasks

  • Character manipulation tasks are becoming more common in language model evaluations
  • Recent efforts have focused on linguistic phenomena that depend on character-level manipulations
  • Examples include digit tokenization, creative blends, puns, and wordplay in crossword clues
  • RoBERTa can correctly spell more than one-third of tested words, but is still far from reliable
  • CharacterBERT is not notably better at the task
  • Subword models are wrong about 10% of the time when mapping from tokens to characters

Intervention-based training methods

  • Core technique is based on Interchange Intervention Method (IIT)
  • IIT allows neural model to conform to high-level causal model while still learning from data
  • IIT belongs to family of causal abstraction techniques
  • IIT used to obtain human-interpretable explanations of complex neural networks
  • Key innovation of IIT is to extend explanation techniques to model training
  • Technical innovation is applying IIT to different variables of same type

Character-level manipulation tasks

  • Tests models along aspects of form, meaning, and context
  • Covers all combinations of values for character manipulation tasks

Character reversal

  • Character Reversal task takes in a string and reverses the characters
  • Inputs and outputs do not need to be valid English words
  • Dataset consists of 20K/2K train/dev examples
  • Evaluate generalization capacity of models with two test splits of 4K/1K
  • In-Vocab split has source tokens in training vocab, Out-Of-Vocab split has source tokens out of training vocab

Unit conversion

  • Unit Conversion task involves decimal shifting to convert a number from one unit to another
  • Task involves form and context, but not meaning
  • Dataset consists of 20K/2K train/dev examples with 300 and 500 unique subword pieces
  • Units include large number numerals and length units
  • IV and OOV splits defined

Unscramble

  • Unscramble task takes a random permutation of a word and outputs the unscrambled word
  • Does not constrain the first or last letter of the permutations
  • Involves meaning, as models need to recognize the sequence of characters in the output as valid English words
  • Dataset constructed from 30K English words
  • Dataset consists of 100K train examples and 30K dev examples
  • IV and OOV splits defined

Single word spelling correction

  • Single Word Spelling Correction task takes a word with a spelling error and outputs the correct word
  • Synthetic errors include swapping two adjacent characters, substituting a character with its neighbors on the keyboard, deleting a character, and repeating a character
  • Involves semantics because the correction needs to create an attested English word
  • Dataset consists of 100K training examples with 3 corruptions per word and 30K dev examples
  • Evaluate on 6K real spelling errors collected by Belinkov and Bisk (2018)

Spelling correction with context

  • Spelling Correction with Context adds contextual aspects to the single-word spelling correction task.
  • Context can be important in spelling correction as some errors have multiple potential corrections.
  • Train dataset samples 30K sentences containing the target word from the Wikipedia corpus.
  • Test sets have two variants: ‘Dependent’ and ‘Independent’.
  • ‘Dependent’ condition has multiple potential corrections and the correct one depends on the context.
  • ‘Independent’ condition has only one valid correction, bringing it closer to Single Word Spelling Correction.
  • Word Search task is adapted from a popular puzzle
  • Players find hidden words in a letter grid that match a theme
  • Synthetic puzzles are generated with a structure definition
  • Task is to find a substring that, when reversed, is defined by definition
  • Reversed words are used to avoid confound of subword tokenization
  • Definitions or names of WordNet Synsets are used to define words
  • Train dataset has 30K examples with two valid English words
  • Test sets have two variations to understand how models use form, meaning, and context
  • O+P test scenario is the hardest as it requires reasoning about the meaning of the full paraphrase and character-level reasoning

Character-level interventions

  • Character-level tokenization methods provide models with direct evidence for learning about characters.
  • Subword tokenization methods do not provide such direct evidence.
  • Interchange Intervention Training (IIT) is a method that seeks to address this shortcoming.
  • IIT has three core steps: defining a causal model, aligning a variable with a set of neurons, and training the neural model to make predictions.
  • Type-level IIT allows interchange interventions to target any two variables of the same type.

Causal models for characters

  • IIT requires a high-level causal model to be defined.
  • The Character Reversal task is used to illustrate this.
  • The model creates intermediate variables for each character and outputs them in reverse order.
  • Interventions are performed using pairs of examples.

Aligning the causal and neural models

  • Step 2 of IIT is to define an alignment of variables in the causal model with sets of neurons in the target neural model.
  • Figure 2b summarizes the alignment of character variables V1, V2, V3 to the first-layer hidden states of the Transformer Encoder.

Training with character interventions

  • IIT has three steps: model training, standard training objective, and counterfactual objective.
  • Model training involves intervening on the causal model and neural model.
  • The intervention assigns the value of one variable to another, and the model is pushed to localize information about the variables.

Handling out-of-vocab items

  • Procedure does not provide solution for unseen vocabulary items
  • Interpretable character representations learned with Type-level IIT provide way to resolve problem
  • Substitute unseen token with sequence of seen tokens
  • Populate each position with averaged representation of each character in unseen vocab

Experiments

  • Compared intervention-based approach to three groups of tokenization approaches
  • Evaluated how each method generalizes with respect to form, meaning, and context

Baselines

  • Fine-tuned T5-small model for subword-based models
  • Experimented with in-context learning using GPT-3
  • Set tokenizer to split input/output into characters for T5-small
  • Inserted hyphen/space between input/output characters for GPT-3
  • Fine-tuned ByT5-small model for character-level models

Intervention-based models

  • Applied character intervention-based method to T5-small model
  • Coefficients λ 1 and λ 2 set to 1

Evaluation

  • Test sets described in Section 3
  • Tasks 1-4 evaluated on In-Vocab (IV) and Out-Of-Vocab (OOV) splits
  • Task 4 evaluated on “Real” split with natural spelling errors
  • Task 5 evaluated on “Independent” and “Dependent” splits
  • Task 6 evaluated on “P”, “O” and “P+O” splits
  • Metrics: sequence-level accuracy
  • Unscramble task: anagrams, valid English words, not identical to input
  • Single Word Spelling Correction: valid English words, synthetic error rules
  • Decoding: T5-based models use greedy decoding algorithm, IIT models use average pooled character representations

Results

  • Character-level models are best for purely form-based tasks
  • IIT boosts performance in OOV setting
  • Character-level models start losing advantage when meaning aspect is added
  • Char-S model (character-level inputs, subword outputs) is best overall
  • Subword models are far behind, but IIT helps them
  • Character-level inputs/outputs outperform subword inputs/outputs for Reversal task
  • Character-level outputs consistently underperform subword outputs for tasks with meaning aspect
  • Subword-based models have advantage when form alone is insufficient to determine output
  • Subword+IIT models improve accuracy on all splits for word search task

Analysis and discussion

  • Analyzed performance on Word Search task
  • Defined two new metrics: Character Match and Synonym Match
  • Character Match measures percentage of predictions that are a substring of the reversed charstring
  • Synonym Match measures percentage of predictions that are semantically relevant to the definition
  • Subword-based models biased towards using meaning and context, character-level models biased towards using form
  • Subword models appear to be viable, Sub-word+IIT is the best variant

Visualizing internal character representations

  • Subword+IIT models embed accurate, coherent representations of characters
  • Subword baseline model does not show evidence of internal character-level representations

Conclusion

  • Character tasks are difficult for large language models that use subword tokenization
  • Character-level interventions can help with this limitation
  • We introduced Type-level IIT to train networks to represent characters
  • We introduced a suite of tasks to assess models on character-level tasks
  • Subword+IIT models are superior for tasks that blend form, meaning, and context