Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Investigates brain regions involved in syntactic and semantic processing during speech comprehension
  • Trained lexical and supra-lexical language models on text corpus with selectively removed information
  • Assessed models’ ability to predict fMRI signal of humans listening to naturalistic text
  • Found asymmetry between left and right hemispheres in sensitivity to syntactic and semantic variables

Paper Content

Introduction

  • Understanding the neural bases of language processing has been a main research effort in the neuroimaging community for decades
  • One open question is whether semantic and syntactic information are encoded and processed jointly or separately in the brain
  • Early lesion studies suggested syntactic processing takes place in specialized brain regions
  • Neuroimaging studies and simulation work have provided support for a modular view of language processing
  • An opposing view has argued that semantics and syntax are processed in a common distributed language processing system
  • Neuroimaging studies have used controlled experimental paradigms and naturalistic paradigms
  • Neural language models have been increasingly employed in the analysis of data collected from ecological paradigms
  • A central puzzle remains in the field: distributed networks for naturalistic stimuli vs. localized activations for experimental paradigms
  • A new approach has been proposed to tackle the questions of syntactic vs. semantic processing and contextual integration

Results

Dissociation of syntactic and semantic information in embeddings

  • Models trained on intact texts had high performance on both tasks
  • Models trained on syntactic features performed well on syntax decoding task, but near chance-level on semantic decoding task
  • Models trained on semantic features had good performance on semantic decoding task, but poorer performance on syntax decoding task

Correlations of fmri data with syntactic and semantic embeddings

  • Objective was to evaluate how well embeddings fit fMRI signal
  • Increase in R score when embeddings appended to baseline model
  • Maps reveal semantic and syntactic feature-derived embeddings explain signal in bilateral brain regions
  • Vast network of regions modulated by semantic and syntactic information
  • Different R score distribution profiles in different regions

Regions best fitted by semantic or syntactic embeddings

  • Neural models were used to predict fMRI activity
  • Brain maps of correlation between encoding models’ predictions and fMRI time-series were computed
  • GPT-2 model was tested on input sequences of bounded context length
  • Syntax and semantics modulate activations in both hemispheres
  • Overlap between semantic and syntactic peak regions is stronger in right hemisphere
  • Left hemisphere has distinct peak regions for syntax and semantics
  • Right hemisphere has shared peak regions for syntax and semantics

Gradient of sensitivity to syntax or semantics

  • Brain regions sensitive to both syntax and semantics
  • Specificity index reflects ratio between R scores derived from semantic and syntactic embeddings
  • Voxels more sensitive to syntax located in anterior Temporal Lobes, STG, SMA, MFG and IFG
  • Voxels more sensitive to semantics located in pMTG, TPJ/AG, IFS, SFS and Precuneus

Unique contributions of syntax and semantics

  • Quantified brain signal explained by information encoded in various embeddings
  • Analyzed additional information brought by each embedding
  • Evaluated correlations uniquely explained by semantic and syntactic embeddings
  • Estimated Pearson correlation explained by individual feature space
  • Assessed correlation explained by concatenation of embeddings
  • Syntactic embeddings explained localized brain regions
  • Semantic embeddings explained wide network of brain regions

Synergy between syntax and semantics

  • GloVe embeddings did not show any significant effect.
  • GPT-2 embeddings showed significant effects in most of the brain, with higher effects in semantic peak regions.

Integration of contextual information

  • GPT-2 embeddings elicit stronger R scores than GloVe
  • Syntax led to significant differences in frontal and temporo-parietal regions
  • Semantics led to significant differences in Precuneus, STS and posterior STG
  • Bilateral network of frontal and temporo-parietal regions modulated by short context
  • Left hemisphere preferentially involved in short-range context integration

Discussion

  • Language comprehension in humans is a complex process
  • Neurolinguistics is trying to discover how the brain implements these processes
  • A lot of attention has been devoted to the syntactic and semantic components
  • It is still debated if these components are implemented in distinct or identical regions
  • A meta analysis of the literature was done using the Neurosynth database
  • Recent studies have presented naturalistic texts to participants and used Artificial Neural Language Models to analyze brain activations

Models trained on semantic and syntactic features fit brain activity in a widely distributed network, but with varying relative degrees.

  • Trained models captured brain activity in a large extended language network
  • This network goes beyond the core language network
  • Areas included in the network are homologous areas in the right hemisphere, the dorsal prefrontal regions, both on the lateral and medial surfaces, as well as in the inferior parietal, Precuneus and posterior Cingulate
  • These regions are part of the Default Mode Network
  • They are relevant in language and high-level cognition
  • Syntactic and semantic features modulate activity throughout the language network
  • Syntax is more sensitive in the STG and anterior temporal lobe, whereas the parietal regions are more sensitive to semantics
  • Syntactic and semantic features are not perfectly orthogonal
  • “Pure” semantic features modulate activity in a wide set of brain regions
  • “Pure” effect of syntax “shrinks” to the STG and aTL, the IFG and the pre-SMA

The contributions of the right hemisphere.

  • Right hemisphere has linguistic abilities
  • Brain imaging studies confirm right hemisphere involvement in higher-level language tasks
  • Right hemisphere apt at recognizing distant relations between words
  • Long-range context effects observed in right hemisphere
  • Left hemisphere more involved in processing local semantic/syntactic information, right hemisphere integrates both information at larger time-scale

Syntax drives the integration of contextual information.

  • GloVe, a purely lexical model, does not benefit from being trained on the integral text.
  • GPT-2 shows clear synergetic effects of syntactic and semantic information.
  • Syntactic information drives the semantic interpretation at the sentence level.

Limitations of our study

  • Syntax and semantics are not perfectly dissociated
  • Removing function words impacts supra-lexical semantics
  • Removing pronouns removes arguments of some verbs
  • Prosody may have confounding effects on the results

Conclusion

  • State-of-the-art Natural Language Processing models can generate flawless grammatical text
  • Using them to fit brain data is common, despite their low biological plausibility
  • Restricting information provided to the model during training can be used to show which brain areas encode this information
  • Information-restricted models are powerful and flexible tools to probe the brain
  • 4.4GB of text used for training, 1.1GB for validation
  • Created two information-restricted datasets: semantic and syntactic
  • GloVe and GPT-2 trained on the three datasets
  • GPT-2 modified to remove residual syntax when trained on semantic features
  • Stimulus used to obtain activations from humans and NLP models was The Little Prince novella
  • Embedding matrices built by concatenating words embeddings
  • Two decoding tasks: syntax and semantic
  • fMRI data of 51 English speaking participants used
  • Global brain mask of 26,164 voxels at 4x4x4mm resolution
  • Metaanalysis of brain regions using Neurosynth database