Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Investigates brain regions involved in syntactic and semantic processing during speech comprehension
Trained lexical and supra-lexical language models on text corpus with selectively removed information
Assessed models’ ability to predict fMRI signal of humans listening to naturalistic text
Found asymmetry between left and right hemispheres in sensitivity to syntactic and semantic variables

Paper Content

Introduction

Understanding the neural bases of language processing has been a main research effort in the neuroimaging community for decades
One open question is whether semantic and syntactic information are encoded and processed jointly or separately in the brain
Early lesion studies suggested syntactic processing takes place in specialized brain regions
Neuroimaging studies and simulation work have provided support for a modular view of language processing
An opposing view has argued that semantics and syntax are processed in a common distributed language processing system
Neuroimaging studies have used controlled experimental paradigms and naturalistic paradigms
Neural language models have been increasingly employed in the analysis of data collected from ecological paradigms
A central puzzle remains in the field: distributed networks for naturalistic stimuli vs. localized activations for experimental paradigms
A new approach has been proposed to tackle the questions of syntactic vs. semantic processing and contextual integration

Results

Dissociation of syntactic and semantic information in embeddings

Models trained on intact texts had high performance on both tasks
Models trained on syntactic features performed well on syntax decoding task, but near chance-level on semantic decoding task
Models trained on semantic features had good performance on semantic decoding task, but poorer performance on syntax decoding task

Correlations of fmri data with syntactic and semantic embeddings

Objective was to evaluate how well embeddings fit fMRI signal
Increase in R score when embeddings appended to baseline model
Maps reveal semantic and syntactic feature-derived embeddings explain signal in bilateral brain regions
Vast network of regions modulated by semantic and syntactic information
Different R score distribution profiles in different regions

Regions best fitted by semantic or syntactic embeddings

Neural models were used to predict fMRI activity
Brain maps of correlation between encoding models’ predictions and fMRI time-series were computed
GPT-2 model was tested on input sequences of bounded context length
Syntax and semantics modulate activations in both hemispheres
Overlap between semantic and syntactic peak regions is stronger in right hemisphere
Left hemisphere has distinct peak regions for syntax and semantics
Right hemisphere has shared peak regions for syntax and semantics

Gradient of sensitivity to syntax or semantics

Brain regions sensitive to both syntax and semantics
Specificity index reflects ratio between R scores derived from semantic and syntactic embeddings
Voxels more sensitive to syntax located in anterior Temporal Lobes, STG, SMA, MFG and IFG
Voxels more sensitive to semantics located in pMTG, TPJ/AG, IFS, SFS and Precuneus

Unique contributions of syntax and semantics

Quantified brain signal explained by information encoded in various embeddings
Analyzed additional information brought by each embedding
Evaluated correlations uniquely explained by semantic and syntactic embeddings
Estimated Pearson correlation explained by individual feature space
Assessed correlation explained by concatenation of embeddings
Syntactic embeddings explained localized brain regions
Semantic embeddings explained wide network of brain regions

Synergy between syntax and semantics

GloVe embeddings did not show any significant effect.
GPT-2 embeddings showed significant effects in most of the brain, with higher effects in semantic peak regions.

Integration of contextual information

GPT-2 embeddings elicit stronger R scores than GloVe
Syntax led to significant differences in frontal and temporo-parietal regions
Semantics led to significant differences in Precuneus, STS and posterior STG
Bilateral network of frontal and temporo-parietal regions modulated by short context
Left hemisphere preferentially involved in short-range context integration

Discussion

Language comprehension in humans is a complex process
Neurolinguistics is trying to discover how the brain implements these processes
A lot of attention has been devoted to the syntactic and semantic components
It is still debated if these components are implemented in distinct or identical regions
A meta analysis of the literature was done using the Neurosynth database
Recent studies have presented naturalistic texts to participants and used Artificial Neural Language Models to analyze brain activations

Models trained on semantic and syntactic features fit brain activity in a widely distributed network, but with varying relative degrees.

Trained models captured brain activity in a large extended language network
This network goes beyond the core language network
Areas included in the network are homologous areas in the right hemisphere, the dorsal prefrontal regions, both on the lateral and medial surfaces, as well as in the inferior parietal, Precuneus and posterior Cingulate
These regions are part of the Default Mode Network
They are relevant in language and high-level cognition
Syntactic and semantic features modulate activity throughout the language network
Syntax is more sensitive in the STG and anterior temporal lobe, whereas the parietal regions are more sensitive to semantics
Syntactic and semantic features are not perfectly orthogonal
“Pure” semantic features modulate activity in a wide set of brain regions
“Pure” effect of syntax “shrinks” to the STG and aTL, the IFG and the pre-SMA

The contributions of the right hemisphere.

Right hemisphere has linguistic abilities
Brain imaging studies confirm right hemisphere involvement in higher-level language tasks
Right hemisphere apt at recognizing distant relations between words
Long-range context effects observed in right hemisphere
Left hemisphere more involved in processing local semantic/syntactic information, right hemisphere integrates both information at larger time-scale

Syntax drives the integration of contextual information.

GloVe, a purely lexical model, does not benefit from being trained on the integral text.
GPT-2 shows clear synergetic effects of syntactic and semantic information.
Syntactic information drives the semantic interpretation at the sentence level.

Limitations of our study

Syntax and semantics are not perfectly dissociated
Removing function words impacts supra-lexical semantics
Removing pronouns removes arguments of some verbs
Prosody may have confounding effects on the results

Conclusion

State-of-the-art Natural Language Processing models can generate flawless grammatical text
Using them to fit brain data is common, despite their low biological plausibility
Restricting information provided to the model during training can be used to show which brain areas encode this information
Information-restricted models are powerful and flexible tools to probe the brain
4.4GB of text used for training, 1.1GB for validation
Created two information-restricted datasets: semantic and syntactic
GloVe and GPT-2 trained on the three datasets
GPT-2 modified to remove residual syntax when trained on semantic features
Stimulus used to obtain activations from humans and NLP models was The Little Prince novella
Embedding matrices built by concatenating words embeddings
Two decoding tasks: syntax and semantic
fMRI data of 51 English speaking participants used
Global brain mask of 26,164 voxels at 4x4x4mm resolution
Metaanalysis of brain regions using Neurosynth database

Link to paper#

Abstract#

Paper Content#

Introduction#

Results#

Dissociation of syntactic and semantic information in embeddings#

Correlations of fmri data with syntactic and semantic embeddings#

Regions best fitted by semantic or syntactic embeddings#

Gradient of sensitivity to syntax or semantics#

Unique contributions of syntax and semantics#

Synergy between syntax and semantics#

Integration of contextual information#

Discussion#

Models trained on semantic and syntactic features fit brain activity in a widely distributed network, but with varying relative degrees.#

The contributions of the right hemisphere.#

Syntax drives the integration of contextual information.#

Limitations of our study#

Conclusion#

Link to paper

Abstract

Paper Content

Introduction

Results

Dissociation of syntactic and semantic information in embeddings

Correlations of fmri data with syntactic and semantic embeddings

Regions best fitted by semantic or syntactic embeddings

Gradient of sensitivity to syntax or semantics

Unique contributions of syntax and semantics

Synergy between syntax and semantics

Integration of contextual information

Discussion

Models trained on semantic and syntactic features fit brain activity in a widely distributed network, but with varying relative degrees.

The contributions of the right hemisphere.

Syntax drives the integration of contextual information.

Limitations of our study

Conclusion