Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Paper Content

Introduction

  • Firstly, a compositional framework in which distributional meanings and linguistic structure interact meaningfully and transparently
  • Flexible enough to accommodate compositionality beyond Frege’s bottom-up notion
  • Local meanings of words informed by global textual context
  • Modern machine learning methods involve top-down meaning flows
  • DisCoCat proposed in 2008, enjoyed empirical support
  • DisCoCirc proposed as an improvement of DisCoCat
  • Produce text meanings from sentence meanings
  • Meanings of words may change as text progresses
  • Space of sentence meanings unspecified in DisCoCat
  • Made up from relevant nouns in DisCoCirc
  • Text circuits two-dimensional
  • Hybrid grammar for text
  • Phrase structure, pronominal links, phrase scope, transformational grammar rules
  • Text diagrams from text with hybrid grammar
  • Text circuits generative for text
  • Characterise text circuits as essential meaning connectedness
  • Text circuits conservatively extended to accommodate more features of language
  • Parsing, expanding fragment of language, relationships between text circuits and other grammatical and semantic formalisms
  • Distillation of text circuits from text
  • Text circuits eliminate grammatical bureaucracy
  • Model meanings in circuits with vector spaces, linear maps and tensor product
  • Human language one-dimensional vehicle for higher-dimensional content

A hybrid grammar for text

  • Introducing a hybrid grammar that is generative and captures linguistic connectedness
  • Developed in three steps: context-sensitive 3 grammar for simple sentences, pronominal links to identify recurring nouns and pronouns, and rules to fuse together simple sentences
  • Uses ideas from Chomsky’s transformational phrase structure grammars, Lambek’s pregroups, discourse representation theory, and dependency grammars
  • Does not deal with some grammatical phenomena, omits certain grammatical patterns, and only deals with part of language
  • Combines grammar and meaning for efficient tools

Preliminaries

  • String diagrams are a graphical mathematical framework for composing input-output boxes.
  • Composition of boxes can be done in parallel or sequential.
  • Symbolic labels for intermediate symbols are replaced by colored edges.

Simple sentence structure

  • Introducing a phrase structure grammar
  • Intransitive and transitive verbs have different tree-fragments
  • Adpositions can be added to the right of intransitive and transitive verb-phrases
  • Hybrid grammar of this section referred to as “hybrid grammar” or just “grammar”
  • Sentences with more than one verb are called compound sentences
  • Pronominal links identify nouns from different sentences
  • Relative pronouns can be used to fuse sentences together
  • Subject relative pronouns replace the subject noun of a parse tree
  • Object relative pronouns come after the first occurrence of a noun
  • Reflexive pronouns can be used in conjunction with relative pronouns

Compound sentence structure ii: phrase scope

  • Verbs with sentential complement are their own grammatical class of verb
  • Phrase scope is used to eliminate ambiguity

Text diagrams

  • Text diagrams are related to dependency grammars and link grammars.
  • Text diagrams are lighter than dependency grammars as they do not specify as many grammatical relationships.
  • Text diagrams are related to pregroup grammars, context-free grammars, and combinatory categorial grammars.
  • Text diagrams are a type-restricted analogue of transformational grammars.

Simple sentences as text diagrams

  • Replace S with sentence-dependent number of NP wires
  • Remove S types
  • Preserve number of input and output noun wires
  • Alter rules to take same number of NP types as input
  • Arrange diagram so that matching input and output NP wires align

Rewriting text diagrams

  • Relative pronouns can be transformed using Chomskian phrase-structure transformations
  • Diagram rewrites can be used to replace diagrams with the same input and output wires
  • Pronominal links take a different shape in text diagrams than in grammar
  • Wires link output and input noun wires in a chain
  • Link-elimination rewrites can cause pieces to vanish
  • Wires and labels may cross one another in text diagrams
  • Phrase scoped regions act as planar obstacles in text diagrams
  • Verbs with sentential complements and conjunctions require special diagram pieces

Definition

  • Wires represent nouns
  • Gates represent adjectives, intransitive verbs, and transitive verbs
  • Boxes with holes represent adverbs
  • Adpositions modify verbs by adding another noun-wire
  • Families of boxes accommodate input circuits of all sizes
  • Conjunctions are boxes that take two circuits

Mathematical results

Main text circuit theorem

  • Translation rules of Section 5.3 can associate any text with grammar to a text circuit.
  • All text circuits can be obtained using the translation rules.

Refinements, extensions, conventions

  • Convention 5.2 clarifies what counts as a text circuit
  • Refinement 5.3 introduces a special verb EXISTS for unconnected wires
  • Lemma 5.15 uses a generic conjunction [&] to express textual elements
  • Refinement 5.5 uses a generic contentless complementiser [THAT]
  • Definition of text circuits extended to include reflexive pronoun boxes
  • Convention 5.6 introduces syntactic sugar for reflexive pronoun boxes
  • Reflexive pronoun boxes can be split and eliminated

Text diagrams to text circuits

  • Lemma 5.8 and Corollary 5.9 constructively organise rewrites of Refinement 5.7 into a function from text diagrams to text diagrams with no pronominal links
  • Lemma 5.11 introduces new rewrite rules and ancilla types to constructively constitute a function from text diagrams with no pronominal links nor phrase scoping to a normal form that corresponds to a single circuit gate
  • Lemma 5.12 extends Lemma 5.11 to account for text diagrams with phrase scope
  • Lemma 5.12 proves that every text diagram without pronominal links can be viewed as a unique text circuit
  • Lemma 5.12 introduces ancillas and rewrites to obtain a normal form for ADV and ADP nodes and labels contracted to their parent V node
  • Lemma 5.12 introduces the ADV-gather, ADV-assoc, ADP-ancilla, adp-gather, adp-assoc, and adp-ADV-order rules to re-express multiple ADV and ADP nodes on the same wire
  • Lemma 5.12 introduces the conjugate box with overlapping arguments as syntactic sugar standing in for a ’normal’ CNJ box inside reflexive pronoun boxes

Proof of theorem

  • Translation procedure from text to circuit yields a function T → C
  • Function is surjective
  • Lemma 5.12 handles text diagrams without pronominal links
  • Transformations eliminate pronominal links
  • Reflexive pronoun boxes addressed by Convention 5.6 and Refinement 5.7
  • Lemma 5.8 provides normal form for reflexive pronoun boxes
  • Text circuit captures essential connectivity of meaning
  • Text Circuit Thesis: equal circuits stand for equal text meanings
  • Recipe for engineering extensions of grammar to accommodate grammatical phenomena
  • Examples of passive voice, possessive pronouns, and adjectivalisation of verbs
  • Dependency grammar, typelogical grammar, discourse representation theory used as starting point for obtaining text circuits

Our hybrid grammar for text

  • Pronominal link data is needed to move from single sentences to text
  • Hybrid grammar belongs to the class of grammars subsumed by linear context-free rewrite systems
  • Composition functions of linear context-free rewrite systems resemble the gates of text circuits
  • Parsing text in hybrid grammar is polynomial time if the algorithm for determining pronominal links is ’effectively computable'

Text circuits.

  • Text circuits closely align with the dynamic semantics perspective
  • Text circuits correspond to discourse referents of DRT
  • Text circuits handle adverbs as boxes
  • Text circuits are an extension of discourse theory
  • Text circuits can be seen as graph models
  • Text circuits are compatible with Montague semantics
  • Text circuits capture the process of reading and understanding text
  • Text circuits allow wires to twist past each other
  • Text circuits replace pronominal links with their text diagram counterparts
  • Text circuits allow phrase scope as phrase bubbles
  • Text circuits allow sentence composition within phrase scope
  • Text circuits treat reflexive pronoun boxes as if noun labels duplicate
  • Text circuits allow adjectivalisation of verbs by gerund -ING
  • Text circuits allow possessive pronouns