Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
Paper Content
Introduction
- Firstly, a compositional framework in which distributional meanings and linguistic structure interact meaningfully and transparently
- Flexible enough to accommodate compositionality beyond Frege’s bottom-up notion
- Local meanings of words informed by global textual context
- Modern machine learning methods involve top-down meaning flows
- DisCoCat proposed in 2008, enjoyed empirical support
- DisCoCirc proposed as an improvement of DisCoCat
- Produce text meanings from sentence meanings
- Meanings of words may change as text progresses
- Space of sentence meanings unspecified in DisCoCat
- Made up from relevant nouns in DisCoCirc
- Text circuits two-dimensional
- Hybrid grammar for text
- Phrase structure, pronominal links, phrase scope, transformational grammar rules
- Text diagrams from text with hybrid grammar
- Text circuits generative for text
- Characterise text circuits as essential meaning connectedness
- Text circuits conservatively extended to accommodate more features of language
- Parsing, expanding fragment of language, relationships between text circuits and other grammatical and semantic formalisms
- Distillation of text circuits from text
- Text circuits eliminate grammatical bureaucracy
- Model meanings in circuits with vector spaces, linear maps and tensor product
- Human language one-dimensional vehicle for higher-dimensional content
A hybrid grammar for text
- Introducing a hybrid grammar that is generative and captures linguistic connectedness
- Developed in three steps: context-sensitive 3 grammar for simple sentences, pronominal links to identify recurring nouns and pronouns, and rules to fuse together simple sentences
- Uses ideas from Chomsky’s transformational phrase structure grammars, Lambek’s pregroups, discourse representation theory, and dependency grammars
- Does not deal with some grammatical phenomena, omits certain grammatical patterns, and only deals with part of language
- Combines grammar and meaning for efficient tools
Preliminaries
- String diagrams are a graphical mathematical framework for composing input-output boxes.
- Composition of boxes can be done in parallel or sequential.
- Symbolic labels for intermediate symbols are replaced by colored edges.
Simple sentence structure
- Introducing a phrase structure grammar
- Intransitive and transitive verbs have different tree-fragments
- Adpositions can be added to the right of intransitive and transitive verb-phrases
- Hybrid grammar of this section referred to as “hybrid grammar” or just “grammar”
Compound sentence structure i: pronominal links
- Sentences with more than one verb are called compound sentences
- Pronominal links identify nouns from different sentences
- Relative pronouns can be used to fuse sentences together
- Subject relative pronouns replace the subject noun of a parse tree
- Object relative pronouns come after the first occurrence of a noun
- Reflexive pronouns can be used in conjunction with relative pronouns
Compound sentence structure ii: phrase scope
- Verbs with sentential complement are their own grammatical class of verb
- Phrase scope is used to eliminate ambiguity
Text diagrams
- Text diagrams are related to dependency grammars and link grammars.
- Text diagrams are lighter than dependency grammars as they do not specify as many grammatical relationships.
- Text diagrams are related to pregroup grammars, context-free grammars, and combinatory categorial grammars.
- Text diagrams are a type-restricted analogue of transformational grammars.
Simple sentences as text diagrams
- Replace S with sentence-dependent number of NP wires
- Remove S types
- Preserve number of input and output noun wires
- Alter rules to take same number of NP types as input
- Arrange diagram so that matching input and output NP wires align
Rewriting text diagrams
- Relative pronouns can be transformed using Chomskian phrase-structure transformations
- Diagram rewrites can be used to replace diagrams with the same input and output wires
Pronominal links in text diagrams
- Pronominal links take a different shape in text diagrams than in grammar
- Wires link output and input noun wires in a chain
- Link-elimination rewrites can cause pieces to vanish
- Wires and labels may cross one another in text diagrams
- Phrase scoped regions act as planar obstacles in text diagrams
- Verbs with sentential complements and conjunctions require special diagram pieces
Definition
- Wires represent nouns
- Gates represent adjectives, intransitive verbs, and transitive verbs
- Boxes with holes represent adverbs
- Adpositions modify verbs by adding another noun-wire
- Families of boxes accommodate input circuits of all sizes
- Conjunctions are boxes that take two circuits
Mathematical results
Main text circuit theorem
- Translation rules of Section 5.3 can associate any text with grammar to a text circuit.
- All text circuits can be obtained using the translation rules.
Refinements, extensions, conventions
- Convention 5.2 clarifies what counts as a text circuit
- Refinement 5.3 introduces a special verb EXISTS for unconnected wires
- Lemma 5.15 uses a generic conjunction [&] to express textual elements
- Refinement 5.5 uses a generic contentless complementiser [THAT]
- Definition of text circuits extended to include reflexive pronoun boxes
- Convention 5.6 introduces syntactic sugar for reflexive pronoun boxes
- Reflexive pronoun boxes can be split and eliminated
Text diagrams to text circuits
- Lemma 5.8 and Corollary 5.9 constructively organise rewrites of Refinement 5.7 into a function from text diagrams to text diagrams with no pronominal links
- Lemma 5.11 introduces new rewrite rules and ancilla types to constructively constitute a function from text diagrams with no pronominal links nor phrase scoping to a normal form that corresponds to a single circuit gate
- Lemma 5.12 extends Lemma 5.11 to account for text diagrams with phrase scope
- Lemma 5.12 proves that every text diagram without pronominal links can be viewed as a unique text circuit
- Lemma 5.12 introduces ancillas and rewrites to obtain a normal form for ADV and ADP nodes and labels contracted to their parent V node
- Lemma 5.12 introduces the ADV-gather, ADV-assoc, ADP-ancilla, adp-gather, adp-assoc, and adp-ADV-order rules to re-express multiple ADV and ADP nodes on the same wire
- Lemma 5.12 introduces the conjugate box with overlapping arguments as syntactic sugar standing in for a ’normal’ CNJ box inside reflexive pronoun boxes
Proof of theorem
- Translation procedure from text to circuit yields a function T → C
- Function is surjective
- Lemma 5.12 handles text diagrams without pronominal links
- Transformations eliminate pronominal links
- Reflexive pronoun boxes addressed by Convention 5.6 and Refinement 5.7
- Lemma 5.8 provides normal form for reflexive pronoun boxes
- Text circuit captures essential connectivity of meaning
- Text Circuit Thesis: equal circuits stand for equal text meanings
- Recipe for engineering extensions of grammar to accommodate grammatical phenomena
- Examples of passive voice, possessive pronouns, and adjectivalisation of verbs
- Dependency grammar, typelogical grammar, discourse representation theory used as starting point for obtaining text circuits
Our hybrid grammar for text
- Pronominal link data is needed to move from single sentences to text
- Hybrid grammar belongs to the class of grammars subsumed by linear context-free rewrite systems
- Composition functions of linear context-free rewrite systems resemble the gates of text circuits
- Parsing text in hybrid grammar is polynomial time if the algorithm for determining pronominal links is ’effectively computable'
Text circuits.
- Text circuits closely align with the dynamic semantics perspective
- Text circuits correspond to discourse referents of DRT
- Text circuits handle adverbs as boxes
- Text circuits are an extension of discourse theory
- Text circuits can be seen as graph models
- Text circuits are compatible with Montague semantics
- Text circuits capture the process of reading and understanding text
- Text circuits allow wires to twist past each other
- Text circuits replace pronominal links with their text diagram counterparts
- Text circuits allow phrase scope as phrase bubbles
- Text circuits allow sentence composition within phrase scope
- Text circuits treat reflexive pronoun boxes as if noun labels duplicate
- Text circuits allow adjectivalisation of verbs by gerund -ING
- Text circuits allow possessive pronouns