Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Formalization of Knowledge Graphs as a new class of graphs
- Double-permutation equivariance for KG representations
- Structural representation of relations allows neural networks to perform complex logical reasoning tasks
- General blueprint for equivariant representations
- GNN-based double-permutation equivariant neural architecture achieves 100% Hits@10 test accuracy
Paper Content
Introduction
- Knowledge graphs are structured representations of facts in the form of triplets.
- Triplets consist of two entities connected by a relation.
- Knowledge graphs are often incomplete.
- The task of predicting missing relations is widely studied.
- Graph Neural Network methods can be adapted for link prediction.
- Knowledge graphs can be treated as attributed graphs.
- Some knowledge graphs belong to a new class of graphs with equivariant node and pairwise representations.
- Experiments are conducted on inductive and synthetic tasks.
Theory review: what are attributed graphs? an exchangeability perspective
- Multigraphs are sequences of edges with N ≥ 2 vertices and R ≥ 1 relation attributes.
- Node attributes are defined as a special type of edge reserved for self-loops.
- Predictions should be invariant to the permutation of node ids.
- GNNs are permutation-equivariant representation functions.
- Link prediction is better served by equivariant pairwise representations.
- Functions and neural networks should be invariant to the action of any permutation.
Brief related work: knowledge graphs as attributed multigraphs
- Knowledge Graphs (KGs) are a key ingredient in successful search engines
- KGs are sometimes described as attributed (multi)graphs
- KGs can be treated as sequences (non-exchangeable) or as graphs (exchangeable)
- State-of-the-art methods for KG completion treat KGs as attributed (multi)graphs
- Proposal: Define some KGs as double exchangeable attributed (multi)graphs
- KGs are defined as µ(A G ) = µ(A H ) for any isomorphic KGs A G and A H
- KG isomorphism is defined as two multigraphs A G and A H being isomorphic if there exists a bijection φ : V (N ) → V (N ) and a bijection τ : R (R) → R (R) such that A G i,r,j = A H φ(i),τ(r),φ(j)
- Invariant KG representations are defined as any (statistical) loss function over a knowledge graph A train G must be the same over any isomorphic KGs A H KG A train G
- Invariant triplet representation for KGs is defined as Γ tri ((i, r, j), A)
- Most-expressive invariant triplet representation is defined as Γ tri ((i, r, j), A)
- Double equivariant representation is defined as Γ gra (A)
- Theorem 4.8 connects Definitions 4.5 and 4.7
Consequences of invariant predictors in kgs
- Two KG completion tasks are impossible for standard KG completion methods.
- A fictional alien civilization is used as an example.
- All edge attributes in the example are unique.
- Invariant KG representations can solve the tasks.
- The tasks can be solved by paying attention to the structural relations between nodes and their relations, not their absolute ids.
- A uniform distribution is predicted over the remaining relations in the training data.
Connection to learning logical rules
- Definition 4.9 is a generalization of first order logic clauses
- Definition 4.9 allows relations in Horn clauses to be universally quantified
- Definition 4.9 does not require Horn clauses to form a path in the KG
- Theorem 4.10 states that a set of UQER Horn clauses can predict the same positive triplets as a triplet predictor
Inductive double-exchangeable neural architecture for kgs
- Proposed model to learn invariant triplet representation for KGs
- Double equivariant function can be used to obtain invariant triplet representation
- Inductive structural doubly-exchangeable architecture proposed to learn double equivariant functions
- Double equivariance described as a graph equivariance and set equivariance
- Linear double-equivariant layer composed of Siamese layer and GNN layers
- Max aggregator used to determine if pair of nodes is connected
- Neural network for KGs defined by linear double equivariant layers and non-polynomial activation functions
Implementation considerations
- Computationally and memory-wise expensive to use most-expressive pairwise representations for 2
- Propose inductive structural doubly-exchangeable architecture (IS-DEA) to trade-off expressivity for speed
- IS-DEA performs vertex message passing through two learnable functions, such as MLPs, recursively over T layers
- Initializing node representations with no node attributes
- Concatenate node representations with distance between nodes in triplet representation
- Triplet representation is an invariant triplet representation
- Use negative sampling and cross-entropy loss for model optimization
Experiments
- Evaluated IS-DEA on two synthetic tasks and two inductive knowledge graph completion datasets
- WN18RR-v1 has an easy task
- Reported mean performance over 5 runs, small variance
Synthetic experiments
- Proposed two challenging family tree completion tasks to verify the benefits of the model
- Model is insensitive to relation identity and can automatically generalize to new nodes and relations
- Model does not need to learn parameters for each relation, allowing it to inductively infer over a KG with new and more relations
Real-world knowledge graphs
- There are no real-world benchmarks where training and test KGs have distinct nodes and relations.
- Experiments are limited to small-scale KGs.
- IS-DEA results are invariant to the permutation of relations in test.
- IS-DEA obtains a perfect score on the key metric Hits@10.
- IS-DEA has the same poor pre-processing scalability as GraIL.
Conclusions
- Introduced the concept of double exchangeable attributed graphs as a formal model for KGs
- Showed that double symmetries (node and relation ids) impose structural rule learning in KGs
- Introduced a blueprint for double equivariant neural network architectures for KGs
- Showed this architecture can learn logical rules that standard KG methods cannot
- Experiments showed that even a simple double exchangeable architecture (IS-DEA) achieves promising results in inductive KG completion tasks
- Factorization-based methods use two learnable embedding matrices to score combinations of relations and entities
- GNN-based models use an attributed graph neural network on an attributed enclosed subgraph
- NodePiece augments node representation with shortest distances to several anchors
- Neural LP and DRUM extract all paths from i to j within G (i,j) and use a recurrent neural network
- GraIL and NBFNet assign node attributes and fill H (R) to corresponding edge as edge attributes
- IS-DEA uses a double symmetric graph neural network
- Family Diagram 1 (FD-1) is based on two logic chain rules
- Family Diagram 2 (FD-2) is an extension of FD-1 with two binary trees