Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

State-of-the-art causal models assume inherent node factors that govern link formation.
Link formation can be path-dependent, meaning the outcome of link interventions depends on existing links.
Existing causal methods are impractical in these scenarios.
This work develops the first causal model capable of dealing with path dependencies in link prediction.

Paper Content

Introduction

Charles Spearman published The Abilities of Man in 1927, which described mathematical tools to uncover latent common factors of intelligence.
Spearman’s work started a revolution that gave us matrix and tensor factorizations, PCA, and ICA.
Spearman warned against interpreting the factors of subject i as innate rather than acquired abilities.
Two competing causal hypotheses exist to describe link formation between young children and their abilities: path-dependent and innate factors.
Current causal models for link prediction operate under the assumption of innate factors.
Causal lifting is an invariance property of causal models that is able to serve link prediction under interventions in both path-dependent and innate factor models.

A family of causal link prediction tasks

Observe graph data at pre-trial time t0 from unknown causal graph generation process
Denote observed graph at pre-trial time by G (t0) with corresponding adjacency matrix a (t0)
Probe into relation of (I, J) ∼ µ(G (t0) )
Probe induces intervention in graph formation process
Task is to predict what would have happened had we probed into relation of different pair (U, V ) ∼ µ(G (t0) )
Challenges include unknown data generating process, spillover and carryover effects
Introduce concept of causal lifting and causal link prediction learning task
Introduce three situations where theoretical results can be leveraged
Invariant pairwise embedding methods outperform node embedding ones in causal link prediction tasks

Existing link prediction literature

Link prediction has been studied since the early 1900s
Link prediction is either treated as an associational or causal task
Self-supervised learning task is used to predict true edges and nonedges in the adjacency matrix
Matrix factorization and GNN node embeddings are used to encode associations between graph topology and node/edge attributes
Model-based methods fit a random graph model to the observed graph
Mechanistic methods assume links are more likely created by nodes with many common neighbors
Inner factors literature merges matrix factorization and interventional data
Recommendations as treatments literature evaluates the effectiveness of a recommendation algorithm
Causal effect estimation on networks estimates the difference in potential outcomes of two treatments given in a network

Causal lifting

Causal lifting is a causal extension to the associational definition of lifting in probabilistic inference
Definition 1 is used to avoid unnecessary computations in probabilistic inference algorithms
Definition 2 and 3 extend lifting to interventional and counterfactual distributions
Interventional lifting can play a key role in identifying causal link prediction tasks

Interventional lifting and link prediction

Goal is to show how interventional lifting can serve as an identification strategy and experimental design tool for causal link prediction task
Main result (Theorem 1) presented in context of idealized single probe task
Sections 4.2 and 4.3 provide more details behind concepts used in Theorem 1
Sections 4.4 and 4.5 show extra causal mechanism assumptions to learn general task of Equation (5) with multiple probes

Causal link prediction with single probe lifting

Interventional lifting can allow us to answer counterfactual query of Equation (2).
Two node pairs (i, j), (u, v) from G (t0) are isomorphic if they are structurally indistinguishable.
Under certain conditions, interventional lifting can be obtained.
The observed graph G (t0) is sufficiently expressive of the causal link formation process.
Theorem 1 shows that if (u, v) is in (i, j)’s orbit in G (t0), the probe outcome Y uv is the same.
The invariances in causal mechanisms ensure interventional lifting.
Corollary 1 provides an identification and estimation procedure for causal link prediction.
The hidden parameters associated with the orbit O (t0) ij of a node pair (i, j) in G (t0) d-separates the probe from the generating process of G (t0).
Theorem 1 relies on a universal class of SCMs for graphs and a few symmetry conditions in their causal mechanisms.

A universal family of causal models for graphs

SCM generates edges and nonedges sequentially
SCM decides which pair of nodes will have an edge or nonedge
SCM can generate attributed graphs
SCM takes sequence of all previously generated edges and nonedges as input

Mechanism invariances sufficient for interventional lifting in link prediction

X is invariant to the SCM intermediate states between the time the intervention probe is performed and the instant before we see its effect
X is invariant to the order in which edges and nonedges have been generated
X is invariant to which pairs of nodes were generated as non-links or were not generated at all
X is invariant to permutations of the node identifiers
Outcome of the probe is invariant to whatever happened between the observation of the graph and the probe
Outcome of the probe is invariant to whether an observed non-link is indeed a non-link generated by the causal model or a node pair not-yet-executed
Exogenous variables between node pairs in an SCM are independent and identical

Lifting in causal link prediction with multiple probes

We can identify symmetric causal link prediction queries under certain conditions.
Corollary 1 shows how a probe in (i, j) can be used to estimate a counterfactual query on a pair (u, v) in (i, j)’s orbit.
We can extend our causal assumptions to learn to answer counterfactual queries about any pair using multiple probes as training data.
Non-interfering probes is an invariance condition depending on both the causal mechanisms and on the pairs (I (tm) , J (tm) )’s we probe at each time t m .
Non-interference tends to be satisfied with higher probability if the probes are performed in pairs far away in the graph.
We can use multiple probes to estimate counterfactual queries, but are still restricted to probing in the orbit of the first queried pair.
We can sample pairs according to an arbitrary policy µ(G (t0) ) and learn a model capable of predicting our counterfactual query for any (U, V ) ∼ µ(G (t0) ).
We need two extra assumptions: i.i.d. exogenous variables and shared parameters of the probes’ distributions across all orbits.
We can approximate the equation with a MAP estimate and a prior P (W Γ ).

Causal lifting induces a supervised learning solution to causal link prediction

Proposition 1 describes when it is possible to estimate a causal link prediction task with multiple probes in different orbits.
A supervised learning-based approach is used, with probed pairs and their observed outcomes as labels.
Prediction is given by a graph embedding model.
Equation (13) estimates parameters and relies on assumptions from Proposition 1.

Graph embeddings for causal link prediction

Node embedding methods build a graph by computing the node embeddings of the two nodes in the represented pair and merging them with a binding function.
Structural (joint) pairwise embeddings have lower bias than structural node embeddings and represent the causal structure of the task.

Structural (joint) pairwise embeddings are the ideal graph embeddings for causal link prediction tasks

Proposition 1 and Figure 3 show that the natural representation for the causal link prediction task is a most-expressive pairwise representation.
Theorem 1(iv) states that most-expressive graph models capture all causal invariances in the task.
Structural node embeddings are a popular attempt to design structural representations of node pairs.
Structural joint pairwise embeddings are defined to distinguish structural representations acting jointly on the node pair from structural representations acting separately on its nodes.
Structural pairwise embeddings capture properties such as distances and common neighbors which play a central role in link formation mechanisms.

Structural node embeddings are undesirable for causal link prediction due to model bias

Joint embeddings and Graph Neural Networks (GNNs) are permutation-invariant node embeddings
Structural node embeddings encompass GNNs and classical structural node roles
Structural pairwise embeddings have lower bias than structural node embeddings
Pairwise symmetric graphs are important in link prediction tasks
Theorem 1 states when structural pairwise embeddings have lower bias than structural node embeddings
Positional node embeddings are samples from an equivariant distribution
Strictly positional node embeddings assign distinct representations to every node in a symmetric graph
Matrix factorization and other factor models reflect some notion of node distances in the graph
SVD and neural matrix factorization graph embeddings are strictly positional
SVD assigns the same embedding to a node pair if they are isomorphic and have the exact same neighborhood
Strictly positional node embeddings assume an incorrect causal structure for the causal link prediction task

Experimental results

Evaluated theoretical findings on identifying and estimating counterfactual query from Equation (5)
Investigated accuracy of positional node embeddings, structural node embeddings, and pairwise invariant embeddings
Investigated three questions: (Q1) How do structural node embeddings perform when test set contains tuples that are nodewise isomorphic to training? (Q2) How do positional node embeddings perform when test set contains tuples that did not participate in training? (Q3) How do positional and structural embedding solutions perform in real-world scenarios?
Evaluated Q1 and Q2 in knowledge base completion task, Q2 in covariance matrix estimation task, and Q2 and Q3 in two user-item recommendation tasks
Used Nonnegative Matrix Factorization (NMF), SVD, positional GCN, classical GCN, UniMP, TransE, DistMult, ComplEX, SEAL, and Neo-GNNs

Impact on knowledge graph queries

Task is causal link prediction in knowledge graphs
Constructed knowledge base of 100 non-isomorphic family trees
Dataset contains 28 relation types
Goal is to predict other family relations from parentOf relations
Model satisfies conditions from Theorem 1 and Proposition 1
30% of family trees contain two isomorphic subtrees
Structural node embeddings assign same representation to isomorphic nodes
Positional node embeddings can distinguish relations but may not make same prediction for isomorphic pairs
Structural pairwise embeddings do not suffer from these problems
Figure 5 confirms that structural pairwise embeddings perform better than positional node embeddings and structural node embeddings

Impact on covariance matrix estimation

Task is mostly unexplored in causal link prediction literature
Consider observing samples from a joint distribution (e.g. list of measurements from a patient)
Construct estimated covariance matrix for attribute variables
Counterfactual query: what would have been the re-estimated values for entries of attributes not collected?
Experiments are made simultaneously, samples are i.i.d.
Assumption needed is that exogenous variables from patients are i.i.d.
Evaluated methods: Label-GCN, positional node embeddings, structural node embeddings

Impact on recommender systems

Two user-item recommendation datasets used: Amazon Electronics and Last FM
Graph built from user-item interactions between 11/24/2015 and 12/24/2015 in AE and between 2007 and 2013 in LFM
Counterfactual problem: what would other users consume if exposed to items?
Male users probed in both datasets
Interactions between 12/24/2015 and 12/31/2015 for AE and in 2014 for LFM
Nonedge probe examples selected by sampling nodes uniformly at random
Unclear whether causal modeling assumptions hold
Node embedding methods doomed to failure
Structural pairwise embeddings present better performance
GCN achieves reasonable performance in AE dataset

Link to paper#

Abstract#

Paper Content#

Introduction#

A family of causal link prediction tasks#

Existing link prediction literature#

Causal lifting#

Interventional lifting and link prediction#

Causal link prediction with single probe lifting#

A universal family of causal models for graphs#

Mechanism invariances sufficient for interventional lifting in link prediction#

Lifting in causal link prediction with multiple probes#

Causal lifting induces a supervised learning solution to causal link prediction#

Graph embeddings for causal link prediction#

Structural (joint) pairwise embeddings are the ideal graph embeddings for causal link prediction tasks#

Structural node embeddings are undesirable for causal link prediction due to model bias#

Experimental results#

Impact on knowledge graph queries#

Impact on covariance matrix estimation#

Impact on recommender systems#

Link to paper

Abstract

Paper Content

Introduction

A family of causal link prediction tasks

Existing link prediction literature

Causal lifting

Interventional lifting and link prediction

Causal link prediction with single probe lifting

A universal family of causal models for graphs

Mechanism invariances sufficient for interventional lifting in link prediction

Lifting in causal link prediction with multiple probes

Causal lifting induces a supervised learning solution to causal link prediction

Graph embeddings for causal link prediction

Structural (joint) pairwise embeddings are the ideal graph embeddings for causal link prediction tasks

Structural node embeddings are undesirable for causal link prediction due to model bias

Experimental results

Impact on knowledge graph queries

Impact on covariance matrix estimation

Impact on recommender systems