Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Unit selection problem aims to identify objects that exhibit desired behavior when subjected to stimuli
Existing work focuses on bounding a specific class of objective functions
Proposed algorithm for finding optimal units given a broad class of causal objective functions and a fully specified structural causal model
Unit selection under this class of objective functions is $\text{NP}^\text{PP}$-complete
Treewidth-based complexity bounds on proposed algorithm

Theory of causality based on two parallel hierarchies: information and reasoning
Three levels of reasoning: associational, interventional and counterfactual
Knowledge encoded as associational, causal and functional models
Unit selection problem: selecting customers to target with an encouragement offer
Four types of customers: responders, always-takers, always-deniers, contrarians
Benefit function to score customers and identify most promising ones
Contrast with classical loss functions
Structured units: decisions, policies, people, situations, regions, activities
Fully specified SCM to obtain point values for any causal objective function
Computational problem of finding units that optimize causal objective functions
Exact algorithm to solve unit optimization problem: Reverse-MAP
Complexity of algorithm characterized by treewidth

Causal objective functions involve observational, interventional or counterfactual probabilities
Goal is to find objects (units) that optimize the function
Linear combination of counterfactual probabilities
Unit variables are exogenous in the SCM

Unit selection is NP-PP-complete for the class of causal objective functions given in Equation (1).
Unit selection is NP-complete when unit variables correspond to all exogenous variables in the SCM.
We can evaluate the objective L(u) by evaluating a single observational probability involving unit variables U.
We can optimize the objective function L(u) on an SCM G by computing the instantiation argmax u Pr (y, w|x, v, e, u) on an objective model G.
D-Reverse-MAP is NP PP -complete.
D-Reverse-MAP is NP-complete if its target variables are all the SCM root variables.
Unit selection is NP-complete when the unit variables are all the SCM exogenous (root) variables.

VE algorithm for MAP uses factors to map variables to non-negative numbers
SCM distribution is given by multiplying all factors
MAP probability is given by maximizing out target variables from a factor
Naive evaluation of MAP probability has complexity of O(n exp(n))
VE algorithm for MAP has complexity of O(n exp(w)) where w is the width of the used elimination order
VE algorithm for Reverse-MAP runs two passes of elimination
First pass sums out variables under evidence e1, e2
Second pass sums out variables under evidence e2
Divide factors from first pass by factors from second pass to obtain MAP probability
Complexity of Reverse-MAP VE is O(n exp(w))

RMAP VE is expected to be more expensive on an objective model than an SCM
Treewidth is used to analyze elimination algorithms
An elimination order for an objective model can be constructed from an SCM
Theorem 14 provides a bound on the width of an elimination order for an objective model
Corollary 15 states that the treewidth of an objective model is less than or equal to 3 times the treewidth of an SCM
U-constrained elimination orders must place the mixture variable before unit variables
Theorem 17 provides a bound on the U-constrained treewidth of an objective model
Corollary 18 states that the U-constrained treewidth of an objective model is less than or equal to the maximum of 3 times the treewidth of an SCM and the number of unit variables
The bound on U-constrained treewidth can be tighter depending on the objective function properties
An experiment was conducted to compare the complexities of MAP VE, RMAP VE and a bruteforce method
The gap between the complexities of MAP VE and RMAP VE narrows as the size of the problem grows
The gap between the complexities of RMAP VE and the bruteforce method grows

Studied unit selection problem in a computational setting
Assumed a fully specified structural causal model
Computed point values of causal objective functions
Unit selection problem with this class of objective functions is NP PP -complete
Identified an intuitive condition under which it is NP-complete
Provided an exact algorithm for the unit selection problem
Characterized complexity in terms of treewidth
Defined a new inference problem, Reverse-MAP, which is also NP PP -complete
Lemma 20 complements Theorem 13
Lemma 22 concerns the augmentation of an SCM
Theorem 17 holds for augmented objective model
Lemmas 20 and 22 used in proof
Time complexity of RMAP VE is O(n 1 • exp(w 1 ))
Bruteforce method time complexity is O(n 2 • exp(w 2 ))
Class of problems for which U-constrained treewidth is no smaller than number of unit variables
MAP VE and RMAP VE must be exponential in number of unit variables
Baseline method can be significantly worse