Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • AI models lack understanding of cause-effect relationships in the real world.
  • AI models do not generalize to unseen data, produce unfair results, and are difficult to interpret.
  • Causal modeling and inference methods have been developed to improve trustworthiness of AI models.

Paper Content

Causality and robustness

  • Pre-processing methods use causality to create data augmentations
  • Adversarial examples are artificially perturbed input values that can fool machine learning models
  • Data augmentation methods use causal graphs to motivate data augmentation
  • Problem abstraction methods simplify the problem for machine learning agents
  • In-processing methods use causality-aware optimization objectives or architecture design choices
  • Post-processing methods alter predictions or enable causality-informed model selection
  • Causal models can help prevent attacks that expose users’ private information

Causality and privacy

  • Pre-processing data augmentation to reduce heterogeneity across user data distribution
  • In-processing using invariant risk minimization to defend against membership inference and property inference attacks
  • Post-processing using test data specific normalization to improve generalization and provide better privacy guarantees
  • Evaluation of success of membership inference attack measured in terms of accuracy and advantage
  • Causal models used to improve privacy by reducing overfitting and providing better differential privacy guarantees
  • Ex-ante impact assessment to identify potential negative effects before deployment
  • Ex-post impact assessment to identify potential negative effects after deployment

Ex-ante impact assessment

  • Ex-ante impact assessment predicts risks and impacts of proposed systems
  • Used to assess environmental, financial, social, and human rights ramifications
  • Environmental impact assessment uses deductive and inductive causal inference
  • Social and fiscal impact assessment uses deductive and inductive methods to discover causal relationships
  • Economic impact assessment looks at effects of introducing new economic policies or changing existing ones

Ex-post impact assessment

  • Ex-ante impact assessments are limited and often cannot identify all risks and impacts
  • Ex-post impact assessment is used to detect risks and impacts on the go
  • Ex-ante assessments have clear guidelines and metrics for specific types of impacts
  • Ex-post assessments are broader and need to define what constitutes an impact in real-time
  • Causal inference is used to tackle various categories of risks and impacts
  • Temporal and long-term effects can be seen in real-world systems
  • Causality can help assess systems in real-time and find out elements responsible
  • Causality can help identify root cause of system failure and system misuse
  • Strategic risks and effects can be analyzed with causality
  • Causal methods have been demonstrated to be advantageous in healthcare

Causality in healthcare through scm framework

  • SCMs are used in healthcare and personalized medicine
  • Causality is used to explain outcomes of medical models
  • Causality is used to discover causal relationships in medical imaging
  • Causality is used to repurpose drugs for new diseases
  • Causality is used to identify causal factors of clinical conditions
  • Causality is used to make AI algorithms fair and robust

Causality in healthcare through the po framework

  • PO framework is commonly used in medical field
  • Provides methods to conduct causal analysis from a statistical perspective
  • Removes selection bias in historical data
  • Shi and Norgeot review different research works to estimate treatment effects
  • Used to test if a drug is beneficial or harmful
  • Graham et al. used propensity score matching to examine risk of death in elderly patients
  • Ozer et al. used propensity score matching to investigate benefits of chemotherapy
  • Friedrich and Friede used several propensity score-based methods and other approaches
  • Ziff et al. used PO framework-based analysis to evaluate safety and efficacy of drug
  • Privacy is a main foundation for a trustworthy AI system
  • Generated a new large synthetic dataset to imitate real-world data distributions and preserve individual patients’ privacy
  • Identifiability metric estimates probability that an individual is identifiable
  • Surveyed causal modeling and reasoning tools for enhancing trustworthy aspects of AI models
  • Curated list of datasets used for recent Causal ML publications
  • Overview of useful causal and non-causal tools and packages
  • Overview of publicly available real-world datasets
  • Benchmarks and packages for Causal Machine Learning
  • Well-established tools to compare to non-causal machine learning
  • CANDLE, MIND, MovieLens, Netflix Prize, WMT 14, OpenSubtitles, LAMA, ImageNet, Adult (Census Income), Human Activity Recognition, Yelp, Amazon (Product) Data, Sangiovese Grapes, WikiText-2, Jigsaw Toxicity Detection, RTGender, CrowS-Pairs, Professions, WinoBias, Winogender Schemas, English UD Treebank, Gender-Neutral GloVe Word Embeddings, Biographies