Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- AI tools for healthcare have sparked debate around adoption of the technology.
- Explainable AI (XAI) is seen as a way to make AI devices more transparent and trustworthy.
- Some have expressed concerns about the reliability of XAI techniques, particularly feature attribution methods.
- Feature importance can be used reliably when low-level features come with a clear semantics, such as tabular data like Electronic Health Records (EHRs).
Paper Content
Introduction
- Artificial Intelligence (AI) and model complexity have increased, leading to a surge of interest in explainable AI (XAI).
- XAI is particularly important in safety-critical domains such as healthcare.
- XAI has already been used to improve diagnosis and prognosis of diseases.
- There are a variety of techniques for XAI, which can be grouped into local vs. global, and model-specific vs. model-agnostic approaches.
- Feature attribution methods are popular XAI techniques, which assign a measure of how much each feature contributes to the model output.
- Despite enthusiasm for XAI, there is no consensus on its reliability.
- Feature attribution methods can be unreliable due to a lack of semantic match between explanations and human understanding.
- Semantic match can be obtained reliably for data types with clear semantics, such as tabular data.
The criticism on local feature attribution methods
- Feature attribution methods present themselves as heat maps or colored overlays
- Intuitively, highlighted regions comprise pixels which were considered ‘important’ by the model
- What look like plausible explanations at first may turn out to be ungrounded or spurious
- Humans are unable to attribute meaning to a sub-symbolic encoding of information
- Need a systematic way to translate sub-symbolic representations to human-understandable ones
- Overlaying the heatmap to an image encourages us to use our visual intuition as translation, but this is an ill-advised one
- Feature attribution methods may be potentially misleading and bring no clear added value
Distinguishing low-and high-level features
- Images are unstructured data
- High-level features in images can be attributed meaning
- Structured data has clear meaning for low-level features
- High-level features in structured data behave similarly to images
- Heatmaps need a semantic match diagram to extract information
Saving feature importance for low-level features
- Low-level and high-level features can be distinguished to understand when semantic match works and when it fails.
- Post-hoc local feature attribution can be used on low-level features when they have a predefined translation.
- Semantic match allows users to engage with explanations and decide if they are agreeable.
- High-level features can be highlighted in image data, but without semantic match, users cannot trust the machine’s internal representation.
Discussion
- Reviewed reliability problem of feature attribution methods
- Proposed to diagnose issue with semantic match diagram
- Without clear meaning and translation, semantic match cannot be obtained for high-level feature importance
- Current methods for feature attribution may not be appropriate for unstructured data
- Structured data may still benefit from feature attribution
- Humans need to exercise oversight to spot failure modes of ML applications
- Explanations can still fail to deliver on their promise
- Need to build explanations in the clinician’s language