Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Conformal prediction is a tool for uncertainty quantification
It produces valid prediction intervals with finite-sample guarantees
Self-supervised learning can be used to improve the quality of conformal regressors
Self-supervised pretext tasks can improve the adaptability of conformal intervals
We use self-supervised error as an additional feature to estimate nonconformity scores
We demonstrate the benefit of the additional information using synthetic and real data

Paper Content

Introduction

Machine Learning research is focused on minimizing a model’s predictive error on unseen data.
Conformal prediction provides finite-sample, frequentist guarantees on the marginal coverage of prediction intervals.
Locally adaptive conformal prediction aims to provide wider intervals for challenging samples and narrower intervals for easier samples.
Self-Supervised Conformal Prediction (SSCP) uses information from self-supervised tasks to improve prediction intervals.
SSCP does not impact the theoretical guarantees of conformal prediction.
SSCP is the first work to integrate and examine the benefit of self-supervised errors to improve conformal prediction.

Standard conformal prediction provides marginal coverage
Exact conditional coverage cannot be guaranteed in finite samples
Locally adaptive conformal prediction uses an independent model to induce adaptiveness
Self-supervised learning has been used to improve performance of downstream models

Background: conformal prediction

ICP and CRF are two methods that are fundamental to the proposed framework.
The supervised learning setting involves features X and labels Y.
The goal is to learn a prediction interval Ĉ that ensures Equation 1 holds.
Labeled data D labeled is used to train the model.
The label y i is a scalar.

Conformal residual fitting (crf)

ICP confidence intervals are constant for all instances
They do not reflect the difficulty of individual samples
CRF proposes to produce locally adaptive intervals
A normalized nonconformity function is used to enable locally adaptive intervals
A disjoint dataset is used to train the conformal normalization model

Method: self-supervised conformal prediction

Train predictive model to solve regression task
Train self-supervised model on top of predictive model
Train conformal normalizer to learn residuals of predictive model
Apply CRF using predictive, self-supervised and residual models

Data splits and assumptions

Predictive and self-supervised models are trained on a mixture of labeled and unlabeled data
Residual model should be trained on disjoint data from the predictive model
Calibration and testing data are exchangeable
Model performance is affected by divergence between data distributions

Conformal normalizer phase (s2)

The conformal normalization model is responsible for the adaptiveness of the prediction intervals.
Self-supervised error is used as an additional feature to the model.
Self-supervised error can be computed at both training and test time.
Random forest and neural network are used in the experiments.

Conformalization phase (s3)

Train predictive model f
Train self-supervised model f ss
Train residual model σ
Apply calibration procedure to obtain α-quantile nonconformity score
For new test example x, obtain adaptive and valid intervals
Summarize complete SSCP framework algorithmically in Algorithm 1

Other locally adaptive methods

SSCP framework focuses on CRF for locally adaptive intervals
CQR is a powerful alternative to CRF
Appendix C.3 describes adaption to integrate CQR into SSCP framework

Remark on sscp and coverage guarantees

Conformal prediction provides theoretical guarantees on the validity of coverage of the prediction intervals.
SSCP inherits these same guarantees.

Experiments

Synthetic example to show why adding self-supervised loss improves prediction intervals
Performance of SSCP on real-world datasets
Quality of prediction intervals assessed using commonly used metrics
Locally adaptive conformal prediction used as baseline

Synthetic demonstration

Synthetic dataset used to illustrate method
Self-supervision used to improve predictive model and conformal normalization model
Synthetic residuals generated from hypothetical predictive model
Inputs and residuals generated as a function of single, uniformly distributed latent dimension
Autoencoder used as self-supervised task
Comparing conformal prediction intervals with and without self-supervised feature
Self-supervised feature results in more adaptive intervals
Self-supervised loss better models heteroscedasticity in predictive model’s residuals

Real data

Self-supervision can be used to improve the quality of conformal intervals.
Self-supervision can be used with only labeled data, by ignoring the label.
Self-supervision with labeled and unlabeled data improves the width of prediction intervals.
The benefit of unlabeled data decreases as the proportion of labeled data increases.
Different pretext tasks may provide superior signal to the normalization model.

Insights

SSCP improves prediction interval width compared to CRF
SSCP helps most on samples with largest interval width
SSCP helps in sparser regions of PC1 space
SSCP improves conformal normalization model by ±14% and ±5% in Fig. 7 (a) and (b) respectively

Discussion

Improving the quality of prediction intervals is important for reliable uncertainty quantification
Self-supervised learning can be used to improve conformal prediction intervals
Self-supervision assists in more challenging and sparser regions
Future research should explore alternative self-supervised approaches, new conformal-aware self-supervised tasks, and the use of self-supervision with other data modalities

C additional experiments

Goal of experiment: Understand benefit of unlabeled data to SSCP
Analysis: Unlabeled data increases average interval width when labeled dataset is small
Goal: Assess value of alternative sources of signal
Analysis: Isolation forest and self-supervised signal provide best intervals
Goal: Examine VIME and AE as self-supervised tasks
Analysis: VIME provides greater performance improvements
Goal: Deep-dive to understand impact of self-supervised task
Analysis: Self-supervision helps to improve prediction intervals on most uncertain examples
Goal: Assess robustness of intervals
Analysis: SSCP has net gain over CRF across all datasets

Link to paper#

Abstract#

Paper Content#

Introduction#

Related work#

Background: conformal prediction#

Conformal residual fitting (crf)#

Method: self-supervised conformal prediction#

Data splits and assumptions#

Conformal normalizer phase (s2)#

Conformalization phase (s3)#

Other locally adaptive methods#

Remark on sscp and coverage guarantees#

Experiments#

Synthetic demonstration#

Real data#

Insights#

Discussion#

C additional experiments#

Link to paper

Abstract

Paper Content

Introduction

Related work

Background: conformal prediction

Conformal residual fitting (crf)

Method: self-supervised conformal prediction

Data splits and assumptions

Conformal normalizer phase (s2)

Conformalization phase (s3)

Other locally adaptive methods

Remark on sscp and coverage guarantees

Experiments

Synthetic demonstration

Real data

Insights

Discussion

C additional experiments