Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Uncertainty quantification is necessary for safe deployment of deep neural networks.
  • Conformal prediction is a framework for uncertainty quantification of deep models.
  • Neighborhood Conformal Prediction (NCP) is a novel algorithm to improve the efficiency of uncertainty quantification.
  • NCP uses the learned representation of the neural network to create adaptive prediction sets.
  • Experiments show that NCP leads to significant reduction in prediction set size.

Paper Content

Introduction

  • Recent advances in deep learning have allowed for models with high accuracy.
  • To safely deploy these models in high-stake applications, uncertainty quantification is needed.
  • Prediction sets allow doctors to rule out harmful diagnoses.
  • Conformal prediction provides formal guarantees for a user-specified coverage.
  • UQ from CP is adaptive and reflects the difficulty of testing inputs.
  • Two key steps in CP: prediction and calibration.
  • Trade-off between coverage and efficiency.
  • Neighborhood Conformal Prediction (NCP) proposed to improve efficiency.
  • NCP uses localizer and weighting function from deep classifiers.
  • Theoretical analysis of NCP to characterize necessary conditions for reduced prediction set size.
  • Experimental evaluation of NCP on classification benchmarks.

Background and problem setup

  • Problem of uncertainty quantification (UQ) of pre-trained deep models for classification tasks
  • Conformity scores based on ordered probabilities proposed
  • Conformity scoring function of regularized APS defined
  • High-level goal of paper is to study provable methods to improve standard CP framework to achieve high-efficiency
  • Neighborhood CP algorithm proposed and studied to assign higher importance to calibration examples in local neighborhood of given testing example

Neighborhood conformal prediction

  • NCP algorithm improves the efficiency of CP
  • NCP uses a non-conformity measure function to determine a quantile
  • NCP assigns importance weights to conformal scores on calibration examples
  • NCP uses a ball-based localizer function to define the neighborhood of a testing example
  • NCP uses a weighting function to assign importance weights to k nearest neighbors
  • NCP uses a pre-trained deep model to learn input representation
  • NCP ablation experiments compare NCP with a variant that includes all calibration examples in the neighborhood

Theoretical analysis

  • NCP and CP are compared in terms of efficiency
  • Efficiency is related to the size of the prediction set
  • Quantile value is used as a measure of efficiency
  • CP has a constant quantile for all data samples, while NCP has an adaptive quantile
  • Data samples are classified into C classes
  • Data samples are drawn from distribution D X
  • Neighborhood of data sample is defined
  • Quantiles are assumed to be σ-concentrated on R * B ∩ X c
  • Robust set is lower bounded for each class
  • Assumption is made that any quantile t satisfies σ-concentration on R * B ∩ X c
  • NCP gives smaller expected quantile than CP
  • NCP and CP are connected by class-wise NCP
  • Alignment between quantiles and distribution is key to improved efficiency

Experiments and results

  • NCP is a wrapper algorithm that can reduce prediction set size
  • We compare NCP to prior CP methods and other baselines
  • We use Adam optimizer and train for 500 epochs
  • We compare NCP to APS, RAPS, and a naive baseline

Results and discussion

  • NCP reduces prediction set size for classification and regression tasks
  • NCP is a wrapper algorithm that uses a better conformity scoring function
  • NCP tightens the gap between actual coverage and desired coverage
  • Naive baseline achieves desired coverage but with larger prediction set
  • NCP is complementary to APS and RAPS

Summary

  • NCP algorithm improves uncertainty quantification of pre-trained deep classifiers
  • NCP assigns importance weights proportional to distance of k nearest neighbors
  • NCP reduces prediction set size compared to standard CP framework
  • Experimental results demonstrate reduction in prediction set size
  • Artificial class-wise NCP algorithm solves problem
  • Class-wise NCP requires smaller quantile than CP to achieve same significance level
  • Class-wise NCP coverage probability lower bounded by 1-α
  • NCP requires smaller quantile than class-wise NCP
  • NCP more efficient than CP if inequality holds

Additional experiments

  • Ablation results show that NCP reduces prediction set size as clustering property of learned representation improves
  • Figure 1 (a) and (b) show that prediction set size from NCP reduces over CP as clustering property of learned representation improves
  • Figure 1 (c) and (d) show t-SNE visualization of raw image pixels and learned input representation from ResNet50 on CIFAR10 data, demonstrating that learned representation exhibits significantly better clustering property
  • Mathematical notations used in paper include D tr, D cal, F θ, UQ, C(x), Cov, Eff
  • Datasets used include BlogFeedback, Gas turbine, Facebook_i, MEPS_i, Concrete
  • Ablation results comparing NCP and NCP variant using all calibration examples as neighborhood (NCP-All) show that NCP reduces mean prediction set size
  • Results justify use of learned representation to define neighborhood and weighting function for NCP algorithm
  • Silhouette score used to measure clustering coefficient for ImageNet-val data
  • Runtime comparison shows two variants of NCP algorithm: NCP with standard KNN search algorithm and NCP using Locality Sensitive Hashing (LSH) based KNN search