Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Predictive Interval (PI) given by Conformal Prediction (CP) may not reflect the uncertainty of a given model.
  • We propose using a Quantile Regression Forest (QRF) to learn the distribution of nonconformity scores and utilizing the QRF’s weights to assign more importance to samples with residuals similar to the test point.
  • This approach results in PI lengths that are more aligned with the model’s uncertainty.
  • Our approach enjoys an assumption-free finite sample marginal and training-conditional coverage.
  • Experiments on simulated and real-world data demonstrate significant improvements compared to existing methods.

Paper Content

  • Use absolute residual as nonconformity score
  • Construct adaptive PIs by focusing on estimated residuals
  • LCP and SLCP use kernel-based weights to approximate conditional c.d.f
  • LCP and SLCP have limitations
  • Replace NW estimator with QRF algorithm
  • QRF algorithm has advantages over NW estimator
  • Calibration step to achieve training-conditional coverage
  • Asymptotic conditional coverage under suitable conditions
  • Outperforms competitors LCP and SLCP

Random forest localizer

  • Random Forest Localizer is used to construct adaptive PI that depends on the test point X n+1
  • RF algorithm partitions the input space by recursively splitting the data
  • Weights of each calibration sample for X n+1 are determined by the number of times it appears in the leaves of the trees where X n+1 falls
  • RF is grown as an ensemble of k trees based on random node and split point selection
  • Random Forests can be used to estimate more complex quantities
  • Quantile Regression Forests use the same weights as Random Forests to approximate the c.d.f F (y|x)
  • Localized Random Forest is used to approximate the estimated residuals V |X = x
  • PI is calibrated using the Localized Conformal Prediction (LCP) framework
  • LCP framework is used to select an appropriate level α to the quantile used in the PI to ensure marginal coverage at level 1 − α

Lcp-rf

  • LCP framework of (Guan, 2022) with Random Forest localizer is described
  • Calibration approach guarantees training-conditional coverage
  • Weights of RF are used to improve LCP calibration process
  • Proofs of theorems and lemmas are in the appendix
  • Lemma 4.1 shows how to achieve marginal coverage by selecting level α of the quantile of the localizer
  • Theorem 4.2 shows that the resulting PI has marginal coverage
  • Lemma 4.3 describes an algorithm to compute the largest accepted value v

Training-conditional coverage for lcp-rf

  • Consider training-conditional coverage for LCP-RF
  • Use two-step approach to ensure coverage
  • Split calibration set into two sets
  • Train Quantile Regression Forest on one set
  • Compute PI for observations in second set using LCP-RF
  • Add correction term to ensure PAC coverage

Clustering using the weights of lcp-rf

  • Random Forest Localizer offers faster computation of PIs and more adaptive PIs than traditional kernel-based localizers
  • Weights of Random Forest Localizer are sparse
  • We can group similar observations together before applying calibration steps
  • We can view weights of Random Forest as a transition matrix or weighted adjacency matrix
  • We can group observations that are connected to each other and separate observations that are not connected
  • We can apply calibration steps separately on each group
  • We can regroup calibration observations by (non-overlapping) communities using the weights
  • We can get marginal/PAC coverage by applying calibration step conditionally on the groups

Asymptotic conditional coverage

  • LCP-RF is a computer science paper that studies conditional coverage
  • Assumptions 5.1-5.3 are necessary to get uniform convergence of the RF estimator
  • Assumption 5.2 allows for control of the approximation error of the RF estimator
  • Assumption 5.3 means that the cells should contain a sufficiently large number of points
  • Theorem 5.4 states that the selected α(v) when V n+1 = v given by the LCP-RF converges to 1 − α

Experiments

  • Evaluated performance of 3 proposed methods against competitors
  • Used original implementations of SLCP and LCP
  • Tested on simulated data and 4 real-world datasets
  • Used mean and quantile scores to measure nonconformity
  • Used Random Forest as mean estimate
  • Compared PI of each method to oracle PI
  • Our methods outperformed competitors in terms of uncertainty fidelity and adaptiveness of lengths
  • SPLIT-G improved PI of split-CP

Conclusion

  • Reweighting strategy based on Random Forest can improve PI
  • PI more similar to traditional statistics
  • Lemma 4.1 shows how to achieve marginal coverage
  • Lemma A.1 shows how to remove dependence on unknown residuals
  • Theorem 4.2 shows how to correct LCP-RF to have training-conditional coverage
  • Theorem 5.4 shows asymptotic conditional coverage of LCP-RF
  • Honest Forest used as theoretical surrogate
  • Lemma D.2 allows for control of weights of Honest Forest
  • Theorem D.1 shows Random Forest Localizer has coverage rate of 1-α
  • Figure 1 shows PI of SLCP, LCP and LCP-RF
  • Figure 2 shows PI lengths and errors of different methods
  • Figure 3 compares SPLIT-G and other methods