Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Predictive Interval (PI) given by Conformal Prediction (CP) may not reflect the uncertainty of a given model.
- We propose using a Quantile Regression Forest (QRF) to learn the distribution of nonconformity scores and utilizing the QRF’s weights to assign more importance to samples with residuals similar to the test point.
- This approach results in PI lengths that are more aligned with the model’s uncertainty.
- Our approach enjoys an assumption-free finite sample marginal and training-conditional coverage.
- Experiments on simulated and real-world data demonstrate significant improvements compared to existing methods.
Paper Content
Related works and contributions
- Use absolute residual as nonconformity score
- Construct adaptive PIs by focusing on estimated residuals
- LCP and SLCP use kernel-based weights to approximate conditional c.d.f
- LCP and SLCP have limitations
- Replace NW estimator with QRF algorithm
- QRF algorithm has advantages over NW estimator
- Calibration step to achieve training-conditional coverage
- Asymptotic conditional coverage under suitable conditions
- Outperforms competitors LCP and SLCP
Random forest localizer
- Random Forest Localizer is used to construct adaptive PI that depends on the test point X n+1
- RF algorithm partitions the input space by recursively splitting the data
- Weights of each calibration sample for X n+1 are determined by the number of times it appears in the leaves of the trees where X n+1 falls
- RF is grown as an ensemble of k trees based on random node and split point selection
- Random Forests can be used to estimate more complex quantities
- Quantile Regression Forests use the same weights as Random Forests to approximate the c.d.f F (y|x)
- Localized Random Forest is used to approximate the estimated residuals V |X = x
- PI is calibrated using the Localized Conformal Prediction (LCP) framework
- LCP framework is used to select an appropriate level α to the quantile used in the PI to ensure marginal coverage at level 1 − α
Lcp-rf
- LCP framework of (Guan, 2022) with Random Forest localizer is described
- Calibration approach guarantees training-conditional coverage
- Weights of RF are used to improve LCP calibration process
- Proofs of theorems and lemmas are in the appendix
- Lemma 4.1 shows how to achieve marginal coverage by selecting level α of the quantile of the localizer
- Theorem 4.2 shows that the resulting PI has marginal coverage
- Lemma 4.3 describes an algorithm to compute the largest accepted value v
Training-conditional coverage for lcp-rf
- Consider training-conditional coverage for LCP-RF
- Use two-step approach to ensure coverage
- Split calibration set into two sets
- Train Quantile Regression Forest on one set
- Compute PI for observations in second set using LCP-RF
- Add correction term to ensure PAC coverage
Clustering using the weights of lcp-rf
- Random Forest Localizer offers faster computation of PIs and more adaptive PIs than traditional kernel-based localizers
- Weights of Random Forest Localizer are sparse
- We can group similar observations together before applying calibration steps
- We can view weights of Random Forest as a transition matrix or weighted adjacency matrix
- We can group observations that are connected to each other and separate observations that are not connected
- We can apply calibration steps separately on each group
- We can regroup calibration observations by (non-overlapping) communities using the weights
- We can get marginal/PAC coverage by applying calibration step conditionally on the groups
Asymptotic conditional coverage
- LCP-RF is a computer science paper that studies conditional coverage
- Assumptions 5.1-5.3 are necessary to get uniform convergence of the RF estimator
- Assumption 5.2 allows for control of the approximation error of the RF estimator
- Assumption 5.3 means that the cells should contain a sufficiently large number of points
- Theorem 5.4 states that the selected α(v) when V n+1 = v given by the LCP-RF converges to 1 − α
Experiments
- Evaluated performance of 3 proposed methods against competitors
- Used original implementations of SLCP and LCP
- Tested on simulated data and 4 real-world datasets
- Used mean and quantile scores to measure nonconformity
- Used Random Forest as mean estimate
- Compared PI of each method to oracle PI
- Our methods outperformed competitors in terms of uncertainty fidelity and adaptiveness of lengths
- SPLIT-G improved PI of split-CP
Conclusion
- Reweighting strategy based on Random Forest can improve PI
- PI more similar to traditional statistics
- Lemma 4.1 shows how to achieve marginal coverage
- Lemma A.1 shows how to remove dependence on unknown residuals
- Theorem 4.2 shows how to correct LCP-RF to have training-conditional coverage
- Theorem 5.4 shows asymptotic conditional coverage of LCP-RF
- Honest Forest used as theoretical surrogate
- Lemma D.2 allows for control of weights of Honest Forest
- Theorem D.1 shows Random Forest Localizer has coverage rate of 1-α
- Figure 1 shows PI of SLCP, LCP and LCP-RF
- Figure 2 shows PI lengths and errors of different methods
- Figure 3 compares SPLIT-G and other methods