Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Testing conditions can affect the reliability of black-box learned components in robot autonomy.
Coping with out-of-distribution (OOD) data is an important challenge for trustworthy learning-enabled open-world autonomy.
This paper aims to demystify OOD data and its associated challenges in the context of data-driven robotic systems.
Reasoning about the overall system-level competence of a robot in OOD conditions is important.

Paper Content

I. introduction

Machine learning systems are being used in robot autonomy stacks.
ML models are used to estimate and forecast the state of the environment.
ML models can behave unreliably on data that is dissimilar from the training data.
Coping with out-of-distribution inputs is a key challenge for reliable and safe open-world autonomy.

Ii. running examples

Autonomous Drone Delivery Service uses ML in its autonomy stack
Robotic Manipulators Assisting in the Home use RL in a controlled environment
Autonomous Drone Delivery Service must detect and manage OOD inputs
Robotic Manipulators Assisting in the Home must perform reliably in OOD test environments

Iii. what makes data out-of-distribution?

ML pipelines produce models that generalize well to test data
When models fail to generalize, it is attributed to “OOD data”
OOD data is data from a different distribution than the training data
Distributional shifts occur when test data is from a different distribution than the training data
Functional uncertainty occurs when the model is uncertain of its predictions, even when the test data is from the same distribution as the training data

Iv. trends in ood in machine learning

OOD problem is an open challenge in ML community
State-of-the-art models are sensitive to distributional shifts
Classical and core formulations and techniques from ML literature guide approach to tackling OOD challenge

A. coping with distributional shift

Standard ML techniques assume P test = P train
Major line of ML research aims to relax this assumption
Domain generalization considers capacity of model trained on P train to generalize to P test
Research direction aims to improve distributional robustness
Complementary research direction targets root cause of poor generalization from a causal inference perspective
Domain adaptation leverages training dataset and test inputs to optimize model on P test
Domain adaptation often yields drastic performance improvements with simple algorithms

B. assessing functional uncertainty

Domain adaptation and generalization focus on methods to select or improve a learned model
Anomaly detection considers predicting if an individual input is dissimilar to training data
Predictive uncertainty measures confidence in model predictions
Calibration algorithms and design choices encourage high predictive uncertainty on anomalous inputs
Bayesian ML allows us to quantify epistemic uncertainty by incorporating prior beliefs

C. evaluation

Researchers have developed benchmark datasets to evaluate OOD performance
OOD test sets can include synthetic corruptions and naturally occurring distribution shifts
Recent datasets emphasize tasks relevant to robotics
Datasets provide an intuitive foothold to develop algorithms to isolate reliability problems rooted in OOD data

V. open challenges for ood in robotics

Robotics is focused on building systems that work in the real world.
Reasoning about the reliability and competence of ML-enabled autonomous systems when they use learned models in a feedback loop over time.
System-level perspectives present unique challenges for the robotics community related to detecting, responding to, and improving OOD closed-loop performance.
Three different timescales of data-driven robotic systems with distinct OOD challenges.
Open research questions towards autonomous systems that leverage ML while being robust to OOD conditions.

A. real-time ml-enabled decision making

Need to reason about downstream impact of OOD inputs on decision-making system in real-time
Need to construct safeguards to ensure inference errors don’t lead to system failure
Need to monitor competence of full decision-making stack on individual inputs at test time

Rq 2 (ood aware decision making). can we design decision-making systems compatible with runtime monitors robust to high functional uncertainty?

Robot must choose an action even if runtime monitors suggest model is operating OOD
Design systems to assess and account for model uncertainties
Fallback strategies may need to rely on redundancy or alternate sources of information
Runtime monitors can flag when model outputs are unreliable

B. episodic closed-loop interaction

Learning-enabled robots actively interact with their environment to perform tasks
Reliable robotic systems should reason about the influence of OOD conditions on the closed-loop decision-making system
Temporally Correlated OOD events should be accounted for
Mitigating Distributional Shifts should be considered
Externally shifting conditions can degrade the perception system
Exploiting the drone’s ability to control shifts is important
Quickly adapting to shifted or evolving conditions is necessary

C. data lifecycle

Data collected during operation can be used to improve system performance
Data collected during different episodes of robot execution can be grouped into different operational domains
Data collected during operation should be used to efficiently improve models
Data collected during operation can be used to mitigate influence of OOD conditions
Training data should match test conditions
Need to increase diversity of data
Need to efficiently collect and label data

Vi. conclusion

Robotics requires a system-level perspective on the OOD problem.
Investigate how OOD data impacts the reliability of the autonomy stack.

Link to paper#

Abstract#

Paper Content#

I. introduction#

Ii. running examples#

Iii. what makes data out-of-distribution?#

Iv. trends in ood in machine learning#

A. coping with distributional shift#

B. assessing functional uncertainty#

C. evaluation#

V. open challenges for ood in robotics#

A. real-time ml-enabled decision making#

Rq 2 (ood aware decision making). can we design decision-making systems compatible with runtime monitors robust to high functional uncertainty?#

B. episodic closed-loop interaction#

C. data lifecycle#

Vi. conclusion#

Link to paper

Abstract

Paper Content

I. introduction

Ii. running examples

Iii. what makes data out-of-distribution?

Iv. trends in ood in machine learning

A. coping with distributional shift

B. assessing functional uncertainty

C. evaluation

V. open challenges for ood in robotics

A. real-time ml-enabled decision making

Rq 2 (ood aware decision making). can we design decision-making systems compatible with runtime monitors robust to high functional uncertainty?

B. episodic closed-loop interaction

C. data lifecycle

Vi. conclusion