Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Proposes a fully differentiable $\nabla$-RANSAC
  • Predicts inlier probabilities of input data points
  • Exploits predictions in a guided sampler
  • Estimates model parameters and quality while propagating gradients
  • Random sampler based on Gumbel Softmax sampler
  • Model quality function marginalizes over scores from all models
  • Unlocks end-to-end training of geometric estimation pipelines
  • Trained with LoFTR to find reliable correspondences
  • Tested on real-world datasets for fundamental and essential matrix estimation
  • Superior to state-of-the-art in terms of accuracy and speed

Paper Content

Introduction

  • Direct optimization on test-time evaluation metric has been beneficial for deep learning in vision tasks
  • Training model directly on evaluation metric is infeasible when metric is nondifferentiable, so training with a surrogate of the metric is used
  • Examples of surrogate losses include average precision and recall@k for image retrieval, perceptual loss for image compression, intersection-over-union loss for object detection, and edit distance loss for scene text recognition
  • RANSAC is widely used for robust estimation in vision pipelines
  • RANSAC variants have been proposed to improve components of the original algorithm
  • ∇-RANSAC is proposed to make RANSAC end-to-end differentiable
  • ∇-RANSAC allows robust estimators to use test-time evaluation metrics to optimize end-to-end training
  • ∇-RANSAC is trained with a detector-free feature matcher, LoFTR, to improve accuracy

Fully differentiable ransac

  • Input is set of tentative point correspondences with extra info from detector and matcher
  • Consensus learning via pruning block from recent [89]
  • ∇-RANSAC is an iterative random sampling of m data points
  • Sampler is Gumbel Softmax Sampler using input probabilities as guidance
  • Differentiable minimal solver estimates model parameters from drawn sample
  • Model quality is computed in a supervised way using ground truth

Gumbel softmax sampler

  • ∇-RANSAC requires sampling m data points from a set of n total samples.
  • Sampling distribution is either governed by importance scores or follows a uniform distribution.
  • Standard sampling operation is either non-differentiable or has sparse gradients.
  • Gumbel-Softmax is extended with the straight through trick to make sampling differentiable.

Differentiable minimal solver

  • Minimal solvers are a part of RANSAC-like hypothesize-and-verify approaches
  • Estimate model parameters from a minimal set of data points
  • Most minimal solvers are differentiable
  • Fundamental and essential matrix estimation are two utmost important problems
  • 8PC and 7PC solvers have a degeneracy when points stem from a close-to-planar underlying 3D structure
  • 5PC solvers are used in practical applications
  • Most minimal solvers return multiple solutions
  • Best solution is selected based on evaluation metric
  • Source codes will be made publicly available

Trainable quality function

  • RANSAC calculates the quality of an estimated model as the number of inliers
  • Other algorithms have improved RANSAC’s performance by better modelling noise
  • Some works use soft probabilistic hypothesis selection
  • Other methods combine classification loss with regression and geometry-induced losses

Training and testing details

  • Input of ∇-RANSAC is a set of correspondences obtained by any feature detector and matcher
  • Number of matches is fixed to 2000, best 2000 chosen based on matching score
  • Missing values filled with zeros if fewer correspondences
  • Local and global features extracted from correspondences by consensus learning block
  • Weights initialized with 1000 epoch-long procedure to minimize Kullback-Leibler divergence
  • Gradient clipping used to avoid exploding gradients and accelerate convergence
  • Training pipeline implemented in PyTorch
  • Inference uses state-of-the-art components and Gumbel Softmax Sampler
  • MAGSAC++ model quality function used to select best model
  • Inner RANSAC-based local optimization and Levenberg-Marquardt numerical optimization used to improve accuracy
  • Testing algorithm implemented in C++

Experimental results

  • Tested epipolar geometry estimation on 13 scenes from the CVPR IMW 2020 PhotoTourism benchmark
  • Trained and validated on St. Peter’s Square with 4950 image pairs
  • Compared ∇-RANSAC to classical robust estimators and state-of-the-art learning-based methods
  • Retrained NG-RANSAC, CLNet, and OANet on same data
  • Used SNN ratio, feature scales and orientations as learnable side-information
  • Pre-filtered correspondences by SNN ratio threshold of 0.8
  • Used 0.75 pixels as inlier-outlier threshold for robust estimators

Fundamental matrix estimation

Essential matrix estimation

  • Evaluated method for E estimation with same train, validation, and test scenes as F estimation
  • Used differentiable 5PC algorithm when training on essential matrix estimation
  • Trained end-to-end for 10 epochs with weight initialization and gradient clipping
  • Iteration number fixed to 100
  • Evaluated estimated essential matrix by decomposing E matrix to rotation and translation, calculating errors R and t, and reporting maximum max(R, t)
  • Calculated Area Under the Recall curve (AUC) thresholded at 5•, 10• and 20•
  • Highest AUC scores achieved by ∇-RANSAC
  • Highest AUC scores achieved by five-point algorithm, confirming necessity of using better minimal solvers than 8PC algorithm
  • Weight initialization with Kullback-Leibler divergence improves accuracy
  • Sampling weights used Sacre Coeur as test set
  • Epipolar error leads to best results
  • ∇-RANSAC can be used to improve end-to-end feature matching approaches
  • Best results observed with setup where only LoFTR model is trained and ∇-RANSAC is kept frozen with pre-trained weights
  • ∇-RANSAC leads to most accurate fundamental and essential matrices compared to state-of-the-art robust estimators
  • Code repository and trained model will be made public