Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Simultaneous odometry and mapping using LiDAR data is important for mobile systems to achieve full autonomy in large-scale environments.
  • Most existing LiDAR-based methods prioritize tracking quality over reconstruction quality.
  • A novel NeRF-based LiDAR odometry and mapping approach is proposed, consisting of three modules.
  • The approach is pre-trained free and exhibits strong generalization abilities.
  • Extensive evaluations demonstrate state-of-the-art odometry and mapping performance.

Paper Content

Introduction

  • Simultaneous odometry and mapping is important for autonomous mobile systems
  • LiDAR sensors are used for odometry and mapping due to their ability to provide precise range measurements
  • Current LiDAR odometry and mapping algorithms prioritize tracking quality over dense reconstruction quality
  • Research on deep-learning based algorithms for LiDAR odometry and mapping is scarce
  • Proposed NeRF-LOAM accurately estimates poses of mobile system and reconstructs dense mesh map of outdoor large-scale environment
  • Contributions of work are: 1) first neural implicit odometry and mapping method for large-scale environments using LiDAR data, 2) novel neural SDF module combined with dynamic generation and key-scans refine strategy, 3) online joint optimization, pre-training free and generalizes well in different environments
  • Odometry and mapping in outdoor large-scale environments using LiDAR data has been studied for a long time
  • Iterative closest point (ICP) algorithm is used to align consecutive point clouds and calculate relative transformation
  • Point-to-edge and point-to-plane distance is used to optimize ICP error and achieve accurate odometry estimates
  • Learning-based methods on LiDAR odometry are popular
  • 3D scene is represented by surfels, occupancy grids, triangle meshes, and polynomial representations
  • Neural implicit representation is used for novel view synthesis and SLAM

Our neural sdf

  • Novel neural SDF module introduced
  • Octree structure adopted to divide scene into leaf nodes with basic scene units voxels
  • N-dimension embedding at each vertex
  • SDF values inferred from embeddings through neural network
  • Treat environments differently when optimizing SDF values
  • Rays and points sampling used to optimize pose and voxel embeddings
  • Neural SDF value approximated by trilinear interpolation of voxel embeddings
  • Free space loss used to remove dynamic objects
  • SDF loss used to supervise SDF estimates
  • Eikonal loss used to make SDF values differentiable and equal to one within truncation area
  • NeRF-LOAM framework outputs poses of each scan and reconstructed mesh map

Overview

  • Takes LiDAR stream as input and outputs 3D reconstructed mesh with poses
  • Neural odometry estimates 6-DoF Pose for each scan
  • Neural mapping transforms point cloud into world coordinate system
  • Key-scan buffer maintains long-map consistency and enhances mapping quality
  • Key scans are used to refine odometry and map results
  • 3D mesh is reconstructed using marching cube method

Neural mapping

  • Octree-based approach used to partition scene
  • Estimated pose used to convert points into world coordinate system
  • New voxels added to octree with corresponding embeddings
  • 3D voxel coordinates encoded into unique scalar value (Morton code)
  • Efficient and scalable method for generating voxel embeddings dynamically

Mesh reconstruction

  • Key-scan buffer is used to improve mapping quality and prevent forgetting of the first K scans.
  • Key-scan is added to the buffer if the number of newly added voxels exceeds a threshold or the distance between the current scan and the last key-scan is large.
  • Refinement process is improved by only including rays or LiDAR points within a truncation distance based on the point density.
  • Final mesh is obtained via marching cube.

Experiments

Experimental setup

  • Evaluate method and compare to SOTA using 3 publicly available outdoor LiDAR datasets
  • Use MaiCity and Newer College datasets to compare odometry and mapping results with provided ground truth
  • Use KITTI odometry dataset to present odometry accuracy and qualitative mapping results
  • Evaluate odometry accuracy using RMSE of ATEs and mapping accuracy using accuracy, completion, Chamfer-L1 distance, and F-score
  • MLP with 2 FC layers, each with 256 hidden units
  • Voxel embeddings length of 16 with voxel size 0.2 m
  • Step size ratio 0.2 for odometry and 0.5 for mapping, truncation distance T r = 0.3 m
  • Use seminal work of [16] to distinguish ground from LiDAR points

Simultaneously odometry & mapping results

  • Our method combined with KissICP outperforms existing SOTA methods on the MaiCity dataset
  • Our method has comparable quality in the Newer College dataset
  • Our method effectively removes artifacts and produces a smoother mapping result
  • Our method outperforms Puma in almost all metrics

Mapping quality

  • Ground truth poses are used to reconstruct the mesh map of the environment.
  • Our approach outperforms two baseline methods in terms of pure mapping ability.
  • Error maps demonstrate the greater accuracy of our reconstruction.

Odometry evaluation

  • Odometry quality influences mapping quality
  • Results of odometry compared to non-learning and learning-based methods
  • Our method achieves comparable results to other methods on the synthetic MaiCity and KITTI09 datasets
  • Our method achieves best performance on the Newer College
  • Our method does not require pre-training and exhibits strong generalization ability

Ablation study

  • Performance of method with/without ground separation improves odometry and mapping accuracy
  • Without ground separation, odometry accuracy declines and trajectory diverges
  • Key-scan refinement improves mapping results and produces smoother and more complete results
  • Varying voxel size affects mapping quality, memory consumption, and processing time

Conclusion

  • Novel approach for simultaneous odometry and mapping using neural implicit representation with 3D LiDAR data
  • NeRF-LOAM network tackles incremental LiDAR inputs in outdoor large-scale environments
  • Uses voxel embeddings to record geometrical structure and avoids any pretraining
  • Dynamic embedding generation for fast query and allocation
  • Experiments conducted on simulated and real-world datasets
  • Reconstructs higher-quality 3D mesh maps compared to other learning-based or non-learning-based methods
  • Estimates accurate pose and generalizes well without any offline pre-training
  • Limitation: cannot currently operate in real-time
  • Future work: optimize code in C++, combine with loop closures
  • Societal Impacts: provides accurate trajectories and reconstructs dense environmental awareness map
  • Ablation study of designs on Maicity and Newer College datasets
  • Influence of network architecture and embedding length
  • Effect of voxel size on processing time, accuracy distance, and memory consumption
  • Odometry evaluation on KITTI dataset
  • Qualitative results of odometry on KITTI dataset
  • Ablation study for ground separation and key-scan refine
  • Reconstruction and odometry results on KITTI07