Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Simultaneous odometry and mapping using LiDAR data is important for mobile systems to achieve full autonomy in large-scale environments.
- Most existing LiDAR-based methods prioritize tracking quality over reconstruction quality.
- A novel NeRF-based LiDAR odometry and mapping approach is proposed, consisting of three modules.
- The approach is pre-trained free and exhibits strong generalization abilities.
- Extensive evaluations demonstrate state-of-the-art odometry and mapping performance.
Paper Content
Introduction
- Simultaneous odometry and mapping is important for autonomous mobile systems
- LiDAR sensors are used for odometry and mapping due to their ability to provide precise range measurements
- Current LiDAR odometry and mapping algorithms prioritize tracking quality over dense reconstruction quality
- Research on deep-learning based algorithms for LiDAR odometry and mapping is scarce
- Proposed NeRF-LOAM accurately estimates poses of mobile system and reconstructs dense mesh map of outdoor large-scale environment
- Contributions of work are: 1) first neural implicit odometry and mapping method for large-scale environments using LiDAR data, 2) novel neural SDF module combined with dynamic generation and key-scans refine strategy, 3) online joint optimization, pre-training free and generalizes well in different environments
Related work
- Odometry and mapping in outdoor large-scale environments using LiDAR data has been studied for a long time
- Iterative closest point (ICP) algorithm is used to align consecutive point clouds and calculate relative transformation
- Point-to-edge and point-to-plane distance is used to optimize ICP error and achieve accurate odometry estimates
- Learning-based methods on LiDAR odometry are popular
- 3D scene is represented by surfels, occupancy grids, triangle meshes, and polynomial representations
- Neural implicit representation is used for novel view synthesis and SLAM
Our neural sdf
- Novel neural SDF module introduced
- Octree structure adopted to divide scene into leaf nodes with basic scene units voxels
- N-dimension embedding at each vertex
- SDF values inferred from embeddings through neural network
- Treat environments differently when optimizing SDF values
- Rays and points sampling used to optimize pose and voxel embeddings
- Neural SDF value approximated by trilinear interpolation of voxel embeddings
- Free space loss used to remove dynamic objects
- SDF loss used to supervise SDF estimates
- Eikonal loss used to make SDF values differentiable and equal to one within truncation area
- NeRF-LOAM framework outputs poses of each scan and reconstructed mesh map
Overview
- Takes LiDAR stream as input and outputs 3D reconstructed mesh with poses
- Neural odometry estimates 6-DoF Pose for each scan
- Neural mapping transforms point cloud into world coordinate system
- Key-scan buffer maintains long-map consistency and enhances mapping quality
- Key scans are used to refine odometry and map results
- 3D mesh is reconstructed using marching cube method
Neural mapping
- Octree-based approach used to partition scene
- Estimated pose used to convert points into world coordinate system
- New voxels added to octree with corresponding embeddings
- 3D voxel coordinates encoded into unique scalar value (Morton code)
- Efficient and scalable method for generating voxel embeddings dynamically
Mesh reconstruction
- Key-scan buffer is used to improve mapping quality and prevent forgetting of the first K scans.
- Key-scan is added to the buffer if the number of newly added voxels exceeds a threshold or the distance between the current scan and the last key-scan is large.
- Refinement process is improved by only including rays or LiDAR points within a truncation distance based on the point density.
- Final mesh is obtained via marching cube.
Experiments
Experimental setup
- Evaluate method and compare to SOTA using 3 publicly available outdoor LiDAR datasets
- Use MaiCity and Newer College datasets to compare odometry and mapping results with provided ground truth
- Use KITTI odometry dataset to present odometry accuracy and qualitative mapping results
- Evaluate odometry accuracy using RMSE of ATEs and mapping accuracy using accuracy, completion, Chamfer-L1 distance, and F-score
- MLP with 2 FC layers, each with 256 hidden units
- Voxel embeddings length of 16 with voxel size 0.2 m
- Step size ratio 0.2 for odometry and 0.5 for mapping, truncation distance T r = 0.3 m
- Use seminal work of [16] to distinguish ground from LiDAR points
Simultaneously odometry & mapping results
- Our method combined with KissICP outperforms existing SOTA methods on the MaiCity dataset
- Our method has comparable quality in the Newer College dataset
- Our method effectively removes artifacts and produces a smoother mapping result
- Our method outperforms Puma in almost all metrics
Mapping quality
- Ground truth poses are used to reconstruct the mesh map of the environment.
- Our approach outperforms two baseline methods in terms of pure mapping ability.
- Error maps demonstrate the greater accuracy of our reconstruction.
Odometry evaluation
- Odometry quality influences mapping quality
- Results of odometry compared to non-learning and learning-based methods
- Our method achieves comparable results to other methods on the synthetic MaiCity and KITTI09 datasets
- Our method achieves best performance on the Newer College
- Our method does not require pre-training and exhibits strong generalization ability
Ablation study
- Performance of method with/without ground separation improves odometry and mapping accuracy
- Without ground separation, odometry accuracy declines and trajectory diverges
- Key-scan refinement improves mapping results and produces smoother and more complete results
- Varying voxel size affects mapping quality, memory consumption, and processing time
Conclusion
- Novel approach for simultaneous odometry and mapping using neural implicit representation with 3D LiDAR data
- NeRF-LOAM network tackles incremental LiDAR inputs in outdoor large-scale environments
- Uses voxel embeddings to record geometrical structure and avoids any pretraining
- Dynamic embedding generation for fast query and allocation
- Experiments conducted on simulated and real-world datasets
- Reconstructs higher-quality 3D mesh maps compared to other learning-based or non-learning-based methods
- Estimates accurate pose and generalizes well without any offline pre-training
- Limitation: cannot currently operate in real-time
- Future work: optimize code in C++, combine with loop closures
- Societal Impacts: provides accurate trajectories and reconstructs dense environmental awareness map
- Ablation study of designs on Maicity and Newer College datasets
- Influence of network architecture and embedding length
- Effect of voxel size on processing time, accuracy distance, and memory consumption
- Odometry evaluation on KITTI dataset
- Qualitative results of odometry on KITTI dataset
- Ablation study for ground separation and key-scan refine
- Reconstruction and odometry results on KITTI07