Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Traditional mapping methods have difficulty balancing memory consumption and accuracy.
- This paper proposes a 3D LiDAR-based mapping method using an octree-based hierarchical structure.
- The features are optimized with 3D measurements and a binary cross entropy loss.
- The mapping system is designed to prevent catastrophic forgetting.
- Experiments show that the proposed method is more accurate, complete, and memory-efficient than current methods.
Paper Content
I. introduction
- Localization and navigation in large-scale outdoor scenes is a common task of mobile robots
- Accurate and dense 3D map of the environment is necessary
- Current large-scale mapping methods use spatial grids or tree structures
- Neural network-based representations are becoming popular
- Little has been done in the context of LiDAR data
- SHINE-Mapping is a novel approach for large-scale incremental 3D mapping
- SHINE-Mapping uses an octree-based sparse data structure and a shared shallow MLP
- SHINE-Mapping is more accurate and complete than non-learning-based mapping methods
Ii. related work
- 3D LiDAR point clouds are used for environment mapping, localization, navigation, visualization, and augmented reality
- Common representations for environment mapping include surfel-based, triangle meshes, and octree-based occupancy
- Volumetric integration methods are popularized by Newcombe et al.
- Neural representations, like NeRF, are used for novel view synthesis
- Implicit representations represent the environment via multilayer perceptrons
- Incremental mapping with implicit representation is a continual learning problem
- Memory cost of dense voxel structures is reduced by octree-based sparse feature grid
- Feature update regularization is used to achieve incremental mapping with limited memory
Iii. our approach -shine-mapping
- Proposed framework for large-scale 3D mapping
- Takes point clouds from range sensor (e.g. LiDAR) with known poses as input
- Uses learnable octree-based hierarchical feature grids and MLP decoder to represent SDF of environment
- Optimizes local feature vectors online to capture local geometry
- Generates explicit geometric representation (triangle mesh) for visualization and evaluation
A. implicit neural map representation
- Our implicit map representation stores spatially located features in 3D world.
- We use a neural network to infer SDF values from these features.
- We combine features from H different resolutions.
- We use an octree-based map representation and hash tables to store features.
- We use Morton codes to quickly find features of upper levels.
B. training pairs and loss function
- Range sensors such as LiDARs provide accurate range measurements.
- Training pairs are obtained by sampling points along the ray and using the signed distance from the sampled point to the beam endpoint as the supervision signal.
- The regions of interest are the values close to zero as they define the surfaces.
- Binary cross entropy (BCE) is used as the loss function, with an Eikonal term added to encourage accurate signed distance values.
C. incremental mapping without forgetting
- Catastrophic forgetting happens when using feature grid-based implicit incremental mapping
- Network focuses on reducing loss generated in current area, not previous area
- As grid size increases, forgetting problem becomes more severe
- Solution is to limit update direction of local feature vector
- Regularization term added to loss function to prevent gradient explosion
Iv. experimental evaluation
- Incremental and scalable 3D mapping system
- Uses sparse hierarchical feature grid and neural decoder
A. experimental setup
- Evaluated model on two publicly available outdoor LiDAR datasets with near ground truth mesh information
- One dataset is a sequence of simulated LiDAR scans, the other is a non-simulated dataset with cm-level measurement noise and motion distortion
- Evaluated mapping accuracy, completeness, and memory efficiency
- Compared results to previous methods
- Validated scalability for incremental mapping
- Showcased high-fidelity 3D reconstruction indoors
B. mapping quality
- Evaluated mapping quality in terms of accuracy and completeness
- Compared approach against 3 other mapping systems
- Implemented differentiable rendering
- Used reconstruction metrics to assess results
- Results show superiority of approach in terms of accuracy and completeness
C. memory efficiency
- Our method creates maps with smaller memory usage than Voxblox and VDB Fusion
- Our method maintains good mapping quality even with lower feature grid resolution, while Voxblox and VDB Fusion have significantly increased mapping error
D. scalable incremental mapping
- SHINE-Mapping can scale to larger environments
- Used KITTI dataset to showcase this
- Reconstructed a driving sequence over 4 km
- Qualitative comparison between incremental mapping with and without feature update regularization
E. indoor mapping and filling occluded areas
- Presented a novel approach to large-scale 3D SDF mapping using range sensors
- Used an octree-based implicit representation consisting of features stored in hash tables
- Network and features can be learned end-to-end from range data
- Evaluated on simulated and real-world datasets
- Advantages over current state-of-the-art mapping systems
- More accurate and complete 3D reconstruction with lower map memory than compared methods
- Can provide a reasonable guess about the structure for regions not covered by the sensor