Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- NCD is a task of learning a model to segment unlabelled classes using only labelled classes
- No work exists for 3D point cloud data
- This paper advances the state of the art on point cloud data analysis in four directions
- Presents a new method for NCD based on online clustering
- Introduces a new evaluation protocol to assess the performance of NCD for point cloud semantic segmentation
Paper Content
Introduction
- Humans can organize new visual knowledge into groups
- Machines cannot do this without supervision
- Novel Class Discovery (NCD) is the task of classifying unlabelled samples into different classes
- NCD has been explored in 2D image domain for classification and semantic segmentation
- NCD for 3D data is different because one point cloud can contain more than one novel class
- NCD for 3D semantic segmentation is explored in this paper
- A new method for NCD is presented, called NOPS (NOvel Point Segmentation)
- A new evaluation protocol is introduced to assess the performance of NCD for 3D semantic segmentation
Related work
- Point cloud semantic segmentation can be performed at the point level, on range view maps, and by voxelising the input points
- Point-level networks process the input without intermediate representations, examples include PointNet, PointNet++, RandLA-Net, and KPConv
- Range view architectures and voxel-based approaches are more computationally efficient than point-level networks
- Novel class discovery is explored for 2D classification and 2D segmentation
- NCD is more complex than standard semi-supervised learning
- NOPS tackles the problem of NCD in 3D point cloud semantic segmentation
- NOPS produces two augmented views that are processed with the same deep neural network
- Sinkhorn-Knopp algorithm is used to obtain pseudo-labels
- Network is trained by minimising the optimisation objective function through a swapped prediction task based on the computed pseudo-labels
Problem formulation
- X is a dataset of 3D point clouds captured in different scenes.
- X is composed of a base set Xb and a novel set Xn.
- Semantic categories present in the point clouds are Cb and Cn, where Cb is the set of base classes and Cn is the set of novel classes.
- Each X is composed of 3D points with coordinates and semantic classes.
- Aim is to design a computational approach to segment all points of a given point cloud.
Online pseudo-labelling
- Pseudo-labelling is the assignment of novel points to class-prototypes learnt during training
- We aim to find an assignment that partitions the points equally across the prototypes
- We maximise the similarity between the features of the new points and the learned prototypes
- We formulate the transportation polytope such that the optimisation is performed online
- We introduce a linear decay of the pseudo-labels during training
- We use multiple novel class segmentation heads to optimise the feature space
Class-balanced queuing
- Soft pseudo-labelling produces an equipartite matching between novel points and class centroids.
- When dealing with 3D data, batches may contain novel classes with different cardinalities.
- To mitigate potential class imbalance, a queue of features from previous iterations is used.
Uncertainty-aware training and queuing
- Proposed selection of novel points for training with fewer but more reliable pseudo-labels
- Adaptive threshold based on class probabilities within each batch
- Selection strategy extracts novel points with greatest class probability, computes threshold as p-th percentile of class probabilities, and retains novel points with class probability above threshold
Optimisation objective
- Optimize f Θ using weighted Cross Entropy objective
- Formulate swapped prediction task based on pseudo-labels
- Generate two different augmentations of X
- Predict novel pseudo-labels of X and X
- Enforce prediction consistency between swapped pseudo-labels
- Use separate segmentation heads for base and novel classes
Adapting ncd for 2d images to 3d
- Adapted Zhao et al. [41] method for NCD for 2D semantic segmentation (EUMS) to 3D data
- EUMS uses two assumptions: I) novel classes belong to foreground and II) each image can contain at most one novel class
- With 3D point clouds, no concept of foreground and background
- Adaptation designed to discover classes of all unlabelled points
- Subsampling of points necessary to fit data in RAM
- Cluster prototypes computed, hard pseudo-labels produced
- Pseudo-label of each point propagated to nearest neighbour in coordinate space
- Overclustering and entropy-based modelling implemented to boost results
Experimental results
Experimental setup
- Evaluated approach on SemanticKITTI and SemanticPOSS datasets
- Created four splits for each dataset
- Measured performance using mean Intersection over Union (mIoU)
- Implemented network based on MinkowskiUNet-34C
- Trained network for 10 epochs with SGD optimizer
Quantitative analysis
- NOPS outperforms EUMS † on 3 out of 4 splits on SemanticPOSS
- NOPS outperforms EUMS † on all 4 splits on SemanticKITTI
- NOPS outperforms EUMS † in terms of computational time
- NOPS requires less memory and a lower computational time than EUMS †
Qualitative analysis
- NOPS and EUMS † are tested on SemanticPOSS and SemaniticKITTI
- NOPS shows better segmentation capabilities for novel classes
- EUMS nearly fails in correctly recognizing novel objects
- NOPS is composed of 7 versions, with and without pre-training
- Pre-trained approaches generally underperform trained-from-scratch counterparts on novel classes
- Performance depends on number of points and difficulty of novel classes
- NOPS outperforms compared baselines by a large margin
- Limitations include prior knowledge on number of novel classes and loss used to handle class unbalancing
- Evaluation protocol introduced to assess performance of NCD for point cloud segmentation
- EUMS † adapted for 3D point cloud data with changes to original implementation
- Splits of SemanticKITTI and SemanticPOSS selected based on balancing novel classes and including semantic relationships between base and novel classes
- Additional qualitative results show NOPS outperforms EUMS †