Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Clustering data close to a union of low-dimensional manifolds is a problem in machine learning.
- Low-rank and sparse priors have been studied for linear subspaces.
- Real-world datasets cannot be approximated by linear subspaces.
- Works have proposed to identify the manifolds by learning a feature map.
- This paper proposes to simultaneously perform clustering and learn a union-of-subspace representation.
- Experiments show the proposed method is accurate and scalable.
Paper Content
Introduction 1.motivation and contributions
- Clustering is a problem in machine learning to group data into clusters
- Classical k-means clustering and its variants can find cluster centroids and assign membership
- Subspace clustering methods are designed to cluster data that lie close to a union of different low-dimensional linear/affine subspaces
- Subspace clustering methods often have theoretical guarantees of correct clustering
- Subspace clustering methods rely on the assumption that each cluster can be approximated by a linear/affine subspace, which is not always valid
- Hand-designing an appropriate feature embedding (or kernel) such as polynomial or exponential mappings can be used
- Treating a local neighborhood of the manifold approximately as a linear subspace can also be used
- Numerous works propose to learn an appropriate linear embedding of the data via deep networks and then perform subspace clustering
- Learning a representation from multi-modal data is a topic of its own interest
- Ideal properties of the learned representation are between-cluster discrimination and within-cluster diversity
- Training with the cross-entropy classification objective fails to achieve the second property
- Maximal Coding Rate Reduction (MCR 2 ) can achieve the two ideal properties
- This paper proposes to simultaneously cluster the data and learn a union-of-orthogonal-subspace representation via MCR 2
We conduct experiments on simulation and cifar10
- Problem 1: Unsupervised Manifold Linearizing and Clustering: Simultaneously cluster samples and learn a linear representation for manifolds
- Maximal Coding Rate Reduction (MCR 2 ) designed to learn ideal representations in the supervised case
- MCR 2 clustering objective proposed to solve Problem 1
- Doubly Stochastic Subspace Clustering: Affinity matrix Π learned to cluster points
- Manifold Linearizing and Clustering (MLC): MCR 2 clustering objective with doubly stochastic constraints on the affinity Π
- Parameterizing Π via a Neural Network: Stochastic gradient descent used to maintain memory and computational complexity
- Comparison with NMCE: MLC yields higher clustering accuracy than NMCE
- Parameterizing Z θ : Existing network architecture used as backbone
- Parameterizing Π θ : Inner product kernel of the output of the cluster head C θ followed by a doubly stochastic projection