Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Clustering data close to a union of low-dimensional manifolds is a problem in machine learning.
  • Low-rank and sparse priors have been studied for linear subspaces.
  • Real-world datasets cannot be approximated by linear subspaces.
  • Works have proposed to identify the manifolds by learning a feature map.
  • This paper proposes to simultaneously perform clustering and learn a union-of-subspace representation.
  • Experiments show the proposed method is accurate and scalable.

Paper Content

Introduction 1.motivation and contributions

  • Clustering is a problem in machine learning to group data into clusters
  • Classical k-means clustering and its variants can find cluster centroids and assign membership
  • Subspace clustering methods are designed to cluster data that lie close to a union of different low-dimensional linear/affine subspaces
  • Subspace clustering methods often have theoretical guarantees of correct clustering
  • Subspace clustering methods rely on the assumption that each cluster can be approximated by a linear/affine subspace, which is not always valid
  • Hand-designing an appropriate feature embedding (or kernel) such as polynomial or exponential mappings can be used
  • Treating a local neighborhood of the manifold approximately as a linear subspace can also be used
  • Numerous works propose to learn an appropriate linear embedding of the data via deep networks and then perform subspace clustering
  • Learning a representation from multi-modal data is a topic of its own interest
  • Ideal properties of the learned representation are between-cluster discrimination and within-cluster diversity
  • Training with the cross-entropy classification objective fails to achieve the second property
  • Maximal Coding Rate Reduction (MCR 2 ) can achieve the two ideal properties
  • This paper proposes to simultaneously cluster the data and learn a union-of-orthogonal-subspace representation via MCR 2

We conduct experiments on simulation and cifar10

  • Problem 1: Unsupervised Manifold Linearizing and Clustering: Simultaneously cluster samples and learn a linear representation for manifolds
  • Maximal Coding Rate Reduction (MCR 2 ) designed to learn ideal representations in the supervised case
  • MCR 2 clustering objective proposed to solve Problem 1
  • Doubly Stochastic Subspace Clustering: Affinity matrix Π learned to cluster points
  • Manifold Linearizing and Clustering (MLC): MCR 2 clustering objective with doubly stochastic constraints on the affinity Π
  • Parameterizing Π via a Neural Network: Stochastic gradient descent used to maintain memory and computational complexity
  • Comparison with NMCE: MLC yields higher clustering accuracy than NMCE
  • Parameterizing Z θ : Existing network architecture used as backbone
  • Parameterizing Π θ : Inner product kernel of the output of the cluster head C θ followed by a doubly stochastic projection