Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Blind source separation (BSS) is a problem of recovering an unobserved signal from its mixture.
  • A framework is presented for analyzing violations of statistical prior assumptions and quantifying their impact on the recovery of the signal.
  • The behaviour of a generic BSS-solution is analysed in terms of explicit continuity guarantees with respect to an informative topology.
  • This approach allows for a flexible and convenient quantification of general model uncertainty scenarios.
  • Novel theoretical guarantees are demonstrated for a number of statistical applications.

Paper Content

Introduction

  • Problem of Blind Source Separation: Inversion of unknown function given image of argument
  • Range of applications: Medical imaging, neuroscience, biology, finance, astronomy, physics, engineering
  • Independent Component Analysis (ICA): Linear relation between hidden relation and source, source modelled as multidimensional random variable with independent components
  • Example 1.1: Recover individual contribution of each speaker from recordings
  • Model assumptions: Linearity and independence of voices
  • Violations of assumptions: Interactive nature of conversation, non-linear relation between recordings and voices, voices not iid in time, physical propagation and recording subject to noise and corruption
  • Central question: How do violations of assumptions affect ICA performance?
  • Robustness theory: Flexible and comprehensive robustness theory for Blind Source Separation and Independent Component Analysis
  • Contributions: General and flexible topological notion of statistical robustness, ICA-topology from premetric topology on space of laws of stochastic processes, generic blind-inversion map with unified stability guarantees

The problem of blind source separation

  • Blind source separation is an inverse problem with the objective of recovering an unobserved signal from its mixture
  • The problem of blind source separation is asymmetric between the observable and its cause
  • The BSS paradigm upgrades the data from X to (X, I)
  • The maximal solution X I from X given I is the quasi sources of X given I
  • The task of blind inversion seeks to find an identifiability assumption I that is strong enough to guarantee that the source S can be recovered from its mixture up to an accuracy of •
  • The Problem of Blind Source Separation consists of an abstract part and a procedural part
  • A solution (I, Φ) is robust if the maps (S, f) → Φ(f(S)) and (S, f) → dist Φ(f(S)), S are continuous on I

Independent component analysis.

  • The Problem of Independent Component Analysis (ICA) is a special instance of a generic problem ( )
  • ICA requires a source signal S and a linearity condition on the mixing (GL d )
  • The accuracy of the ICA solution is quantified by a ‘deviance function’
  • Robust ICA is defined as a solution (I , Φ) to the ICA problem with a topology on the causal space C that supports the notion of stability (10)
  • The topology on C should be coarse enough to capture relevant violations of I
  • The topology on C is studied in the next section
  • The topology on C allows the BSS-robustness (17) of the ICA solution to be quantified
  • A number of application-relevant corollaries to this robustness are presented in Section 6

Coordinatisable signals and their convergences

  • Introduce coordinates on M1 in the form of moment-like statistics
  • Regularity classes within Cd for signals to be contained
  • Subspace of Cd is an expressive and convenient class
  • Space D is left invariant under C1,1 action
  • Topological regularity properties of C1 make it a convenient sample space
  • Lemma 3.1 proves C1 is a Borel subset of (Cd, •∞) and a separable Banach space
  • Discrete time models are special cases of formalism
  • Signature moments of υ coincide with classical multivariate moments

Signature moments.

  • Definition of coordinates of a signal in D
  • Key idea: each sample of signal has concise non-local description in terms of iterated-integral statistics
  • Signature coefficients of x combine to form signature of x
  • Signature characterises path of x
  • Signature-based description of path can be transferred from C1 to signals in D
  • Signature moments of signal in D introduced in Definition 3.6
  • Signature moments characterise signal uniquely
  • Signature moments capture pushforward-action on D
  • Signature moments share equivariance property with classical signal statistics
  • Signature moments capture non-local statistical effects
  • Signature moments carry more information per matrix than classical statistics
  • Signature moments provide excellent basis for constructing robustness topology

Convergence of signals.

  • Introduction of a natural notion of p-variation-graded weak convergence on D
  • Classical weak-convergence on D is at most gradually stronger than the topology of the p-variation-graded weak convergence
  • Different topologies on C1 may induce different sets of (bounded) continuous functions
  • Lemma 3.3 implies that the signal spaces are related to the notion of weak convergence
  • Canonically defines a unique topology on D called the p-weak topology
  • Any q-weak topology for 1 ≤ q < 2 is called a weak’-topology
  • The p-weak topology is stronger than the q-weak topology
  • The p-weak topology is gradually stronger than the classical weak topology
  • The norms • p-var and • ∞ are asymptotically equivalent in the limit p → ∞
  • The classical weak topology is equivalent to the q-weak topology on D p K for any q > p
  • Real-world signal processing systems are subject to capacity constraints
  • Combining Sections 3.1 and 3.3 into an informative robustness topology
  • Linking the convergence of signals to the convergence of their moments
  • Imposing a growth condition to make integral statistics continuous
  • U is uniformly signature integrable if U is uniformly signature integrable of order 3
  • U is uniformly signature integrable of order m if its elements attain with uniformly lower probability
  • Moment coordinates are continuous with respect to any weak’-topology
  • U is core integrable if it satisfies (39) for m = 3

An ica-tailored topology on causal space

  • Describe an informative and explicitly computable premetric topology on causal space C
  • Relates to the coarseness requirement
  • Supports the robustness for a generic ICA-solution
  • Systematic, divergence-like quantification of the ‘distance’ between two signals
  • Topological landscape of statistical dependence
  • Algebraically and analytically flexible
  • Sensitive to spatial (non)linear transformations and intrinsic statistical variations
  • Linked to the ICA-specific identifiability structure
  • Extends premetric signal topology to a coarse topology on the whole causal space C
  • Lower-order arrangement captures aspects of a signal most expressive of any linear action
  • Coredinate matrices describe statistical dependence between components of a signal
  • Distance function implemented by premetric topology
  • IC-defect of a mean-stationary signal
  • IC-defect of a non mean-stationary signal
  • Maximum mean discrepancy over finite set of test functions
  • IC-defect quantitatively controls blind identifiability
  • Moduli of continuity for IC-defect relate to premetric topology and uniform topology
  • Class I* of ICA-identifiable signals in D

Orthogonal signals.

  • General signals are not identifiable from their blind linear mixtures, but recovery is possible if components are mutually independent or pairwise uncorrelated.
  • Coordinates of such signals take a simple algebraic form.
  • A process is a product signal if its components are mutually independent.
  • Coordinates of a mean-stationary product signal are all diagonal.

A coarse topology on the space of causes.

  • Signal topology extended to an ‘identifiability’ topology on the causal space C
  • Robustness property (17) can be analysed
  • Topology supported on ‘regular’ subspaces E of C
  • Factor S endowed with the τ δ -induced subspace topology
  • Premetric d defined on subspace (57)
  • Topology relates to the ‘identifiability controlling’ defect δ ⊥ ⊥ of a signal
  • Premetric space (E, d) is the domain for robustness (17)
  • Topology complemented into a topology on C
  • For any given (µ, f ) ∈ B r there exists a unique μ ∈ D
  • Lemma 4.13 shows how topology relates to defect δ ⊥ ⊥
  • Taking S = Ḋ, estimates (62) improve
  • Relation between source and observable can often be modelled as exactly linear

Robust independent component analysis

  • Introduce a quantifiably robust statistical procedure for recovering a matrix from its action on a nonorthogonal source signal
  • Based on the classical idea of jointly diagonalising a set of derived equivariant matrix statistics
  • Link inversion procedure to causal topology to obtain an identifiability theorem with stability bounds
  • Main result is a readily implementable new ICA-solution with strong and general robustness guarantees
  • ICA-Inversion from Coredinates: two linearly related signals ζ and χ in Ḋ
  • Problem of ICA is to recover the inverse A −1 from the input χ
  • Coredinates of χ can be used to efficiently relate the action of A to some relevant statistical properties of ζ
  • Matrix R exists and is unique, with C R•χ = RC χ R = I
  • Central components for inversion procedure are the normalised statistics for θ ∈ GL d
  • Statistics are scale-invariant for any χ ∈ D
  • Optimization: recover inverse of hidden mixing transformation as minimizers of a specially constructed cost function
  • Cost function is minimal over the approximate ‘joint diagonalisers’ of the matrices
  • Compactified subdomain for optimisation of φ χ is the superset of O d
  • Identifiability map returns invertible transformations for its input observables

Blind inversion for (non-)orthogonal sources.

  • Main goal is to prove blind inversion of signals is robust
  • Blind inversion is quantified by defect δ ⊥ ⊥ (ζ)
  • Stability analysis of relation between Φ(χ) and true inverse A −1
  • Constants and auxiliary functions introduced
  • Source signals lie within set R ≡ R (χ)
  • ε 0 := q 0 /(1 + q 0 ) > 0 defined via (73)
  • c 1 := 2dk d r 0 and c 2 := √ dκ 2 (B R )c 1
  • Theorem 5.8 is cornerstone for robustness result
  • Statement (88) guarantees that for each B ∈ Φ(χ) there is 1−δ for the deviance (15)
  • Proof of (88) combines standard arguments from matrix analysis
  • Remark 5.9 revisits usual case (68) with δ ⊥ ⊥ (S) ≤ ε 0
  • Joint-diagonalisation approach outlined in Section E.2

A robust ica-inversion.

  • Inversion map Φ from (84) can be extended to an ICA-solution (I , Φ)
  • ICA-solution is robust in the sense of (10) and Definition 2.10
  • ICA-solution is defined by (89) and (94)
  • ICA-solution is an identifiable cause that is far from empty
  • Inversion map (12) associated with M-estimator (92) parameterises subset of maximal solution (7)
  • Set I of identifiable causes is exhausted by regular sublevel sets (93) related to condition numbers of cause
  • Set I is exhausted by (94) with prior condition bound κ 0 < ∞
  • Theorem 5.12 asserts ICA-solution (I , Φ) is robust in sense of (10) and (17)
  • Theorem 5.12 provides explicit moduli for underlying continuities
  • Constants ci = ci (µ , A, c S , κ0 ) > 0, i = 0, 1, 2, 3, 4, can be read off in explicit form