Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Blind source separation (BSS) is a problem of recovering an unobserved signal from its mixture.
- A framework is presented for analyzing violations of statistical prior assumptions and quantifying their impact on the recovery of the signal.
- The behaviour of a generic BSS-solution is analysed in terms of explicit continuity guarantees with respect to an informative topology.
- This approach allows for a flexible and convenient quantification of general model uncertainty scenarios.
- Novel theoretical guarantees are demonstrated for a number of statistical applications.
Paper Content
Introduction
- Problem of Blind Source Separation: Inversion of unknown function given image of argument
- Range of applications: Medical imaging, neuroscience, biology, finance, astronomy, physics, engineering
- Independent Component Analysis (ICA): Linear relation between hidden relation and source, source modelled as multidimensional random variable with independent components
- Example 1.1: Recover individual contribution of each speaker from recordings
- Model assumptions: Linearity and independence of voices
- Violations of assumptions: Interactive nature of conversation, non-linear relation between recordings and voices, voices not iid in time, physical propagation and recording subject to noise and corruption
- Central question: How do violations of assumptions affect ICA performance?
- Robustness theory: Flexible and comprehensive robustness theory for Blind Source Separation and Independent Component Analysis
- Contributions: General and flexible topological notion of statistical robustness, ICA-topology from premetric topology on space of laws of stochastic processes, generic blind-inversion map with unified stability guarantees
The problem of blind source separation
- Blind source separation is an inverse problem with the objective of recovering an unobserved signal from its mixture
- The problem of blind source separation is asymmetric between the observable and its cause
- The BSS paradigm upgrades the data from X to (X, I)
- The maximal solution X I from X given I is the quasi sources of X given I
- The task of blind inversion seeks to find an identifiability assumption I that is strong enough to guarantee that the source S can be recovered from its mixture up to an accuracy of •
- The Problem of Blind Source Separation consists of an abstract part and a procedural part
- A solution (I, Φ) is robust if the maps (S, f) → Φ(f(S)) and (S, f) → dist Φ(f(S)), S are continuous on I
Independent component analysis.
- The Problem of Independent Component Analysis (ICA) is a special instance of a generic problem ( )
- ICA requires a source signal S and a linearity condition on the mixing (GL d )
- The accuracy of the ICA solution is quantified by a ‘deviance function’
- Robust ICA is defined as a solution (I , Φ) to the ICA problem with a topology on the causal space C that supports the notion of stability (10)
- The topology on C should be coarse enough to capture relevant violations of I
- The topology on C is studied in the next section
- The topology on C allows the BSS-robustness (17) of the ICA solution to be quantified
- A number of application-relevant corollaries to this robustness are presented in Section 6
Coordinatisable signals and their convergences
- Introduce coordinates on M1 in the form of moment-like statistics
- Regularity classes within Cd for signals to be contained
- Subspace of Cd is an expressive and convenient class
- Space D is left invariant under C1,1 action
- Topological regularity properties of C1 make it a convenient sample space
- Lemma 3.1 proves C1 is a Borel subset of (Cd, •∞) and a separable Banach space
- Discrete time models are special cases of formalism
- Signature moments of υ coincide with classical multivariate moments
Signature moments.
- Definition of coordinates of a signal in D
- Key idea: each sample of signal has concise non-local description in terms of iterated-integral statistics
- Signature coefficients of x combine to form signature of x
- Signature characterises path of x
- Signature-based description of path can be transferred from C1 to signals in D
- Signature moments of signal in D introduced in Definition 3.6
- Signature moments characterise signal uniquely
- Signature moments capture pushforward-action on D
- Signature moments share equivariance property with classical signal statistics
- Signature moments capture non-local statistical effects
- Signature moments carry more information per matrix than classical statistics
- Signature moments provide excellent basis for constructing robustness topology
Convergence of signals.
- Introduction of a natural notion of p-variation-graded weak convergence on D
- Classical weak-convergence on D is at most gradually stronger than the topology of the p-variation-graded weak convergence
- Different topologies on C1 may induce different sets of (bounded) continuous functions
- Lemma 3.3 implies that the signal spaces are related to the notion of weak convergence
- Canonically defines a unique topology on D called the p-weak topology
- Any q-weak topology for 1 ≤ q < 2 is called a weak’-topology
- The p-weak topology is stronger than the q-weak topology
- The p-weak topology is gradually stronger than the classical weak topology
- The norms • p-var and • ∞ are asymptotically equivalent in the limit p → ∞
- The classical weak topology is equivalent to the q-weak topology on D p K for any q > p
- Real-world signal processing systems are subject to capacity constraints
- Combining Sections 3.1 and 3.3 into an informative robustness topology
- Linking the convergence of signals to the convergence of their moments
- Imposing a growth condition to make integral statistics continuous
- U is uniformly signature integrable if U is uniformly signature integrable of order 3
- U is uniformly signature integrable of order m if its elements attain with uniformly lower probability
- Moment coordinates are continuous with respect to any weak’-topology
- U is core integrable if it satisfies (39) for m = 3
An ica-tailored topology on causal space
- Describe an informative and explicitly computable premetric topology on causal space C
- Relates to the coarseness requirement
- Supports the robustness for a generic ICA-solution
- Systematic, divergence-like quantification of the ‘distance’ between two signals
- Topological landscape of statistical dependence
- Algebraically and analytically flexible
- Sensitive to spatial (non)linear transformations and intrinsic statistical variations
- Linked to the ICA-specific identifiability structure
- Extends premetric signal topology to a coarse topology on the whole causal space C
- Lower-order arrangement captures aspects of a signal most expressive of any linear action
- Coredinate matrices describe statistical dependence between components of a signal
- Distance function implemented by premetric topology
- IC-defect of a mean-stationary signal
- IC-defect of a non mean-stationary signal
- Maximum mean discrepancy over finite set of test functions
- IC-defect quantitatively controls blind identifiability
- Moduli of continuity for IC-defect relate to premetric topology and uniform topology
- Class I* of ICA-identifiable signals in D
Orthogonal signals.
- General signals are not identifiable from their blind linear mixtures, but recovery is possible if components are mutually independent or pairwise uncorrelated.
- Coordinates of such signals take a simple algebraic form.
- A process is a product signal if its components are mutually independent.
- Coordinates of a mean-stationary product signal are all diagonal.
A coarse topology on the space of causes.
- Signal topology extended to an ‘identifiability’ topology on the causal space C
- Robustness property (17) can be analysed
- Topology supported on ‘regular’ subspaces E of C
- Factor S endowed with the τ δ -induced subspace topology
- Premetric d defined on subspace (57)
- Topology relates to the ‘identifiability controlling’ defect δ ⊥ ⊥ of a signal
- Premetric space (E, d) is the domain for robustness (17)
- Topology complemented into a topology on C
- For any given (µ, f ) ∈ B r there exists a unique μ ∈ D
- Lemma 4.13 shows how topology relates to defect δ ⊥ ⊥
- Taking S = Ḋ, estimates (62) improve
- Relation between source and observable can often be modelled as exactly linear
Robust independent component analysis
- Introduce a quantifiably robust statistical procedure for recovering a matrix from its action on a nonorthogonal source signal
- Based on the classical idea of jointly diagonalising a set of derived equivariant matrix statistics
- Link inversion procedure to causal topology to obtain an identifiability theorem with stability bounds
- Main result is a readily implementable new ICA-solution with strong and general robustness guarantees
- ICA-Inversion from Coredinates: two linearly related signals ζ and χ in Ḋ
- Problem of ICA is to recover the inverse A −1 from the input χ
- Coredinates of χ can be used to efficiently relate the action of A to some relevant statistical properties of ζ
- Matrix R exists and is unique, with C R•χ = RC χ R = I
- Central components for inversion procedure are the normalised statistics for θ ∈ GL d
- Statistics are scale-invariant for any χ ∈ D
- Optimization: recover inverse of hidden mixing transformation as minimizers of a specially constructed cost function
- Cost function is minimal over the approximate ‘joint diagonalisers’ of the matrices
- Compactified subdomain for optimisation of φ χ is the superset of O d
- Identifiability map returns invertible transformations for its input observables
Blind inversion for (non-)orthogonal sources.
- Main goal is to prove blind inversion of signals is robust
- Blind inversion is quantified by defect δ ⊥ ⊥ (ζ)
- Stability analysis of relation between Φ(χ) and true inverse A −1
- Constants and auxiliary functions introduced
- Source signals lie within set R ≡ R (χ)
- ε 0 := q 0 /(1 + q 0 ) > 0 defined via (73)
- c 1 := 2dk d r 0 and c 2 := √ dκ 2 (B R )c 1
- Theorem 5.8 is cornerstone for robustness result
- Statement (88) guarantees that for each B ∈ Φ(χ) there is 1−δ for the deviance (15)
- Proof of (88) combines standard arguments from matrix analysis
- Remark 5.9 revisits usual case (68) with δ ⊥ ⊥ (S) ≤ ε 0
- Joint-diagonalisation approach outlined in Section E.2
A robust ica-inversion.
- Inversion map Φ from (84) can be extended to an ICA-solution (I , Φ)
- ICA-solution is robust in the sense of (10) and Definition 2.10
- ICA-solution is defined by (89) and (94)
- ICA-solution is an identifiable cause that is far from empty
- Inversion map (12) associated with M-estimator (92) parameterises subset of maximal solution (7)
- Set I of identifiable causes is exhausted by regular sublevel sets (93) related to condition numbers of cause
- Set I is exhausted by (94) with prior condition bound κ 0 < ∞
- Theorem 5.12 asserts ICA-solution (I , Φ) is robust in sense of (10) and (17)
- Theorem 5.12 provides explicit moduli for underlying continuities
- Constants ci = ci (µ , A, c S , κ0 ) > 0, i = 0, 1, 2, 3, 4, can be read off in explicit form