Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Langevin diffusions are rapidly convergent under certain conditions.
Vemapala and Wibisono (2019) and Chewi et al. (2022) established results for log-Sobolev and Poincaré inequalities respectively.
This paper goes beyond Poincaré inequalities and establishes upper and lower bounds for Langevin diffusions and LMC under weak Poincaré inequalities.
Results explicitly quantify the effect of the initializer on the performance of the LMC algorithm.
Three-step phase transition is unavoidable, as demonstrated by lower bounds.

Problem of sampling from a target probability density using Langevin Monte Carlo (LMC) algorithm
LMC iterations based on discretizing a stochastic differential equation (SDE)
Non-asymptotic convergence of LMC studied extensively for log-concave and smooth targets
Functional inequality based conditions allow for multi-modality in target density
Logarithmic Sobolev inequality (LSI) and Lata la-Oleszkiewicz inequality (LOI) can cover range of tail behavior
Research program initiated by [VW19] to provide convergence guarantees for LMC when target density satisfies functional inequality and smoothness condition
[CEL + 22] extended framework and proved LSI can be replaced with LOI
[VW19] and [CEL + 22] provide state of the art guarantees under minimal set of conditions for LMC
Aim to complete program initiated by [VW19] and push convergence analysis of LMC to its limits
Study behavior of LMC for potentials that satisfy a family of weak-Poincaré inequalities (WPI)
WPI virtually any target density satisfies such an inequality
Establish non-asymptotic convergence guarantees for LMC and Langevin diffusion in Rényi divergence
Prove WPIs with explicit dimension dependence for two model examples of heavy-tailed distributions
Establish lower bounds for complexity of LMC and Langevin diffusion in Rényi divergence
Lower bounds indicate slow start behavior with worse dependence on initial divergence
Exponential dependence on initial error for LMC and diffusion unavoidable for Cauchy-type targets

We consider a class of functional inequalities introduced by [RW01]
[RW01] showed that a certain measure satisfies a WPI with certain parameters
The WPI reduces to a classical Poincaré inequality in certain cases
The tail properties of the distribution are captured by a function that determines the convergence rate of LMC
We present our convergence guarantees under the generic condition (WPI)
We use Rényi divergence as a measure of distance between two probability distributions
Rényi divergence is related to KL divergence, L ∞ -norm, and χ 2 divergence
A bound in Rényi divergence can be translated to a bound in W 2 under certain conditions

Convergence of Langevin diffusion is known under variance metric or χ2 divergence
Theorem 2 characterizes convergence in Rényi divergence
Initial error must satisfy Rq’ (ρ 0 π) < ∞ for some q’ > q
Classical convergence results require R∞ (ρ 0 π) < ∞
Proposition 3 converts WPI with Φ(•) = Osc(f ) 2 to Φ’ such that β’ (r) ≤ β WPI (r)
π cannot satisfy a WPI with Φ = Φ’u for u = 2

Sampling from Cauchy-type measures can be done with convergence guarantees for LMC in Rényi divergence of any finite order.
A potential is analyzed as a substitute for xα since the latter does not have continuous gradients.
Proposition 5 presents the β WPI estimate for this potential.
Corollary 6 establishes convergence guarantees for LMC and the Langevin diffusion.
Corollary 8 presents a rate for LMC and the Langevin diffusion for Cauchy-type measures.
Initializing with an isotropic Gaussian with an appropriately scaled variance can reduce the complexity of sampling from Cauchy-type measures.

LMC has worse dependence on initial divergence when target has heavier tails
Sharp transition at Cauchy-type logarithmic tails, in which case initial error becomes exponential
Method for developing lower bounds for LMC convergence rate
Notation of complexity introduced
Strategy for obtaining lower bounds when initial error is large
Variational representations of divergences used to obtain lower bounds

Assumption of α ∈ [0, 2] for sufficiently large x
Dependence on initial error deteriorates as α → 0
Three-step phase transition:
Lower bounds reproduce dependence of ∆ 0 known in Gaussian setting
Lower bounds for smooth ∇V with ∇V (0) = 0 and ∇V (x) x α−1 for large x
LMC exhibits slow start behavior in heavy-tailed settings
Step size needs to be small enough for discretization to not harm convergence
Corollary 12: exponential dependence on initial error unavoidable unless good initialization available

We provided convergence guarantees for LMC and Langevin diffusion for target distributions.
We obtained guarantees demonstrating that targets with heavier tails lead to a worse dependence on the initial error.
The dependence on initial error is a polynomial of order (2−α)2 2α for α > 0, with a phase transition at α = 0.
We established lower bounds under generic tail growth conditions that asserted such dependence on the initial error is unavoidable.
We left the stability of fixed step size LMC in the number of iterations under heavy-tailed targets as an open direction for future research.
We need to show Osc ρt π q/2 2 ≤ ρ0 π q L ∞ (π).
We provided WPI estimates for our model examples.
We used Lemma 15 to establish WPIs with suitable dimension dependencies.
We used Lemma 17 to bound the weights in Lemma 17.
We used Lemma 18 to obtain a WPI for π α.
We used Lemma 19 to lower bound the Rényi divergence between ρ and π.
We used Lemma 20 to lower bound the decay rate of the second moment for the Langevin diffusion and LMC.
We used Lemma 21 to control the Rényi or KL divergence using the variance of an isotropic Gaussian initialization.