Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Early-stopping strategies on gradient-descent methods for linear regression are studied for robustness to adversarial attacks.
  • Early-stopped GD is optimally robust against Euclidean-norm adversarial attacks.
  • GD can converge to non-robust models in the case of classification.
  • A GD scheme on a transformation of the data adapted to the attack is proposed to handle any Mahalanobis attack.
  • A simple and tractable estimator is designed with optimal adversarial risk.

Paper Content

Introduction

  • Machine learning models are vulnerable to small changes known as adversarial examples.
  • Adversarial training can reduce vulnerability, but safety-critical applications remain at risk.
  • Classification is well-understood, but regression is relatively understudied.
  • Xing et al. proposed a two-stage estimator and proved its consistency.
  • This paper considers linear regression under arbitrary norm attacks and analyzes the robustness of gradient-descent.
  • A variant of gradient-descent is proposed to understand the effect of early-stopping strategies.
  • A generic algorithm is designed to produce a robust predictor against any norm-attacks.

Summary of main contributions

  • Early-stopped GD achieves near-optimal adversarial risk in case of Euclidean attacks
  • Early-stopped GD can be arbitrarily sub-optimal in the non-Euclidean case
  • GD+ achieves near-optimal adversarial risk in the case of Mahalanobis norm attacks
  • GD+ is near-optimally robust under p-norm attacks when features are uncorrelated
  • Active area of research in theoretical understanding of adversarial examples
  • Good accuracy implies poor robustness
  • High-dimensional data distributions have concentration property
  • Tradeoffs in Gaussian mixture classification problems
  • Natural images are well-separated
  • Established results show gap between learning in ordinary and adversarial settings
  • Dynamics of linear classification on separable data
  • Tradeoffs between ordinary and adversarial risk in linear regression
  • Tradeoffs between interpolation, normal risk, and adversarial risk for finite-width overparameterized networks
  • Minimax estimation of linear models under adversarial attacks in Euclidean norm

Outline of manuscript

  • Problem setup and definitions presented in Section 2
  • Adversarial risk of GD in infinite-sample regime analyzed in Section 4
  • GD+ proposed in Section 5 and its adversarial risk studied
  • Two-stage estimator proposed in Section 6 with optimal adversarial risk and additive statistical estimation error due to finite samples

Preliminaries

  • [d] denotes the set of integers from 1 to d inclusive
  • The maximum/minimum of two real numbers a and b is denoted a ∨ b/a ∧ b
  • Operator norm of a matrix A is denoted A op
  • Mahalanobis norm of w induced by S is defined by w S := √ w Sw
  • M d (R) and S ++ d (R) denote the set of d × d matrices and positivedefinite matrices
  • Given p ∈ [1, ∞], its harmonic conjugate is denoted q(p)
  • Dual norm of a given norm • is denoted •
  • Unit-ball for a norm • is denoted B d •
  • Absolute constants c 0 , α and β are defined

Problem setup

  • Vector w 0 is in R d
  • Positive definite matrix Σ of size d
  • Dataset D n of size n
  • Features x i i.i.d ∼ N (0, Σ)
  • Labels y i i.i.d ∼ N (0, σ 2 )
  • Distribution of features is centered multivariate Gaussian with covariance matrix Σ
  • w 0 is generative model
  • σ ≥ 0 measures size of noise
  • n is sample size
  • d is input-dimension
  • Infinite-data regime n = ∞

Adversarial robustness risk

  • Linear model w ∈ R d can be attacked by swapping a clean test point with a corrupted version
  • Attack is constrained to be small, enforced by demanding δ ≤ r
  • Adversarial risk measures performance of linear model under attack
  • Aim is to minimize adversarial risk E (w, w 0 , r)
  • Lemma 2.1 states that for any w ∈ R d and r ≥ 0, it holds that E (w, w 0 , r) ≥ c 0 r2
  • Aim is to obtain robust predictor to adversarial attacks by minimizing adversarial risk

A proxy for adversarial risk

  • Lemma 3.1 allows us to replace the adversarial risk functional E with a more tractable proxy E.
  • By minimizing the proxy function, we can get roughly 90% optimality in the adversarial risk.

Gradient-descent for linear regression

  • Gradient descent can be used to minimize ordinary risk
  • Continuous-time gradient descent is an infinitely small step-size limit of discrete-time gradient descent
  • Early-stopped gradient descent predictors can be used to minimize adversarial risk

Euclidean attacks: almost-optimal robustness

  • Euclidean attacks can make the generative model w0 non-robust
  • As the attack strength increases, the adversarial risk of w0 becomes arbitrarily large
  • Applying a GD scheme until convergence may lead to predictors that are not robust to adversarial attacks
  • GD early-stopped at time t has the same adversarial risk as a ridge estimator with regularization parameter λ ∝ 1/t
  • In the isotropic case, GD achieves the exact optimal adversarial risk

Mahalanobis attacks: sub-optimality of gradient-descent

  • GD fails to be adversarially robust under Mahalanobis attacks
  • Proposed modified version of GD can handle Mahalanobis attacks and more general ones

An adapted gradient-descent: gd+

  • GD+ is a modified version of the GD scheme
  • GD+ is an almost optimally robust predictor for Mahalanobis attacks
  • GD+ applies feature-dependent gradient steps determined by M
  • Early-stopped GD+ is optimally robust up to an absolute constant
  • GD+ is able to obtain near-optimality under B-norm attack

Robustness under general attacks

  • Goal is to provide control of adversarial risk of GD+
  • Define condition number of matrix w.r.t attacker’s norm
  • Show general upper-bound of modified GD scheme
  • Special case recovers result of Proposition 5.1
  • Minimizing quantity is hard due to arbitrary choice of norm

A sufficient condition for optimality

  • Consider the general case of an arbitrary norm-attack with dual
  • Focus on a specific path induced GD+
  • Data is normalized and the path is a uniform shrinkage of the generative model
  • Optimal adversarial risk is given by a condition
  • Subgradient of the condition can be chosen
  • GD+ is near-optimal for any p-norm attack when the norms are aligned
  • A two-stage estimator is proposed to reach near-optimality for general attacks
  • An illustration of the result is provided in Figure 4

Efficient algorithms for attacks in general norms

  • Propose a simple tractable estimator with optimal adversarial risk
  • Drop assumption of infinite training data
  • Estimators are functions of finite-training dataset

A two-stage estimator and its statistical analysis

  • Algorithm 1 is a two-stage estimator for minimizing the adversarial risk proxy.
  • Stage 1 of Algorithm 1 computes consistent estimators for w0 and Σ from the data Dn.

Consistency of proposed two-stage estimator

  • Algorithm 1 is a two-stage estimator for a generative model w0
  • Theorem 6.1 states that Algorithm 1 is robust-optimal up to a multiplicative factor and an additive term
  • If the covariance matrix Σ is known, the statistical error of Algorithm 1 is dominated by the error associated with estimating the generative model w0
  • The error term is further bounded by σ2s log(ed/s) n
  • Theorem 6.1 recovers the adversarial risk-consistency result established in (Xing et al., 2021) as a special case

Algorithm 1: primal-dual algorithm

  • Algorithm 1 is proposed to compute the second stage of an estimator
  • Algorithm 2 is a Primal-Dual algorithm which implements Stage 2 of Algorithm 1
  • Algorithm 3 is a Non-uniform soft-thresholding which implements Stage 2 of Algorithm 1 for diagonal covariance matrix under ∞-attacks
  • Algorithm 2 converges to a stationary point at an ergodic rate O(1/t)
  • Algorithm 3 suppresses weak components of w0 and boosts strong components
  • Algorithm 3 returns w(t) which minimizes the adversarial risk of w(t)

Concluding remarks

  • Gradient-descent (GD) can achieve optimal adversarial risk up to an absolute constant.
  • GD+ with feature-dependent learning rates can succeed to achieve the optimal adversarial risk.
  • The covariance structure of the features and the norm used to measure the strength of the attack affect GD and GD+.
  • A two-stage estimator is proposed which achieves optimal adversarial risk in the population regime, up to within a constant factor.
  • The proposed estimator adapts to attacks with general norms and to the covariance structure of the features.
  • Experiments verify theoretical results, showing that GD is optimally robust and GD+ can reach the optimal adversarial risk.
  • The uniform shrinkage strategy is optimally robust when the Condition 5.1 is satisfied.
  • The proposed algorithm is compared to the Two-Stage Estimator Proposed in Xing et al. (2021).
  • Adversarial risk improves with number of samples.
  • Lemma 2.1 and Lemma 3.1 are proved.
  • Proposition 4.2 and Proposition 4.3 are proved.
  • Proposition C.1 is proved.