Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Proposed a novel Bayesian inference framework for distributed differentially private linear regression
Data is split between multiple parties who share summary statistics in a privacy-preserving way
Developed a novel generative statistical model for privately shared statistics
Bayesian estimation of regression coefficients is conducted using Markov chain Monte Carlo algorithms
Also provided a fast version to perform Bayesian estimation in one iteration
Proposed methods have computational advantages over competitors
Numerical results on real and simulated data demonstrate well-rounded estimation and prediction

Paper Content

Introduction

Linear regression is a mathematical method used in statistical research
Many researchers have been working on linear regression since the 19th century
Differential privacy is the most commonly used definition for privacy
There is a growing interest in differentially private linear regression
General-purpose Bayesian differentially private estimation methods can be used in regression problems
Hierarchical model for privatised data and Bayesian estimation for the model parameters
Differential privacy mechanisms for posterior sampling and linear regression
General-purpose differentially private Markov chain Monte Carlo (MCMC) algorithms can be applied to regression
Perturbing polynomial objective functions with privacy-preserving noise
Perturbation of summary statistics
Point estimation of the linear regression parameters
Confidence intervals for the coefficients of linear regression
Rates of convergence for parameter estimation with differential privacy
Distributed setting where the total dataset is shared among multiple parties
Adding noise to summary statistics of linear regression
Fast Bayesian estimation methods
MCMC algorithms for iterative sampling from posterior distributions

Differential privacy

Differential privacy is a type of algorithm that takes in sensitive data and returns a random output.
The amount of privacy is determined by a parameter, δ.
Noise-adding mechanisms are used to preserve privacy, with the Gaussian mechanism being a popular one.
This paper focuses on ( , δ)-DP and the Gaussian mechanism to generate noisy observations.
The paper presents a hierarchical model for differentially private distributed linear regression.

Basic model and privacy setup

We have a sequence of random variables (x i , y i )
We consider normal linear regression to model the dependency between x i and y i
We assume the feature vectors x i are i.i.d. with a normal distribution
We define summary statistics of X and y
We assume a setup where S and z are privately released
We set up a hierarchical model to enable Bayesian inference of θ
We use the exact conditional distribution p(z|S, θ, σ 2 )
Our model has a different hierarchical structure and requires less privacy-preserving noise

Distributed setting

Model extended to distributed setting
Data shared among J ≥ 1 data holders
Each data holder shares summary statistics with privacy-preserving noise
Hierarchical structure of model specified for normally distributed x i ’s
Node-specific observations more informative on θ than aggregate versions
Partitioning of data relevant to data privacy applications outside distributed learning framework

Algorithms for bayesian inference

Bayesian inference targets the posterior distribution of latent variables
Present several Bayesian inference algorithms for hierarchical model
Two cases considered: normal and non-normal Px
MCMC algorithm and closed form solution for posterior of θ developed

Normally distributed features

MCMC algorithm presented for Bayesian inference for differentially private distributed linear regression model
Latent variables involved: θ, Σ x , σ 2 y , S 1:J , z 1:J
Poor convergence due to high posterior correlation between θ and z 1:J
Reduced model with θ, Σ x , σ 2 y as latent variables
Closed-form full conditional distributions for θ and Σ x
Metropolis-Hastings moves to update S 1:J and σ 2 y
Wishart distribution used to update S j
Adaptive MCMC framework used to target acceptance rate of 0.2

Features with a general distribution

Normality assumption for x i ’s may not be adequate for some data sets
Updating S j ’s can be a bottleneck in terms of computation time and convergence
Algorithms provide accurate estimations even for normally distributed features
Estimate S j ’s from the beginning and fix them during inference procedure

Extensions

Variants of methodology mentioned in Appendix B
Average feature vectors in X and corresponding response variables in y to make them approximately normal
Details of this approach in Appendix B.1
If features are normally distributed but data not centred, need to include intercept parameter and modify hierarchical model in Appendix B.2

Numerical experiments

MCMC-normalX, MCMC-fixedS, and Bayes-fixedS-fast algorithms are evaluated numerically
Compared to adaSSP of Wang (2018) and MCMC method of Bernstein and Sheldon (2019)
Extensions of adaSSP and MCMC-B&S for J ≥ 1 implemented
Model in Bernstein and Sheldon (2019) generalised for J ≥ 1
Code to replicate experiments available at given URL

Experiments with simulated data

Considered two different configurations for problem size
Generated data with certain parameters
Used same parameters for inference
Evaluated methods at different combinations of J and
Ran MCMC algorithms for 10,000 iterations
Looked at mean squared errors of estimates and predictions
MCMC-fixedS and Bayes-fixedS-fast outperformed adaSSP and MCMC-B&S
MCMC-normalX better at d = 2, MCMC-B&S better at d = 5
All methods improved as grows
Compared computation times of MCMC algorithms
MCMC-B&S slowed down by O(d6)

Experiments with real data

Used four different data sets from UCI Machine Learning Repository
Disregarded columns with string data or key values
Considered most right-hand column as y
80% of data used for training, 20% for testing
Average prediction performances presented in Table 1
MCMC-fixed-S and Bayes-fixed-S most stable
MCMC-fixed-S and Bayes-fixed-S beat adaSSP and MCMC-B&S when J > 1

Conclusion

Propose a novel Bayesian inference framework for a differentially private distributed linear regression setting
Exploit the conditional structure between the summary statistics of linear regression
Numerical experiments show proposed methods are competitive with state-of-the-art alternatives
Room for improvement of MCMC-normalX
Full Conditional Distribution of Σx and θ
Acceptance Ratio for the MH Update of Sj and σ2y
Extensions mentioned in Section 4.4 indicate potential future directions
Extension of Bernstein and Sheldon (2019) suited to observations
Model includes b = x, Σx, and S0
Extension of adaSSP (Wang, 2018) for J ≥ 1
Calculate D × 1 mean vector and D × D covariance matrix

Link to paper#

Abstract#

Paper Content#

Introduction#

Differential privacy#

Basic model and privacy setup#

Distributed setting#

Algorithms for bayesian inference#

Normally distributed features#

Features with a general distribution#

Extensions#

Numerical experiments#

Experiments with simulated data#

Experiments with real data#

Conclusion#

Link to paper

Abstract

Paper Content

Introduction

Differential privacy

Basic model and privacy setup

Distributed setting

Algorithms for bayesian inference

Normally distributed features

Features with a general distribution

Extensions

Numerical experiments

Experiments with simulated data

Experiments with real data

Conclusion