Differential privacy is a cryptographically-motivated approach to privacy that has become a very active field of research over the last decade in theoretical computer science and machine learning. In this paradigm we assume there is a trusted curator who holds the data of individuals in a database and the goal of privacy is to simultaneously protect individual data while allowing statistical analysis of the database as a whole. In this setting we introduce a general framework for parametric inference with differential privacy guarantees. We first obtain differentially private estimators based on bounded influence M-estimators by leveraging their gross-error sensitivity in the calibration of a noise term added to them in order to ensure privacy. We then we show how a similar construction can also be applied to construct differentially private test statistics analogous to the Wald, score and likelihood ratio tests. We provide statistical guarantees for all our proposals via an asymptotic analysis. An interesting consequence of our results is to further clarify the connection between differential privacy and robust statistics. In particular we demonstrate that differential privacy is a weaker requirement than infinitesimal robustness and show that robust M-estimators can be easily randomized in order to guarantee both differential privacy and robustness towards the presence of contaminated data. We illustrate our results both on simulated and real data.
Marco Avella-Medina is an assistant professor in the Department of Statistics at Columbia University. His research interests include robust statistics, high-dimensional statistics and statistical machine learning. Before joining Columbia he was a postdoctoral fellow at MIT, where I was affiliated with the Sloan School of Management and the Statistics and Data Science Center. He holds a B.A. degree in Economics, a M.Sc. degree in Statistics and a Ph.D. in Statistics, all from the University of Geneva, Switzerland.