Name: Model Selection for Machine Learning Estimation of Doubly Robust Functionals
Start: 2019-10-31T19:30:00
End: 2019-10-31T20:30:00

Submitted by lfb109 on Thu, 10/24/2019 - 13:51

event

Model Selection for Machine Learning Estimation of Doubly Robust Functionals

Presented By

Eric J. Tchetgen Tchetgen, The Warton School, University of Pennsylvania

Details

Start DateThu, Oct 31, 2019
3:30 PM

End DateThu, Oct 31, 2019
4:30 PM

Location

View larger map

201 Thomas Building

Add to Calendar 2019-10-31T19:30:00 2019-10-31T20:30:00 UTC Model Selection for Machine Learning Estimation of Doubly Robust Functionals 201 Thomas Building

Start DateThu, Oct 31, 2019
3:30 PM

End DateThu, Oct 31, 2019
4:30 PM

Presented By

Eric J. Tchetgen Tchetgen, The Warton School, University of Pennsylvania

Event Series:

While model selection is a well-studied topic in parametric and nonparametric regression and density estimation, model selection of possibly high dimensional
nuisance parameters in semiparametric problems is far less developed. This paper proposes a new model selection framework for making inferences about a finite dimensional functional defined on a semiparametric model when the latter admits a doubly robust estimating function. The class of such doubly robust functionals is quite large and includes estimation of pathwise differentiable functionals when data are missing at random and in causal inference problems under unconfoundedness conditions. Under double robustness, the estimated functional should incur no bias if either of two nuisance parameters is evaluated at the truth while the other spans a large collection of possibly incorrect candidate models. Our approach introduces a novel minimax pseudo-risk criterion for the functional of primary interest that embodies this double robustness property and thus may be used to select the candidate model that is nearest to fulfilling this property even when all models are wrong. We establish an oracle property for a multi-fold cross-validation scheme of the new model selection criterion which states that our empirical criterion performs nearly as well as that of an oracle with a priori knowledge of the pseudo-risk for each candidate model. We also describe a smooth approximation to the selection criterion which allows for valid post-selection inference. Finally, we apply the approach to perform model selection of a emiparametric estimator of average treatment effect given an ensemble of candidate machine learning methods to account for confounding in a study of right heart catheterization in the initial care unit of critically ill patients.
This is joint work with Yifan Cui.