Skip to main content

How many samples are needed to train a deep neural network?

Abstract:

Even though neural networks have become standard tools in many areas, many important statistical questions remain open. This paper studies the question of how much data is needed to train a ReLU feed-forward neural network. Our theoretical and empirical results suggest that the generalization error of ReLU feed-forward neural networks scales at the rate  1/√n  in the sample size n---rather than the ``parametric rate'' 1/n, which might be suggested by traditional statistical theories. Thus, broadly speaking, our results underpin the common belief that neural networks need ``many'' training samples. Along the way, we also establish new technical insights, such as the first lower bounds of the entropy of ReLU feed-forward networks.
 

 

Presented By: Pegah Golestaneh