The role of over-parametrization in generalization of neural networks

Behnam Neyshabur, Zhiyuan Li, Srinadh Bhojanapalli, Yann LeCun, Nathan Srebro

Research output: Contribution to conferencePaper

Abstract

Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization. In this work we suggest a novel complexity measure based on unit-wise capacities resulting in a tighter generalization bound for two layer ReLU networks. Our capacity bound correlates with the behavior of test error with increasing network sizes (within the range reported in the experiments), and could partly explain the improvement in generalization with over-parametrization. We further present a matching lower bound for the Rademacher complexity that improves over previous capacity lower bounds for neural networks.

Original languageEnglish (US)
StatePublished - Jan 1 2019
Event7th International Conference on Learning Representations, ICLR 2019 - New Orleans, United States
Duration: May 6 2019May 9 2019

Conference

Conference7th International Conference on Learning Representations, ICLR 2019
CountryUnited States
CityNew Orleans
Period5/6/195/9/19

Fingerprint

neural network
Neural networks
Network layers
Neural Networks
experiment
Experiments

ASJC Scopus subject areas

  • Education
  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Cite this

Neyshabur, B., Li, Z., Bhojanapalli, S., LeCun, Y., & Srebro, N. (2019). The role of over-parametrization in generalization of neural networks. Paper presented at 7th International Conference on Learning Representations, ICLR 2019, New Orleans, United States.

The role of over-parametrization in generalization of neural networks. / Neyshabur, Behnam; Li, Zhiyuan; Bhojanapalli, Srinadh; LeCun, Yann; Srebro, Nathan.

2019. Paper presented at 7th International Conference on Learning Representations, ICLR 2019, New Orleans, United States.

Research output: Contribution to conferencePaper

Neyshabur, B, Li, Z, Bhojanapalli, S, LeCun, Y & Srebro, N 2019, 'The role of over-parametrization in generalization of neural networks' Paper presented at 7th International Conference on Learning Representations, ICLR 2019, New Orleans, United States, 5/6/19 - 5/9/19, .
Neyshabur B, Li Z, Bhojanapalli S, LeCun Y, Srebro N. The role of over-parametrization in generalization of neural networks. 2019. Paper presented at 7th International Conference on Learning Representations, ICLR 2019, New Orleans, United States.
Neyshabur, Behnam ; Li, Zhiyuan ; Bhojanapalli, Srinadh ; LeCun, Yann ; Srebro, Nathan. / The role of over-parametrization in generalization of neural networks. Paper presented at 7th International Conference on Learning Representations, ICLR 2019, New Orleans, United States.
@conference{65e684e59417479dbe3fd97c22618b2c,
title = "The role of over-parametrization in generalization of neural networks",
abstract = "Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization. In this work we suggest a novel complexity measure based on unit-wise capacities resulting in a tighter generalization bound for two layer ReLU networks. Our capacity bound correlates with the behavior of test error with increasing network sizes (within the range reported in the experiments), and could partly explain the improvement in generalization with over-parametrization. We further present a matching lower bound for the Rademacher complexity that improves over previous capacity lower bounds for neural networks.",
author = "Behnam Neyshabur and Zhiyuan Li and Srinadh Bhojanapalli and Yann LeCun and Nathan Srebro",
year = "2019",
month = "1",
day = "1",
language = "English (US)",
note = "7th International Conference on Learning Representations, ICLR 2019 ; Conference date: 06-05-2019 Through 09-05-2019",

}

TY - CONF

T1 - The role of over-parametrization in generalization of neural networks

AU - Neyshabur, Behnam

AU - Li, Zhiyuan

AU - Bhojanapalli, Srinadh

AU - LeCun, Yann

AU - Srebro, Nathan

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization. In this work we suggest a novel complexity measure based on unit-wise capacities resulting in a tighter generalization bound for two layer ReLU networks. Our capacity bound correlates with the behavior of test error with increasing network sizes (within the range reported in the experiments), and could partly explain the improvement in generalization with over-parametrization. We further present a matching lower bound for the Rademacher complexity that improves over previous capacity lower bounds for neural networks.

AB - Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization. In this work we suggest a novel complexity measure based on unit-wise capacities resulting in a tighter generalization bound for two layer ReLU networks. Our capacity bound correlates with the behavior of test error with increasing network sizes (within the range reported in the experiments), and could partly explain the improvement in generalization with over-parametrization. We further present a matching lower bound for the Rademacher complexity that improves over previous capacity lower bounds for neural networks.

UR - http://www.scopus.com/inward/record.url?scp=85071150465&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85071150465&partnerID=8YFLogxK

M3 - Paper

ER -