Learned-norm pooling for deep feedforward and recurrent neural networks

Caglar Gulcehre, Kyunghyun Cho, Razvan Pascanu, Yoshua Bengio

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper we propose and investigate a novel nonlinear unit, called Lp unit, for deep neural networks. The proposed L p unit receives signals from several projections of a subset of units in the layer below and computes a normalized L p norm. We notice two interesting interpretations of the Lp unit. First, the proposed unit can be understood as a generalization of a number of conventional pooling operators such as average, root-mean-square and max pooling widely used in, for instance, convolutional neural networks (CNN), HMAX models and neocognitrons. Furthermore, the L p unit is, to a certain degree, similar to the recently proposed maxout unit [13] which achieved the state-of-the-art object recognition results on a number of benchmark datasets. Secondly, we provide a geometrical interpretation of the activation function based on which we argue that the L p unit is more efficient at representing complex, nonlinear separating boundaries. Each L p unit defines a superelliptic boundary, with its exact shape defined by the order p. We claim that this makes it possible to model arbitrarily shaped, curved boundaries more efficiently by combining a few L p units of different orders. This insight justifies the need for learning different orders for each unit in the model. We empirically evaluate the proposed L p units on a number of datasets and show that multilayer perceptrons (MLP) consisting of the L p units achieve the state-of-the-art results on a number of benchmark datasets. Furthermore, we evaluate the proposed L p unit on the recently proposed deep recurrent neural networks (RNN).

Original languageEnglish (US)
Title of host publicationMachine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2014, Proceedings
PublisherSpringer Verlag
Pages530-546
Number of pages17
Volume8724 LNAI
EditionPART 1
ISBN (Print)9783662448472
DOIs
StatePublished - 2014
EventEuropean Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2014 - Nancy, France
Duration: Sep 15 2014Sep 19 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume8724 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

OtherEuropean Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2014
CountryFrance
CityNancy
Period9/15/149/19/14

Fingerprint

Pooling
Recurrent neural networks
Feedforward neural networks
Feedforward Neural Networks
Recurrent Neural Networks
Norm
Unit
Object recognition
Multilayer neural networks
Chemical activation
Neural networks
Benchmark
Curved Boundary
Evaluate
Activation Function
Lp-norm
Object Recognition
Perceptron
Neural Network Model
Mean Square

Keywords

  • deep learning
  • multilayer perceptron

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Gulcehre, C., Cho, K., Pascanu, R., & Bengio, Y. (2014). Learned-norm pooling for deep feedforward and recurrent neural networks. In Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2014, Proceedings (PART 1 ed., Vol. 8724 LNAI, pp. 530-546). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8724 LNAI, No. PART 1). Springer Verlag. https://doi.org/10.1007/978-3-662-44848-9_34

Learned-norm pooling for deep feedforward and recurrent neural networks. / Gulcehre, Caglar; Cho, Kyunghyun; Pascanu, Razvan; Bengio, Yoshua.

Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2014, Proceedings. Vol. 8724 LNAI PART 1. ed. Springer Verlag, 2014. p. 530-546 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8724 LNAI, No. PART 1).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Gulcehre, C, Cho, K, Pascanu, R & Bengio, Y 2014, Learned-norm pooling for deep feedforward and recurrent neural networks. in Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2014, Proceedings. PART 1 edn, vol. 8724 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 1, vol. 8724 LNAI, Springer Verlag, pp. 530-546, European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2014, Nancy, France, 9/15/14. https://doi.org/10.1007/978-3-662-44848-9_34
Gulcehre C, Cho K, Pascanu R, Bengio Y. Learned-norm pooling for deep feedforward and recurrent neural networks. In Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2014, Proceedings. PART 1 ed. Vol. 8724 LNAI. Springer Verlag. 2014. p. 530-546. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 1). https://doi.org/10.1007/978-3-662-44848-9_34
Gulcehre, Caglar ; Cho, Kyunghyun ; Pascanu, Razvan ; Bengio, Yoshua. / Learned-norm pooling for deep feedforward and recurrent neural networks. Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2014, Proceedings. Vol. 8724 LNAI PART 1. ed. Springer Verlag, 2014. pp. 530-546 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 1).
@inproceedings{34012d808c814b5593264c1037ed3dc6,
title = "Learned-norm pooling for deep feedforward and recurrent neural networks",
abstract = "In this paper we propose and investigate a novel nonlinear unit, called Lp unit, for deep neural networks. The proposed L p unit receives signals from several projections of a subset of units in the layer below and computes a normalized L p norm. We notice two interesting interpretations of the Lp unit. First, the proposed unit can be understood as a generalization of a number of conventional pooling operators such as average, root-mean-square and max pooling widely used in, for instance, convolutional neural networks (CNN), HMAX models and neocognitrons. Furthermore, the L p unit is, to a certain degree, similar to the recently proposed maxout unit [13] which achieved the state-of-the-art object recognition results on a number of benchmark datasets. Secondly, we provide a geometrical interpretation of the activation function based on which we argue that the L p unit is more efficient at representing complex, nonlinear separating boundaries. Each L p unit defines a superelliptic boundary, with its exact shape defined by the order p. We claim that this makes it possible to model arbitrarily shaped, curved boundaries more efficiently by combining a few L p units of different orders. This insight justifies the need for learning different orders for each unit in the model. We empirically evaluate the proposed L p units on a number of datasets and show that multilayer perceptrons (MLP) consisting of the L p units achieve the state-of-the-art results on a number of benchmark datasets. Furthermore, we evaluate the proposed L p unit on the recently proposed deep recurrent neural networks (RNN).",
keywords = "deep learning, multilayer perceptron",
author = "Caglar Gulcehre and Kyunghyun Cho and Razvan Pascanu and Yoshua Bengio",
year = "2014",
doi = "10.1007/978-3-662-44848-9_34",
language = "English (US)",
isbn = "9783662448472",
volume = "8724 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
number = "PART 1",
pages = "530--546",
booktitle = "Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2014, Proceedings",
edition = "PART 1",

}

TY - GEN

T1 - Learned-norm pooling for deep feedforward and recurrent neural networks

AU - Gulcehre, Caglar

AU - Cho, Kyunghyun

AU - Pascanu, Razvan

AU - Bengio, Yoshua

PY - 2014

Y1 - 2014

N2 - In this paper we propose and investigate a novel nonlinear unit, called Lp unit, for deep neural networks. The proposed L p unit receives signals from several projections of a subset of units in the layer below and computes a normalized L p norm. We notice two interesting interpretations of the Lp unit. First, the proposed unit can be understood as a generalization of a number of conventional pooling operators such as average, root-mean-square and max pooling widely used in, for instance, convolutional neural networks (CNN), HMAX models and neocognitrons. Furthermore, the L p unit is, to a certain degree, similar to the recently proposed maxout unit [13] which achieved the state-of-the-art object recognition results on a number of benchmark datasets. Secondly, we provide a geometrical interpretation of the activation function based on which we argue that the L p unit is more efficient at representing complex, nonlinear separating boundaries. Each L p unit defines a superelliptic boundary, with its exact shape defined by the order p. We claim that this makes it possible to model arbitrarily shaped, curved boundaries more efficiently by combining a few L p units of different orders. This insight justifies the need for learning different orders for each unit in the model. We empirically evaluate the proposed L p units on a number of datasets and show that multilayer perceptrons (MLP) consisting of the L p units achieve the state-of-the-art results on a number of benchmark datasets. Furthermore, we evaluate the proposed L p unit on the recently proposed deep recurrent neural networks (RNN).

AB - In this paper we propose and investigate a novel nonlinear unit, called Lp unit, for deep neural networks. The proposed L p unit receives signals from several projections of a subset of units in the layer below and computes a normalized L p norm. We notice two interesting interpretations of the Lp unit. First, the proposed unit can be understood as a generalization of a number of conventional pooling operators such as average, root-mean-square and max pooling widely used in, for instance, convolutional neural networks (CNN), HMAX models and neocognitrons. Furthermore, the L p unit is, to a certain degree, similar to the recently proposed maxout unit [13] which achieved the state-of-the-art object recognition results on a number of benchmark datasets. Secondly, we provide a geometrical interpretation of the activation function based on which we argue that the L p unit is more efficient at representing complex, nonlinear separating boundaries. Each L p unit defines a superelliptic boundary, with its exact shape defined by the order p. We claim that this makes it possible to model arbitrarily shaped, curved boundaries more efficiently by combining a few L p units of different orders. This insight justifies the need for learning different orders for each unit in the model. We empirically evaluate the proposed L p units on a number of datasets and show that multilayer perceptrons (MLP) consisting of the L p units achieve the state-of-the-art results on a number of benchmark datasets. Furthermore, we evaluate the proposed L p unit on the recently proposed deep recurrent neural networks (RNN).

KW - deep learning

KW - multilayer perceptron

UR - http://www.scopus.com/inward/record.url?scp=84907016671&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84907016671&partnerID=8YFLogxK

U2 - 10.1007/978-3-662-44848-9_34

DO - 10.1007/978-3-662-44848-9_34

M3 - Conference contribution

SN - 9783662448472

VL - 8724 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 530

EP - 546

BT - Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2014, Proceedings

PB - Springer Verlag

ER -