Proximity-graph instance-based learning, support vector machines, and high dimensionality: An empirical comparison

Godfried Toussaint, Constantin Berzan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Previous experiments with low dimensional data sets have shown that Gabriel graph methods for instance-based learning are among the best machine learning algorithms for pattern classification applications. However, as the dimensionality of the data grows large, all data points in the training set tend to become Gabriel neighbors of each other, bringing the efficacy of this method into question. Indeed, it has been conjectured that for high-dimensional data, proximity graph methods that use sparser graphs, such as relative neighbor graphs (RNG) and minimum spanning trees (MST) would have to be employed in order to maintain their privileged status. Here the performance of proximity graph methods, in instance-based learning, that employ Gabriel graphs, relative neighborhood graphs, and minimum spanning trees, are compared experimentally on high-dimensional data sets. These methods are also compared empirically against the traditional k-NN rule and support vector machines (SVMs), the leading competitors of proximity graph methods.

Original languageEnglish (US)
Title of host publicationMachine Learning and Data Mining in Pattern Recognition - 8th International Conference, MLDM 2012, Proceedings
Pages222-236
Number of pages15
DOIs
StatePublished - Aug 17 2012
Event8th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2012 - Berlin, Germany
Duration: Jul 13 2012Jul 20 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7376 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other8th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2012
CountryGermany
CityBerlin
Period7/13/127/20/12

Fingerprint

Proximity Graphs
Instance-based Learning
Learning algorithms
Pattern recognition
Dimensionality
Support vector machines
Learning systems
Support Vector Machine
Experiments
Minimum Spanning Tree
High-dimensional Data
Graph in graph theory
Sparse Graphs
Pattern Classification
Large Data
Efficacy
Learning Algorithm
Machine Learning
Tend

Keywords

  • Gabriel graph
  • Instance-based learning
  • machine learning
  • minimum spanning tree (MST)
  • proximity graphs
  • relative neighborhood graph (RNG)
  • sequential minimal optimization (SMO)
  • support vector machines (SVM)

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Toussaint, G., & Berzan, C. (2012). Proximity-graph instance-based learning, support vector machines, and high dimensionality: An empirical comparison. In Machine Learning and Data Mining in Pattern Recognition - 8th International Conference, MLDM 2012, Proceedings (pp. 222-236). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7376 LNAI). https://doi.org/10.1007/978-3-642-31537-4_18

Proximity-graph instance-based learning, support vector machines, and high dimensionality : An empirical comparison. / Toussaint, Godfried; Berzan, Constantin.

Machine Learning and Data Mining in Pattern Recognition - 8th International Conference, MLDM 2012, Proceedings. 2012. p. 222-236 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7376 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Toussaint, G & Berzan, C 2012, Proximity-graph instance-based learning, support vector machines, and high dimensionality: An empirical comparison. in Machine Learning and Data Mining in Pattern Recognition - 8th International Conference, MLDM 2012, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7376 LNAI, pp. 222-236, 8th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2012, Berlin, Germany, 7/13/12. https://doi.org/10.1007/978-3-642-31537-4_18
Toussaint G, Berzan C. Proximity-graph instance-based learning, support vector machines, and high dimensionality: An empirical comparison. In Machine Learning and Data Mining in Pattern Recognition - 8th International Conference, MLDM 2012, Proceedings. 2012. p. 222-236. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-31537-4_18
Toussaint, Godfried ; Berzan, Constantin. / Proximity-graph instance-based learning, support vector machines, and high dimensionality : An empirical comparison. Machine Learning and Data Mining in Pattern Recognition - 8th International Conference, MLDM 2012, Proceedings. 2012. pp. 222-236 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{a479cd8b59c54a60bfe5ac7d98543b41,
title = "Proximity-graph instance-based learning, support vector machines, and high dimensionality: An empirical comparison",
abstract = "Previous experiments with low dimensional data sets have shown that Gabriel graph methods for instance-based learning are among the best machine learning algorithms for pattern classification applications. However, as the dimensionality of the data grows large, all data points in the training set tend to become Gabriel neighbors of each other, bringing the efficacy of this method into question. Indeed, it has been conjectured that for high-dimensional data, proximity graph methods that use sparser graphs, such as relative neighbor graphs (RNG) and minimum spanning trees (MST) would have to be employed in order to maintain their privileged status. Here the performance of proximity graph methods, in instance-based learning, that employ Gabriel graphs, relative neighborhood graphs, and minimum spanning trees, are compared experimentally on high-dimensional data sets. These methods are also compared empirically against the traditional k-NN rule and support vector machines (SVMs), the leading competitors of proximity graph methods.",
keywords = "Gabriel graph, Instance-based learning, machine learning, minimum spanning tree (MST), proximity graphs, relative neighborhood graph (RNG), sequential minimal optimization (SMO), support vector machines (SVM)",
author = "Godfried Toussaint and Constantin Berzan",
year = "2012",
month = "8",
day = "17",
doi = "10.1007/978-3-642-31537-4_18",
language = "English (US)",
isbn = "9783642315367",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "222--236",
booktitle = "Machine Learning and Data Mining in Pattern Recognition - 8th International Conference, MLDM 2012, Proceedings",

}

TY - GEN

T1 - Proximity-graph instance-based learning, support vector machines, and high dimensionality

T2 - An empirical comparison

AU - Toussaint, Godfried

AU - Berzan, Constantin

PY - 2012/8/17

Y1 - 2012/8/17

N2 - Previous experiments with low dimensional data sets have shown that Gabriel graph methods for instance-based learning are among the best machine learning algorithms for pattern classification applications. However, as the dimensionality of the data grows large, all data points in the training set tend to become Gabriel neighbors of each other, bringing the efficacy of this method into question. Indeed, it has been conjectured that for high-dimensional data, proximity graph methods that use sparser graphs, such as relative neighbor graphs (RNG) and minimum spanning trees (MST) would have to be employed in order to maintain their privileged status. Here the performance of proximity graph methods, in instance-based learning, that employ Gabriel graphs, relative neighborhood graphs, and minimum spanning trees, are compared experimentally on high-dimensional data sets. These methods are also compared empirically against the traditional k-NN rule and support vector machines (SVMs), the leading competitors of proximity graph methods.

AB - Previous experiments with low dimensional data sets have shown that Gabriel graph methods for instance-based learning are among the best machine learning algorithms for pattern classification applications. However, as the dimensionality of the data grows large, all data points in the training set tend to become Gabriel neighbors of each other, bringing the efficacy of this method into question. Indeed, it has been conjectured that for high-dimensional data, proximity graph methods that use sparser graphs, such as relative neighbor graphs (RNG) and minimum spanning trees (MST) would have to be employed in order to maintain their privileged status. Here the performance of proximity graph methods, in instance-based learning, that employ Gabriel graphs, relative neighborhood graphs, and minimum spanning trees, are compared experimentally on high-dimensional data sets. These methods are also compared empirically against the traditional k-NN rule and support vector machines (SVMs), the leading competitors of proximity graph methods.

KW - Gabriel graph

KW - Instance-based learning

KW - machine learning

KW - minimum spanning tree (MST)

KW - proximity graphs

KW - relative neighborhood graph (RNG)

KW - sequential minimal optimization (SMO)

KW - support vector machines (SVM)

UR - http://www.scopus.com/inward/record.url?scp=84864922736&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84864922736&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-31537-4_18

DO - 10.1007/978-3-642-31537-4_18

M3 - Conference contribution

AN - SCOPUS:84864922736

SN - 9783642315367

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 222

EP - 236

BT - Machine Learning and Data Mining in Pattern Recognition - 8th International Conference, MLDM 2012, Proceedings

ER -