Geometric decision rules for instance-based learning problems

Binay Bhattacharya, Kaustav Mukherjee, Godfried Toussaint

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In the typical nonparametric approach to classification in instance-based learning and data mining, random data (the training set of patterns) are collected and used to design a decision rule (classifier). One of the most well known such rules is the k-nearest neighbor decision rule (also known as lazy learning) in which an unknown pattern is classified into the majority class among the k-nearest neighbors in the training set. This rule gives low error rates when the training set is large. However, in practice it is desired to store as little of the training data as possible, without sacrificing the performance. It is well known that thinning (condensing) the training set with the Gabriel proximity graph is a viable partial solution to the problem. However, this brings up the problem of efficiently computing the Gabriel graph of large training data sets in high dimensional spaces. In this paper we report on a new approach to the instance-based learning problem. The new approach combines five tools: first, editing the data using Wilson-Gabriel-editing to smooth the decision boundary, second, applying Gabriel-thinning to the edited set, third, filtering this output with the ICF algorithm of Brighton and Mellish, fourth, using the Gabriel-neighbor decision rule to classify new incoming queries, and fifth, using a new data structure that allows the efficient computation of approximate Gabriel graphs in high dimensional spaces. Extensive experiments suggest that our approach is the best on the market.

Original languageEnglish (US)
Title of host publicationPattern Recognition and Machine Intelligence - First International Conference, PReMI 2005, Proceedings
Pages60-69
Number of pages10
DOIs
StatePublished - Dec 1 2005
Event1st International Conference on Pattern Recognition and Machine Intelligence, PReMI 2005 - Kolkata, India
Duration: Dec 20 2005Dec 22 2005

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3776 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other1st International Conference on Pattern Recognition and Machine Intelligence, PReMI 2005
CountryIndia
CityKolkata
Period12/20/0512/22/05

Fingerprint

Instance-based Learning
Problem-Based Learning
Decision Rules
Data mining
Data structures
Classifiers
Learning
Data Mining
Experiments
Thinning
Nearest Neighbor
High-dimensional
Proximity Graphs
Graph in graph theory
Error Rate
Training
Datasets
Data Structures
Filtering
Classify

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Bhattacharya, B., Mukherjee, K., & Toussaint, G. (2005). Geometric decision rules for instance-based learning problems. In Pattern Recognition and Machine Intelligence - First International Conference, PReMI 2005, Proceedings (pp. 60-69). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3776 LNCS). https://doi.org/10.1007/11590316_9

Geometric decision rules for instance-based learning problems. / Bhattacharya, Binay; Mukherjee, Kaustav; Toussaint, Godfried.

Pattern Recognition and Machine Intelligence - First International Conference, PReMI 2005, Proceedings. 2005. p. 60-69 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3776 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Bhattacharya, B, Mukherjee, K & Toussaint, G 2005, Geometric decision rules for instance-based learning problems. in Pattern Recognition and Machine Intelligence - First International Conference, PReMI 2005, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 3776 LNCS, pp. 60-69, 1st International Conference on Pattern Recognition and Machine Intelligence, PReMI 2005, Kolkata, India, 12/20/05. https://doi.org/10.1007/11590316_9
Bhattacharya B, Mukherjee K, Toussaint G. Geometric decision rules for instance-based learning problems. In Pattern Recognition and Machine Intelligence - First International Conference, PReMI 2005, Proceedings. 2005. p. 60-69. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/11590316_9
Bhattacharya, Binay ; Mukherjee, Kaustav ; Toussaint, Godfried. / Geometric decision rules for instance-based learning problems. Pattern Recognition and Machine Intelligence - First International Conference, PReMI 2005, Proceedings. 2005. pp. 60-69 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{c9196d69083e4b21863a7fc1e3b04ad2,
title = "Geometric decision rules for instance-based learning problems",
abstract = "In the typical nonparametric approach to classification in instance-based learning and data mining, random data (the training set of patterns) are collected and used to design a decision rule (classifier). One of the most well known such rules is the k-nearest neighbor decision rule (also known as lazy learning) in which an unknown pattern is classified into the majority class among the k-nearest neighbors in the training set. This rule gives low error rates when the training set is large. However, in practice it is desired to store as little of the training data as possible, without sacrificing the performance. It is well known that thinning (condensing) the training set with the Gabriel proximity graph is a viable partial solution to the problem. However, this brings up the problem of efficiently computing the Gabriel graph of large training data sets in high dimensional spaces. In this paper we report on a new approach to the instance-based learning problem. The new approach combines five tools: first, editing the data using Wilson-Gabriel-editing to smooth the decision boundary, second, applying Gabriel-thinning to the edited set, third, filtering this output with the ICF algorithm of Brighton and Mellish, fourth, using the Gabriel-neighbor decision rule to classify new incoming queries, and fifth, using a new data structure that allows the efficient computation of approximate Gabriel graphs in high dimensional spaces. Extensive experiments suggest that our approach is the best on the market.",
author = "Binay Bhattacharya and Kaustav Mukherjee and Godfried Toussaint",
year = "2005",
month = "12",
day = "1",
doi = "10.1007/11590316_9",
language = "English (US)",
isbn = "3540305068",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "60--69",
booktitle = "Pattern Recognition and Machine Intelligence - First International Conference, PReMI 2005, Proceedings",

}

TY - GEN

T1 - Geometric decision rules for instance-based learning problems

AU - Bhattacharya, Binay

AU - Mukherjee, Kaustav

AU - Toussaint, Godfried

PY - 2005/12/1

Y1 - 2005/12/1

N2 - In the typical nonparametric approach to classification in instance-based learning and data mining, random data (the training set of patterns) are collected and used to design a decision rule (classifier). One of the most well known such rules is the k-nearest neighbor decision rule (also known as lazy learning) in which an unknown pattern is classified into the majority class among the k-nearest neighbors in the training set. This rule gives low error rates when the training set is large. However, in practice it is desired to store as little of the training data as possible, without sacrificing the performance. It is well known that thinning (condensing) the training set with the Gabriel proximity graph is a viable partial solution to the problem. However, this brings up the problem of efficiently computing the Gabriel graph of large training data sets in high dimensional spaces. In this paper we report on a new approach to the instance-based learning problem. The new approach combines five tools: first, editing the data using Wilson-Gabriel-editing to smooth the decision boundary, second, applying Gabriel-thinning to the edited set, third, filtering this output with the ICF algorithm of Brighton and Mellish, fourth, using the Gabriel-neighbor decision rule to classify new incoming queries, and fifth, using a new data structure that allows the efficient computation of approximate Gabriel graphs in high dimensional spaces. Extensive experiments suggest that our approach is the best on the market.

AB - In the typical nonparametric approach to classification in instance-based learning and data mining, random data (the training set of patterns) are collected and used to design a decision rule (classifier). One of the most well known such rules is the k-nearest neighbor decision rule (also known as lazy learning) in which an unknown pattern is classified into the majority class among the k-nearest neighbors in the training set. This rule gives low error rates when the training set is large. However, in practice it is desired to store as little of the training data as possible, without sacrificing the performance. It is well known that thinning (condensing) the training set with the Gabriel proximity graph is a viable partial solution to the problem. However, this brings up the problem of efficiently computing the Gabriel graph of large training data sets in high dimensional spaces. In this paper we report on a new approach to the instance-based learning problem. The new approach combines five tools: first, editing the data using Wilson-Gabriel-editing to smooth the decision boundary, second, applying Gabriel-thinning to the edited set, third, filtering this output with the ICF algorithm of Brighton and Mellish, fourth, using the Gabriel-neighbor decision rule to classify new incoming queries, and fifth, using a new data structure that allows the efficient computation of approximate Gabriel graphs in high dimensional spaces. Extensive experiments suggest that our approach is the best on the market.

UR - http://www.scopus.com/inward/record.url?scp=33646719102&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33646719102&partnerID=8YFLogxK

U2 - 10.1007/11590316_9

DO - 10.1007/11590316_9

M3 - Conference contribution

AN - SCOPUS:33646719102

SN - 3540305068

SN - 9783540305064

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 60

EP - 69

BT - Pattern Recognition and Machine Intelligence - First International Conference, PReMI 2005, Proceedings

ER -