Speeding up support vector machines: Probabilistic versus nearest neighbour methods for condensing training data

Moïri Gamboni, Abhijai Garg, Oleg Grishin, Seung Man Oh, Francis Sowani, Anthony Spalvieri-Kruse, Godfried T. Toussaint, Lingliang Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Several methods for reducing the running time of support vector machines (SVMs) are compared in terms of speed-up factor and classification accuracy using seven large real world datasets obtained from the UCI Machine Learning Repository. All the methods tested are based on reducing the size of the training data that is then fed to the SVM. Two probabilistic methods are investigated that run in linear time with respect to the size of the training data: blind random sampling and a new method for guided random sampling (Gaussian Condensing). These methods are compared with k-Nearest Neighbour methods for reducing the size of the training set and for smoothing the decision boundary. For all the datasets tested blind random sampling gave the best results for speeding up SVMs without significantly sacrificing classification accuracy.

Original languageEnglish (US)
Title of host publicationICPRAM 2014 - Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods
PublisherSciTePress
Pages364-371
Number of pages8
ISBN (Print)9789897580185
DOIs
StatePublished - Jan 1 2014
Event3rd International Conference on Pattern Recognition Applications and Methods, ICPRAM 2014 - Angers, Loire Valley, France
Duration: Mar 6 2014Mar 8 2014

Publication series

NameICPRAM 2014 - Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods

Other

Other3rd International Conference on Pattern Recognition Applications and Methods, ICPRAM 2014
CountryFrance
CityAngers, Loire Valley
Period3/6/143/8/14

    Fingerprint

Keywords

  • Blind random sampling
  • Data mining
  • Gaussian condensing
  • Guided random sampling
  • K-nearest neighbour methods
  • Machine learning
  • SMO
  • Support vector machines
  • Training data condensation
  • Wilson editing

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition

Cite this

Gamboni, M., Garg, A., Grishin, O., Oh, S. M., Sowani, F., Spalvieri-Kruse, A., Toussaint, G. T., & Zhang, L. (2014). Speeding up support vector machines: Probabilistic versus nearest neighbour methods for condensing training data. In ICPRAM 2014 - Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods (pp. 364-371). (ICPRAM 2014 - Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods). SciTePress. https://doi.org/10.5220/0004927003640371