Large-scale SVD and manifold learning

Ameet Talwalkar, Sanjiv Kumar, Mehryar Mohri, Henry Rowley

Research output: Contribution to journalArticle

Abstract

Abstract This paper examines the efficacy of sampling-based low-rank approximation techniques when applied to large dense kernel matrices. We analyze two common approximate singular value decomposition techniques, namely the Nystr om and Column sampling methods. We present a theoretical comparison between these two methods, provide novel insights regarding their suitability for various tasks and present experimental results that support our theory. Our results illustrate the relative strengths of each method. We next examine the performance of these two techniques on the largescale task of extracting low-dimensional manifold structure given millions of high-dimensional face images. We address the computational challenges of non-linear dimensionality reduction via Isomap and Laplacian Eigenmaps, using a graph containing about 18 million nodes and 65 million edges. We present extensive experiments on learning low-dimensional embeddings for two large face data sets: CMU-PIE (35 thousand faces) and a web data set (18 million faces). Our comparisons show that the Nystr om approximation is superior to the Column sampling method for this task. Furthermore, approximate Isomap tends to perform better than Laplacian Eigenmaps on both clustering and classification with the labeled CMU-PIE data set.

Original languageEnglish (US)
Pages (from-to)3129-3152
Number of pages24
JournalJournal of Machine Learning Research
Volume14
StatePublished - Oct 2013

Fingerprint

Manifold Learning
Singular value decomposition
Isomap
Face
Sampling
Sampling Methods
Low-rank Approximation
Decomposition Techniques
Dimensionality Reduction
Efficacy
High-dimensional
Clustering
Tend
kernel
Experimental Results
Approximation
Graph in graph theory
Vertex of a graph
Experiments
Experiment

Keywords

  • Large-scale matrix factorization
  • Low-rank approximation
  • Manifold learning

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Cite this

Talwalkar, A., Kumar, S., Mohri, M., & Rowley, H. (2013). Large-scale SVD and manifold learning. Journal of Machine Learning Research, 14, 3129-3152.

Large-scale SVD and manifold learning. / Talwalkar, Ameet; Kumar, Sanjiv; Mohri, Mehryar; Rowley, Henry.

In: Journal of Machine Learning Research, Vol. 14, 10.2013, p. 3129-3152.

Research output: Contribution to journalArticle

Talwalkar, A, Kumar, S, Mohri, M & Rowley, H 2013, 'Large-scale SVD and manifold learning', Journal of Machine Learning Research, vol. 14, pp. 3129-3152.
Talwalkar A, Kumar S, Mohri M, Rowley H. Large-scale SVD and manifold learning. Journal of Machine Learning Research. 2013 Oct;14:3129-3152.
Talwalkar, Ameet ; Kumar, Sanjiv ; Mohri, Mehryar ; Rowley, Henry. / Large-scale SVD and manifold learning. In: Journal of Machine Learning Research. 2013 ; Vol. 14. pp. 3129-3152.
@article{bd52cd13bcfe419ca4a2e4fdb111f7d3,
title = "Large-scale SVD and manifold learning",
abstract = "Abstract This paper examines the efficacy of sampling-based low-rank approximation techniques when applied to large dense kernel matrices. We analyze two common approximate singular value decomposition techniques, namely the Nystr om and Column sampling methods. We present a theoretical comparison between these two methods, provide novel insights regarding their suitability for various tasks and present experimental results that support our theory. Our results illustrate the relative strengths of each method. We next examine the performance of these two techniques on the largescale task of extracting low-dimensional manifold structure given millions of high-dimensional face images. We address the computational challenges of non-linear dimensionality reduction via Isomap and Laplacian Eigenmaps, using a graph containing about 18 million nodes and 65 million edges. We present extensive experiments on learning low-dimensional embeddings for two large face data sets: CMU-PIE (35 thousand faces) and a web data set (18 million faces). Our comparisons show that the Nystr om approximation is superior to the Column sampling method for this task. Furthermore, approximate Isomap tends to perform better than Laplacian Eigenmaps on both clustering and classification with the labeled CMU-PIE data set.",
keywords = "Large-scale matrix factorization, Low-rank approximation, Manifold learning",
author = "Ameet Talwalkar and Sanjiv Kumar and Mehryar Mohri and Henry Rowley",
year = "2013",
month = "10",
language = "English (US)",
volume = "14",
pages = "3129--3152",
journal = "Journal of Machine Learning Research",
issn = "1532-4435",
publisher = "Microtome Publishing",

}

TY - JOUR

T1 - Large-scale SVD and manifold learning

AU - Talwalkar, Ameet

AU - Kumar, Sanjiv

AU - Mohri, Mehryar

AU - Rowley, Henry

PY - 2013/10

Y1 - 2013/10

N2 - Abstract This paper examines the efficacy of sampling-based low-rank approximation techniques when applied to large dense kernel matrices. We analyze two common approximate singular value decomposition techniques, namely the Nystr om and Column sampling methods. We present a theoretical comparison between these two methods, provide novel insights regarding their suitability for various tasks and present experimental results that support our theory. Our results illustrate the relative strengths of each method. We next examine the performance of these two techniques on the largescale task of extracting low-dimensional manifold structure given millions of high-dimensional face images. We address the computational challenges of non-linear dimensionality reduction via Isomap and Laplacian Eigenmaps, using a graph containing about 18 million nodes and 65 million edges. We present extensive experiments on learning low-dimensional embeddings for two large face data sets: CMU-PIE (35 thousand faces) and a web data set (18 million faces). Our comparisons show that the Nystr om approximation is superior to the Column sampling method for this task. Furthermore, approximate Isomap tends to perform better than Laplacian Eigenmaps on both clustering and classification with the labeled CMU-PIE data set.

AB - Abstract This paper examines the efficacy of sampling-based low-rank approximation techniques when applied to large dense kernel matrices. We analyze two common approximate singular value decomposition techniques, namely the Nystr om and Column sampling methods. We present a theoretical comparison between these two methods, provide novel insights regarding their suitability for various tasks and present experimental results that support our theory. Our results illustrate the relative strengths of each method. We next examine the performance of these two techniques on the largescale task of extracting low-dimensional manifold structure given millions of high-dimensional face images. We address the computational challenges of non-linear dimensionality reduction via Isomap and Laplacian Eigenmaps, using a graph containing about 18 million nodes and 65 million edges. We present extensive experiments on learning low-dimensional embeddings for two large face data sets: CMU-PIE (35 thousand faces) and a web data set (18 million faces). Our comparisons show that the Nystr om approximation is superior to the Column sampling method for this task. Furthermore, approximate Isomap tends to perform better than Laplacian Eigenmaps on both clustering and classification with the labeled CMU-PIE data set.

KW - Large-scale matrix factorization

KW - Low-rank approximation

KW - Manifold learning

UR - http://www.scopus.com/inward/record.url?scp=84887492519&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84887492519&partnerID=8YFLogxK

M3 - Article

VL - 14

SP - 3129

EP - 3152

JO - Journal of Machine Learning Research

JF - Journal of Machine Learning Research

SN - 1532-4435

ER -