On transductive regression

Corinna Cortes, Mehryar Mohri

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In many modern large-scale learning applications, the amount of unlabeled data far exceeds that of labeled data. A common instance of this problem is the transductive setting where the unlabeled test points are known to the learning algorithm. This paper presents a study of regression problems in that setting. It presents explicit VC-dimension error bounds for transductive regression that hold for all bounded loss functions and coincide with the tight classification bounds of Vapnik when applied to classification. It also presents a new transductive regression algorithm inspired by our bound that admits a primal and kernelized closed-form solution and deals efficiently with large amounts of unlabeled data. The algorithm exploits the position of unlabeled points to locally estimate their labels and then uses a global optimization to ensure robust predictions. Our study also includes the results of experiments with several publicly available regression data sets with up to 20,000 unlabeled examples. The comparison with other transductive regression algorithms shows that it performs well and that it can scale to large data sets.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 19 - Proceedings of the 2006 Conference
Pages305-312
Number of pages8
StatePublished - 2007
Event20th Annual Conference on Neural Information Processing Systems, NIPS 2006 - Vancouver, BC, Canada
Duration: Dec 4 2006Dec 7 2006

Other

Other20th Annual Conference on Neural Information Processing Systems, NIPS 2006
CountryCanada
CityVancouver, BC
Period12/4/0612/7/06

Fingerprint

Global optimization
Learning algorithms
Labels
Experiments

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Cite this

Cortes, C., & Mohri, M. (2007). On transductive regression. In Advances in Neural Information Processing Systems 19 - Proceedings of the 2006 Conference (pp. 305-312)

On transductive regression. / Cortes, Corinna; Mohri, Mehryar.

Advances in Neural Information Processing Systems 19 - Proceedings of the 2006 Conference. 2007. p. 305-312.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cortes, C & Mohri, M 2007, On transductive regression. in Advances in Neural Information Processing Systems 19 - Proceedings of the 2006 Conference. pp. 305-312, 20th Annual Conference on Neural Information Processing Systems, NIPS 2006, Vancouver, BC, Canada, 12/4/06.
Cortes C, Mohri M. On transductive regression. In Advances in Neural Information Processing Systems 19 - Proceedings of the 2006 Conference. 2007. p. 305-312
Cortes, Corinna ; Mohri, Mehryar. / On transductive regression. Advances in Neural Information Processing Systems 19 - Proceedings of the 2006 Conference. 2007. pp. 305-312
@inproceedings{c4094a868cc04ed7915b1628a1944b37,
title = "On transductive regression",
abstract = "In many modern large-scale learning applications, the amount of unlabeled data far exceeds that of labeled data. A common instance of this problem is the transductive setting where the unlabeled test points are known to the learning algorithm. This paper presents a study of regression problems in that setting. It presents explicit VC-dimension error bounds for transductive regression that hold for all bounded loss functions and coincide with the tight classification bounds of Vapnik when applied to classification. It also presents a new transductive regression algorithm inspired by our bound that admits a primal and kernelized closed-form solution and deals efficiently with large amounts of unlabeled data. The algorithm exploits the position of unlabeled points to locally estimate their labels and then uses a global optimization to ensure robust predictions. Our study also includes the results of experiments with several publicly available regression data sets with up to 20,000 unlabeled examples. The comparison with other transductive regression algorithms shows that it performs well and that it can scale to large data sets.",
author = "Corinna Cortes and Mehryar Mohri",
year = "2007",
language = "English (US)",
isbn = "9780262195683",
pages = "305--312",
booktitle = "Advances in Neural Information Processing Systems 19 - Proceedings of the 2006 Conference",

}

TY - GEN

T1 - On transductive regression

AU - Cortes, Corinna

AU - Mohri, Mehryar

PY - 2007

Y1 - 2007

N2 - In many modern large-scale learning applications, the amount of unlabeled data far exceeds that of labeled data. A common instance of this problem is the transductive setting where the unlabeled test points are known to the learning algorithm. This paper presents a study of regression problems in that setting. It presents explicit VC-dimension error bounds for transductive regression that hold for all bounded loss functions and coincide with the tight classification bounds of Vapnik when applied to classification. It also presents a new transductive regression algorithm inspired by our bound that admits a primal and kernelized closed-form solution and deals efficiently with large amounts of unlabeled data. The algorithm exploits the position of unlabeled points to locally estimate their labels and then uses a global optimization to ensure robust predictions. Our study also includes the results of experiments with several publicly available regression data sets with up to 20,000 unlabeled examples. The comparison with other transductive regression algorithms shows that it performs well and that it can scale to large data sets.

AB - In many modern large-scale learning applications, the amount of unlabeled data far exceeds that of labeled data. A common instance of this problem is the transductive setting where the unlabeled test points are known to the learning algorithm. This paper presents a study of regression problems in that setting. It presents explicit VC-dimension error bounds for transductive regression that hold for all bounded loss functions and coincide with the tight classification bounds of Vapnik when applied to classification. It also presents a new transductive regression algorithm inspired by our bound that admits a primal and kernelized closed-form solution and deals efficiently with large amounts of unlabeled data. The algorithm exploits the position of unlabeled points to locally estimate their labels and then uses a global optimization to ensure robust predictions. Our study also includes the results of experiments with several publicly available regression data sets with up to 20,000 unlabeled examples. The comparison with other transductive regression algorithms shows that it performs well and that it can scale to large data sets.

UR - http://www.scopus.com/inward/record.url?scp=84864073503&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84864073503&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9780262195683

SP - 305

EP - 312

BT - Advances in Neural Information Processing Systems 19 - Proceedings of the 2006 Conference

ER -