Improving image classification with location context

Kevin Tang, Manohar Paluri, Li Fei-Fei, Robert Fergus, Lubomir Bourdev

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

With the widespread availability of cellphones and cameras that have GPS capabilities, it is common for images being uploaded to the Internet today to have GPS coordinates associated with them. In addition to research that tries to predict GPS coordinates from visual features, this also opens up the door to problems that are conditioned on the availability of GPS coordinates. In this work, we tackle the problem of performing image classification with location context, in which we are given the GPS coordinates for images in both the train and test phases. We explore different ways of encoding and extracting features from the GPS coordinates, and show how to naturally incorporate these features into a Convolutional Neural Network (CNN), the current state-of-the-art for most image classification and recognition problems. We also show how it is possible to simultaneously learn the optimal pooling radii for a subset of our features within the CNN framework. To evaluate our model and to help promote research in this area, we identify a set of location-sensitive concepts and annotate a subset of the Yahoo Flickr Creative Commons 100M dataset that has GPS coordinates with these concepts, which we make publicly available. By leveraging location context, we are able to achieve almost a 7% gain in mean average precision.

Original languageEnglish (US)
Title of host publicationProceedings - 2015 IEEE International Conference on Computer Vision, ICCV 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1008-1016
Number of pages9
Volume11-18-December-2015
ISBN (Electronic)9781467383912
DOIs
StatePublished - Feb 17 2016
Event15th IEEE International Conference on Computer Vision, ICCV 2015 - Santiago, Chile
Duration: Dec 11 2015Dec 18 2015

Other

Other15th IEEE International Conference on Computer Vision, ICCV 2015
CountryChile
CitySantiago
Period12/11/1512/18/15

Fingerprint

Image classification
Global positioning system
Availability
Neural networks
Image recognition
Cameras
Internet

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Cite this

Tang, K., Paluri, M., Fei-Fei, L., Fergus, R., & Bourdev, L. (2016). Improving image classification with location context. In Proceedings - 2015 IEEE International Conference on Computer Vision, ICCV 2015 (Vol. 11-18-December-2015, pp. 1008-1016). [7410478] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCV.2015.121

Improving image classification with location context. / Tang, Kevin; Paluri, Manohar; Fei-Fei, Li; Fergus, Robert; Bourdev, Lubomir.

Proceedings - 2015 IEEE International Conference on Computer Vision, ICCV 2015. Vol. 11-18-December-2015 Institute of Electrical and Electronics Engineers Inc., 2016. p. 1008-1016 7410478.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Tang, K, Paluri, M, Fei-Fei, L, Fergus, R & Bourdev, L 2016, Improving image classification with location context. in Proceedings - 2015 IEEE International Conference on Computer Vision, ICCV 2015. vol. 11-18-December-2015, 7410478, Institute of Electrical and Electronics Engineers Inc., pp. 1008-1016, 15th IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 12/11/15. https://doi.org/10.1109/ICCV.2015.121
Tang K, Paluri M, Fei-Fei L, Fergus R, Bourdev L. Improving image classification with location context. In Proceedings - 2015 IEEE International Conference on Computer Vision, ICCV 2015. Vol. 11-18-December-2015. Institute of Electrical and Electronics Engineers Inc. 2016. p. 1008-1016. 7410478 https://doi.org/10.1109/ICCV.2015.121
Tang, Kevin ; Paluri, Manohar ; Fei-Fei, Li ; Fergus, Robert ; Bourdev, Lubomir. / Improving image classification with location context. Proceedings - 2015 IEEE International Conference on Computer Vision, ICCV 2015. Vol. 11-18-December-2015 Institute of Electrical and Electronics Engineers Inc., 2016. pp. 1008-1016
@inproceedings{61f02a155a41419d8d999c4cd9b89c17,
title = "Improving image classification with location context",
abstract = "With the widespread availability of cellphones and cameras that have GPS capabilities, it is common for images being uploaded to the Internet today to have GPS coordinates associated with them. In addition to research that tries to predict GPS coordinates from visual features, this also opens up the door to problems that are conditioned on the availability of GPS coordinates. In this work, we tackle the problem of performing image classification with location context, in which we are given the GPS coordinates for images in both the train and test phases. We explore different ways of encoding and extracting features from the GPS coordinates, and show how to naturally incorporate these features into a Convolutional Neural Network (CNN), the current state-of-the-art for most image classification and recognition problems. We also show how it is possible to simultaneously learn the optimal pooling radii for a subset of our features within the CNN framework. To evaluate our model and to help promote research in this area, we identify a set of location-sensitive concepts and annotate a subset of the Yahoo Flickr Creative Commons 100M dataset that has GPS coordinates with these concepts, which we make publicly available. By leveraging location context, we are able to achieve almost a 7{\%} gain in mean average precision.",
author = "Kevin Tang and Manohar Paluri and Li Fei-Fei and Robert Fergus and Lubomir Bourdev",
year = "2016",
month = "2",
day = "17",
doi = "10.1109/ICCV.2015.121",
language = "English (US)",
volume = "11-18-December-2015",
pages = "1008--1016",
booktitle = "Proceedings - 2015 IEEE International Conference on Computer Vision, ICCV 2015",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

TY - GEN

T1 - Improving image classification with location context

AU - Tang, Kevin

AU - Paluri, Manohar

AU - Fei-Fei, Li

AU - Fergus, Robert

AU - Bourdev, Lubomir

PY - 2016/2/17

Y1 - 2016/2/17

N2 - With the widespread availability of cellphones and cameras that have GPS capabilities, it is common for images being uploaded to the Internet today to have GPS coordinates associated with them. In addition to research that tries to predict GPS coordinates from visual features, this also opens up the door to problems that are conditioned on the availability of GPS coordinates. In this work, we tackle the problem of performing image classification with location context, in which we are given the GPS coordinates for images in both the train and test phases. We explore different ways of encoding and extracting features from the GPS coordinates, and show how to naturally incorporate these features into a Convolutional Neural Network (CNN), the current state-of-the-art for most image classification and recognition problems. We also show how it is possible to simultaneously learn the optimal pooling radii for a subset of our features within the CNN framework. To evaluate our model and to help promote research in this area, we identify a set of location-sensitive concepts and annotate a subset of the Yahoo Flickr Creative Commons 100M dataset that has GPS coordinates with these concepts, which we make publicly available. By leveraging location context, we are able to achieve almost a 7% gain in mean average precision.

AB - With the widespread availability of cellphones and cameras that have GPS capabilities, it is common for images being uploaded to the Internet today to have GPS coordinates associated with them. In addition to research that tries to predict GPS coordinates from visual features, this also opens up the door to problems that are conditioned on the availability of GPS coordinates. In this work, we tackle the problem of performing image classification with location context, in which we are given the GPS coordinates for images in both the train and test phases. We explore different ways of encoding and extracting features from the GPS coordinates, and show how to naturally incorporate these features into a Convolutional Neural Network (CNN), the current state-of-the-art for most image classification and recognition problems. We also show how it is possible to simultaneously learn the optimal pooling radii for a subset of our features within the CNN framework. To evaluate our model and to help promote research in this area, we identify a set of location-sensitive concepts and annotate a subset of the Yahoo Flickr Creative Commons 100M dataset that has GPS coordinates with these concepts, which we make publicly available. By leveraging location context, we are able to achieve almost a 7% gain in mean average precision.

UR - http://www.scopus.com/inward/record.url?scp=84973861940&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84973861940&partnerID=8YFLogxK

U2 - 10.1109/ICCV.2015.121

DO - 10.1109/ICCV.2015.121

M3 - Conference contribution

VL - 11-18-December-2015

SP - 1008

EP - 1016

BT - Proceedings - 2015 IEEE International Conference on Computer Vision, ICCV 2015

PB - Institute of Electrical and Electronics Engineers Inc.

ER -