A GPU-based index to support interactive spatio-temporal queries over historical data

Harish Doraiswamy, Huy T. Vo, Claudio T. Silva, Juliana Freire

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

There are increasing volumes of spatio-temporal data from various sources such as sensors, social networks and urban environments. Analysis of such data requires flexible exploration and visualizations, but queries that span multiple geographical regions over multiple time slices are expensive to compute, making it challenging to attain interactive speeds for large data sets. In this paper, we propose a new indexing scheme that makes use of modern GPUs to efficiently support spatio-temporal queries over point data. The index covers multiple dimensions, thus allowing simultaneous filtering of spatial and temporal attributes. It uses a block-based storage structure to speed up OLAP-type queries over historical data, and supports query processing over in-memory and disk-resident data. We present different query execution algorithms that we designed to allow the index to be used in different hardware configurations, including CPU-only, GPU-only, and a combination of CPU and GPU. To demonstrate the effectiveness of our techniques, we implemented them on top of MongoDB and performed an experimental evaluation using two real-world data sets: New York City's (NYC) taxi data - consisting of over 868 million taxi trips spanning a period of five years, and Twitter posts - over 1.1 billion tweets collected over a period of 14 months. Our results show that our GPU-based index obtains interactive, sub-second response times for queries over large data sets and leads to at least two orders of magnitude speedup over spatial indexes implemented in existing open-source and commercial database systems.

Original languageEnglish (US)
Title of host publication2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1086-1097
Number of pages12
ISBN (Electronic)9781509020195
DOIs
StatePublished - Jun 22 2016
Event32nd IEEE International Conference on Data Engineering, ICDE 2016 - Helsinki, Finland
Duration: May 16 2016May 20 2016

Other

Other32nd IEEE International Conference on Data Engineering, ICDE 2016
CountryFinland
CityHelsinki
Period5/16/165/20/16

Fingerprint

Program processors
Geographical regions
Query processing
Computer hardware
Visualization
Data storage equipment
Graphics processing unit
Query
Sensors
Indexing
Urban environment
Sensor
Social networks
Twitter
Residents
Online analytical processing
Data base
Network environment
Evaluation
Open source

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computational Theory and Mathematics
  • Computer Graphics and Computer-Aided Design
  • Computer Networks and Communications
  • Information Systems
  • Information Systems and Management

Cite this

Doraiswamy, H., Vo, H. T., Silva, C. T., & Freire, J. (2016). A GPU-based index to support interactive spatio-temporal queries over historical data. In 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016 (pp. 1086-1097). [7498315] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICDE.2016.7498315

A GPU-based index to support interactive spatio-temporal queries over historical data. / Doraiswamy, Harish; Vo, Huy T.; Silva, Claudio T.; Freire, Juliana.

2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016. Institute of Electrical and Electronics Engineers Inc., 2016. p. 1086-1097 7498315.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Doraiswamy, H, Vo, HT, Silva, CT & Freire, J 2016, A GPU-based index to support interactive spatio-temporal queries over historical data. in 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016., 7498315, Institute of Electrical and Electronics Engineers Inc., pp. 1086-1097, 32nd IEEE International Conference on Data Engineering, ICDE 2016, Helsinki, Finland, 5/16/16. https://doi.org/10.1109/ICDE.2016.7498315
Doraiswamy H, Vo HT, Silva CT, Freire J. A GPU-based index to support interactive spatio-temporal queries over historical data. In 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016. Institute of Electrical and Electronics Engineers Inc. 2016. p. 1086-1097. 7498315 https://doi.org/10.1109/ICDE.2016.7498315
Doraiswamy, Harish ; Vo, Huy T. ; Silva, Claudio T. ; Freire, Juliana. / A GPU-based index to support interactive spatio-temporal queries over historical data. 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 1086-1097
@inproceedings{709e075c3f1e4e8dbb560baaccdaf359,
title = "A GPU-based index to support interactive spatio-temporal queries over historical data",
abstract = "There are increasing volumes of spatio-temporal data from various sources such as sensors, social networks and urban environments. Analysis of such data requires flexible exploration and visualizations, but queries that span multiple geographical regions over multiple time slices are expensive to compute, making it challenging to attain interactive speeds for large data sets. In this paper, we propose a new indexing scheme that makes use of modern GPUs to efficiently support spatio-temporal queries over point data. The index covers multiple dimensions, thus allowing simultaneous filtering of spatial and temporal attributes. It uses a block-based storage structure to speed up OLAP-type queries over historical data, and supports query processing over in-memory and disk-resident data. We present different query execution algorithms that we designed to allow the index to be used in different hardware configurations, including CPU-only, GPU-only, and a combination of CPU and GPU. To demonstrate the effectiveness of our techniques, we implemented them on top of MongoDB and performed an experimental evaluation using two real-world data sets: New York City's (NYC) taxi data - consisting of over 868 million taxi trips spanning a period of five years, and Twitter posts - over 1.1 billion tweets collected over a period of 14 months. Our results show that our GPU-based index obtains interactive, sub-second response times for queries over large data sets and leads to at least two orders of magnitude speedup over spatial indexes implemented in existing open-source and commercial database systems.",
author = "Harish Doraiswamy and Vo, {Huy T.} and Silva, {Claudio T.} and Juliana Freire",
year = "2016",
month = "6",
day = "22",
doi = "10.1109/ICDE.2016.7498315",
language = "English (US)",
pages = "1086--1097",
booktitle = "2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

TY - GEN

T1 - A GPU-based index to support interactive spatio-temporal queries over historical data

AU - Doraiswamy, Harish

AU - Vo, Huy T.

AU - Silva, Claudio T.

AU - Freire, Juliana

PY - 2016/6/22

Y1 - 2016/6/22

N2 - There are increasing volumes of spatio-temporal data from various sources such as sensors, social networks and urban environments. Analysis of such data requires flexible exploration and visualizations, but queries that span multiple geographical regions over multiple time slices are expensive to compute, making it challenging to attain interactive speeds for large data sets. In this paper, we propose a new indexing scheme that makes use of modern GPUs to efficiently support spatio-temporal queries over point data. The index covers multiple dimensions, thus allowing simultaneous filtering of spatial and temporal attributes. It uses a block-based storage structure to speed up OLAP-type queries over historical data, and supports query processing over in-memory and disk-resident data. We present different query execution algorithms that we designed to allow the index to be used in different hardware configurations, including CPU-only, GPU-only, and a combination of CPU and GPU. To demonstrate the effectiveness of our techniques, we implemented them on top of MongoDB and performed an experimental evaluation using two real-world data sets: New York City's (NYC) taxi data - consisting of over 868 million taxi trips spanning a period of five years, and Twitter posts - over 1.1 billion tweets collected over a period of 14 months. Our results show that our GPU-based index obtains interactive, sub-second response times for queries over large data sets and leads to at least two orders of magnitude speedup over spatial indexes implemented in existing open-source and commercial database systems.

AB - There are increasing volumes of spatio-temporal data from various sources such as sensors, social networks and urban environments. Analysis of such data requires flexible exploration and visualizations, but queries that span multiple geographical regions over multiple time slices are expensive to compute, making it challenging to attain interactive speeds for large data sets. In this paper, we propose a new indexing scheme that makes use of modern GPUs to efficiently support spatio-temporal queries over point data. The index covers multiple dimensions, thus allowing simultaneous filtering of spatial and temporal attributes. It uses a block-based storage structure to speed up OLAP-type queries over historical data, and supports query processing over in-memory and disk-resident data. We present different query execution algorithms that we designed to allow the index to be used in different hardware configurations, including CPU-only, GPU-only, and a combination of CPU and GPU. To demonstrate the effectiveness of our techniques, we implemented them on top of MongoDB and performed an experimental evaluation using two real-world data sets: New York City's (NYC) taxi data - consisting of over 868 million taxi trips spanning a period of five years, and Twitter posts - over 1.1 billion tweets collected over a period of 14 months. Our results show that our GPU-based index obtains interactive, sub-second response times for queries over large data sets and leads to at least two orders of magnitude speedup over spatial indexes implemented in existing open-source and commercial database systems.

UR - http://www.scopus.com/inward/record.url?scp=84980340152&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84980340152&partnerID=8YFLogxK

U2 - 10.1109/ICDE.2016.7498315

DO - 10.1109/ICDE.2016.7498315

M3 - Conference contribution

AN - SCOPUS:84980340152

SP - 1086

EP - 1097

BT - 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016

PB - Institute of Electrical and Electronics Engineers Inc.

ER -