Predicting taxi demand at high spatial resolution

Approaching the limit of predictability

Kai Zhao, Denis Khryashchev, Juliana Freire, Claudio Silva, Huy Vo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In big cities, taxi service is imbalanced. In some areas, passengers wait too long for a taxi, while in others, many taxis roam without passengers. Knowledge of where a taxi will become available can help us solve the taxi demand imbalance problem. In this paper, we employ a holistic approach to predict taxi demand at high spatial resolution. We showcase our techniques using two real-world data sets, yellow cabs and Uber trips in New York City, and perform an evaluation over 9,940 building blocks in Manhattan. Our approach consists of two key steps. First, we use entropy and the temporal correlation of human mobility to measure the demand uncertainty at the building block level. Second, to identify which predictive algorithm can approach the theoretical maximum predictability, we implement and compare three predictors: the Markov predictor (a probability-based predictive algorithm), the Lempel-Ziv-Welch predictor (a sequence-based predictive algorithm), and the Neural Network predictor (a predictive algorithm that uses machine learning). The results show that predictability varies by building block and, on average, the theoretical maximum predictability can be as high as 83%. The performance of the predictors also vary: the Neural Network predictor provides better accuracy for blocks with low predictability, and the Markov predictor provides better accuracy for blocks with high predictability. In blocks with high maximum predictability, the Markov predictor is able to predict the taxi demand with an 89% accuracy, 11% better than the Neural Network predictor, while requiring only 0.03% computation time. These findings indicate that the maximum predictability can be a good metric for selecting prediction algorithms.

Original languageEnglish (US)
Title of host publicationProceedings - 2016 IEEE International Conference on Big Data, Big Data 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages833-842
Number of pages10
ISBN (Electronic)9781467390040
DOIs
StatePublished - Feb 2 2017
Event4th IEEE International Conference on Big Data, Big Data 2016 - Washington, United States
Duration: Dec 5 2016Dec 8 2016

Other

Other4th IEEE International Conference on Big Data, Big Data 2016
CountryUnited States
CityWashington
Period12/5/1612/8/16

Fingerprint

Neural networks
Learning systems
Entropy
Uncertainty

Keywords

  • human mobility
  • limit of predictability
  • predictive algorithm
  • spatiotemporal data
  • taxi demand prediction

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Hardware and Architecture

Cite this

Zhao, K., Khryashchev, D., Freire, J., Silva, C., & Vo, H. (2017). Predicting taxi demand at high spatial resolution: Approaching the limit of predictability. In Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016 (pp. 833-842). [7840676] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BigData.2016.7840676

Predicting taxi demand at high spatial resolution : Approaching the limit of predictability. / Zhao, Kai; Khryashchev, Denis; Freire, Juliana; Silva, Claudio; Vo, Huy.

Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016. Institute of Electrical and Electronics Engineers Inc., 2017. p. 833-842 7840676.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhao, K, Khryashchev, D, Freire, J, Silva, C & Vo, H 2017, Predicting taxi demand at high spatial resolution: Approaching the limit of predictability. in Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016., 7840676, Institute of Electrical and Electronics Engineers Inc., pp. 833-842, 4th IEEE International Conference on Big Data, Big Data 2016, Washington, United States, 12/5/16. https://doi.org/10.1109/BigData.2016.7840676
Zhao K, Khryashchev D, Freire J, Silva C, Vo H. Predicting taxi demand at high spatial resolution: Approaching the limit of predictability. In Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016. Institute of Electrical and Electronics Engineers Inc. 2017. p. 833-842. 7840676 https://doi.org/10.1109/BigData.2016.7840676
Zhao, Kai ; Khryashchev, Denis ; Freire, Juliana ; Silva, Claudio ; Vo, Huy. / Predicting taxi demand at high spatial resolution : Approaching the limit of predictability. Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016. Institute of Electrical and Electronics Engineers Inc., 2017. pp. 833-842
@inproceedings{b38f5d2370114fb6917d9af770e52b67,
title = "Predicting taxi demand at high spatial resolution: Approaching the limit of predictability",
abstract = "In big cities, taxi service is imbalanced. In some areas, passengers wait too long for a taxi, while in others, many taxis roam without passengers. Knowledge of where a taxi will become available can help us solve the taxi demand imbalance problem. In this paper, we employ a holistic approach to predict taxi demand at high spatial resolution. We showcase our techniques using two real-world data sets, yellow cabs and Uber trips in New York City, and perform an evaluation over 9,940 building blocks in Manhattan. Our approach consists of two key steps. First, we use entropy and the temporal correlation of human mobility to measure the demand uncertainty at the building block level. Second, to identify which predictive algorithm can approach the theoretical maximum predictability, we implement and compare three predictors: the Markov predictor (a probability-based predictive algorithm), the Lempel-Ziv-Welch predictor (a sequence-based predictive algorithm), and the Neural Network predictor (a predictive algorithm that uses machine learning). The results show that predictability varies by building block and, on average, the theoretical maximum predictability can be as high as 83{\%}. The performance of the predictors also vary: the Neural Network predictor provides better accuracy for blocks with low predictability, and the Markov predictor provides better accuracy for blocks with high predictability. In blocks with high maximum predictability, the Markov predictor is able to predict the taxi demand with an 89{\%} accuracy, 11{\%} better than the Neural Network predictor, while requiring only 0.03{\%} computation time. These findings indicate that the maximum predictability can be a good metric for selecting prediction algorithms.",
keywords = "human mobility, limit of predictability, predictive algorithm, spatiotemporal data, taxi demand prediction",
author = "Kai Zhao and Denis Khryashchev and Juliana Freire and Claudio Silva and Huy Vo",
year = "2017",
month = "2",
day = "2",
doi = "10.1109/BigData.2016.7840676",
language = "English (US)",
pages = "833--842",
booktitle = "Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

TY - GEN

T1 - Predicting taxi demand at high spatial resolution

T2 - Approaching the limit of predictability

AU - Zhao, Kai

AU - Khryashchev, Denis

AU - Freire, Juliana

AU - Silva, Claudio

AU - Vo, Huy

PY - 2017/2/2

Y1 - 2017/2/2

N2 - In big cities, taxi service is imbalanced. In some areas, passengers wait too long for a taxi, while in others, many taxis roam without passengers. Knowledge of where a taxi will become available can help us solve the taxi demand imbalance problem. In this paper, we employ a holistic approach to predict taxi demand at high spatial resolution. We showcase our techniques using two real-world data sets, yellow cabs and Uber trips in New York City, and perform an evaluation over 9,940 building blocks in Manhattan. Our approach consists of two key steps. First, we use entropy and the temporal correlation of human mobility to measure the demand uncertainty at the building block level. Second, to identify which predictive algorithm can approach the theoretical maximum predictability, we implement and compare three predictors: the Markov predictor (a probability-based predictive algorithm), the Lempel-Ziv-Welch predictor (a sequence-based predictive algorithm), and the Neural Network predictor (a predictive algorithm that uses machine learning). The results show that predictability varies by building block and, on average, the theoretical maximum predictability can be as high as 83%. The performance of the predictors also vary: the Neural Network predictor provides better accuracy for blocks with low predictability, and the Markov predictor provides better accuracy for blocks with high predictability. In blocks with high maximum predictability, the Markov predictor is able to predict the taxi demand with an 89% accuracy, 11% better than the Neural Network predictor, while requiring only 0.03% computation time. These findings indicate that the maximum predictability can be a good metric for selecting prediction algorithms.

AB - In big cities, taxi service is imbalanced. In some areas, passengers wait too long for a taxi, while in others, many taxis roam without passengers. Knowledge of where a taxi will become available can help us solve the taxi demand imbalance problem. In this paper, we employ a holistic approach to predict taxi demand at high spatial resolution. We showcase our techniques using two real-world data sets, yellow cabs and Uber trips in New York City, and perform an evaluation over 9,940 building blocks in Manhattan. Our approach consists of two key steps. First, we use entropy and the temporal correlation of human mobility to measure the demand uncertainty at the building block level. Second, to identify which predictive algorithm can approach the theoretical maximum predictability, we implement and compare three predictors: the Markov predictor (a probability-based predictive algorithm), the Lempel-Ziv-Welch predictor (a sequence-based predictive algorithm), and the Neural Network predictor (a predictive algorithm that uses machine learning). The results show that predictability varies by building block and, on average, the theoretical maximum predictability can be as high as 83%. The performance of the predictors also vary: the Neural Network predictor provides better accuracy for blocks with low predictability, and the Markov predictor provides better accuracy for blocks with high predictability. In blocks with high maximum predictability, the Markov predictor is able to predict the taxi demand with an 89% accuracy, 11% better than the Neural Network predictor, while requiring only 0.03% computation time. These findings indicate that the maximum predictability can be a good metric for selecting prediction algorithms.

KW - human mobility

KW - limit of predictability

KW - predictive algorithm

KW - spatiotemporal data

KW - taxi demand prediction

UR - http://www.scopus.com/inward/record.url?scp=85015259491&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85015259491&partnerID=8YFLogxK

U2 - 10.1109/BigData.2016.7840676

DO - 10.1109/BigData.2016.7840676

M3 - Conference contribution

SP - 833

EP - 842

BT - Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016

PB - Institute of Electrical and Electronics Engineers Inc.

ER -