Multi-Armed Bandit On-Time Arrival Algorithms for Sequential Reliable Route Selection under Uncertainty

Jinkai Zhou, Xuebo Lai, Joseph Ying Jun Chow

Research output: Contribution to journalArticle

Abstract

Traditionally vehicles act only as servers in transporting passengers and goods. With increasing sensor equipment in vehicles, including automated vehicles, there is a need to test algorithms that consider the dual role of vehicles as both servers and sensors. The paper formulates a sequential route selection problem as a shortest path problem with on-time arrival reliability under a multi-armed bandit setting, a type of reinforcement learning model. A decision-maker has to make a finite set of decisions sequentially on departure time and path between a fixed origin-destination pair such that on-time reliability is maximized while travel time is minimized. The upper confidence bound algorithm is extended to handle this problem. Several tests are conducted. First, simulated data successfully verifies the method, then a real-data scenario is constructed of a hotel shuttle service from midtown Manhattan in New York City providing hourly access to John F. Kennedy International Airport. Results suggest that route selection with multi-armed bandit learning algorithms can be effective but neglecting passenger scheduling constraints can have negative effects on on-time arrival reliability by as much as 4.8% and combined reliability and travel time by 66.1%.

Original languageEnglish (US)
JournalTransportation Research Record
DOIs
StatePublished - Jan 1 2019

Fingerprint

Travel time
Servers
Hotels
Sensors
Reinforcement learning
Airports
Learning algorithms
Scheduling
Uncertainty

ASJC Scopus subject areas

  • Civil and Structural Engineering
  • Mechanical Engineering

Cite this

@article{a6f1822e149248f491832cf33a04756d,
title = "Multi-Armed Bandit On-Time Arrival Algorithms for Sequential Reliable Route Selection under Uncertainty",
abstract = "Traditionally vehicles act only as servers in transporting passengers and goods. With increasing sensor equipment in vehicles, including automated vehicles, there is a need to test algorithms that consider the dual role of vehicles as both servers and sensors. The paper formulates a sequential route selection problem as a shortest path problem with on-time arrival reliability under a multi-armed bandit setting, a type of reinforcement learning model. A decision-maker has to make a finite set of decisions sequentially on departure time and path between a fixed origin-destination pair such that on-time reliability is maximized while travel time is minimized. The upper confidence bound algorithm is extended to handle this problem. Several tests are conducted. First, simulated data successfully verifies the method, then a real-data scenario is constructed of a hotel shuttle service from midtown Manhattan in New York City providing hourly access to John F. Kennedy International Airport. Results suggest that route selection with multi-armed bandit learning algorithms can be effective but neglecting passenger scheduling constraints can have negative effects on on-time arrival reliability by as much as 4.8{\%} and combined reliability and travel time by 66.1{\%}.",
author = "Jinkai Zhou and Xuebo Lai and Chow, {Joseph Ying Jun}",
year = "2019",
month = "1",
day = "1",
doi = "10.1177/0361198119850457",
language = "English (US)",
journal = "Transportation Research Record",
issn = "0361-1981",
publisher = "US National Research Council",

}

TY - JOUR

T1 - Multi-Armed Bandit On-Time Arrival Algorithms for Sequential Reliable Route Selection under Uncertainty

AU - Zhou, Jinkai

AU - Lai, Xuebo

AU - Chow, Joseph Ying Jun

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Traditionally vehicles act only as servers in transporting passengers and goods. With increasing sensor equipment in vehicles, including automated vehicles, there is a need to test algorithms that consider the dual role of vehicles as both servers and sensors. The paper formulates a sequential route selection problem as a shortest path problem with on-time arrival reliability under a multi-armed bandit setting, a type of reinforcement learning model. A decision-maker has to make a finite set of decisions sequentially on departure time and path between a fixed origin-destination pair such that on-time reliability is maximized while travel time is minimized. The upper confidence bound algorithm is extended to handle this problem. Several tests are conducted. First, simulated data successfully verifies the method, then a real-data scenario is constructed of a hotel shuttle service from midtown Manhattan in New York City providing hourly access to John F. Kennedy International Airport. Results suggest that route selection with multi-armed bandit learning algorithms can be effective but neglecting passenger scheduling constraints can have negative effects on on-time arrival reliability by as much as 4.8% and combined reliability and travel time by 66.1%.

AB - Traditionally vehicles act only as servers in transporting passengers and goods. With increasing sensor equipment in vehicles, including automated vehicles, there is a need to test algorithms that consider the dual role of vehicles as both servers and sensors. The paper formulates a sequential route selection problem as a shortest path problem with on-time arrival reliability under a multi-armed bandit setting, a type of reinforcement learning model. A decision-maker has to make a finite set of decisions sequentially on departure time and path between a fixed origin-destination pair such that on-time reliability is maximized while travel time is minimized. The upper confidence bound algorithm is extended to handle this problem. Several tests are conducted. First, simulated data successfully verifies the method, then a real-data scenario is constructed of a hotel shuttle service from midtown Manhattan in New York City providing hourly access to John F. Kennedy International Airport. Results suggest that route selection with multi-armed bandit learning algorithms can be effective but neglecting passenger scheduling constraints can have negative effects on on-time arrival reliability by as much as 4.8% and combined reliability and travel time by 66.1%.

UR - http://www.scopus.com/inward/record.url?scp=85067847194&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85067847194&partnerID=8YFLogxK

U2 - 10.1177/0361198119850457

DO - 10.1177/0361198119850457

M3 - Article

AN - SCOPUS:85067847194

JO - Transportation Research Record

JF - Transportation Research Record

SN - 0361-1981

ER -