Zooming in on NYC taxi data with portal

Julia Stoyanovich, Matthew Gilbride, Vera Z. Mott

    Research output: Contribution to journalConference article

    Abstract

    In this paper we develop a methodology for analyzing transportation data at different levels of temporal and spatial granularity, and apply our methodology to the TLC Trip Record Dataset, made publicly available by the NYC Taxi & Limousine Commission. This data is naturally represented by a set of trajectories, annotated with time and with additional information such as passenger count and cost. We analyze TLC data to identify hotspots, which point to lack of convenient public transportation options, and popular routes, which motivate ride-sharing solutions or addition of a bus route. Our methodology is based on using an open-source system called Portal that supports an algebraic query language for analyzing evolving property graphs. Portal is implemented as an Apache Spark library and is inter-operable with other Spark libraries like SparkSQL, which we also use in our analysis.

    Original languageEnglish (US)
    JournalCEUR Workshop Proceedings
    Volume2247
    StatePublished - Jan 1 2018
    Event2018 Poster Track of the Workshop on Big Social Data and Urban Computing, BiDU-PS 2018 - Rio de Janeiro, Brazil
    Duration: Aug 31 2018 → …

    Fingerprint

    Electric sparks
    Query languages
    Trajectories
    Costs

    ASJC Scopus subject areas

    • Computer Science(all)

    Cite this

    Stoyanovich, J., Gilbride, M., & Mott, V. Z. (2018). Zooming in on NYC taxi data with portal. CEUR Workshop Proceedings, 2247.

    Zooming in on NYC taxi data with portal. / Stoyanovich, Julia; Gilbride, Matthew; Mott, Vera Z.

    In: CEUR Workshop Proceedings, Vol. 2247, 01.01.2018.

    Research output: Contribution to journalConference article

    Stoyanovich, J, Gilbride, M & Mott, VZ 2018, 'Zooming in on NYC taxi data with portal', CEUR Workshop Proceedings, vol. 2247.
    Stoyanovich J, Gilbride M, Mott VZ. Zooming in on NYC taxi data with portal. CEUR Workshop Proceedings. 2018 Jan 1;2247.
    Stoyanovich, Julia ; Gilbride, Matthew ; Mott, Vera Z. / Zooming in on NYC taxi data with portal. In: CEUR Workshop Proceedings. 2018 ; Vol. 2247.
    @article{f12e6c47de594c9fbc8e2e8fdd80eaa7,
    title = "Zooming in on NYC taxi data with portal",
    abstract = "In this paper we develop a methodology for analyzing transportation data at different levels of temporal and spatial granularity, and apply our methodology to the TLC Trip Record Dataset, made publicly available by the NYC Taxi & Limousine Commission. This data is naturally represented by a set of trajectories, annotated with time and with additional information such as passenger count and cost. We analyze TLC data to identify hotspots, which point to lack of convenient public transportation options, and popular routes, which motivate ride-sharing solutions or addition of a bus route. Our methodology is based on using an open-source system called Portal that supports an algebraic query language for analyzing evolving property graphs. Portal is implemented as an Apache Spark library and is inter-operable with other Spark libraries like SparkSQL, which we also use in our analysis.",
    author = "Julia Stoyanovich and Matthew Gilbride and Mott, {Vera Z.}",
    year = "2018",
    month = "1",
    day = "1",
    language = "English (US)",
    volume = "2247",
    journal = "CEUR Workshop Proceedings",
    issn = "1613-0073",
    publisher = "CEUR-WS",

    }

    TY - JOUR

    T1 - Zooming in on NYC taxi data with portal

    AU - Stoyanovich, Julia

    AU - Gilbride, Matthew

    AU - Mott, Vera Z.

    PY - 2018/1/1

    Y1 - 2018/1/1

    N2 - In this paper we develop a methodology for analyzing transportation data at different levels of temporal and spatial granularity, and apply our methodology to the TLC Trip Record Dataset, made publicly available by the NYC Taxi & Limousine Commission. This data is naturally represented by a set of trajectories, annotated with time and with additional information such as passenger count and cost. We analyze TLC data to identify hotspots, which point to lack of convenient public transportation options, and popular routes, which motivate ride-sharing solutions or addition of a bus route. Our methodology is based on using an open-source system called Portal that supports an algebraic query language for analyzing evolving property graphs. Portal is implemented as an Apache Spark library and is inter-operable with other Spark libraries like SparkSQL, which we also use in our analysis.

    AB - In this paper we develop a methodology for analyzing transportation data at different levels of temporal and spatial granularity, and apply our methodology to the TLC Trip Record Dataset, made publicly available by the NYC Taxi & Limousine Commission. This data is naturally represented by a set of trajectories, annotated with time and with additional information such as passenger count and cost. We analyze TLC data to identify hotspots, which point to lack of convenient public transportation options, and popular routes, which motivate ride-sharing solutions or addition of a bus route. Our methodology is based on using an open-source system called Portal that supports an algebraic query language for analyzing evolving property graphs. Portal is implemented as an Apache Spark library and is inter-operable with other Spark libraries like SparkSQL, which we also use in our analysis.

    UR - http://www.scopus.com/inward/record.url?scp=85057539830&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85057539830&partnerID=8YFLogxK

    M3 - Conference article

    VL - 2247

    JO - CEUR Workshop Proceedings

    JF - CEUR Workshop Proceedings

    SN - 1613-0073

    ER -