An urban data profiler

Daniel Castellani Ribeiro, Huy T. Vo, Juliana Freire, Cláudio T. Silva

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Large volumes of urban data are being made available through a variety of open portals. Besides promoting transparency, these data can bring benefits to government, science, citizens and industry. It is no longer a fantasy to ask "if you could know anything about a city, what do you want to know" and to ponder what could be done with that information. However, the great number and variety of datasets creates a new challenge: how to find relevant datasets. While existing portals provide search interfaces, these are often limited to keyword searches over the limited metadata associated each dataset, for example, attribute names and textual description. In this paper, we present a new tool, UrbanProfiler, that automatically extracts detailed information from datasets. This information includes attribute types, value distributions, and geographical information, which can be used to support complex search queries as well as visualizations that help users explore and obtain insight into the contents of a data collection. Besides describing the tool and its implementation, we present case studies that illustrate how the tool was used to explore a large open urban data repository.

Original languageEnglish (US)
Title of host publicationWWW 2015 Companion - Proceedings of the 24th International Conference on World Wide Web
PublisherAssociation for Computing Machinery, Inc
Pages1389-1394
Number of pages6
ISBN (Print)9781450334730
DOIs
StatePublished - May 18 2015
Event24th International Conference on World Wide Web, WWW 2015 - Florence, Italy
Duration: May 18 2015May 22 2015

Other

Other24th International Conference on World Wide Web, WWW 2015
CountryItaly
CityFlorence
Period5/18/155/22/15

Fingerprint

Metadata
Transparency
Visualization
Industry

Keywords

  • Automatic Type Detection
  • Dataset Analysis
  • Metadata Extractionl

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Software

Cite this

Ribeiro, D. C., Vo, H. T., Freire, J., & Silva, C. T. (2015). An urban data profiler. In WWW 2015 Companion - Proceedings of the 24th International Conference on World Wide Web (pp. 1389-1394). Association for Computing Machinery, Inc. https://doi.org/10.1145/2740908.2742135

An urban data profiler. / Ribeiro, Daniel Castellani; Vo, Huy T.; Freire, Juliana; Silva, Cláudio T.

WWW 2015 Companion - Proceedings of the 24th International Conference on World Wide Web. Association for Computing Machinery, Inc, 2015. p. 1389-1394.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ribeiro, DC, Vo, HT, Freire, J & Silva, CT 2015, An urban data profiler. in WWW 2015 Companion - Proceedings of the 24th International Conference on World Wide Web. Association for Computing Machinery, Inc, pp. 1389-1394, 24th International Conference on World Wide Web, WWW 2015, Florence, Italy, 5/18/15. https://doi.org/10.1145/2740908.2742135
Ribeiro DC, Vo HT, Freire J, Silva CT. An urban data profiler. In WWW 2015 Companion - Proceedings of the 24th International Conference on World Wide Web. Association for Computing Machinery, Inc. 2015. p. 1389-1394 https://doi.org/10.1145/2740908.2742135
Ribeiro, Daniel Castellani ; Vo, Huy T. ; Freire, Juliana ; Silva, Cláudio T. / An urban data profiler. WWW 2015 Companion - Proceedings of the 24th International Conference on World Wide Web. Association for Computing Machinery, Inc, 2015. pp. 1389-1394
@inproceedings{9895d49c1d6d4b0297cb833daeb4b90e,
title = "An urban data profiler",
abstract = "Large volumes of urban data are being made available through a variety of open portals. Besides promoting transparency, these data can bring benefits to government, science, citizens and industry. It is no longer a fantasy to ask {"}if you could know anything about a city, what do you want to know{"} and to ponder what could be done with that information. However, the great number and variety of datasets creates a new challenge: how to find relevant datasets. While existing portals provide search interfaces, these are often limited to keyword searches over the limited metadata associated each dataset, for example, attribute names and textual description. In this paper, we present a new tool, UrbanProfiler, that automatically extracts detailed information from datasets. This information includes attribute types, value distributions, and geographical information, which can be used to support complex search queries as well as visualizations that help users explore and obtain insight into the contents of a data collection. Besides describing the tool and its implementation, we present case studies that illustrate how the tool was used to explore a large open urban data repository.",
keywords = "Automatic Type Detection, Dataset Analysis, Metadata Extractionl",
author = "Ribeiro, {Daniel Castellani} and Vo, {Huy T.} and Juliana Freire and Silva, {Cl{\'a}udio T.}",
year = "2015",
month = "5",
day = "18",
doi = "10.1145/2740908.2742135",
language = "English (US)",
isbn = "9781450334730",
pages = "1389--1394",
booktitle = "WWW 2015 Companion - Proceedings of the 24th International Conference on World Wide Web",
publisher = "Association for Computing Machinery, Inc",

}

TY - GEN

T1 - An urban data profiler

AU - Ribeiro, Daniel Castellani

AU - Vo, Huy T.

AU - Freire, Juliana

AU - Silva, Cláudio T.

PY - 2015/5/18

Y1 - 2015/5/18

N2 - Large volumes of urban data are being made available through a variety of open portals. Besides promoting transparency, these data can bring benefits to government, science, citizens and industry. It is no longer a fantasy to ask "if you could know anything about a city, what do you want to know" and to ponder what could be done with that information. However, the great number and variety of datasets creates a new challenge: how to find relevant datasets. While existing portals provide search interfaces, these are often limited to keyword searches over the limited metadata associated each dataset, for example, attribute names and textual description. In this paper, we present a new tool, UrbanProfiler, that automatically extracts detailed information from datasets. This information includes attribute types, value distributions, and geographical information, which can be used to support complex search queries as well as visualizations that help users explore and obtain insight into the contents of a data collection. Besides describing the tool and its implementation, we present case studies that illustrate how the tool was used to explore a large open urban data repository.

AB - Large volumes of urban data are being made available through a variety of open portals. Besides promoting transparency, these data can bring benefits to government, science, citizens and industry. It is no longer a fantasy to ask "if you could know anything about a city, what do you want to know" and to ponder what could be done with that information. However, the great number and variety of datasets creates a new challenge: how to find relevant datasets. While existing portals provide search interfaces, these are often limited to keyword searches over the limited metadata associated each dataset, for example, attribute names and textual description. In this paper, we present a new tool, UrbanProfiler, that automatically extracts detailed information from datasets. This information includes attribute types, value distributions, and geographical information, which can be used to support complex search queries as well as visualizations that help users explore and obtain insight into the contents of a data collection. Besides describing the tool and its implementation, we present case studies that illustrate how the tool was used to explore a large open urban data repository.

KW - Automatic Type Detection

KW - Dataset Analysis

KW - Metadata Extractionl

UR - http://www.scopus.com/inward/record.url?scp=84968546467&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84968546467&partnerID=8YFLogxK

U2 - 10.1145/2740908.2742135

DO - 10.1145/2740908.2742135

M3 - Conference contribution

AN - SCOPUS:84968546467

SN - 9781450334730

SP - 1389

EP - 1394

BT - WWW 2015 Companion - Proceedings of the 24th International Conference on World Wide Web

PB - Association for Computing Machinery, Inc

ER -