Data debugging and exploration with vizier

Mike Brachmann, Carlos Bautista, Sonia Castelo, Su Feng, Juliana Freire, Boris Glavic, Oliver Kennedy, Heiko Müller, Rémi Rampin, William Spoth, Ying Yang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present Vizier, a multi-modal data exploration and debugging tool. The system supports a wide range of operations by seamlessly integrating Python, SQL, and automated data curation and debugging methods. Using Spark as an execution backend, Vizier handles large datasets in multiple formats. Ease-of-use is attained through integration of a notebook with a spreadsheet-style interface and with visualizations that guide and support the user in the loop. In addition, native support for provenance and versioning enable collaboration and uncertainty management. In this demonstration we will illustrate the diverse features of the system using several realistic data science tasks based on real data.

Original languageEnglish (US)
Title of host publicationSIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages1877-1880
Number of pages4
ISBN (Electronic)9781450356435
StatePublished - Jun 25 2019
Event2019 International Conference on Management of Data, SIGMOD 2019 - Amsterdam, Netherlands
Duration: Jun 30 2019Jul 5 2019

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
ISSN (Print)0730-8078

Conference

Conference2019 International Conference on Management of Data, SIGMOD 2019
CountryNetherlands
CityAmsterdam
Period6/30/197/5/19

Fingerprint

Spreadsheets
Electric sparks
Demonstrations
Visualization
Uncertainty

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Brachmann, M., Bautista, C., Castelo, S., Feng, S., Freire, J., Glavic, B., ... Yang, Y. (2019). Data debugging and exploration with vizier. In SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data (pp. 1877-1880). (Proceedings of the ACM SIGMOD International Conference on Management of Data). Association for Computing Machinery.

Data debugging and exploration with vizier. / Brachmann, Mike; Bautista, Carlos; Castelo, Sonia; Feng, Su; Freire, Juliana; Glavic, Boris; Kennedy, Oliver; Müller, Heiko; Rampin, Rémi; Spoth, William; Yang, Ying.

SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data. Association for Computing Machinery, 2019. p. 1877-1880 (Proceedings of the ACM SIGMOD International Conference on Management of Data).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Brachmann, M, Bautista, C, Castelo, S, Feng, S, Freire, J, Glavic, B, Kennedy, O, Müller, H, Rampin, R, Spoth, W & Yang, Y 2019, Data debugging and exploration with vizier. in SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data. Proceedings of the ACM SIGMOD International Conference on Management of Data, Association for Computing Machinery, pp. 1877-1880, 2019 International Conference on Management of Data, SIGMOD 2019, Amsterdam, Netherlands, 6/30/19.
Brachmann M, Bautista C, Castelo S, Feng S, Freire J, Glavic B et al. Data debugging and exploration with vizier. In SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data. Association for Computing Machinery. 2019. p. 1877-1880. (Proceedings of the ACM SIGMOD International Conference on Management of Data).
Brachmann, Mike ; Bautista, Carlos ; Castelo, Sonia ; Feng, Su ; Freire, Juliana ; Glavic, Boris ; Kennedy, Oliver ; Müller, Heiko ; Rampin, Rémi ; Spoth, William ; Yang, Ying. / Data debugging and exploration with vizier. SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data. Association for Computing Machinery, 2019. pp. 1877-1880 (Proceedings of the ACM SIGMOD International Conference on Management of Data).
@inproceedings{23564af1d4584701a24dd33dee9a2366,
title = "Data debugging and exploration with vizier",
abstract = "We present Vizier, a multi-modal data exploration and debugging tool. The system supports a wide range of operations by seamlessly integrating Python, SQL, and automated data curation and debugging methods. Using Spark as an execution backend, Vizier handles large datasets in multiple formats. Ease-of-use is attained through integration of a notebook with a spreadsheet-style interface and with visualizations that guide and support the user in the loop. In addition, native support for provenance and versioning enable collaboration and uncertainty management. In this demonstration we will illustrate the diverse features of the system using several realistic data science tasks based on real data.",
author = "Mike Brachmann and Carlos Bautista and Sonia Castelo and Su Feng and Juliana Freire and Boris Glavic and Oliver Kennedy and Heiko M{\"u}ller and R{\'e}mi Rampin and William Spoth and Ying Yang",
year = "2019",
month = "6",
day = "25",
language = "English (US)",
series = "Proceedings of the ACM SIGMOD International Conference on Management of Data",
publisher = "Association for Computing Machinery",
pages = "1877--1880",
booktitle = "SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data",

}

TY - GEN

T1 - Data debugging and exploration with vizier

AU - Brachmann, Mike

AU - Bautista, Carlos

AU - Castelo, Sonia

AU - Feng, Su

AU - Freire, Juliana

AU - Glavic, Boris

AU - Kennedy, Oliver

AU - Müller, Heiko

AU - Rampin, Rémi

AU - Spoth, William

AU - Yang, Ying

PY - 2019/6/25

Y1 - 2019/6/25

N2 - We present Vizier, a multi-modal data exploration and debugging tool. The system supports a wide range of operations by seamlessly integrating Python, SQL, and automated data curation and debugging methods. Using Spark as an execution backend, Vizier handles large datasets in multiple formats. Ease-of-use is attained through integration of a notebook with a spreadsheet-style interface and with visualizations that guide and support the user in the loop. In addition, native support for provenance and versioning enable collaboration and uncertainty management. In this demonstration we will illustrate the diverse features of the system using several realistic data science tasks based on real data.

AB - We present Vizier, a multi-modal data exploration and debugging tool. The system supports a wide range of operations by seamlessly integrating Python, SQL, and automated data curation and debugging methods. Using Spark as an execution backend, Vizier handles large datasets in multiple formats. Ease-of-use is attained through integration of a notebook with a spreadsheet-style interface and with visualizations that guide and support the user in the loop. In addition, native support for provenance and versioning enable collaboration and uncertainty management. In this demonstration we will illustrate the diverse features of the system using several realistic data science tasks based on real data.

UR - http://www.scopus.com/inward/record.url?scp=85069450009&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85069450009&partnerID=8YFLogxK

M3 - Conference contribution

T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data

SP - 1877

EP - 1880

BT - SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data

PB - Association for Computing Machinery

ER -