From XML Schema to relations

A cost-based approach to XML storage

Philip Bohannon, Juliana Freire, Prasan Roy, Jérôme Siméon

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

As Web applications manipulate an increasing amount of XML, there is a growing interest in storing XML data in relational databases. Due to the mismatch between the complexity of XML's tree structure and the simplicity of flat relational tables, there are many ways to store the same document in an RDBMS, and a number of heuristic techniques have been proposed. These techniques typically define fixed mappings and do not take application characteristics into account. However, a fixed mapping is unlikely to work well for all possible applications. In contrast, LegoDB is a cost-based XML storage mapping engine that explores a space of possible XML-to-relational mappings and selects the best mapping for a given application. LegoDB leverages current XML and relational technologies: 1) it models the target application with an XML Schema, XML data statistics, and an XQuery workload; 2) the space of configurations is generated through XML-Schema rewritings; and 3) the best among the derived configurations is selected using cost estimates obtained through a standard relational optimizer. In this paper, we describe the LegoDB storage engine and provide experimental results that demonstrate the effectiveness of this approach.

Original languageEnglish (US)
Title of host publicationProceedings - International Conference on Data Engineering
EditorsR Agrawal, K Dittrich, A Ngu
Pages64-75
Number of pages12
StatePublished - 2002
Event18th International Conference on Data Engineering - San Jose, CA, United States
Duration: Feb 26 2002Mar 1 2002

Other

Other18th International Conference on Data Engineering
CountryUnited States
CitySan Jose, CA
Period2/26/023/1/02

Fingerprint

XML
Costs
Engines
World Wide Web
Statistics

ASJC Scopus subject areas

  • Software
  • Engineering(all)
  • Engineering (miscellaneous)

Cite this

Bohannon, P., Freire, J., Roy, P., & Siméon, J. (2002). From XML Schema to relations: A cost-based approach to XML storage. In R. Agrawal, K. Dittrich, & A. Ngu (Eds.), Proceedings - International Conference on Data Engineering (pp. 64-75)

From XML Schema to relations : A cost-based approach to XML storage. / Bohannon, Philip; Freire, Juliana; Roy, Prasan; Siméon, Jérôme.

Proceedings - International Conference on Data Engineering. ed. / R Agrawal; K Dittrich; A Ngu. 2002. p. 64-75.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Bohannon, P, Freire, J, Roy, P & Siméon, J 2002, From XML Schema to relations: A cost-based approach to XML storage. in R Agrawal, K Dittrich & A Ngu (eds), Proceedings - International Conference on Data Engineering. pp. 64-75, 18th International Conference on Data Engineering, San Jose, CA, United States, 2/26/02.
Bohannon P, Freire J, Roy P, Siméon J. From XML Schema to relations: A cost-based approach to XML storage. In Agrawal R, Dittrich K, Ngu A, editors, Proceedings - International Conference on Data Engineering. 2002. p. 64-75
Bohannon, Philip ; Freire, Juliana ; Roy, Prasan ; Siméon, Jérôme. / From XML Schema to relations : A cost-based approach to XML storage. Proceedings - International Conference on Data Engineering. editor / R Agrawal ; K Dittrich ; A Ngu. 2002. pp. 64-75
@inproceedings{7b2280e575aa43eba2282305f37b0b9d,
title = "From XML Schema to relations: A cost-based approach to XML storage",
abstract = "As Web applications manipulate an increasing amount of XML, there is a growing interest in storing XML data in relational databases. Due to the mismatch between the complexity of XML's tree structure and the simplicity of flat relational tables, there are many ways to store the same document in an RDBMS, and a number of heuristic techniques have been proposed. These techniques typically define fixed mappings and do not take application characteristics into account. However, a fixed mapping is unlikely to work well for all possible applications. In contrast, LegoDB is a cost-based XML storage mapping engine that explores a space of possible XML-to-relational mappings and selects the best mapping for a given application. LegoDB leverages current XML and relational technologies: 1) it models the target application with an XML Schema, XML data statistics, and an XQuery workload; 2) the space of configurations is generated through XML-Schema rewritings; and 3) the best among the derived configurations is selected using cost estimates obtained through a standard relational optimizer. In this paper, we describe the LegoDB storage engine and provide experimental results that demonstrate the effectiveness of this approach.",
author = "Philip Bohannon and Juliana Freire and Prasan Roy and J{\'e}r{\^o}me Sim{\'e}on",
year = "2002",
language = "English (US)",
pages = "64--75",
editor = "R Agrawal and K Dittrich and A Ngu",
booktitle = "Proceedings - International Conference on Data Engineering",

}

TY - GEN

T1 - From XML Schema to relations

T2 - A cost-based approach to XML storage

AU - Bohannon, Philip

AU - Freire, Juliana

AU - Roy, Prasan

AU - Siméon, Jérôme

PY - 2002

Y1 - 2002

N2 - As Web applications manipulate an increasing amount of XML, there is a growing interest in storing XML data in relational databases. Due to the mismatch between the complexity of XML's tree structure and the simplicity of flat relational tables, there are many ways to store the same document in an RDBMS, and a number of heuristic techniques have been proposed. These techniques typically define fixed mappings and do not take application characteristics into account. However, a fixed mapping is unlikely to work well for all possible applications. In contrast, LegoDB is a cost-based XML storage mapping engine that explores a space of possible XML-to-relational mappings and selects the best mapping for a given application. LegoDB leverages current XML and relational technologies: 1) it models the target application with an XML Schema, XML data statistics, and an XQuery workload; 2) the space of configurations is generated through XML-Schema rewritings; and 3) the best among the derived configurations is selected using cost estimates obtained through a standard relational optimizer. In this paper, we describe the LegoDB storage engine and provide experimental results that demonstrate the effectiveness of this approach.

AB - As Web applications manipulate an increasing amount of XML, there is a growing interest in storing XML data in relational databases. Due to the mismatch between the complexity of XML's tree structure and the simplicity of flat relational tables, there are many ways to store the same document in an RDBMS, and a number of heuristic techniques have been proposed. These techniques typically define fixed mappings and do not take application characteristics into account. However, a fixed mapping is unlikely to work well for all possible applications. In contrast, LegoDB is a cost-based XML storage mapping engine that explores a space of possible XML-to-relational mappings and selects the best mapping for a given application. LegoDB leverages current XML and relational technologies: 1) it models the target application with an XML Schema, XML data statistics, and an XQuery workload; 2) the space of configurations is generated through XML-Schema rewritings; and 3) the best among the derived configurations is selected using cost estimates obtained through a standard relational optimizer. In this paper, we describe the LegoDB storage engine and provide experimental results that demonstrate the effectiveness of this approach.

UR - http://www.scopus.com/inward/record.url?scp=0036206114&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0036206114&partnerID=8YFLogxK

M3 - Conference contribution

SP - 64

EP - 75

BT - Proceedings - International Conference on Data Engineering

A2 - Agrawal, R

A2 - Dittrich, K

A2 - Ngu, A

ER -