### Abstract

Partitioning a multi-dimensional data set into rectangular partitions subject to certain constraints is an important problem that arises in many database applications, including histogram-based selectivity estimation, load-balancing, and construction of index structures. While provably optimal and efficient algorithms exist for partitioning one-dimensional data, the multi-dimensional problem has received less attention, except for a few special cases. As a result, the heuristic partitioning techniques that are used in practice are not well understood, and come with no guarantees on the quality of the solution. In this paper, we present algorithmic and complexity-theoretic results for the fundamental problem of partitioning a two-dimensional array into rectangular tiles of arbitrary size in a way that minimizes the number of tiles required to satisfy a given constraint. Our main results are approximation algorithms for several partitioning problems that provably approximate the optimal solutions within small constant factors, and that run in linear or close to linear time. We also establish the NP-hardness of several partitioning problems, therefore it is unlikely that there are efficient, i.e., polynomial time, algorithms for solving these problems exactly. We also discuss a few applications in which partitioning problems arise. One of the applications is the problem of constructing multi-dimensional histograms. Our results, for example, give an efficient algorithm to construct the V-Optimal histograms which are known to be the most ac- curate histograms in several selectivity estimation problems. Our algorithms are the first to provide guaranteed bounds on the quality of the solution.

Original language | English (US) |
---|---|

Title of host publication | Database Theory - ICDT 1999 - 7th International Conference, Proceedings |

Publisher | Springer Verlag |

Pages | 236-256 |

Number of pages | 21 |

Volume | 1540 |

ISBN (Print) | 3540654526, 9783540654520 |

State | Published - 1998 |

Event | 7th International Conference on Database Theory, ICDT 1999 - Jerusalem, Israel Duration: Jan 10 1999 → Jan 12 1999 |

### Publication series

Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|

Volume | 1540 |

ISSN (Print) | 03029743 |

ISSN (Electronic) | 16113349 |

### Other

Other | 7th International Conference on Database Theory, ICDT 1999 |
---|---|

Country | Israel |

City | Jerusalem |

Period | 1/10/99 → 1/12/99 |

### Fingerprint

### ASJC Scopus subject areas

- Computer Science(all)
- Theoretical Computer Science

### Cite this

*Database Theory - ICDT 1999 - 7th International Conference, Proceedings*(Vol. 1540, pp. 236-256). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1540). Springer Verlag.

**On rectangular partitionings in two dimensions : Algorithms, complexity, and applications.** / Muthukrishnan, S.; Poosala, Viswanath; Suel, Torsten.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

*Database Theory - ICDT 1999 - 7th International Conference, Proceedings.*vol. 1540, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 1540, Springer Verlag, pp. 236-256, 7th International Conference on Database Theory, ICDT 1999, Jerusalem, Israel, 1/10/99.

}

TY - GEN

T1 - On rectangular partitionings in two dimensions

T2 - Algorithms, complexity, and applications

AU - Muthukrishnan, S.

AU - Poosala, Viswanath

AU - Suel, Torsten

PY - 1998

Y1 - 1998

N2 - Partitioning a multi-dimensional data set into rectangular partitions subject to certain constraints is an important problem that arises in many database applications, including histogram-based selectivity estimation, load-balancing, and construction of index structures. While provably optimal and efficient algorithms exist for partitioning one-dimensional data, the multi-dimensional problem has received less attention, except for a few special cases. As a result, the heuristic partitioning techniques that are used in practice are not well understood, and come with no guarantees on the quality of the solution. In this paper, we present algorithmic and complexity-theoretic results for the fundamental problem of partitioning a two-dimensional array into rectangular tiles of arbitrary size in a way that minimizes the number of tiles required to satisfy a given constraint. Our main results are approximation algorithms for several partitioning problems that provably approximate the optimal solutions within small constant factors, and that run in linear or close to linear time. We also establish the NP-hardness of several partitioning problems, therefore it is unlikely that there are efficient, i.e., polynomial time, algorithms for solving these problems exactly. We also discuss a few applications in which partitioning problems arise. One of the applications is the problem of constructing multi-dimensional histograms. Our results, for example, give an efficient algorithm to construct the V-Optimal histograms which are known to be the most ac- curate histograms in several selectivity estimation problems. Our algorithms are the first to provide guaranteed bounds on the quality of the solution.

AB - Partitioning a multi-dimensional data set into rectangular partitions subject to certain constraints is an important problem that arises in many database applications, including histogram-based selectivity estimation, load-balancing, and construction of index structures. While provably optimal and efficient algorithms exist for partitioning one-dimensional data, the multi-dimensional problem has received less attention, except for a few special cases. As a result, the heuristic partitioning techniques that are used in practice are not well understood, and come with no guarantees on the quality of the solution. In this paper, we present algorithmic and complexity-theoretic results for the fundamental problem of partitioning a two-dimensional array into rectangular tiles of arbitrary size in a way that minimizes the number of tiles required to satisfy a given constraint. Our main results are approximation algorithms for several partitioning problems that provably approximate the optimal solutions within small constant factors, and that run in linear or close to linear time. We also establish the NP-hardness of several partitioning problems, therefore it is unlikely that there are efficient, i.e., polynomial time, algorithms for solving these problems exactly. We also discuss a few applications in which partitioning problems arise. One of the applications is the problem of constructing multi-dimensional histograms. Our results, for example, give an efficient algorithm to construct the V-Optimal histograms which are known to be the most ac- curate histograms in several selectivity estimation problems. Our algorithms are the first to provide guaranteed bounds on the quality of the solution.

UR - http://www.scopus.com/inward/record.url?scp=18144380057&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=18144380057&partnerID=8YFLogxK

M3 - Conference contribution

SN - 3540654526

SN - 9783540654520

VL - 1540

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 236

EP - 256

BT - Database Theory - ICDT 1999 - 7th International Conference, Proceedings

PB - Springer Verlag

ER -