Alphabet partitioning techniques for semiadaptive Huffman coding of large alphabets

Dan Chen, Yi Jen Chiang, Nasir Memon, Xiaolin Wu

Research output: Contribution to journalArticle

Abstract

Practical applications that employ entropy coding for large alphabets often partition the alphabet set into two or more layers, and encode each symbol by using some suitable prefix coding for each layer. In this paper, we formulate the problem of finding an alphabet partitioning for the design of a two-layer semiadaptive code as an optimization problem, and give a solution based on dynamic programming. However, the complexity of the dynamic programming approach can be quite prohibitive for a long sequence and a very large alphabet size. Hence, we also give a simple greedy heuristic algorithm whose running time is linear in the length of the input sequence, irrespective of the underlying alphabet size. Although our dynamic programming and greedy algorithms do not provide a globally optimal solution for the alphabet partitioning problem, experimental results demonstrate that superior prefix coding schemes for large alphabets can be designed using our new approach.

Original languageEnglish (US)
Pages (from-to)436-443
Number of pages8
JournalIEEE Transactions on Communications
Volume55
Issue number3
DOIs
StatePublished - Mar 2007

Fingerprint

Dynamic programming
Heuristic algorithms
Entropy

Keywords

  • Data compression
  • Dynamic programming
  • Greedy heuristic
  • Large alphabet partitioning
  • Two-layer semiadaptive coding

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Networks and Communications

Cite this

Alphabet partitioning techniques for semiadaptive Huffman coding of large alphabets. / Chen, Dan; Chiang, Yi Jen; Memon, Nasir; Wu, Xiaolin.

In: IEEE Transactions on Communications, Vol. 55, No. 3, 03.2007, p. 436-443.

Research output: Contribution to journalArticle

@article{0435396aba174c4a9a34a7aade621c22,
title = "Alphabet partitioning techniques for semiadaptive Huffman coding of large alphabets",
abstract = "Practical applications that employ entropy coding for large alphabets often partition the alphabet set into two or more layers, and encode each symbol by using some suitable prefix coding for each layer. In this paper, we formulate the problem of finding an alphabet partitioning for the design of a two-layer semiadaptive code as an optimization problem, and give a solution based on dynamic programming. However, the complexity of the dynamic programming approach can be quite prohibitive for a long sequence and a very large alphabet size. Hence, we also give a simple greedy heuristic algorithm whose running time is linear in the length of the input sequence, irrespective of the underlying alphabet size. Although our dynamic programming and greedy algorithms do not provide a globally optimal solution for the alphabet partitioning problem, experimental results demonstrate that superior prefix coding schemes for large alphabets can be designed using our new approach.",
keywords = "Data compression, Dynamic programming, Greedy heuristic, Large alphabet partitioning, Two-layer semiadaptive coding",
author = "Dan Chen and Chiang, {Yi Jen} and Nasir Memon and Xiaolin Wu",
year = "2007",
month = "3",
doi = "10.1109/TCOMM.2006.888894",
language = "English (US)",
volume = "55",
pages = "436--443",
journal = "IEEE Transactions on Communications",
issn = "0090-6778",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "3",

}

TY - JOUR

T1 - Alphabet partitioning techniques for semiadaptive Huffman coding of large alphabets

AU - Chen, Dan

AU - Chiang, Yi Jen

AU - Memon, Nasir

AU - Wu, Xiaolin

PY - 2007/3

Y1 - 2007/3

N2 - Practical applications that employ entropy coding for large alphabets often partition the alphabet set into two or more layers, and encode each symbol by using some suitable prefix coding for each layer. In this paper, we formulate the problem of finding an alphabet partitioning for the design of a two-layer semiadaptive code as an optimization problem, and give a solution based on dynamic programming. However, the complexity of the dynamic programming approach can be quite prohibitive for a long sequence and a very large alphabet size. Hence, we also give a simple greedy heuristic algorithm whose running time is linear in the length of the input sequence, irrespective of the underlying alphabet size. Although our dynamic programming and greedy algorithms do not provide a globally optimal solution for the alphabet partitioning problem, experimental results demonstrate that superior prefix coding schemes for large alphabets can be designed using our new approach.

AB - Practical applications that employ entropy coding for large alphabets often partition the alphabet set into two or more layers, and encode each symbol by using some suitable prefix coding for each layer. In this paper, we formulate the problem of finding an alphabet partitioning for the design of a two-layer semiadaptive code as an optimization problem, and give a solution based on dynamic programming. However, the complexity of the dynamic programming approach can be quite prohibitive for a long sequence and a very large alphabet size. Hence, we also give a simple greedy heuristic algorithm whose running time is linear in the length of the input sequence, irrespective of the underlying alphabet size. Although our dynamic programming and greedy algorithms do not provide a globally optimal solution for the alphabet partitioning problem, experimental results demonstrate that superior prefix coding schemes for large alphabets can be designed using our new approach.

KW - Data compression

KW - Dynamic programming

KW - Greedy heuristic

KW - Large alphabet partitioning

KW - Two-layer semiadaptive coding

UR - http://www.scopus.com/inward/record.url?scp=34147149767&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34147149767&partnerID=8YFLogxK

U2 - 10.1109/TCOMM.2006.888894

DO - 10.1109/TCOMM.2006.888894

M3 - Article

VL - 55

SP - 436

EP - 443

JO - IEEE Transactions on Communications

JF - IEEE Transactions on Communications

SN - 0090-6778

IS - 3

ER -