DjVu: Analyzing and compressing scanned documents for Internet distribution

Patrick Haffner, Léon Bottou, Paul G. Howard, Yann LeCun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

DjVu is an image compression technique specifically geared towards the compression of scanned documents in color at high resolution. Typical color magazine pages scanned at 300 dpi are compressed to between 40 and 80 kBytes, or 5 to 10 times smaller than with JPEG for a similar level of subjective quality. The foreground layer, which contains the text and drawings and requires high spatial resolution, is separated from the background layer, which contains pictures and backgrounds and requires less resolution. The foreground is compressed with a bi-tonal image compression technique that takes advantage of character shape similarities. The background is compressed with a new progressive, wavelet-based compression method. A real-time, memory-efficient version of the decoder is available as a plug-in for popular Web browsers.

Original languageEnglish (US)
Title of host publicationProceedings of the 5th International Conference on Document Analysis and Recognition, ICDAR 1999
PublisherIEEE Computer Society
Pages629-632
Number of pages4
ISBN (Electronic)0769503187
DOIs
StatePublished - Jan 1 1999
Event5th International Conference on Document Analysis and Recognition, ICDAR 1999 - Bangalore, India
Duration: Sep 20 1999Sep 22 1999

Other

Other5th International Conference on Document Analysis and Recognition, ICDAR 1999
CountryIndia
CityBangalore
Period9/20/999/22/99

Fingerprint

Image compression
Internet
Color
Web browsers
Data storage equipment

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition

Cite this

Haffner, P., Bottou, L., Howard, P. G., & LeCun, Y. (1999). DjVu: Analyzing and compressing scanned documents for Internet distribution. In Proceedings of the 5th International Conference on Document Analysis and Recognition, ICDAR 1999 (pp. 629-632). [791865] IEEE Computer Society. https://doi.org/10.1109/ICDAR.1999.791865

DjVu : Analyzing and compressing scanned documents for Internet distribution. / Haffner, Patrick; Bottou, Léon; Howard, Paul G.; LeCun, Yann.

Proceedings of the 5th International Conference on Document Analysis and Recognition, ICDAR 1999. IEEE Computer Society, 1999. p. 629-632 791865.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Haffner, P, Bottou, L, Howard, PG & LeCun, Y 1999, DjVu: Analyzing and compressing scanned documents for Internet distribution. in Proceedings of the 5th International Conference on Document Analysis and Recognition, ICDAR 1999., 791865, IEEE Computer Society, pp. 629-632, 5th International Conference on Document Analysis and Recognition, ICDAR 1999, Bangalore, India, 9/20/99. https://doi.org/10.1109/ICDAR.1999.791865
Haffner P, Bottou L, Howard PG, LeCun Y. DjVu: Analyzing and compressing scanned documents for Internet distribution. In Proceedings of the 5th International Conference on Document Analysis and Recognition, ICDAR 1999. IEEE Computer Society. 1999. p. 629-632. 791865 https://doi.org/10.1109/ICDAR.1999.791865
Haffner, Patrick ; Bottou, Léon ; Howard, Paul G. ; LeCun, Yann. / DjVu : Analyzing and compressing scanned documents for Internet distribution. Proceedings of the 5th International Conference on Document Analysis and Recognition, ICDAR 1999. IEEE Computer Society, 1999. pp. 629-632
@inproceedings{3462fa436809495080d87afde136e670,
title = "DjVu: Analyzing and compressing scanned documents for Internet distribution",
abstract = "DjVu is an image compression technique specifically geared towards the compression of scanned documents in color at high resolution. Typical color magazine pages scanned at 300 dpi are compressed to between 40 and 80 kBytes, or 5 to 10 times smaller than with JPEG for a similar level of subjective quality. The foreground layer, which contains the text and drawings and requires high spatial resolution, is separated from the background layer, which contains pictures and backgrounds and requires less resolution. The foreground is compressed with a bi-tonal image compression technique that takes advantage of character shape similarities. The background is compressed with a new progressive, wavelet-based compression method. A real-time, memory-efficient version of the decoder is available as a plug-in for popular Web browsers.",
author = "Patrick Haffner and L{\'e}on Bottou and Howard, {Paul G.} and Yann LeCun",
year = "1999",
month = "1",
day = "1",
doi = "10.1109/ICDAR.1999.791865",
language = "English (US)",
pages = "629--632",
booktitle = "Proceedings of the 5th International Conference on Document Analysis and Recognition, ICDAR 1999",
publisher = "IEEE Computer Society",
address = "United States",

}

TY - GEN

T1 - DjVu

T2 - Analyzing and compressing scanned documents for Internet distribution

AU - Haffner, Patrick

AU - Bottou, Léon

AU - Howard, Paul G.

AU - LeCun, Yann

PY - 1999/1/1

Y1 - 1999/1/1

N2 - DjVu is an image compression technique specifically geared towards the compression of scanned documents in color at high resolution. Typical color magazine pages scanned at 300 dpi are compressed to between 40 and 80 kBytes, or 5 to 10 times smaller than with JPEG for a similar level of subjective quality. The foreground layer, which contains the text and drawings and requires high spatial resolution, is separated from the background layer, which contains pictures and backgrounds and requires less resolution. The foreground is compressed with a bi-tonal image compression technique that takes advantage of character shape similarities. The background is compressed with a new progressive, wavelet-based compression method. A real-time, memory-efficient version of the decoder is available as a plug-in for popular Web browsers.

AB - DjVu is an image compression technique specifically geared towards the compression of scanned documents in color at high resolution. Typical color magazine pages scanned at 300 dpi are compressed to between 40 and 80 kBytes, or 5 to 10 times smaller than with JPEG for a similar level of subjective quality. The foreground layer, which contains the text and drawings and requires high spatial resolution, is separated from the background layer, which contains pictures and backgrounds and requires less resolution. The foreground is compressed with a bi-tonal image compression technique that takes advantage of character shape similarities. The background is compressed with a new progressive, wavelet-based compression method. A real-time, memory-efficient version of the decoder is available as a plug-in for popular Web browsers.

UR - http://www.scopus.com/inward/record.url?scp=85038128172&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85038128172&partnerID=8YFLogxK

U2 - 10.1109/ICDAR.1999.791865

DO - 10.1109/ICDAR.1999.791865

M3 - Conference contribution

SP - 629

EP - 632

BT - Proceedings of the 5th International Conference on Document Analysis and Recognition, ICDAR 1999

PB - IEEE Computer Society

ER -