Color documents on the web with DjVu

Patrick Haffner, Yann LeCun, Leon Bottou, Paul Howard, Pascal Vincent, Bill Riemers

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a new image compression technique called `DjVu' that is specifically geared towards the compression of scanned documents in color at high resolution. With DjVu, a magazine page in color at 300 dpi typically occupies between 40 KB and 80 KB, approximately 5 to 10 times better than JPEG for a similar level of readability. Using a combination of Hidden Markov Model techniques and MDL-driven heursitics, DjVu first classifies each pixel in the image as either foreground (text, drawings) or background (pictures, photos, paper texture). The pixel categories form a bitonal image which is compressed using a pattern matching technique that takes advantage of the similarities between character shapes. A progressive, wavelet-based compression technique, combined with a masking algorithm, is then used to compress the foreground and background images at lower resolutions while minimizing the number of bits spent on the pixels that are not visible in the foreground and background planes. Encoders, decoders, and real-time, memory efficient plug-ins for various web browsers are available for all the major platforms.

Original languageEnglish (US)
Title of host publicationIEEE International Conference on Image Processing
PublisherIEEE
Pages239-243
Number of pages5
Volume1
StatePublished - 1999
EventInternational Conference on Image Processing (ICIP'99) - Kobe, Jpn
Duration: Oct 24 1999Oct 28 1999

Other

OtherInternational Conference on Image Processing (ICIP'99)
CityKobe, Jpn
Period10/24/9910/28/99

Fingerprint

Pixels
Color
Web browsers
Pattern matching
Hidden Markov models
Image compression
Textures
Data storage equipment

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Hardware and Architecture
  • Electrical and Electronic Engineering

Cite this

Haffner, P., LeCun, Y., Bottou, L., Howard, P., Vincent, P., & Riemers, B. (1999). Color documents on the web with DjVu. In IEEE International Conference on Image Processing (Vol. 1, pp. 239-243). IEEE.

Color documents on the web with DjVu. / Haffner, Patrick; LeCun, Yann; Bottou, Leon; Howard, Paul; Vincent, Pascal; Riemers, Bill.

IEEE International Conference on Image Processing. Vol. 1 IEEE, 1999. p. 239-243.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Haffner, P, LeCun, Y, Bottou, L, Howard, P, Vincent, P & Riemers, B 1999, Color documents on the web with DjVu. in IEEE International Conference on Image Processing. vol. 1, IEEE, pp. 239-243, International Conference on Image Processing (ICIP'99), Kobe, Jpn, 10/24/99.
Haffner P, LeCun Y, Bottou L, Howard P, Vincent P, Riemers B. Color documents on the web with DjVu. In IEEE International Conference on Image Processing. Vol. 1. IEEE. 1999. p. 239-243
Haffner, Patrick ; LeCun, Yann ; Bottou, Leon ; Howard, Paul ; Vincent, Pascal ; Riemers, Bill. / Color documents on the web with DjVu. IEEE International Conference on Image Processing. Vol. 1 IEEE, 1999. pp. 239-243
@inproceedings{01203d85eb0641e09b52e9638c08e4e0,
title = "Color documents on the web with DjVu",
abstract = "We present a new image compression technique called `DjVu' that is specifically geared towards the compression of scanned documents in color at high resolution. With DjVu, a magazine page in color at 300 dpi typically occupies between 40 KB and 80 KB, approximately 5 to 10 times better than JPEG for a similar level of readability. Using a combination of Hidden Markov Model techniques and MDL-driven heursitics, DjVu first classifies each pixel in the image as either foreground (text, drawings) or background (pictures, photos, paper texture). The pixel categories form a bitonal image which is compressed using a pattern matching technique that takes advantage of the similarities between character shapes. A progressive, wavelet-based compression technique, combined with a masking algorithm, is then used to compress the foreground and background images at lower resolutions while minimizing the number of bits spent on the pixels that are not visible in the foreground and background planes. Encoders, decoders, and real-time, memory efficient plug-ins for various web browsers are available for all the major platforms.",
author = "Patrick Haffner and Yann LeCun and Leon Bottou and Paul Howard and Pascal Vincent and Bill Riemers",
year = "1999",
language = "English (US)",
volume = "1",
pages = "239--243",
booktitle = "IEEE International Conference on Image Processing",
publisher = "IEEE",

}

TY - GEN

T1 - Color documents on the web with DjVu

AU - Haffner, Patrick

AU - LeCun, Yann

AU - Bottou, Leon

AU - Howard, Paul

AU - Vincent, Pascal

AU - Riemers, Bill

PY - 1999

Y1 - 1999

N2 - We present a new image compression technique called `DjVu' that is specifically geared towards the compression of scanned documents in color at high resolution. With DjVu, a magazine page in color at 300 dpi typically occupies between 40 KB and 80 KB, approximately 5 to 10 times better than JPEG for a similar level of readability. Using a combination of Hidden Markov Model techniques and MDL-driven heursitics, DjVu first classifies each pixel in the image as either foreground (text, drawings) or background (pictures, photos, paper texture). The pixel categories form a bitonal image which is compressed using a pattern matching technique that takes advantage of the similarities between character shapes. A progressive, wavelet-based compression technique, combined with a masking algorithm, is then used to compress the foreground and background images at lower resolutions while minimizing the number of bits spent on the pixels that are not visible in the foreground and background planes. Encoders, decoders, and real-time, memory efficient plug-ins for various web browsers are available for all the major platforms.

AB - We present a new image compression technique called `DjVu' that is specifically geared towards the compression of scanned documents in color at high resolution. With DjVu, a magazine page in color at 300 dpi typically occupies between 40 KB and 80 KB, approximately 5 to 10 times better than JPEG for a similar level of readability. Using a combination of Hidden Markov Model techniques and MDL-driven heursitics, DjVu first classifies each pixel in the image as either foreground (text, drawings) or background (pictures, photos, paper texture). The pixel categories form a bitonal image which is compressed using a pattern matching technique that takes advantage of the similarities between character shapes. A progressive, wavelet-based compression technique, combined with a masking algorithm, is then used to compress the foreground and background images at lower resolutions while minimizing the number of bits spent on the pixels that are not visible in the foreground and background planes. Encoders, decoders, and real-time, memory efficient plug-ins for various web browsers are available for all the major platforms.

UR - http://www.scopus.com/inward/record.url?scp=0033317014&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0033317014&partnerID=8YFLogxK

M3 - Conference contribution

VL - 1

SP - 239

EP - 243

BT - IEEE International Conference on Image Processing

PB - IEEE

ER -