Intelligible encoding of ASL image sequences at extremely low information rates

George Sperling, Michael Landy, Yoav Cohen, M. Pavel

Research output: Contribution to journalArticle

Abstract

American Sign Language (ASL) is a gestural language used by the hearing impaired. This paper describes experimental tests with deaf subjects that compared the most effective known methods of creating extremely compressed ASL images. The minimum requirements for intelligibility were determined for three basically different kinds of transformations: (1) gray-scale transformations that subsample the images in space and time; (2) two-level intensity quantization that converts the gray scale image into a black-and-white approximation; (3) transformations that convert the images into black and white outline drawings (cartoons). In Experiment 1, five subjects made quality ratings of 81 kinds of images that varied in spatial resolution, frame rate, and type of transformation. The most promising image size was 96 × 64 pixels (height × width). The 17 most promising image transformations were selected for formal intelligibility testing: 38 deaf subjects viewed 87 ASL sequences 1-2 s long of each transformation. The most effective code for gray-scale images is an analog raster code, which can produce images with 0.86 normalized intelligibility (I) at a bandwidth of 2,880 Hz and therefore is transmittable on ordinary 3 KHz telephone circuits. For the binary images, a number of coding schemes are described and compared, the most efficient being an extension of the quadtree method, here termed binquad coding which yielded I = 0.68 at 7,500 bits per second (bps). For cartoons, an even more efficient polygonal transformation with a victorgraph code yielding, for connected straight line segments, is proposed, together with a vectorgraph code yielding, for example, I = 0.56 at 3,900 bps and I = 0.70 at 6,000 bps. Polygonally transformed cartoons offer the possibility of telephonic ASL communication at 4,800 bps. Several combinations of binary image transformations and encoding schemes offer I > 80% at 9,600 bps.

Original languageEnglish (US)
Pages (from-to)335-391
Number of pages57
JournalComputer Vision, Graphics and Image Processing
Volume31
Issue number3
DOIs
StatePublished - 1985

Fingerprint

Binary images
Telephone circuits
Audition
Pixels
rate
Bandwidth
Communication
Testing
hearing
raster
Experiments
pixel
spatial resolution
communication
code
experiment
method

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Intelligible encoding of ASL image sequences at extremely low information rates. / Sperling, George; Landy, Michael; Cohen, Yoav; Pavel, M.

In: Computer Vision, Graphics and Image Processing, Vol. 31, No. 3, 1985, p. 335-391.

Research output: Contribution to journalArticle

@article{9b543cbb27324994b417856e3b1d0a07,
title = "Intelligible encoding of ASL image sequences at extremely low information rates",
abstract = "American Sign Language (ASL) is a gestural language used by the hearing impaired. This paper describes experimental tests with deaf subjects that compared the most effective known methods of creating extremely compressed ASL images. The minimum requirements for intelligibility were determined for three basically different kinds of transformations: (1) gray-scale transformations that subsample the images in space and time; (2) two-level intensity quantization that converts the gray scale image into a black-and-white approximation; (3) transformations that convert the images into black and white outline drawings (cartoons). In Experiment 1, five subjects made quality ratings of 81 kinds of images that varied in spatial resolution, frame rate, and type of transformation. The most promising image size was 96 × 64 pixels (height × width). The 17 most promising image transformations were selected for formal intelligibility testing: 38 deaf subjects viewed 87 ASL sequences 1-2 s long of each transformation. The most effective code for gray-scale images is an analog raster code, which can produce images with 0.86 normalized intelligibility (I) at a bandwidth of 2,880 Hz and therefore is transmittable on ordinary 3 KHz telephone circuits. For the binary images, a number of coding schemes are described and compared, the most efficient being an extension of the quadtree method, here termed binquad coding which yielded I = 0.68 at 7,500 bits per second (bps). For cartoons, an even more efficient polygonal transformation with a victorgraph code yielding, for connected straight line segments, is proposed, together with a vectorgraph code yielding, for example, I = 0.56 at 3,900 bps and I = 0.70 at 6,000 bps. Polygonally transformed cartoons offer the possibility of telephonic ASL communication at 4,800 bps. Several combinations of binary image transformations and encoding schemes offer I > 80{\%} at 9,600 bps.",
author = "George Sperling and Michael Landy and Yoav Cohen and M. Pavel",
year = "1985",
doi = "10.1016/0734-189X(85)90034-9",
language = "English (US)",
volume = "31",
pages = "335--391",
journal = "Computer Vision, Graphics, and Image Processing",
issn = "0734-189X",
publisher = "Academic Press Inc.",
number = "3",

}

TY - JOUR

T1 - Intelligible encoding of ASL image sequences at extremely low information rates

AU - Sperling, George

AU - Landy, Michael

AU - Cohen, Yoav

AU - Pavel, M.

PY - 1985

Y1 - 1985

N2 - American Sign Language (ASL) is a gestural language used by the hearing impaired. This paper describes experimental tests with deaf subjects that compared the most effective known methods of creating extremely compressed ASL images. The minimum requirements for intelligibility were determined for three basically different kinds of transformations: (1) gray-scale transformations that subsample the images in space and time; (2) two-level intensity quantization that converts the gray scale image into a black-and-white approximation; (3) transformations that convert the images into black and white outline drawings (cartoons). In Experiment 1, five subjects made quality ratings of 81 kinds of images that varied in spatial resolution, frame rate, and type of transformation. The most promising image size was 96 × 64 pixels (height × width). The 17 most promising image transformations were selected for formal intelligibility testing: 38 deaf subjects viewed 87 ASL sequences 1-2 s long of each transformation. The most effective code for gray-scale images is an analog raster code, which can produce images with 0.86 normalized intelligibility (I) at a bandwidth of 2,880 Hz and therefore is transmittable on ordinary 3 KHz telephone circuits. For the binary images, a number of coding schemes are described and compared, the most efficient being an extension of the quadtree method, here termed binquad coding which yielded I = 0.68 at 7,500 bits per second (bps). For cartoons, an even more efficient polygonal transformation with a victorgraph code yielding, for connected straight line segments, is proposed, together with a vectorgraph code yielding, for example, I = 0.56 at 3,900 bps and I = 0.70 at 6,000 bps. Polygonally transformed cartoons offer the possibility of telephonic ASL communication at 4,800 bps. Several combinations of binary image transformations and encoding schemes offer I > 80% at 9,600 bps.

AB - American Sign Language (ASL) is a gestural language used by the hearing impaired. This paper describes experimental tests with deaf subjects that compared the most effective known methods of creating extremely compressed ASL images. The minimum requirements for intelligibility were determined for three basically different kinds of transformations: (1) gray-scale transformations that subsample the images in space and time; (2) two-level intensity quantization that converts the gray scale image into a black-and-white approximation; (3) transformations that convert the images into black and white outline drawings (cartoons). In Experiment 1, five subjects made quality ratings of 81 kinds of images that varied in spatial resolution, frame rate, and type of transformation. The most promising image size was 96 × 64 pixels (height × width). The 17 most promising image transformations were selected for formal intelligibility testing: 38 deaf subjects viewed 87 ASL sequences 1-2 s long of each transformation. The most effective code for gray-scale images is an analog raster code, which can produce images with 0.86 normalized intelligibility (I) at a bandwidth of 2,880 Hz and therefore is transmittable on ordinary 3 KHz telephone circuits. For the binary images, a number of coding schemes are described and compared, the most efficient being an extension of the quadtree method, here termed binquad coding which yielded I = 0.68 at 7,500 bits per second (bps). For cartoons, an even more efficient polygonal transformation with a victorgraph code yielding, for connected straight line segments, is proposed, together with a vectorgraph code yielding, for example, I = 0.56 at 3,900 bps and I = 0.70 at 6,000 bps. Polygonally transformed cartoons offer the possibility of telephonic ASL communication at 4,800 bps. Several combinations of binary image transformations and encoding schemes offer I > 80% at 9,600 bps.

UR - http://www.scopus.com/inward/record.url?scp=0022130687&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0022130687&partnerID=8YFLogxK

U2 - 10.1016/0734-189X(85)90034-9

DO - 10.1016/0734-189X(85)90034-9

M3 - Article

VL - 31

SP - 335

EP - 391

JO - Computer Vision, Graphics, and Image Processing

JF - Computer Vision, Graphics, and Image Processing

SN - 0734-189X

IS - 3

ER -