Scanning the Technology on the Applications of Multimedia Processing to Communications

Richard V. Cox, Barry G. Haskell, Yann LeCun, Behzad Shahraray, Lawrence Rabiner

Research output: Contribution to journalArticle

Abstract

The challenge of multimedia processing is to provide services that seamlessly integrate text, sound, image, and video information and to do it in a way that preserves the ease of use and interactivity of conventional plain old telephone service (POTS) telephony, irrelevant of the bandwidth or means of access of the connection to the service. To achieve this goal, there are a number of technological problems that must be considered, including: • compression and coding of multimedia signals, including algorithmic issues, standards issues, and transmission issues; • synthesis and recognition of multimedia signals, including speech, images, handwriting, and text; • organization, storage, and retrieval of multimedia signals, including the appropriate method and speed of delivery (e.g., streaming versus full downloading), resolution (including layering or embedded versions of the signal), and quality of service, i.e., perceived quality of the resulting signal; • access methods to the multimedia signal (i.e., matching the user to the machine), including spoken natural language interfaces, agent interfaces, and media conversion tools; • searching (i.e., based on machine intelligence) by text, speech, and image queries; • browsing (i.e., based on human intelligence) by accessing the text, by voice, or by indexed images. In each of these areas, a great deal of progress has been made in the past few years, driven in part by the relentless growth in multimedia personal computers and in part by the promise of broad-band access from the home and from wireless connections. Standards have also played a key role in driving new multimedia services, both on the POTS network and on the Internet. It is the purpose of this paper to review the status of the technology in each of the areas listed above and to illustrate current capabilities by describing several multimedia applications that have been implemented at AT&T Labs over the past several years.

Original languageEnglish (US)
Pages (from-to)755-823
Number of pages69
JournalProceedings of the IEEE
Volume86
Issue number5
DOIs
StatePublished - 1998

Fingerprint

Scanning
Telephone
Communication
Processing
Multimedia services
Personal computers
Quality of service
Acoustic waves
Internet
Bandwidth

Keywords

  • AAC
  • Access
  • Agents
  • Audio coding
  • Cable modems
  • Communications networks
  • Content-based video sampling
  • Document compression

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Cite this

Scanning the Technology on the Applications of Multimedia Processing to Communications. / Cox, Richard V.; Haskell, Barry G.; LeCun, Yann; Shahraray, Behzad; Rabiner, Lawrence.

In: Proceedings of the IEEE, Vol. 86, No. 5, 1998, p. 755-823.

Research output: Contribution to journalArticle

Cox, Richard V. ; Haskell, Barry G. ; LeCun, Yann ; Shahraray, Behzad ; Rabiner, Lawrence. / Scanning the Technology on the Applications of Multimedia Processing to Communications. In: Proceedings of the IEEE. 1998 ; Vol. 86, No. 5. pp. 755-823.
@article{549a828870aa4d5a851d46d8cce237ac,
title = "Scanning the Technology on the Applications of Multimedia Processing to Communications",
abstract = "The challenge of multimedia processing is to provide services that seamlessly integrate text, sound, image, and video information and to do it in a way that preserves the ease of use and interactivity of conventional plain old telephone service (POTS) telephony, irrelevant of the bandwidth or means of access of the connection to the service. To achieve this goal, there are a number of technological problems that must be considered, including: • compression and coding of multimedia signals, including algorithmic issues, standards issues, and transmission issues; • synthesis and recognition of multimedia signals, including speech, images, handwriting, and text; • organization, storage, and retrieval of multimedia signals, including the appropriate method and speed of delivery (e.g., streaming versus full downloading), resolution (including layering or embedded versions of the signal), and quality of service, i.e., perceived quality of the resulting signal; • access methods to the multimedia signal (i.e., matching the user to the machine), including spoken natural language interfaces, agent interfaces, and media conversion tools; • searching (i.e., based on machine intelligence) by text, speech, and image queries; • browsing (i.e., based on human intelligence) by accessing the text, by voice, or by indexed images. In each of these areas, a great deal of progress has been made in the past few years, driven in part by the relentless growth in multimedia personal computers and in part by the promise of broad-band access from the home and from wireless connections. Standards have also played a key role in driving new multimedia services, both on the POTS network and on the Internet. It is the purpose of this paper to review the status of the technology in each of the areas listed above and to illustrate current capabilities by describing several multimedia applications that have been implemented at AT&T Labs over the past several years.",
keywords = "AAC, Access, Agents, Audio coding, Cable modems, Communications networks, Content-based video sampling, Document compression",
author = "Cox, {Richard V.} and Haskell, {Barry G.} and Yann LeCun and Behzad Shahraray and Lawrence Rabiner",
year = "1998",
doi = "10.1109/5.664272",
language = "English (US)",
volume = "86",
pages = "755--823",
journal = "Proceedings of the IEEE",
issn = "0018-9219",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "5",

}

TY - JOUR

T1 - Scanning the Technology on the Applications of Multimedia Processing to Communications

AU - Cox, Richard V.

AU - Haskell, Barry G.

AU - LeCun, Yann

AU - Shahraray, Behzad

AU - Rabiner, Lawrence

PY - 1998

Y1 - 1998

N2 - The challenge of multimedia processing is to provide services that seamlessly integrate text, sound, image, and video information and to do it in a way that preserves the ease of use and interactivity of conventional plain old telephone service (POTS) telephony, irrelevant of the bandwidth or means of access of the connection to the service. To achieve this goal, there are a number of technological problems that must be considered, including: • compression and coding of multimedia signals, including algorithmic issues, standards issues, and transmission issues; • synthesis and recognition of multimedia signals, including speech, images, handwriting, and text; • organization, storage, and retrieval of multimedia signals, including the appropriate method and speed of delivery (e.g., streaming versus full downloading), resolution (including layering or embedded versions of the signal), and quality of service, i.e., perceived quality of the resulting signal; • access methods to the multimedia signal (i.e., matching the user to the machine), including spoken natural language interfaces, agent interfaces, and media conversion tools; • searching (i.e., based on machine intelligence) by text, speech, and image queries; • browsing (i.e., based on human intelligence) by accessing the text, by voice, or by indexed images. In each of these areas, a great deal of progress has been made in the past few years, driven in part by the relentless growth in multimedia personal computers and in part by the promise of broad-band access from the home and from wireless connections. Standards have also played a key role in driving new multimedia services, both on the POTS network and on the Internet. It is the purpose of this paper to review the status of the technology in each of the areas listed above and to illustrate current capabilities by describing several multimedia applications that have been implemented at AT&T Labs over the past several years.

AB - The challenge of multimedia processing is to provide services that seamlessly integrate text, sound, image, and video information and to do it in a way that preserves the ease of use and interactivity of conventional plain old telephone service (POTS) telephony, irrelevant of the bandwidth or means of access of the connection to the service. To achieve this goal, there are a number of technological problems that must be considered, including: • compression and coding of multimedia signals, including algorithmic issues, standards issues, and transmission issues; • synthesis and recognition of multimedia signals, including speech, images, handwriting, and text; • organization, storage, and retrieval of multimedia signals, including the appropriate method and speed of delivery (e.g., streaming versus full downloading), resolution (including layering or embedded versions of the signal), and quality of service, i.e., perceived quality of the resulting signal; • access methods to the multimedia signal (i.e., matching the user to the machine), including spoken natural language interfaces, agent interfaces, and media conversion tools; • searching (i.e., based on machine intelligence) by text, speech, and image queries; • browsing (i.e., based on human intelligence) by accessing the text, by voice, or by indexed images. In each of these areas, a great deal of progress has been made in the past few years, driven in part by the relentless growth in multimedia personal computers and in part by the promise of broad-band access from the home and from wireless connections. Standards have also played a key role in driving new multimedia services, both on the POTS network and on the Internet. It is the purpose of this paper to review the status of the technology in each of the areas listed above and to illustrate current capabilities by describing several multimedia applications that have been implemented at AT&T Labs over the past several years.

KW - AAC

KW - Access

KW - Agents

KW - Audio coding

KW - Cable modems

KW - Communications networks

KW - Content-based video sampling

KW - Document compression

UR - http://www.scopus.com/inward/record.url?scp=0032074814&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0032074814&partnerID=8YFLogxK

U2 - 10.1109/5.664272

DO - 10.1109/5.664272

M3 - Article

VL - 86

SP - 755

EP - 823

JO - Proceedings of the IEEE

JF - Proceedings of the IEEE

SN - 0018-9219

IS - 5

ER -