Audio content authentication based on psycho-acoustic model

Regunathan Radhakrishnan, Nasir Memon

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The goal of audio content authentication techniques is to separate malicious manipulations from authentic signal processing applications like compression, filtering, etc. The key difference between malicious operations and signal processing operations is that the latter tends to preserve the perceptual content of the underlying audio signal. Hence, in order to separate malicious operations from allowed operations, a content authentication procedure should be based on a model that approximates human perception of audio. In this paper, we propose an audio content authentication technique based on an invariant feature contained in two perceptually similar audio data, i.e. the masking curve. We also evaluate the performance of this technique by embedding a hash based on the masking curve into the audio signal using an existing transparent and robust data hiding technique. At the receiver, the same content-based hash is extracted from the audio and compared with the calculated hash bits. Correlation between calculated hash bits and extracted hash bits degrades gracefully with the perceived quality of received audio. This implies that the threshold for authentication can be adapted to the required level of perceptual quality at the receiver. Experimental results show that this content-based hash is able to differentiate allowed signal processing applications like MP3 compression from certain malicious operations, which modify the perceptual content of the audio.

Original languageEnglish (US)
Title of host publicationProceedings of SPIE - The International Society for Optical Engineering
EditorsE.J. Delp III, P.W. Wong
Pages110-117
Number of pages8
Volume4675
DOIs
StatePublished - 2002
EventSecurity and Watermarking of Multimedia Contents IV - San Jose, CA, United States
Duration: Jan 21 2002Jan 24 2002

Other

OtherSecurity and Watermarking of Multimedia Contents IV
CountryUnited States
CitySan Jose, CA
Period1/21/021/24/02

Fingerprint

Authentication
Acoustics
Signal processing
acoustics
signal processing
audio signals
masking
receivers
audio data
curves
embedding
manipulators
thresholds

Keywords

  • MP-3 Compression
  • Multimedia Content Authentication
  • Psycho-acoustic model

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Condensed Matter Physics

Cite this

Radhakrishnan, R., & Memon, N. (2002). Audio content authentication based on psycho-acoustic model. In E. J. Delp III, & P. W. Wong (Eds.), Proceedings of SPIE - The International Society for Optical Engineering (Vol. 4675, pp. 110-117) https://doi.org/10.1117/12.465266

Audio content authentication based on psycho-acoustic model. / Radhakrishnan, Regunathan; Memon, Nasir.

Proceedings of SPIE - The International Society for Optical Engineering. ed. / E.J. Delp III; P.W. Wong. Vol. 4675 2002. p. 110-117.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Radhakrishnan, R & Memon, N 2002, Audio content authentication based on psycho-acoustic model. in EJ Delp III & PW Wong (eds), Proceedings of SPIE - The International Society for Optical Engineering. vol. 4675, pp. 110-117, Security and Watermarking of Multimedia Contents IV, San Jose, CA, United States, 1/21/02. https://doi.org/10.1117/12.465266
Radhakrishnan R, Memon N. Audio content authentication based on psycho-acoustic model. In Delp III EJ, Wong PW, editors, Proceedings of SPIE - The International Society for Optical Engineering. Vol. 4675. 2002. p. 110-117 https://doi.org/10.1117/12.465266
Radhakrishnan, Regunathan ; Memon, Nasir. / Audio content authentication based on psycho-acoustic model. Proceedings of SPIE - The International Society for Optical Engineering. editor / E.J. Delp III ; P.W. Wong. Vol. 4675 2002. pp. 110-117
@inproceedings{dc50da29305840aea0d382deea06f91b,
title = "Audio content authentication based on psycho-acoustic model",
abstract = "The goal of audio content authentication techniques is to separate malicious manipulations from authentic signal processing applications like compression, filtering, etc. The key difference between malicious operations and signal processing operations is that the latter tends to preserve the perceptual content of the underlying audio signal. Hence, in order to separate malicious operations from allowed operations, a content authentication procedure should be based on a model that approximates human perception of audio. In this paper, we propose an audio content authentication technique based on an invariant feature contained in two perceptually similar audio data, i.e. the masking curve. We also evaluate the performance of this technique by embedding a hash based on the masking curve into the audio signal using an existing transparent and robust data hiding technique. At the receiver, the same content-based hash is extracted from the audio and compared with the calculated hash bits. Correlation between calculated hash bits and extracted hash bits degrades gracefully with the perceived quality of received audio. This implies that the threshold for authentication can be adapted to the required level of perceptual quality at the receiver. Experimental results show that this content-based hash is able to differentiate allowed signal processing applications like MP3 compression from certain malicious operations, which modify the perceptual content of the audio.",
keywords = "MP-3 Compression, Multimedia Content Authentication, Psycho-acoustic model",
author = "Regunathan Radhakrishnan and Nasir Memon",
year = "2002",
doi = "10.1117/12.465266",
language = "English (US)",
volume = "4675",
pages = "110--117",
editor = "{Delp III}, E.J. and P.W. Wong",
booktitle = "Proceedings of SPIE - The International Society for Optical Engineering",

}

TY - GEN

T1 - Audio content authentication based on psycho-acoustic model

AU - Radhakrishnan, Regunathan

AU - Memon, Nasir

PY - 2002

Y1 - 2002

N2 - The goal of audio content authentication techniques is to separate malicious manipulations from authentic signal processing applications like compression, filtering, etc. The key difference between malicious operations and signal processing operations is that the latter tends to preserve the perceptual content of the underlying audio signal. Hence, in order to separate malicious operations from allowed operations, a content authentication procedure should be based on a model that approximates human perception of audio. In this paper, we propose an audio content authentication technique based on an invariant feature contained in two perceptually similar audio data, i.e. the masking curve. We also evaluate the performance of this technique by embedding a hash based on the masking curve into the audio signal using an existing transparent and robust data hiding technique. At the receiver, the same content-based hash is extracted from the audio and compared with the calculated hash bits. Correlation between calculated hash bits and extracted hash bits degrades gracefully with the perceived quality of received audio. This implies that the threshold for authentication can be adapted to the required level of perceptual quality at the receiver. Experimental results show that this content-based hash is able to differentiate allowed signal processing applications like MP3 compression from certain malicious operations, which modify the perceptual content of the audio.

AB - The goal of audio content authentication techniques is to separate malicious manipulations from authentic signal processing applications like compression, filtering, etc. The key difference between malicious operations and signal processing operations is that the latter tends to preserve the perceptual content of the underlying audio signal. Hence, in order to separate malicious operations from allowed operations, a content authentication procedure should be based on a model that approximates human perception of audio. In this paper, we propose an audio content authentication technique based on an invariant feature contained in two perceptually similar audio data, i.e. the masking curve. We also evaluate the performance of this technique by embedding a hash based on the masking curve into the audio signal using an existing transparent and robust data hiding technique. At the receiver, the same content-based hash is extracted from the audio and compared with the calculated hash bits. Correlation between calculated hash bits and extracted hash bits degrades gracefully with the perceived quality of received audio. This implies that the threshold for authentication can be adapted to the required level of perceptual quality at the receiver. Experimental results show that this content-based hash is able to differentiate allowed signal processing applications like MP3 compression from certain malicious operations, which modify the perceptual content of the audio.

KW - MP-3 Compression

KW - Multimedia Content Authentication

KW - Psycho-acoustic model

UR - http://www.scopus.com/inward/record.url?scp=0036029470&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0036029470&partnerID=8YFLogxK

U2 - 10.1117/12.465266

DO - 10.1117/12.465266

M3 - Conference contribution

AN - SCOPUS:0036029470

VL - 4675

SP - 110

EP - 117

BT - Proceedings of SPIE - The International Society for Optical Engineering

A2 - Delp III, E.J.

A2 - Wong, P.W.

ER -