Abstract
We introduce a real-time capable algorithm which estimates the long-term signal to noise ratio (SNR) of the speech in multi-talker babble noise. In real-time applications, long-term SNR is calculated over a sufficiently long moving frame of the noisy speech ending at the current time. The algorithm performs the real-time long-term SNR estimation by averaging “speech-likeness” values of multiple consecutive short-frames of the noisy speech which collectively form a long-frame with an adaptive length. The algorithm is calibrated to be insensitive to short-term fluctuations and transient changes in speech or noise level. However, it quickly responds to non-transient changes in long-term SNR by adjusting the duration of the long-frame on which the long-term SNR is measured. This ability is obtained by employing an event detector and adaptive frame duration. The event detector identifies non-transient changes of the long-term SNR and optimizes the duration of the long-frame accordingly. The algorithm was trained and tested for randomly generated speech samples corrupted with multi-talker babble. In addition to its ability to provide an adaptive long-term SNR estimation in a dynamic noisy situation, the evaluation results show that the algorithm outperforms the existing overall SNR estimation methods in multi-talker babble over a wide range of number of talkers and SNRs. The relatively low computational cost and the ability to update the estimated long-term SNR several times per second make this algorithm capable of operating in real-time speech processing applications.
Original language | English (US) |
---|---|
Pages (from-to) | 231-246 |
Number of pages | 16 |
Journal | Computer Speech and Language |
Volume | 58 |
DOIs | |
State | Published - Nov 1 2019 |
Fingerprint
Keywords
- Adaptive SNR
- Long-term SNR
- Multi-talker babble
- Real-time SNR
- Signal-to-noise ratio
ASJC Scopus subject areas
- Software
- Theoretical Computer Science
- Human-Computer Interaction
Cite this
ALTIS : A new algorithm for adaptive long-term SNR estimation in multi-talker babble. / Soleymani, Roozbeh; Selesnick, Ivan; Landsberger, David M.
In: Computer Speech and Language, Vol. 58, 01.11.2019, p. 231-246.Research output: Contribution to journal › Article
}
TY - JOUR
T1 - ALTIS
T2 - A new algorithm for adaptive long-term SNR estimation in multi-talker babble
AU - Soleymani, Roozbeh
AU - Selesnick, Ivan
AU - Landsberger, David M.
PY - 2019/11/1
Y1 - 2019/11/1
N2 - We introduce a real-time capable algorithm which estimates the long-term signal to noise ratio (SNR) of the speech in multi-talker babble noise. In real-time applications, long-term SNR is calculated over a sufficiently long moving frame of the noisy speech ending at the current time. The algorithm performs the real-time long-term SNR estimation by averaging “speech-likeness” values of multiple consecutive short-frames of the noisy speech which collectively form a long-frame with an adaptive length. The algorithm is calibrated to be insensitive to short-term fluctuations and transient changes in speech or noise level. However, it quickly responds to non-transient changes in long-term SNR by adjusting the duration of the long-frame on which the long-term SNR is measured. This ability is obtained by employing an event detector and adaptive frame duration. The event detector identifies non-transient changes of the long-term SNR and optimizes the duration of the long-frame accordingly. The algorithm was trained and tested for randomly generated speech samples corrupted with multi-talker babble. In addition to its ability to provide an adaptive long-term SNR estimation in a dynamic noisy situation, the evaluation results show that the algorithm outperforms the existing overall SNR estimation methods in multi-talker babble over a wide range of number of talkers and SNRs. The relatively low computational cost and the ability to update the estimated long-term SNR several times per second make this algorithm capable of operating in real-time speech processing applications.
AB - We introduce a real-time capable algorithm which estimates the long-term signal to noise ratio (SNR) of the speech in multi-talker babble noise. In real-time applications, long-term SNR is calculated over a sufficiently long moving frame of the noisy speech ending at the current time. The algorithm performs the real-time long-term SNR estimation by averaging “speech-likeness” values of multiple consecutive short-frames of the noisy speech which collectively form a long-frame with an adaptive length. The algorithm is calibrated to be insensitive to short-term fluctuations and transient changes in speech or noise level. However, it quickly responds to non-transient changes in long-term SNR by adjusting the duration of the long-frame on which the long-term SNR is measured. This ability is obtained by employing an event detector and adaptive frame duration. The event detector identifies non-transient changes of the long-term SNR and optimizes the duration of the long-frame accordingly. The algorithm was trained and tested for randomly generated speech samples corrupted with multi-talker babble. In addition to its ability to provide an adaptive long-term SNR estimation in a dynamic noisy situation, the evaluation results show that the algorithm outperforms the existing overall SNR estimation methods in multi-talker babble over a wide range of number of talkers and SNRs. The relatively low computational cost and the ability to update the estimated long-term SNR several times per second make this algorithm capable of operating in real-time speech processing applications.
KW - Adaptive SNR
KW - Long-term SNR
KW - Multi-talker babble
KW - Real-time SNR
KW - Signal-to-noise ratio
UR - http://www.scopus.com/inward/record.url?scp=85065893448&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85065893448&partnerID=8YFLogxK
U2 - 10.1016/j.csl.2019.05.001
DO - 10.1016/j.csl.2019.05.001
M3 - Article
AN - SCOPUS:85065893448
VL - 58
SP - 231
EP - 246
JO - Computer Speech and Language
JF - Computer Speech and Language
SN - 0885-2308
ER -