دورية أكاديمية

Extended VTS for Noise-Robust Speech Recognition.

التفاصيل البيبلوغرافية
العنوان: Extended VTS for Noise-Robust Speech Recognition.
المؤلفون: van Dalen, Rogier C., Gales, Mark J. F.
المصدر: IEEE Transactions on Audio, Speech & Language Processing; Apr2011, Vol. 19 Issue 4, p733-743, 11p
مصطلحات موضوعية: VIENNA Test System, AUTOMATIC speech recognition, ROBUST control, SOUND measurement, APPROXIMATION theory, LINEAR systems, PARAMETER estimation
مستخلص: Model compensation is a standard way of improving the robustness of speech recognition systems to noise. A number of popular schemes are based on vector Taylor series (VTS) compensation, which uses a linear approximation to represent the influence of noise on the clean speech. To compensate the dynamic parameters, the continuous time approximation is often used. This approximation uses a point estimate of the gradient, which fails to take into account that dynamic coefficients are a function of a number of consecutive static coefficients. In this paper, the accuracy of dynamic parameter compensation is improved by representing the dynamic features as a linear transformation of a window of static features. A modified version of VTS compensation is applied to the distribution of the window of static features and, importantly, their correlations. These compensated distributions are then transformed to distributions over standard static and dynamic features. With this improved approximation, it is also possible to obtain full-covariance corrupted speech distributions. This addresses the correlation changes that occur in noise. The proposed scheme outperformed the standard VTS scheme by 10% to 20% relative on a range of tasks. [ABSTRACT FROM PUBLISHER]
Copyright of IEEE Transactions on Audio, Speech & Language Processing is the property of IEEE and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
قاعدة البيانات: Complementary Index
الوصف
تدمد:15587916
DOI:10.1109/TASL.2010.2061226