Unsupervised Domain Adaptation Schemes for Building ASR in Low-resource Languages

التفاصيل البيبلوغرافية
العنوان:	Unsupervised Domain Adaptation Schemes for Building ASR in Low-resource Languages
المؤلفون:	S, Anoop C, P, Prathosh A, Ramakrishnan, A G
سنة النشر:	2021
المجموعة:	Computer Science
مصطلحات موضوعية:	Computer Science - Computation and Language, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
الوصف:	Building an automatic speech recognition (ASR) system from scratch requires a large amount of annotated speech data, which is difficult to collect in many languages. However, there are cases where the low-resource language shares a common acoustic space with a high-resource language having enough annotated data to build an ASR. In such cases, we show that the domain-independent acoustic models learned from the high-resource language through unsupervised domain adaptation (UDA) schemes can enhance the performance of the ASR in the low-resource language. We use the specific example of Hindi in the source domain and Sanskrit in the target domain. We explore two architectures: i) domain adversarial training using gradient reversal layer (GRL) and ii) domain separation networks (DSN). The GRL and DSN architectures give absolute improvements of 6.71% and 7.32%, respectively, in word error rate over the baseline deep neural network model when trained on just 5.5 hours of data in the target domain. We also show that choosing a proper language (Telugu) in the source domain can bring further improvement. The results suggest that UDA schemes can be helpful in the development of ASR systems for low-resource languages, mitigating the hassle of collecting large amounts of annotated speech data. Comment: Submitted to ASRU 2021
نوع الوثيقة:	Working Paper
الوصول الحر:	http://arxiv.org/abs/2109.05494Test
رقم الانضمام:	edsarx.2109.05494
قاعدة البيانات:	arXiv

View record in Arxiv

الوصف
الوصف غير متاح.