Improving Low-Resource Question Answering using Active Learning in Multiple Stages

التفاصيل البيبلوغرافية
العنوان: Improving Low-Resource Question Answering using Active Learning in Multiple Stages
المؤلفون: Schmidt, Maximilian, Bartezzaghi, Andrea, Bogojeska, Jasmina, Malossi, A. Cristiano I., Vu, Thang
بيانات النشر: arXiv, 2022.
سنة النشر: 2022
مصطلحات موضوعية: FOS: Computer and information sciences, Computer Science - Computation and Language, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, I.2.7, Computation and Language (cs.CL)
الوصف: Neural approaches have become very popular in the domain of Question Answering, however they require a large amount of annotated data. Furthermore, they often yield very good performance but only in the domain they were trained on. In this work we propose a novel approach that combines data augmentation via question-answer generation with Active Learning to improve performance in low resource settings, where the target domains are diverse in terms of difficulty and similarity to the source domain. We also investigate Active Learning for question answering in different stages, overall reducing the annotation effort of humans. For this purpose, we consider target domains in realistic settings, with an extremely low amount of annotated samples but with many unlabeled documents, which we assume can be obtained with little effort. Additionally, we assume sufficient amount of labeled data from the source domain is available. We perform extensive experiments to find the best setup for incorporating domain experts. Our findings show that our novel approach, where humans are incorporated as early as possible in the process, boosts performance in the low-resource, domain-specific setting, allowing for low-labeling-effort question answering systems in new, specialized domains. They further demonstrate how human annotation affects the performance of QA depending on the stage it is performed.
Comment: 16 pages, 8 figures
DOI: 10.48550/arxiv.2211.14880
الوصول الحر: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::597db9ea5214a7dc353f5fb1802ce17fTest
حقوق: OPEN
رقم الانضمام: edsair.doi.dedup.....597db9ea5214a7dc353f5fb1802ce17f
قاعدة البيانات: OpenAIRE