Towards Standard Criteria for human evaluation of Chatbots: A Survey

التفاصيل البيبلوغرافية
العنوان: Towards Standard Criteria for human evaluation of Chatbots: A Survey
المؤلفون: Liang, Hongru, Li, Huaqing
سنة النشر: 2021
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computation and Language
الوصف: Human evaluation is becoming a necessity to test the performance of Chatbots. However, off-the-shelf settings suffer the severe reliability and replication issues partly because of the extremely high diversity of criteria. It is high time to come up with standard criteria and exact definitions. To this end, we conduct a through investigation of 105 papers involving human evaluation for Chatbots. Deriving from this, we propose five standard criteria along with precise definitions.
نوع الوثيقة: Working Paper
الوصول الحر: http://arxiv.org/abs/2105.11197Test
رقم الانضمام: edsarx.2105.11197
قاعدة البيانات: arXiv