Can time dependencies and ensemble classification improve content-free dialogue segmentation?

التفاصيل البيبلوغرافية
العنوان: Can time dependencies and ensemble classification improve content-free dialogue segmentation?
المؤلفون: Jing Su, Saturnino Luz
المصدر: 2013 IEEE 4th International Conference on Cognitive Infocommunications (CogInfoCom).
بيانات النشر: IEEE, 2013.
سنة النشر: 2013
مصطلحات موضوعية: Conditional random field, Boosting (machine learning), Computer science, business.industry, Bayesian probability, Pattern recognition, Bayes classifier, Machine learning, computer.software_genre, Naive Bayes classifier, ComputingMethodologies_PATTERNRECOGNITION, False positive paradox, Segmentation, Artificial intelligence, business, Classifier (UML), computer
الوصف: We present an extended study of content-free topic segmentation of conversational (meeting) data based on classification of vocalization events. In previous work, content-free topic segmentation achieved good accuracy through a modified naive Bayes classifier and vocalization horizon features. In this study, we attempted to improve on those results by incorporating time (sequential) dependency information into the topic boundary detection process through the use of conditional random fields and ensemble classifiers. We expected that incorporating such information would help reduce the number of false positives generated by the naive Bayes method. We introduce a new metric in the assessment of performance, in addition to the usual Pk and WindowDiff (WD) metrics in order to account for the under-detection bias of the segmentation task. Although a boosting model showed fairly good performance using a simple base classifier and limited contextual features, the more elaborate methods still trailed the Bayesian method.
الوصول الحر: https://explore.openaire.eu/search/publication?articleId=doi_________::85cac3158331a937122a71fe87afb88bTest
https://doi.org/10.1109/coginfocom.2013.6719238Test
رقم الانضمام: edsair.doi...........85cac3158331a937122a71fe87afb88b
قاعدة البيانات: OpenAIRE