Detection of Novel Social Bots by Ensembles of Specialized Classifiers

التفاصيل البيبلوغرافية
العنوان: Detection of Novel Social Bots by Ensembles of Specialized Classifiers
المؤلفون: Sayyadiharikandeh, Mohsen, Varol, Onur, Yang, Kai-Cheng, Flammini, Alessandro, Menczer, Filippo
المصدر: Proc. 29th ACM International Conference on Information and Knowledge Management (CIKM), pages 2725-2732, 2020
سنة النشر: 2020
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Social and Information Networks, Computer Science - Information Retrieval, Computer Science - Machine Learning
الوصف: Malicious actors create inauthentic social media accounts controlled in part by algorithms, known as social bots, to disseminate misinformation and agitate online discussion. While researchers have developed sophisticated methods to detect abuse, novel bots with diverse behaviors evade detection. We show that different types of bots are characterized by different behavioral features. As a result, supervised learning techniques suffer severe performance deterioration when attempting to detect behaviors not observed in the training data. Moreover, tuning these models to recognize novel bots requires retraining with a significant amount of new annotations, which are expensive to obtain. To address these issues, we propose a new supervised learning method that trains classifiers specialized for each class of bots and combines their decisions through the maximum rule. The ensemble of specialized classifiers (ESC) can better generalize, leading to an average improvement of 56\% in F1 score for unseen accounts across datasets. Furthermore, novel bot behaviors are learned with fewer labeled examples during retraining. We deployed ESC in the newest version of Botometer, a popular tool to detect social bots in the wild, with a cross-validation AUC of 0.99.
Comment: 8 pages, 10 figures, Accepted to CIKM'20
نوع الوثيقة: Working Paper
DOI: 10.1145/3340531.3412698
الوصول الحر: http://arxiv.org/abs/2006.06867Test
رقم الانضمام: edsarx.2006.06867
قاعدة البيانات: arXiv