Using textual features for the detection of vandalism in wikipedia: a comparative approach in low-resource language sections
العنوان: | Using textual features for the detection of vandalism in wikipedia: a comparative approach in low-resource language sections |
---|---|
المؤلفون: | Mentor Hamiti, Arsim Susuri, Agni Dika |
المصدر: | Volume: 5, Issue: 1 80-87 PressAcademia Procedia |
بيانات النشر: | Pressacademia, 2017. |
سنة النشر: | 2017 |
مصطلحات موضوعية: | Information retrieval, Bosnian, business.industry, Computer science, Comparative method, Low resource, Beşeri Bilimler, Ortak Disiplinler, Humanities, Multidisciplinary, computer.software_genre, GeneralLiterature_MISCELLANEOUS, language.human_language, Set (abstract data type), Wikipedia,textual features,low-resource languages,vandalism, Simple (abstract algebra), language, Artificial intelligence, Detection rate, business, computer, Natural language processing |
الوصف: | This study investigates the impact of using textual features for the detection of vandalism across low-resource language sections in Wikipedia. For this purpose, we propose new features that allow the machine learning-based text classifiers to better distinguish vandalism and to improve the detection rates of vandalism across languages, based on textual features applied in previous researches. These features enable us to compare the contributions of the bots against vandalism, stressing the differences between bots and editors with regards to the detection of vandalism. We propose a new set of efficient and language independent features, which has the performance level similar to the previous sets. Three Wikipedia sections will be used for this purpose: Simple English (simple), Albanian (sq) and Bosnian (bs). We will show that our set of textual features has similar and, in some cases, better vandalism detection rates across languages than previous research. |
وصف الملف: | application/pdf |
تدمد: | 2146-7943 2459-0762 |
الوصول الحر: | https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d7b6018ab9a0c017c168e87d01f0c7abTest https://doi.org/10.17261/pressacademia.2017.575Test |
حقوق: | OPEN |
رقم الانضمام: | edsair.doi.dedup.....d7b6018ab9a0c017c168e87d01f0c7ab |
قاعدة البيانات: | OpenAIRE |
تدمد: | 21467943 24590762 |
---|