A Sparse-Modeling based approach for Class-Specific feature selection

التفاصيل البيبلوغرافية
العنوان: A Sparse-Modeling based approach for Class-Specific feature selection
المؤلفون: Nardone, Davide, Ciaramella, Angelo, Staiano, Antonino
بيانات النشر: PeerJ
سنة النشر: 2019
المجموعة: PeerJ (E-Journal - via CrossRef)
الوصف: In this work, we propose a novel Feature Selection framework, called Sparse-Modeling Based Approach for Class Specific Feature Selection (SMBA-CSFS), that simultaneously exploits the idea of Sparse Modeling and Class-Specific Feature Selection. Feature selection plays a key role in several fields (e.g., computational biology), making it possible to treat models with fewer variables which, in turn, are easier to explain, by providing valuable insights on the importance of their role, and might speed the experimental validation up. Unfortunately, also corroborated by the no free lunch theorems, none of the approaches in literature is the most apt to detect the optimal feature subset for building a final model, thus it still represents a challenge. The proposed feature selection procedure conceives a two steps approach: (a) a sparse modeling-based learning technique is first used to find the best subset of features, for each class of a training set; (b) the discovered feature subsets are then fed to a class-specific feature selection scheme, in order to assess the effectiveness of the selected features in classification tasks. To this end, an ensemble of classifiers is built, where each classifier is trained on its own feature subset discovered in the previous phase, and a proper decision rule is adopted to compute the ensemble responses. In order to evaluate the performance of the proposed method, extensive experiments have been performed on publicly available datasets, in particular belonging to the computational biology field where feature selection is indispensable: the acute lymphoblastic leukemia and acute myeloid leukemia, the human carcinomas, the human lung carcinomas, the diffuse large B-cell lymphoma, and the malignant glioma. SMBA-CSFS is able to identify/retrieve the most representative features that maximize the classification accuracy. With top 20 and 80 features, SMBA-CSFS exhibits a promising performance when compared to its competitors from literature, on all considered datasets, especially those ...
نوع الوثيقة: other/unknown material
اللغة: unknown
DOI: 10.7287/peerj.preprints.27740
الإتاحة: https://doi.org/10.7287/peerj.preprints.27740Test
https://peerj.com/preprints/27740.pdfTest
https://peerj.com/preprints/27740.xmlTest
https://peerj.com/preprints/27740.htmlTest
حقوق: http://creativecommons.org/licenses/by/4.0Test/
رقم الانضمام: edsbas.E493DB8
قاعدة البيانات: BASE