Synthetic oversampling based decision support framework to solve class imbalance problem in smoking cessation program

التفاصيل البيبلوغرافية
العنوان: Synthetic oversampling based decision support framework to solve class imbalance problem in smoking cessation program
المؤلفون: Davagdorj, Khishigsuren, Lee, Jong Seol, Park, Kwang Ho, Huy, Pham Van, Ryu, Keun Ho
المساهمون: 理工學院, Database and Bioinformatics Laboratory, College of Electrical and Computer Engineering, Chungbuk National University, Cheongju, South Korea, Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam, Department of Computer Science, College of Electrical and Computer Engineering, Chungbuk National University, Cheongju, South Korea
بيانات النشر: 朝陽科技大學理工學院
سنة النشر: 2020
المجموعة: Chaoyang University of Technology Institutional Repository (CYUTIR)
مصطلحات موضوعية: Smoking cessation, Risk factor analysis, Class imbalance, Synthetic minority oversampling, Machine learning classifiers
الوصف: Smoking is one of the significant avoidable risk factors for premature death. Most smokers make multiple quit attempts during their lifetime but smoking dependence is not easy and many people eventually failed quit attempts. Predicting the likelihood of success in smoking cessation program is necessary for public health. In recent years, a few numbers of decision support systems have been developed for dealing with smoking cessation based on machine learning techniques. However, the class imbalance problem is increasingly recognized as serious in real-world applications. Therefore, this paper presents a synthetic minority over-sampling technique (SMOTE) based decision support framework in order to predict the success of smoking cessation program using Korea National Health and Nutrition Examination Survey (KNHANES) dataset. We carried out experiments as follows: I) the unnecessary instances and variables have been eliminated, II) then we employed three variations of SMOTE, III) also the prediction models have been constructed. Finally, compare the prediction models to obtain the best model. Our experimental results showed that SMOTE improved the prediction performance of machine learning classifiers among evaluation metrics. Moreover, SMOTE regular based Random Forest (RF) and Na�ve Bayes (NB) classifiers were determined the best prediction models in real-world smoking cessation dataset. Consequently, our decision support framework can interpret the important risk factors of smoking cessation using multivariate regression analysis.
نوع الوثيقة: other/unknown material
وصف الملف: 744679 bytes; application/pdf
اللغة: English
تدمد: 1727-2394
العلاقة: International Journal of Applied Science and Engineering 17(3), p.223-235; 國際應用科學與工程學刊 17(3), p.223-235; http://ir.lib.cyut.edu.tw:8080/handle/310901800/38176Test; http://ir.lib.cyut.edu.tw:8080/bitstream/310901800/38176/1/1.pdfTest
DOI: 10.6703/IJASE.202009_17(3).223
الإتاحة: https://doi.org/10.6703/IJASE.202009_17Test(3).223
http://ir.lib.cyut.edu.tw:8080/handle/310901800/38176Test
http://ir.lib.cyut.edu.tw:8080/bitstream/310901800/38176/1/1.pdfTest
رقم الانضمام: edsbas.EE92D262
قاعدة البيانات: BASE
الوصف
تدمد:17272394
DOI:10.6703/IJASE.202009_17(3).223