دورية أكاديمية

The clustering-based case-based reasoning for imbalanced business failure prediction: a hybrid approach through integrating unsupervised process with supervised process

التفاصيل البيبلوغرافية
العنوان: The clustering-based case-based reasoning for imbalanced business failure prediction: a hybrid approach through integrating unsupervised process with supervised process
المؤلفون: Hui Li, Jun-Ling Yu, Le-An Yu, Jie Sun
المجموعة: RePEc (Research Papers in Economics)
الوصف: Case-based reasoning (CBR) is one of the main forecasting methods in business forecasting, which performs well in prediction and holds the ability of giving explanations for the results. In business failure prediction (BFP), the number of failed enterprises is relatively small, compared with the number of non-failed ones. However, the loss is huge when an enterprise fails. Therefore, it is necessary to develop methods (trained on imbalanced samples) which forecast well for this small proportion of failed enterprises and performs accurately on total accuracy meanwhile. Commonly used methods constructed on the assumption of balanced samples do not perform well in predicting minority samples on imbalanced samples consisting of the minority/failed enterprises and the majority/non-failed ones. This article develops a new method called clustering-based CBR (CBCBR), which integrates clustering analysis, an unsupervised process, with CBR, a supervised process, to enhance the efficiency of retrieving information from both minority and majority in CBR. In CBCBR, various case classes are firstly generated through hierarchical clustering inside stored experienced cases, and class centres are calculated out by integrating cases information in the same clustered class. When predicting the label of a target case, its nearest clustered case class is firstly retrieved by ranking similarities between the target case and each clustered case class centre. Then, nearest neighbours of the target case in the determined clustered case class are retrieved. Finally, labels of the nearest experienced cases are used in prediction. In the empirical experiment with two imbalanced samples from China, the performance of CBCBR was compared with the classical CBR, a support vector machine, a logistic regression and a multi-variant discriminate analysis. The results show that compared with the other four methods, CBCBR performed significantly better in terms of sensitivity for identifying the minority samples and generated high total accuracy ...
نوع الوثيقة: article in journal/newspaper
اللغة: unknown
العلاقة: http://hdl.handle.net/10.1080/00207721.2012.748105Test
DOI: 10.1080/00207721.2012.748105
رقم الانضمام: edsbas.30CF9E16
قاعدة البيانات: BASE