Journal article 234 views 87 downloads
Combining weighted SMOTE with ensemble learning for the class-imbalanced prediction of small business credit risk
Complex and Intelligent Systems, Volume: 9, Issue: 4, Pages: 3559 - 3579
Swansea University Author: Mohammad Abedin
-
PDF | Version of Record
© The Author(s) 2021. Distributed under the terms of a Creative Commons Attribution 4.0 License (CC BY 4.0).
Download (1.23MB)
DOI (Published version): 10.1007/s40747-021-00614-4
Abstract
In small business credit risk assessment, the default and nondefault classes are highly imbalanced. To overcome this problem, this study proposes an extended ensemble approach rooted in the weighted synthetic minority oversampling technique (WSMOTE), which is called WSMOTE-ensemble. The proposed ens...
Published in: | Complex and Intelligent Systems |
---|---|
ISSN: | 2199-4536 2198-6053 |
Published: |
Springer Science and Business Media LLC
2023
|
Online Access: |
Check full text
|
URI: | https://cronfa.swan.ac.uk/Record/cronfa64260 |
Abstract: |
In small business credit risk assessment, the default and nondefault classes are highly imbalanced. To overcome this problem, this study proposes an extended ensemble approach rooted in the weighted synthetic minority oversampling technique (WSMOTE), which is called WSMOTE-ensemble. The proposed ensemble classifier hybridizes WSMOTE and Bagging with sampling composite mixtures to guarantee the robustness and variability of the generated synthetic instances and, thus, minimize the small business class-skewed constraints linked to default and nondefault instances. The original small business dataset used in this study was taken from 3111 records from a Chinese commercial bank. By implementing a thorough experimental study of extensively skewed data-modeling scenarios, a multilevel experimental setting was established for a rare event domain. Based on the proper evaluation measures, this study proposes that the random forest classifier used in the WSMOTE-ensemble model provides a good trade-off between the performance on default class and that of nondefault class. The ensemble solution improved the accuracy of the minority class by 15.16% in comparison with its competitors. This study also shows that sampling methods outperform nonsampling algorithms. With these contributions, this study fills a noteworthy knowledge gap and adds several unique insights regarding the prediction of small business credit risk. |
---|---|
Keywords: |
Small business, Credit risk, Imbalanced data, Oversampling, Weighted SMOTE, Ensemble learning |
College: |
Faculty of Humanities and Social Sciences |
Funders: |
This work has been supported by the Key Projects of National Natural Science Foundation of China (71731003 and 71431002), the General Projects of National Natural Science Foundation of China (71471027 and 71873103), the National Social Science Foundation of China (16BTJ017), the Youth Project of National Natural Science Foundation of China (71601041), the scientific research project of the Czech Sciences Foundation Grant (19-15498S), the Aderi Intelligent Technology (Xiamen) Co and Bank of Dalian as well as Postal Savings Bank of China. |
Issue: |
4 |
Start Page: |
3559 |
End Page: |
3579 |