Purpose This study aims to analyze the relationship between consumption patterns and default risk among financially vulnerable households in a rapidly changing economic environment. Financially vulnerable households are more susceptible to economic sh...
Purpose This study aims to analyze the relationship between consumption patterns and default risk among financially vulnerable households in a rapidly changing economic environment. Financially vulnerable households are more susceptible to economic shocks, and their consumption patterns can significantly contribute to an increased risk of default. Therefore, this study seeks to provide a systematic approach to predict and manage these risks in advance.
Design/methodology/approach The study utilizes data from the Korea Welfare Panel Study (KOWEPS) to analyze the consumption patterns and default status of financially vulnerable households. To address the issue of data imbalance, sampling techniques such as SMOTE, SMOTE-ENN, and SMOTE-Tomek Links were applied. Various machine learning algorithms, including Logistic Regression, Decision Tree, Random Forest, and Support Vector Machine (SVM), were employed to develop the prediction model. The performance of the models was evaluated using Confusion Matrix and F1-score.
Findings The findings reveal that when using the original imbalanced data, the prediction performance for the minority class (default) was poor. However, after applying imbalance handling techniques such as SMOTE, the predictive performance for the minority class improved significantly. In particular, the Random Forest model, when combined with the SMOTE-Tomek Links technique, showed the highest predictive performance, making it the most suitable model for default prediction. These results suggest that effectively addressing data imbalance is crucial in developing accurate default prediction models, and the appropriate use of sampling techniques can greatly enhance predictive performance.