مدل ترکیبی یادگیری ماشین مبتنی بر رأی‌گیری وزنی برای طبقه‌بندی هوشمند ارزش مشتریان بانکی

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشجوی دکتری، گروه مدیریت صنعتی، دانشکده مدیریت، دانشگاه تهران، تهران، ایران.

2 استاد، گروه مدیریت صنعتی، دانشکده مدیریت، دانشگاه تهران، تهران، ایران.

چکیده
در فضای رقابتی صنعت بانکداری، شناسایی و طبقه‌بندی دقیق ارزش مشتریان، نقشی کلیدی در طراحی استراتژی‌های بازاریابی هدفمند، تخصیص بهینه منابع و افزایش سودآوری دارند. این پژوهش، یک مدل ترکیبی پیشرفته مبتنی بر یادگیری ماشین را با استفاده از تکنیک رأی‌گیری وزنی طراحی و پیاده‌سازی می کند که هدف آن طبقه‌بندی هوشمند مشتریان بانکی بر‌اساس میزان ارزش آنهاست. شش الگوریتم قدرتمند یادگیری ماشین شامل Random Forest، Gradient Boosting، XGBoost، LightGBM، CatBoost و Extra Trees به‌عنوان مدل‌های پایه انتخاب و با دو روش رأی‌گیری سخت و نرم ترکیب شده اند. وزن مشارکت هر الگوریتم در فرآیند رأی‌گیری به‌صورت بهینه با الگوریتم Optuna تنظیم گردیده تا دقت و تعادل مدل به حداکثر برسد. همچنین، روش ADASYN برای مقابله با مشکل عدم‌توازن کلاس‌ها در داده‌های واقعی بانکی به کار رفته و مهم‌ترین ویژگی‌های مؤثر در پیش‌بینی با بهره‌گیری از الگوریتم Random Forest شناسایی گردیده‌اند. عملکرد مدل پیشنهادی با استفاده از چهار معیار ارزیابی صحت، دقت، بازخوانی و امتیاز F1 و در مقایسه با ۱۶ الگوریتم کلاسیک و مدرن یادگیری ماشین تحلیل شده است. نتایج نشان می دهد که مدل رأی‌گیری سخت با صحت 0.9426، دقت 0.9756 و بازخوانی 0.9112، و مدل رأی‌گیری نرم با دقت 0.9771 و امتیاز F1 معادل 0.9422، عملکردی برتر از سایر الگوریتم‌ها مانند SVM، KNN و Logistic Regression داشته‌ و تعادلی مؤثر میان معیارهای ارزیابی برقرار کرده‌اند. این مدل ترکیبی، با دقت بالا، مقاومت در برابر داده‌های نامتوازن، و انعطاف‌پذیری در انتخاب ویژگی‌ها، الگویی کاربردی و قابل اتکا برای بانک‌ها و مؤسسات مالی جهت تحلیل ارزش مشتریان فراهم می‌آورد.

کلیدواژه‌ها


عنوان مقاله English

A hybrid machine learning model based on weighted voting for intelligent classification of bank customer value

نویسندگان English

Amir Mohammad Khani 1
Ahmad Jafarnjad 2
Arman Rezasoltani, 1
1 Ph.D. Candidate, Department of Industrial Management, Faculty of Management, University of Tehran, Tehran, Iran.
2 Prof., Department of Industrial Management, Faculty of Management, University of Tehran, Tehran, Iran.
چکیده English

Accurately determining and categorizing customer value is crucial for targeted marketing campaigns, efficient resource allocation, and increasing profitability in the highly competitive and ever-changing banking sector. This study proposes a machine leanring-based sophisticated hybrid model using a weighted voting technique to intelligently classify banking customers based on their value.Both hard and soft voting techniques were used to implement the combinations of six potent machine learning algorithms chosen as base models: Random Forest, Gradient Boosting, XGBoost, LightGBM, CatBoost, and Extra Trees. The Optuna algorithm was utilized to optimally adjust the contribution weight of each algorithm in the voting process and optimize the accuracy and balance of the model. Additionally, the ADASYN method and Random Forest algorithm were used to address the problem of class imbalance in real-world banking data and identify the most influential features for prediction, respectively. Four metrics—accuracy, precision, recall, and F1-score—were used to assess the suggested model's performance, and it was contrasted with 16 traditional and contemporary machine learning algorithms. the results showed that the hard voting model outperformed other algorithms like SVM, KNN, and Logistic Regression and effectively balancing evaluation metrics, obtaining an accuracy of 0.9426, precision of 0.9756, and recall of 0.9112, while the soft voting model recorded a precision of 0.9771 and an F1-score of 0.9422. This hybrid model provides banks and other financial institutions with a useful and trustworthy framework for analyzing customer value because of its high accuracy, resilience to unbalanced data, and feature selection flexibility.

کلیدواژه‌ها English

Bank customer value
weighted voting
machine learning
intelligent classification
[1] Dadashi, A., Hamidizadeh, A., & Sanavi Fard, R. (2022). Designing a content marketing model for the banking industry to increase the target market share. Management Research in Iran, 26(2), 116–142. https://doi.org/20.1001.1.2322200.1401.26.2.6.5. [In Persian]
[2] Ullah, A., Mohmand, M. I., Hussain, H., Johar, S., Khan, I., Ahmad, S., Mahmoud, H. A., & Huda, S. (2023). Customer Analysis Using Machine Learning-Based Classification Algorithms for Effective Segmentation Using Recency, Frequency, Monetary, and Time. Sensors, 23(6), 3180. https://doi.org/10.3390/s23063180.
[3] Singh, P. P., Anik, F. I., Senapati, R., Sinha, A., Sakib, N., & Hossain, E. (2023). Investigating customer churn in banking: A machine learning approach and visualization app for data science and management. Data Science and Management, 7(1). https://doi.org/10.1016/j.dsm.2023.09.002.
[4] Ahmed, U., Srivastava, G., & Lin, J. C.-W. (2022). Reliable customer analysis using federated learning and exploring deep-attention edge intelligence. Future Generation Computer Systems, 127, 70–79. https://doi.org/10.1016/j.future.2021.08.028.
[5] Sun, Y., Liu, H., & Gao, Y. (2023). Research on customer lifetime value based on machine learning algorithms and customer relationship management analysis model. Heliyon, 9(2), e13384. Sciencedirect. https://doi.org/10.1016/j.heliyon.2023.e13384.
[6] Galal, M., Rady, S., & Aref, M. (2024a). Enhancing Customer Churn Prediction in Digital Banking Using Hybrid Meta-Learners and Stacking Ensemble Modeling. 143–148. https://doi.org/10.1109/airc61399.2024.10671880.
[7] Chen Shuofeng, Karim, A. M., & Li, L. (2024). A Multiclass Ensemble Learning Approach for Predicting Customer Churn in Commercial Banks. International Journal of Academic Research in Progressive Education and Development, 13(4), 787–804. http://dx.doi.org/10.6007/IJARPED/v13-i4/23543.
[8] Yao, J., Wang, Z., Wang, L., Liu, M., Jiang, H., & Chen, Y. (2022). Novel hybrid ensemble credit scoring model with stacking-based noise detection and weight assignment. Expert Systems with Applications, 198, 116913. https://doi.org/10.1016/j.eswa.2022.116913.
[9] Kehinde Josephine Olowe, Ngozi Linda Edoh, Jean, S., & Olamijuwon, J. (2024). Review of predictive modeling and machine learning applications in financial service analysis. Computer Science & IT Research Journal, 5(11). https://doi.org/10.51594/csitrj.v5i11.1731.
[10] Birant, D. (2020). Data Mining in Banking Sector Using Weighted Decision Jungle Method. Data Mining - Methods, Applications and Systems. https://doi.org/10.5772/intechopen.91836.
[11] Dawood, E. A. E., Elfakhrany, E., & Maghraby, F. A. (2019). Improve Profiling Bank Customer’s Behavior Using Machine Learning. IEEE Access, 7, 109320–109327. https://doi.org/10.1109/access.2019.2934644.
[12] Deloitte. (2022). AI and risk management in banking: Navigating challenges and seizing opportunities. Deloitte Insights. Retrieved from https://www2.deloitte.com/xe/en/pages/financial-services/articles/ai-in-banking.html.
[13] McKinsey & Company. (2021). The future of personalization in banking. Retrieved from https://www.mckinsey.com/industries/financial-services/our-insights/reimagining-personalized-banking-through-ai.
[14] Gheysari, K., Hoseyni, M., Azar, A., & Khademi, S. (2021). Investigate the factors affecting consumer brand preferences by considering the life cycle of customers in banking sector. Management Research in Iran, 25(4), 27–44. https://doi.org/20.1001.1.2322200.1400.25.4.2.8. [In Persian]
[15] Jafarnejad Chaghoshi, A. , Rezasoltani, A. and Khani, A. M. (2024). Unleashing the Power of Ensemble Learning: Predicting National Ranks in Iran’s University Entrance Examination. Industrial Management Journal, 16(3), 457-481. doi: 10.22059/imj.2024.381521.1008178.
[16] Amirhassankhani H, Toloie Eshlaghy A, Radfar R, Pourebrahimi A. Presenting a hybrid model based on machine learning for the classification of banking and insurance industry common customers. J Prod Manag. 2024;68:53-80. [In Persian]
[17] Hosseini, S. , Motadel, M. and Toloie Eshlaghy, A. (2024). Developing a customer relationship model based on the competitive advantage of the Markov chain approach and customer classification using customer lifetime value (case study of Tejarat Bank). Modern Research in Decision Making, 9(4), 33-66. [In Persian]
[18] Najafi, A. and Akhondzadeh Noughabi, E. (2024). Pattern Mining of customer dynamics through different customer value states by using sequence pattern mining and big data analytics. Modern Research in Decision Making, 9(4), 68-93. [In Persian]
[19] Aghakhani Bezdi Langari, A. and Hasani ‎, A. (2023). Customer Churn Analysis Based on the Data-mining Approach: Hybrid Algorithm ‎Incorporates Decision Tree and Bayesian Network. New Marketing Research Journal, 13(2), 1-22. doi: 10.22108/nmrj.2023.135756.2797. [In Persian]
[20] Soltani, M., Khatami Firouzabadi, S. M. A. , Amiri, M. and Hajian Heidary, M. (2023). Proposing an integrated approach for omnichannel demand forecasting using machine learning-time series clustering with dynamic time warping algorithm and artificial neural networks. Research in Production and Operations Management, 14(1), 121-140. doi: 10.22108/pom.2023.136202.1485. [In Persian]
[21] Zhu, H. (2024). Bank Customer Churn Prediction with Machine Learning Methods. 69(1), 23–29. https://doi.org/10.54254/2754-1169/69/20230773.
[22] He, C., & Chris. (2024). A novel classification algorithm for customer churn prediction based on hybrid Ensemble-Fusion model. Scientific Reports, 14(1), 1–25. https://doi.org/10.1038/s41598-024-71168-x.
[23] Galal, M., Rady, S., & Aref, M. (2024b). Enhancing Machine Learning Engineering For Predicting Youth Loyalty In Digital Banking Using A Hybrid Meta-Learners. International Journal of Intelligent Computing and Information Sciences/International Journal of Intelligent Computing and Information Sciences, 24(2), 28–40. https://doi.org/10.21608/ijicis.2024.283191.1334.
[24] Mohammad, E., Dekamini Fatemeh, Amir, M., Khazaei Moein, Cristi, S., Ramona, B., & Dorin, F. R. (2023). Evaluating the performance of machine learning algorithms in predicting the best bank customers. Annals of the University of Craiova Mathematics and Computer Science Series, 50(2), 464–475. https://doi.org/10.52846/ami.v50i2.1781.
[25] Yao, J., Wang, Z., Wang, L., Zhang, Z., Jiang, H., & Yan, S. (2022). A hybrid model with novel feature selection method and enhanced voting method for credit scoring. Journal of Intelligent & Fuzzy Systems, 42(3), 2565–2579. https://doi.org/10.3233/jifs-211828.
[26] Zhang, W., Yang, D., & Zhang, S. (2021). A new hybrid ensemble model with voting-based outlier detection and balanced sampling for credit scoring. Expert Systems with Applications, 174, 114744. https://doi.org/10.1016/j.eswa.2021.114744.
[27] Mitra, R., Bajpai, A., & Biswas, K. (2023). ADASYN-assisted machine learning for phase prediction of high entropy carbides. Computational Materials Science, 223, 112142. https://doi.org/10.1016/j.commatsci.2023.112142.
[28] Grus, J. (2019). DATA SCIENCE FROM SCRATCH: first principles with python. O’Reilly Media.
[29] Mehregan, M. R., & Khani, A. M. (2024). Improving organizational performance: The role of supply chain 4.0 and financing in reducing supply chain risk. Journal of International Business Administration, 7(3), 39–59. https://doi.org/10.22034/jiba.2024.60005.216.
[30] Jafarnejad Chaghoshi, A. , Rezasoltani, A. and Khani, A. M. (2024). Unleashing the Power of Ensemble Learning: Predicting National Ranks in Iran’s University Entrance Examination. Industrial Management Journal, 16(3), 457-481. doi: 10.22059/imj.2024.381521.1008178. [In Persian].
[31] Abdulsadig, R. S., & Rodriguez-Villegas, E. (2024). A comparative study in class imbalance mitigation when working with physiological signals. Frontiers in Digital Health, 6. https://doi.org/10.3389/fdgth.2024.1377165.
[32] Parra-Ullauri, J., Zhang, X., Bravalheri, A., Reza Nejabati, & Dimitra Simeonidou. (2023). Federated Hyperparameter Optimisation with Flower and Optuna. Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing, 1209–1216. https://doi.org/10.1145/3555776.3577847.
[33] Sami Hadhri, Mondher Hadiji, & Walid Labidi. (2024). A voting ensemble classifier for stress detection. Journal of Information and Telecommunication, 1–18. https://doi.org/10.1080/24751839.2024.2306786.
[34] Mimusa Azim Mim, Nazia Majadi, & Mazumder, P. (2024). A soft voting ensemble learning approach for credit card fraud detection. Heliyon, e25466–e25466. https://doi.org/10.1016/j.heliyon.2024.e25466.
[35] Jafarnejad, A., Rezasoltani, A., & Khani, A. M. (2025). Predicting heart disease using automated machine learning based on genetic algorithms. Journal of Information Technology Management, 17(2), 91–122. https://doi.org/10.22059/jitm.2024.382556.3829.