Using a naive Bayesian classifier methodology for loan risk assessment: Evidence from a Tunisian commercial bank


  • Aida Krichene Department of Accounting, IHEC Carthage, Tunis, Tunisia


ROC curve, Risk assessment, Default risk, Banking sector, Bayesian classifier algorithm


Purpose. Loan default risk or credit risk evaluation is important to financial institutions which provide loans to businesses and individuals. Loans carry the risk of being defaulted. To understand the risk levels of credit users (corporations and individuals), credit providers (bankers) normally collect vast amounts of information on borrowers. Statistical predictive analytic techniques can be used to analyse or to determine the risk levels involved in loans. This paper aims to address the question of default prediction of short-term loans for a Tunisian commercial bank.

Design/methodology/approach. The authors have used a database of 924 files of credits granted to industrial Tunisian companies by a commercial bank in the years 2003, 2004, 2005 and 2006. The naive Bayesian classifier algorithm was used, and the results show that the good classification rate is of the order of 63.85 per cent. The default probability is explained by the variables measuring working capital, leverage, solvency, profitability and cash flow indicators.

Findings. The results of the validation test show that the good classification rate is of the order of 58.66 per cent; nevertheless, the error types I and II remain relatively high at 42.42 and 40.47 per cent, respectively. A receiver operating characteristic curve is plotted to evaluate the performance of the model. The result shows that the area under the curve criterion is of the order of 69 per cent.

Originality/value. The paper highlights the fact that the Tunisian central bank obliged all commercial banks to conduct a survey study to collect qualitative data for better credit notation of the borrowers.



Download data is not yet available.


Abid, F. and Zouari, A. (2000), “Financial distress prediction using neural networks”, available at:, doi: 10.2139/ssrn.355980.

Abramowicz, W., Nowak, M. and Sztykiel, J. (2003), Bayesian Networks as a Decision Support Tool in Credit Scoring Domain, Idea Group Publishing.

Altman, E.I. (1968), “Financial ratios, discriminant analysis and the prediction of corporate bankruptcy”, Journal of Finance, Vol. 23 No. 4, pp. 589-609.

Anderson, D.R., Sweeney, D.J., Freeman, J., Williams, T.A. and Shoesmith, E. (2007), Statistics for Business and Economics, Thomson Learning EMEA, London.

Antonakis, A.C. and Sfakianakis, M.E. (2009), “Assessing naive Bayes as a method for screening credit applicants”, Journal of Applied Statistics, Vol. 36 No. 5, pp. 537-545.

Antonietta, M. and Paolo, T. (2003), “Bayesian estimate of credit risk via MCMC with delayed rejection”, Economics and quantitative methods, Department of Economics, University of Insubria.

Atiya, A.F. (2001), “Bankruptcy prediction for credit risk using neural nets: a survey and new results”, IEEE Transactions on Neural Nets, Vol. 12 No. 4, pp. 929-935.

Baesens, B., Van Gestel, T., Viaene, S., Stepanova, M., Suykens, J. and Vanthienen, J. (2003), “Benchmarking state-of-the-art classification algorithms for credit scoring”, Journal of the Operational Research Society, Vol. 54 No. 6, pp. 627-635.

Beaver, W. (1963), “Financial ratios as predictors of failure. Empirical research in accounting: selected studies”, Journal of Accounting Research, Vol. 5, pp. 71-111.

Berk, B., Hidayet, T. and Utku, C.E. (2011), “Bank credit risk analysis with Bayesian network decision tool”, International Journal of Advanced Engineering Sciences and Technologies, Vol. 9 No. 2, pp. 273-279.

Berstein, L.A. and Wild, J.J. (1998), Financial Statement Analysis: Theory, Application, and Interpretation, 6th ed., McGraw-Hill.

Bocker, K. (2010), Rethinking Risk Measurement and Reporting: Volume II, Risk Books, London.

Bogess, W.P. (1967), “Screen-test your credit risks”, Harvard Business Review, Vol. 45 No. 6, pp. 113-122.

Bradley, A.P. (1997), “The use of the area under the ROC curve in the evaluation of machine learning algorithms”, Pattern Recognize, Vol. 30 No. 7, pp. 1145-1159.

Çinko, M. (2006), “Comparison of credit scoring techniques: İstanbul Ticaret Üniversitesi Sosyal Bilimler”, Dergisi, Vol. 5 No. 9, pp. 143-153.

Davis, R.H., Edelman, D.B. and Gammerman, A.J. (1992), “Machine learning algorithms for credit-card applications”, IMA Journal of Management Mathematics, Vol. 4 No. 1, pp. 43-51.

Davutyan, N. and Özar, S. (2006), “A credit scoring model for Turkey’s micro and small enterprises (MSE’s)”, 13th Annual ERF Conference, 16-18 December.

Demerjian, P.R.W. (2007), Financial Ratios and Credit Risk: The Selection of Financial Ratio Covenants in Debt Contracts, Workshop Stephen M. Ross School of Business, University of Michigan, Michigan, MI.

Desai, V.S., Crook, J.N. and Overstreet, G.A. (1996), “A comparison of neural networks and linear scoring models in the credit union environment”, European Journal of Operational Research, Vol. 95 No. 1, pp. 24-37.

Diamond, D.W. (1984), “Financial intermediation and delegated monitoring”, Review of Economic Studies, Vol. 51 No. 3, pp. 393-414.

Dichev, I. and Skinner, D. (2002), “Large-sample evidence on the debt covenant hypothesis”, Journal of Accounting Research, Vol. 40 No. 4, pp. 1091-1123.

El-Shazly, A. (2002), “Financial distress and early warning signals: a non-parametric approach with application to Egypt”, 9th Annual ERF Conference, Emirates, October.

Fawcett, T. (1997), “Analysis and visualization of classifier performance: comparison under imprecise class and cost distributions”, Proceeding Third Intern ate Conference on Knowledge Discovery and Data Mining (KDD-97), AAAI Press, Menlo Park, CA, pp. 43-48.

Fawcett, T. (2006), “An introduction to ROC analysis”, Pattern Recognition Letters, Vol. 27 No. 8, pp. 861-874.

Galindo, J. and Tamayo, P. (2000), “Credit risk assessment using statistical and machine learning: basic methodology and risk modeling applications”, Computational Economic, Vol. 15 Nos 1/2, pp. 107-143.

Hand, D.J. (1997), “Construction and assessment of classification rules”, Wiley Series in Probability and Statistics, John Wiley & Sons.

Hanley, J.A. and McNeil, B.J. (1982), “The meaning and use of the area under a receiver operating characteristic (ROC) curve”, Radiology, Vol. 143, pp. 29-36.

Hellwig, M. (2000), “Financial intermediation with risk aversion”, Review of Economic Studies, Vol. 67 No. 4, pp. 719-742.

Hellwig, M. (2001), “Risk aversion and incentive compatibility with ex post information asymmetry”, Economic Theory, Vol. 18 No. 2, pp. 415-438.

Henley, W.E. and Hand, D.J. (1996), “A k-nearest-neighbour classifier for assessing consumer credit risk”, The Statistician, Vol. 45 No. 1, p. 77

Henley, W.E. and Hand, D.J. (1997), “Statistical classification methods in consumer credit scoring: a review”, Journal of the Royal Statistical Society. Series A (Statistics in Society), Vol. 160 No. 3, pp. 523-541.

Hill, T. and Lewicki, P. (2007), Statistics: Methods and Applications, StatSoft, Tulsa, OK.

Jacobs, M. and Kiefer, N. (2010), “The Bayesian approach to default risk: a guide”, in Böcker, K. (Eds), Rethinking Risk Measurement and Reporting: Volume II, Risk Books, London,pp. 319-334.

Jie, L. and Bo, S. (2011), “Naive Bayesian classifier based on genetic simulated annealing algorithm”, Procedia Engineering, Vol. 23, pp. 504-509.

Kay, J.W. and Titterington, D.M. (2000), Statistics and Neural Networks Advances at the Interface, University of Glasgow.

Komor´ad, K. (2002), “On credit scoring estimation”, Master of Science thesis, Institute for Statistics and Econometrics, Humboldt University, Berlin.

Lee, T. and Chen, I. (2005), “A two-stage hybrid credit scoring model using artificial neural networks and multivariate adaptive regression splines”, Expert Systems with Applications, Vol. 28 No. 4, pp. 743-752.

Lee, T., Chiu, C., Lu, C. and Chen, I. (2002), “Credit scoring using the hybrid neural discriminant technique”, Expert Systems with Applications, Vol. 23 No. 3, pp. 245-254.

Lundholm, R. and Sloan, R. (2004), Equity Valuation and Analysis, McGraw-Hill/Irwin, New York, NY.

Maltritz, D. and Molchanov, A. (2008), “Economic determinants of country credit risk: a Bayesian approach”, Proceedings of the 12th New Zealand Finance Colloquium, Massey University, Palmerston North.

Martens, D., Van Gestel, T. and Baesens, B. (2009), “Decompositional rule extraction from support vector machines by active learning”, IEEE Transactions on Knowledge and Data Engineering, Vol. 21 No. 2, pp. 178-191.

Matoussi, H. and Krichène, A.A. (2010), “Credit risk evaluation of a Tunisian commercial bank: logistic regression versus neural network modelling”, The Journal of Accounting and Management Information Systems, Vol. 9 No. 1.

Matoussi, H., Mouelhi, R. and Salah, S. (1999), “La prédiction de faillite des entreprises tunisiennes par la régression logistique”, Revue Tunisienne des Sciences de Gestion, Vol. 1, pp. 90-106.

Mcculloch, W. and Pitts, W. (1943), “A logical calculus of the ideas immanent in nervous activity”, Bulletin of Mathematical Biophysics, Vol. 5 No. 4, pp. 115-133.

Merton, R. (1974), “On the pricing of corporate debt: the risk structure of interest rates”, Journal of Finance, Vol. 29 No. 2, pp. 449-470.

Mileris, R. (2010), “Estimation of loan applicants default probability applying discriminant analysis and simple Bayesian classifier”, Economics and Management, Vol. 15 No. 9, pp. 1078-1084.

Mitchell, T.M. (2010), “Generative and discriminative classifiers: Naive Bayes and logistic regression”, Machine Learning, Second edition chapter 3, McGraw Hill.

Moonasar, V. (2007), “Credit risk analysis using artificial intelligence: evidence from a leading South African banking institution”, Research Report: Mbl3.

Odom, M. and Sharda, R. (1990), “A neural net model for bankruptcy prediction”, Proceeding Intern ate Joint Conference Neural Nets, San Diego, CA.

Ohlson, J.A. (1980), “Financial ratios and the probabilistic prediction of bankruptcy”, Journal of Accounting Research, Vol. 18 No. 1, pp. 109-131.

Okan, V.S. (2007), “Credit risk assessment for the banking sector of Northern Cyprus”, Banks and Bank Systems, Vol. 2 No. 1.

Palepu, K.G., Healy, P.M. and Bernard, V.L. (2000), Business Analysis and Valuation Using Financial Statements, 2nd ed., South – Western College Publishing.

Pang, S.L., Wang, Y.M. and Bai, Y.H. (2002), “Credit scoring model based on neural network”, Proceeding of the First International Conference on Machine Learning and Cybernetics, Beijing, 4-5 November.

Provost, F., Fawcett, T. and Kohavi, R. (1998), “The case against accuracy estimation for comparing induction algorithms”, in Shavlik, J. (Eds), Proceeding ICML-98. Morgan Kaufmann, San Francisco, CA, pp. 445-453, available at:

Quinlan, J.R. (1992), C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers, California, CA.

Raymond, A. (2007), The Credit Scoring Toolkit: Theory and Practice for Retail Credit Risk Management and Decision Automation, 1st ed., Oxford University Press.

Revsine, L., Collins, D.W. and Johnson, W.B. (1999), Financial Statement and Analysis, Prentice Hall, New Jersey, NJ.

Rosner, B.A. (2006), Fundamental of Biostatistics, Quebecor World, Taunton.

Sarkar, S. and Sriram, R.S. (2001), “Bayesian models for early warning of bank failures”, Management Science, Vol. 47 No. 11, pp. 1457-1475.

Smith, C. and Warner, J. (1979), “On financial contracting”, Journal of Financial Economics, Vol. 7 No. 2, pp. 117-161.

Spackman, K.A. (1989), “Signal detection theory: valuable tools for evaluating inductive learning”, Proceeding Sixth Intern ate Workshop on Machine Learning, Morgan Kaufman, San Mateo, CA, pp. 160-163.

Steenackers, A. and Goovaerts, M.J. (1989), “A credit scoring model for personal loans”, Insurance: Mathematics and Economics, Vol. 8 No. 1, pp. 31-34.

Stibor, T. (2010), “A study of detecting computer viruses in real-infected files in the n-gram representation with machine learning methods”, 23rd International Conference on Industrial Engineering and other Applications of Applied Intelligent Systems, Part I, Cordoba, June 1-4, pp. 509-519.

Sun, L. and Shenoy, P. (2007), “Using Bayesian networks for bankruptcy prediction: some methodological issues”, European Journal of Operational Research, Vol. 180 No. 2, pp. 738-753.

Thomas, L.C., Edelman, D.B. and Crook, J.N. (2002), Credit Scoring and its Applications. Society for Industrial Mathematics, 1st ed., Philadelphia.

Thomas, L.C. (2002), “A survey of credit and behavioral scoring: forecasting financial risk of lending to consumers”, International Journal of Forecasting, Vol. 16 No. 1, pp. 149-172.

Townsend, R.M. (1979), “Optimal contracts and competitive markets with costly state verification”, Journal of Economic Theory, Vol. 21 No. 2, pp. 265-293.

Miguéis, V.L., Benoit, D.F. and Van den Poel, D. (2012), “Enhanced decision support in credit scoring using Bayesian binary quantile regression”, Working Paper.

West, D. (2000), “Neural network credit scoring”, Computer and Operations Research, Vol. 27 No. 11, pp. 1131-1152.

Wu, C. and Wang, X.M. (2000), “A neural network approach for analyzing small business lending decisions”, Review of Quantitative Finance and Accounting, Vol. 15 No. 3, pp. 259-276.

Yang, L. (2002), “The evaluation of classification models for credit scoring”, Working Paper No. 02/2002 Edit, Matthias Schumann University of Göttingen Institute of Computer Science.




How to Cite

Krichene, A. . (2017). Using a naive Bayesian classifier methodology for loan risk assessment: Evidence from a Tunisian commercial bank. Journal of Economics, Finance and Administrative Science, 22(42), 3–24. Retrieved from