Development of a Phishing Detection System Using Ensemble Machine Learning Method

Authors

  • Bosede O Oguntunde Redeemer’s University, Ede Osun State, Nigeria.
  • Chizor S Iwuh Redeemer’s University, Ede Osun State,Nigeria.
  • Theresa Ojewumi Redeemer’s University, Ede Osun State, Nigeria.
  • Michael O Abolarinwa Osun State University, Osogbo Osun State, Nigeria.

Keywords:

Phishing, Machine Learning, Machine Learning algorithms, Random Forest, Logistic Regression, Naïve Bayes, Support Vector Machine

Abstract

Over the years, phishing has been a major problem and has caused different people to lose sensitive information, hence leading to loss of financial assets. Different machine learning algorithms have been used in the assessment of phishing in different aspects: websites, emails, texts amongst others. However, phishing attacks continue to increase frequency and sophistication despite the numerous attempts to combat it, there is therefore a need for improved detection mechanisms. This study therefore assessed four machine learning algorithms (Random Forest (RF), Logistic Regression (LR), Naive  (NB) and Support Vector Machine (SVM)), built an ensemble model with them and developed a system using this model to detect phishing websites. A dataset obtained from Kaggle machine learning repository containing 549,347 records of websites was split into two, 70% to train the ensemble model and 30% to test the model. Two categories of features were selected: Lexical based features and Domain based features of the URL. The performance of the four algorithms were evaluated using accuracy, precision, recall and f1-score. The model was implemented with Python programming language in Jupyter Notebook and 97.42% accuracy was recorded. The results obtained showed that proposed model is comparable to existing models with accuracies of 96%, 98%, 72% and 97% for LR, SVM, RF and NB respectively. The model was used to develop a user-friendly system where users can paste URLs in order to check the safety of the address. The system however is limited to HTTP protocols and might not be equipped to handle short URLs.

Published

2024-08-05

How to Cite

Oguntunde, B. O., Iwuh, C. S. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ., Ojewumi , T., & Abolarinwa, M. O. (2024). Development of a Phishing Detection System Using Ensemble Machine Learning Method. LAUTECH JOURNAL OF COMPUTING AND INFORMATICS , 4(2), 87-100. Retrieved from https://laujci.lautech.edu.ng/index.php/laujci/article/view/125