Credit Card Fraud Detection on Skewed Data using Machine Learning Techniques
Keywords:
accuracy, credit card fraud, data imbalance, machine learning, precision, skewed data, under-samplingAbstract
The fraud associated with credit card transaction is increasing at an alarming rate and consequently resulting in huge financial loss for both the cardholders and concerned financial institutions. Most datasets for real-life problems such as this are usually
imbalanced, which makes the machine learning model not robust for training purposes. Therefore, this research aimed to detect 100% of the fraudulent transactions while minimizing the incorrect classifications by first, balancing the data using under sampling technique and then developing classification models using different machine learning algorithms such as Logistic regression, Random forest, K nearest neighbor and Decision tree classifiers. The performance of the models are evaluated based on accuracy, precision and recall and the results indicated that Random Forest recorded the highest accuracy, precision and recall of 95.19%, 97.94%, and 0.9226 respectively compared to the other three (3) algorithms.