Prediction of Students’ Success in Secondary Education Using Selected Machine Learning Techniques
Keywords:
Machine Learning, Logistic Regression, k-Nearest Neighbour, Ada Boost, Decision Stump, Secondary EducationAbstract
The success of secondary school students, in terms of passing final certificate exams, is an important factor that determines their progress to higher education and towards becoming skilled labour or entrepreneurs in future. The complexity of this problem is multi-faceted as student experiences and expectations are not same for all schools. However, failure to address this problem can result in a high percentage of unskilled workforce which has adverse effects on the development of any country. This paper evaluates the performance of three machine learning algorithms to predict success or failure of a final year secondary school student. The algorithms are Logistic Regression, Ada Boost with Decision stump and k-Nearest Neighbour (kNN). Experimental dataset was obtained from Kaggle, with 395 instances each originally having 31 attributes. A ten-fold cross validation evaluation methodology was employed in our experiments after feature selection with a best first attribute selection filter which reduced the attributes to five (5). We simulated the algorithms using WEKA 3.9.5. Ada Boost with Decision stump performed best among the three selected algorithms with an average accuracy of 71.65%, followed by Logistic Regression and kNN with 70.63% and 70.13%, respectively. We intend to experiment with data obtained locally from secondary schools within Nigeria to further validate the performance of the selected and other machine learning algorithms.