Enhancing Prediction of Parkinson’s Disease Using Stacked Ensemble Learning Algorithm
Keywords:
Parkinson disease, gain ratio, stacked ensemble, voice recordingAbstract
Accurate and early diagnosis of Parkinson's disease (PD) is still a challenge. In this work, stacked ensemble learning is explored for enhanced PD prediction from voice data. The "Parkinson’s" dataset consisting of 195 instances from 22 recordings of voices (features) was downloaded from Kaggle. Preprocessing of the data included resampling through Synthetic Minority Oversampling Techniques to balance against possible class imbalance, as well as normalization through Min-Max scaling. Gain Ratio was used for feature ranking, and experiments were done using the top 5 and top 10 ranked features. Four machine learning algorithms – K-Nearest Neighbor, Logistic Regression, Random Forest, and a Stacked Ensemble (with SVM, KNN, and Random Forest as base learners and Logistic Regression as the meta learner) – were compared using a hold-out evaluation strategy with accuracy, precision, recall, and F1-score as measures of evaluation. It was found that Stacked Ensemble worked the best, particularly when the top 10 features were implemented to train (Accuracy: 95.7%, Precision: 95.0%, F1-Score: 95.0%, Recall: 95.0%) and outperformed all the individual models as well as what was discovered when the top 5 features only were used. By this study, it is concluded that stack ensemble learning coupled with effective feature selection is an effective approach to enhance Parkinson's disease prediction from voice data.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 UNIABUJA Journal of Engineering and Technology (UJET)

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.