Performance Evaluation of Machine Learning Algorithms for AIDS-Infected Patient Classification

https://doi.org/10.22146/jkesvo.107716

Ardi Kurniawan(1*), Citrawani Marthabakti(2), Larisa Mutiara Putri(3), Billy Christandy Suyono(4), Rindiani Ahmada Alisiah(5)

(1) Department of Mathematics, Faculty of Science and Technology, Airlangga University
(2) Department of Mathematics, Faculty of Science and Technology, Airlangga University
(3) Department of Mathematics, Faculty of Science and Technology, Airlangga University
(4) Department of Mathematics, Faculty of Science and Technology, Airlangga University
(5) Department of Mathematics, Faculty of Science and Technology, Airlangga University
(*) Corresponding Author

Abstract


Background: According to UNAIDS (2023), approximately 39.9 million people are living with HIV worldwide, with 1.3 million new cases and 630,000 AIDS-related deaths in 2023. This indicates that HIV/AIDS remains a serious global health threat. Machine learning methods have the potential to improve the accuracy of AIDS infection classification.

Objective: This research is aimed to determine the best classification method based on prediction accuracy and to identify the method with the best performance for further analysis.

Methods: This research used a quantitative approach by evaluating the performance of machine learning algorithms: Decision Tree, Random Forest, XGBoost, Naive Bayes, and Logistic Regression. Secondary data were obtained from the UCI Machine Learning Repository, comprising 2,000 observations of AIDS patients and 23 variables. Model evaluation used a confusion matrix to calculate accuracy, precision, recall, and F1-score. The best model, logistic regression, was further analyzed with parameter significance tests, odds ratios, and goodness of fit.

Results: Logistic regression yielded an accuracy of 88.4%, precision and recall of 90%, and the highest F1-score. Variables significant to AIDS were: time, preanti, symptom, offtrt, and cd420. The model passed the Hosmer and Lemeshow test (p-value = 0.365) with a Nagelkerke R-Square of 0.642.

Conclusion: Machine learning approaches, particularly logistic regression, support early detection of AIDS and data-driven medical decision-making.


Keywords


Logistic Regression; Classification Accuracy

Full Text:

PDF


References

Alehegn, M. (2022). Application of machine learning and deep learning for the prediction of HIV / AIDS. International Journal of HIV-Related Problems, 21(1), 17–23. https://doi.org/10.5114/hivar.2022.112852

Biau, G. (2012). Analysis of a random forests model. Journal of Machine Learning Research, 13(1), 1063–1095.

Chamid, A. A., Nindyasari, R., & Ghozali, M. I. (2025). Comparative Analysis of Machine Learning Algorithms for. 5(158), 185–194.

Dalal, S., Lilhore, U. K., Faujdar, N., Simaiya, S., Agrawal, A., Rani, U., & Mohan, A. (2025). Enhancing thyroid disease prediction with improved XGBoost model and bias management techniques. Multimedia Tools and Applications, 84(16), 6757–16788.

Hammer, S. M., Katzenstein, D. A., Hughes, M. D., Gundacker, H., Schooley, R. T., Haubrich, R. H., Henry, W. K., Lederman, M. M., Phair, J. P., Niu, M., Hirsch, M. S., & Merigan, T. C. (1996). Therapy in Hiv-Infected Adults With Cd4 Cell Counts. The New England Journal of Medicine, 335(15), 1081–1090.

Hosmer, D. W., Hosmer, T., Le Cessie, S., & Lemeshow, S. (1997). A comparison of goodness‐of‐fit tests for the logistic regression. Statistics in Medicine, 16(9), 965–980.

Ibrahim, N. S. (2024). Analisis Diskriminan Linear Robust dengan Penduga Minimum Covariance Determinant (Studi Kasus: Indeks Kerentanan Pangan Menurut Kabupaten/Kota di Indonesia Tahun 2023). Emerging Statistics and Data Science Journal, 2(2), 264–279. https://doi.org/10.20885/esds.vol2.iss.2.art20

Lu, Y., Ye, T., & Zheng, J. (2022). Decision Tree Algorithm in Machine Learning. 2022 IEEE International Conference on Advances in Electrical Engineering and Computer Applications, AEECA 2022, 1014–1017. https://doi.org/10.1109/AEECA55500.2022.9918857

Moore, R. D. (2011). Epidemiology of HIV infection in the United States: implications for linkage to care. Clinical Infectious Diseases, 52(2), 208–213. https://doi.org/10.1093/cid/ciq044

Mutiara, E., No, J. K. R., Barat, R., & Cengkareng, J. B. (2020). Algoritma Klasifikasi Naive Bayes Berbasis Particle Swarm Optimization Untuk Prediksi Penyakit Tuberculosis (Tb). Swabumi, 8(1), 46–58. https://doi.org/10.31294/swabumi.v8i1.7668

Norhidayati, A., Farid, F. M., Annisa, S., & Hamidy, A. (2023). Application of Binary Logistic Regression Analysis on Household Welfare in Banjarmasin City. Jurnal Al-Qardh, 8(2), 135–148. https://doi.org/10.23971/jaq.v8i2.7637

Premeaux, T. A., Bowler, S., Friday, C. M., Hoenigl, C. B. M. M., Landay, M. M. L. A. L., Gianella, S., Ndhlovu, L. C., Team, & Study, A. C. T. G. N. 411. (2024). Machine learning models based on fluid immunoproteins that predict non-AIDS adverse events in people with HIV. IScience, 27(6). https://doi.org/https://doi.org/10.1016/j.isci.2024.109945

Salih, A. A., & Abdulazeez, A. M. (2021). Evaluation of Classification Algorithms for Intrusion Detection System: A Review. Journal of Soft Computing and Data Mining, 2(1), 31–40. https://doi.org/10.30880/jscdm.2021.02.01.004

Sui, Q., Li, G., Peng, Y., Zhang, J., Zhang, Y., & Zhao, R. (2025). Scalable and robust machine learning framework for HIV classification using clinical and laboratory data. Scientific Reports, 15(18727). https://doi.org/https://doi.org/10.1038/s41598-025-00085-4

UNAIDS. (2023). Global HIV & AIDS statistics Fact sheet. Joint United Nations Programme on HIV/AIDS.

Zakaria, Y. S., Ariffin, N. A., Ahmad, A., Rainis, R., Muslim, A. M., & Wan Ibrahim, W. M. M. (2025). Optimizing Tuberculosis Treatment Predictions: A Comparative Study of XGBoost with Hyperparameter in Penang, Malaysia. Sains Malaysiana, 54(1), 3741–3752. https://doi.org/10.17576/jsm-2025-5401-22

Zaman, S., B, W. A., Siddiqui, M. K., Mumtaz, A., & Kosar, Z. (2025). Role of eccentricity based topological descriptors to predict anti-HIV drugs attributes with supervised machine learning algorithms. Computers in Biology and Medicine, 190(110101). https://doi.org/https://doi.org/10.1016/j.compbiomed.2025.1101


Ardi Kurniawan, et al.



DOI: https://doi.org/10.22146/jkesvo.107716

Article Metrics

Abstract views : 888 | views : 222

Refbacks

  • There are currently no refbacks.




Copyright (c) 2025 The Author(s)

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


 

Jurnal Kesehatan Vokasional with registered number ISSN 2541-0644 (print), ISSN 2599-3275 (online) published by the Departement of Health Information Management and Services, Vocational College, Universitas Gadjah Mada

site
stats View My Stats