MODEL PREDIKSI FAKTOR-FAKTOR RISIKO OBESITAS MENGGUNAKAN MACHINE LEARNING
Predictive Modeling of Obesity Risk Factors Using Machine Learning
Abstract
Background: Obesity is a major global health concern and a key risk factor for various non-communicable diseases, including diabetes, hypertension, and cardiovascular disorders. Despite extensive studies, accurately identifying the key contributing factors remains a challenge.
Objective: This study aims to predict the likelihood of obesity using a machine learning algorithm, based on questionnaire-derived clinical and behavioral data. Several supervised machine learning algorithms—logistic regression, naïve Bayes, support vector machine (SVM), and random forest—will be employed to build predictive models. Model performance will be evaluated using accuracy, precision, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC).
Methods: We used an open-access dataset from Kaggle comprising 2,111 samples with anthropometric, demographic, and lifestyle data. Of these, 972 individuals were categorized as obese and 1,139 as non-obese. The target variable was categorized into binary labels: "Obesity" and "Non-Obesity." Preprocessing included one-hot encoding, label encoding, and train-test splitting. All four ML models were trained and evaluated using accuracy, area under the curve (AUC), precision, sensitivity, and specificity metrics.
Results: The model achieved an accuracy of 98.58%, AUC of 99.96%, sensitivity of 98.99%, specificity of 98.21%, and precision of 98.01%. The most influential predictors were weight, frequent consumption of high-caloric food, family history of being overweight, physical activity frequency, and daily water intake.
Conclusion: The model demonstrated high performance and identified key lifestyle-related features. These findings support machine learning's potential for obesity screening and public health strategy development.
References
Blüher M. Obesity: global epidemiology and pathogenesis. Nat Rev Endocrinol [Internet]. 2019;15:288–98. Available from: https://consensus.app/papers/obesity-global-epidemiology-and-pathogenesis-bl%C3%BCher/cebb6cda9b2a50b98f003b5df9a3ea9f/
Sweis N. The economic burden of obesity in 2024: a cost analysis using the value of a statistical life. Crit Public Health [Internet]. 2024;34:1–13. Available from: https://consensus.app/papers/the-economic-burden-of-obesity-in-2024-a-cost-analysis-using-sweis/e630f8189097548c8cc6c818d9f6653a/
Pledger SL, Ahmadizar F. Gene-environment interactions and the effect on obesity risk in low and middle-income countries: a scoping review. Front Endocrinol (Lausanne). 2023;14(August).
Kim MS, Shim I, Fahed AC, Do R, Park WY, Natarajan P, et al. Association of genetic risk, lifestyle, and their interaction with obesity and obesity-related morbidities. Cell Metab. 2024 Jul 2;36(7):1494-1503.e3.
Bennett M, J. Kleczyk E, Hayes K, Mehta R. Evaluating Similarities and Differences between Machine Learning and Traditional Statistical Modeling in Healthcare Analytics. 2022;1–15.
Liu Z, Wen T, Sun W, Zhang Q. Feature-weighting and clustering random forest. International Journal of Computational Intelligence Systems. 2021;14(1):257–65.
Marquart I, Marquart EK. RFCC: Random Forest Consensus Clustering for Regression and Classification [Internet]. 2021. Available from: https://ssrn.com/abstract=3807828
Alsagri H, Ykhlef M. Quantifying Feature Importance for Detecting Depression using Random Forest. International Journal of Advanced Computer Science and Applications [Internet]. 2020;11. Available from: https://consensus.app/papers/quantifying-feature-importance-for-detecting-depression-alsagri-ykhlef/20a0bd57c68451b691fe58c65b333c38/
Miao Y, Xu Y. Random Forest-Based Analysis of Variability in Feature Impacts. 2024 IEEE 2nd International Conference on Image Processing and Computer Applications (ICIPCA) [Internet]. 2024;1130–5. Available from: https://consensus.app/papers/random-forestbased-analysis-of-variability-in-feature-miao-xu/3f1a3a7258d4501683174a33d89f1a1b/
Saikia D, Ahmed S, Saikia H, Sarma R. Body mass index and body fat percentage in assessing obesity: An analytical study among the adolescents of Dibrugarh, Assam. Indian J Public Health [Internet]. 2018;62:277–81. Available from: https://consensus.app/papers/body-mass-index-and-body-fat-percentage-in-assessing-sarma-ahmed/3008eff9d94452388b7ca8cc137dcbed/
Loos RJF, Yeo GSH. The genetics of obesity: from discovery to biology. Nat Rev Genet. 2022 Feb;23(2):120–33.
Mangla A, Dhamija N, Gupta U, Dhall M. Familial Background as a Hidden Cause for Obesity among College Going Girls. J Biosci Med (Irvine) [Internet]. 2019; Available from: https://consensus.app/papers/familial-background-as-a-hidden-cause-for-obesity-among-dhamija-gupta/f9da5a5dbd755e9aa142bc01f414422d/
Arsita C, Rachmani E, Isworo S, Kusumangrum L, Anggraini T. Exploring Obesity Risk Factors: Focuses on Family History of Overweight and Smoking Behaviour. Asian Journal of Medicine and Health [Internet]. 2024; Available from: https://consensus.app/papers/exploring-obesity-risk-factors-focuses-on-family-history-arsita-rachmani/f1f6352b034e581cacbc22101022b5ae/
Mielke G, Ding D, Keating S, Nunes B, Brady R, Brown W. Physical activity volume, frequency, and intensity: Associations with hypertension and obesity over 21 years in Australian women. J Sport Health Sci [Internet]. 2024;13:631–41. Available from: https://consensus.app/papers/physical-activity-volume-frequency-and-intensity-mielke-brown/000b248221f5514ab8c1d1af184d79a7/
Nuryani, Muhdar IN, Ramadhani F, Paramata Y, Adi DI, Bohari B. Association of Physical Activity and Dietary Patterns with Adults Abdominal Obesity in Gorontalo Regency, Indonesia: A Cross-Sectional Study. Current Research in Nutrition and Food Science Journal [Internet]. 2021;9:280–92. Available from: https://consensus.app/papers/association-of-physical-activity-and-dietary-patterns-muhdar-nuryani/4e5bee9c08cd5139aa54c9a1e362a656/
Kim YJ, Oh SN, Kong EK, Seon ES. Association between Water Intake and Abdominal Obesity: The Korea National Health and Nutrition Examination Survey 2019-2021. Korean J Fam Med [Internet]. 2024; Available from: https://consensus.app/papers/association-between-water-intake-and-abdominal-obesity-seon-kong/1074dda12ad8557fa233f0d07d969a03/
Rodríguez R, Aparicio A, López-Sobaler A, Ortega R. Importance of water consumption in a group of young women with overweight and obesity. Nutr Hosp [Internet]. 2015;32 Suppl 2:10339. Available from: https://consensus.app/papers/importance-of-water-consumption-in-a-group-of-young-women-ortega-l%C3%B3pez-sobaler/974a65a1228951d4954e19a235ac792c/
Putri PA. Association of High Calorie Food and Coffee Consumption Pattern, Sleep Duration and Stress Level with Nutritional Status in Final Year Students. Media Gizi Kesmas [Internet]. 2022; Available from: https://consensus.app/papers/association-of-high-calorie-food-and-coffee-consumption-putri/1231859fa1565623995acbff15b26567/
Kotska S, Teguh MM, Santoso AH. Study Analysis of the Role of High-Calorie Food (Junk Food) on the Incident of Adolescent Obesity: A Community-Based Observational Study of Senior High School Adolescents in Tangerang Regency, Indonesia. Community Medicine and Education Journal [Internet]. 2024; Available from: https://consensus.app/papers/study-analysis-of-the-role-of-highcalorie-food-junk-food-on-teguh-kotska/537d279e356d571c9c29106f6ce5cf81
Copyright (c) 2026 Husnul Khuluq, Lazuardi Fatahillah Hamdi, Ayu Nissa Ainni, Tri Cahyani Widiastuti

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

