Clustering High School Students’ Career Interests Using K-Means with Multi-Metric Validation
noor latifah(1*), imelda annas fatia(2), soni adiyono(3)
(1) Universitas Muria Kudus
(2) Universitas Muria Kudus
(3) Universitas Muria Kudus
(*) Corresponding Author
Abstract
Understanding students' career interests is essential for supporting effective career guidance programs in schools. However, identifying patterns of career interest among students is often challenging due to the diversity of motivational, cognitive, and planning-related factors. This study aims to analyze the segmentation of high school students' career interests using clustering techniques based on questionnaire data. This study uses the K-means algorithm run in conjunction with the Elbow Method to find the most appropriatenumber of clusters. The data preparation stages included cleaning the data, performing normalization using the Min-Max scaling method, and reducing the number of variables using principal component analysis (PCA) to facilitate visualization and initial analysis. In addition, cluster validity was evaluated using several internal validation indices, namely the silhouette score, Davies-Bouldin Index, and Calinski-Harabasz Index. The experimental results show that the data can be grouped into three clusters representing different levels of career interest characteristics among students. The identified clusters reveal variations in motivation, career planning clarity, and expectations for future careers. These findings provide useful insights for school counselors in designing targeted career guidance strategies.
Keywords
Full Text:
PDFReferences
A. M. Shahiri, W. Husain, and N. A. Rashid, “A Review on Predicting Student’s Performance Using Data Mining Techniques,” Procedia Comput. Sci., vol. 72, pp. 414–422, 2015, doi: 10.1016/j.procs.2015.12.157. [2] D. J. Lemay, C. Baek, and T. Doleck, “Comparison of learning analytics and educational data mining: A topic modeling approach,” Comput. Educ. Artif. Intell., vol. 2, no. December 2020, p. 100016, 2021, doi: 10.1016/j.caeai.2021.100016. [3] M. Alvarez-Garcia, M. Arenas-Parra, and R. Ibar-Alonso, “Uncovering student profiles. An explainable cluster analysis approach to PISA 2022,” Comput. Educ., vol. 223, no. July, p. 105166, 2024, doi: 10.1016/j.compedu.2024.105166. [4] J. Han, K. H. Kim, W. Rhee, and Y. H. Cho, “Learning analytics dashboards for adaptive support in face-to-face collaborative argumentation,” Comput. Educ., vol. 163, no. September 2020, p. 104041, 2021, doi: 10.1016/j.compedu.2020.104041. [5] J. N. Walsh, “Using cluster analysis to identify procrastination and student learning strategies in a flipped classroom,” Int. J. Manag. Educ., vol. 22, no. 1, p. 100936, 2024, doi: 10.1016/j.ijme.2024.100936. [6] Y. Ma, K. Cain, and A. Ushakova, “Application of cluster analysis to identify different reader groups through their engagement with a digital reading supplement,” Comput. Educ., vol. 214, no. February, p. 105025, 2024, doi: 10.1016/j.compedu.2024.105025. [7] F. Ma, “Learning behavior analysis and personalized recommendation system of online education platform based on machine learning,” Comput. Educ. Artif. Intell., vol. 8, no. April, p. 100408, 2025, doi: 10.1016/j.caeai.2025.100408. [8] Y. Li and H. Zhang, “Big data technology for teaching quality monitoring and improvement in higher education - joint K-means clustering algorithm and Apriori algorithm,” Syst. Soft Comput., vol. 6, no. April, p. 200125, 2024, doi: 10.1016/j.sasc.2024.200125. [9] Y. Lu, S. Yeom, J. Maktoubian, M. M. Rahman, and S. H. Kim, “Improve Student Risk Prediction with Clustering Techniques: A Systematic Review in Education Data Mining,” Educ. Sci., vol. 15, no. 12, pp. 1–38, 2025, doi: 10.3390/educsci15121695. [10] A. R. Savira, “Optimizing Clustering Models Using Principle Component Analysis for Car Customers,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 18, no. 2, pp. 1–12, 2024, doi: 10.22146/ijccs.94744. [11] R. Alamsyah and N. Rokhman, “Preprocessing Algorithm for K-Means Anomaly Detection on Payment Logs,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 19, no. 2, pp. 211–222, 2025, doi: 10.22146/ijccs.105290. [12] I. B. G. Sarasvananda, R. Wardoyo, and A. K. Sari, “The K-Means Clustering Algorithm With Semantic Similarity To Estimate The Cost of Hospitalization,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 13, no. 4, p. 313, 2019, doi: 10.22146/ijccs.45093. [13] W. A. Prastyabudi, A. N. Alifah, and A. Nurdin, “Segmenting the Higher Education Market: An Analysis of Admissions Data Using K-Means Clustering,” Procedia Comput. Sci., vol. 234, no. 2023, pp. 96–105, 2024, doi: 10.1016/j.procs.2024.02.156. [14] M. Talebinamvar and F. Zarrabi, “Clustering students’ writing behaviors using keystroke logging: a learning analytic approach in EFL writing,” Lang. Test. Asia, vol. 12, no. 1, 2022, doi: 10.1186/s40468-021-00150-5. [15] N. Alzahrani, M. Meccawy, H. Samra, and H. A. El-Sabagh, “Identifying Weekly Student Engagement Patterns in E-Learning via K-Means Clustering and Label-Based Validation,” Electron., vol. 14, no. 15, pp. 1–27, 2025, doi: 10.3390/electronics14153018. [16] L. Liu, “Application of K-means supported by clustered systems in big data association rule mining,” Syst. Soft Comput., vol. 7, no. March, p. 200211, 2025, doi: 10.1016/j.sasc.2025.200211. [17] A. Wächter, B. Ioshchikhes, N. Kolb, and M. Weigold, “Data mining approach for production order identification in load profiles of machine tools: A change-point and clustering based analysis,” Procedia CIRP, vol. 120, pp. 940–945, 2023, doi: 10.1016/j.procir.2023.09.104. [18] X. Zhu et al., “A prediction model for hazard levels of shallow natural gas in tunnel based on K-means clustering and tabular prior-data fitted network,” Results Eng., vol. 27, no. June, p. 106873, 2025, doi: 10.1016/j.rineng.2025.106873.
Article Metrics
Copyright (c) 2026 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
View My Stats1







