Clustering High School Students’ Career Interests Using K-Means with Multi-Metric Validation

https://doi.org/10.22146/ijccs.117507

noor latifah(1*), imelda annas fatia(2), soni adiyono(3)

(1) Universitas Muria Kudus
(2) Universitas Muria Kudus
(3) Universitas Muria Kudus
(*) Corresponding Author

Abstract


Understanding students' career interests is essential for supporting effective career guidance programs in schools. However, identifying patterns of career interest among students is often challenging due to the diversity of motivational, cognitive, and planning-related factors. This study aims to analyze the segmentation of high school students' career interests using clustering techniques based on questionnaire data. This study uses the K-means algorithm run in conjunction with the Elbow Method to find the most appropriatenumber of clusters. The data preparation stages included cleaning the data, performing normalization using the Min-Max scaling method, and reducing the number of variables using principal component analysis (PCA) to facilitate visualization and initial analysis. In addition, cluster validity was evaluated using several internal validation indices, namely the silhouette score, Davies-Bouldin Index, and Calinski-Harabasz Index. The experimental results show that the data can be grouped into three clusters representing different levels of career interest characteristics among students. The identified clusters reveal variations in motivation, career planning clarity, and expectations for future careers. These findings provide useful insights for school counselors in designing targeted career guidance strategies.


Keywords


Career Interest, Clustering Analysis, K-Means Algorithm, Elbow Method, Educational Data Mining

Full Text:

PDF


References

A. M. Shahiri, W. Husain, and N. A. Rashid, “A Review on Predicting Student’s Performance Using Data Mining Techniques,” Procedia Comput. Sci., vol. 72, pp. 414–422, 2015, doi: 10.1016/j.procs.2015.12.157. [2] D. J. Lemay, C. Baek, and T. Doleck, “Comparison of learning analytics and educational data mining: A topic modeling approach,” Comput. Educ. Artif. Intell., vol. 2, no. December 2020, p. 100016, 2021, doi: 10.1016/j.caeai.2021.100016. [3] M. Alvarez-Garcia, M. Arenas-Parra, and R. Ibar-Alonso, “Uncovering student profiles. An explainable cluster analysis approach to PISA 2022,” Comput. Educ., vol. 223, no. July, p. 105166, 2024, doi: 10.1016/j.compedu.2024.105166. [4] J. Han, K. H. Kim, W. Rhee, and Y. H. Cho, “Learning analytics dashboards for adaptive support in face-to-face collaborative argumentation,” Comput. Educ., vol. 163, no. September 2020, p. 104041, 2021, doi: 10.1016/j.compedu.2020.104041. [5] J. N. Walsh, “Using cluster analysis to identify procrastination and student learning strategies in a flipped classroom,” Int. J. Manag. Educ., vol. 22, no. 1, p. 100936, 2024, doi: 10.1016/j.ijme.2024.100936. [6] Y. Ma, K. Cain, and A. Ushakova, “Application of cluster analysis to identify different reader groups through their engagement with a digital reading supplement,” Comput. Educ., vol. 214, no. February, p. 105025, 2024, doi: 10.1016/j.compedu.2024.105025. [7] F. Ma, “Learning behavior analysis and personalized recommendation system of online education platform based on machine learning,” Comput. Educ. Artif. Intell., vol. 8, no. April, p. 100408, 2025, doi: 10.1016/j.caeai.2025.100408. [8] Y. Li and H. Zhang, “Big data technology for teaching quality monitoring and improvement in higher education - joint K-means clustering algorithm and Apriori algorithm,” Syst. Soft Comput., vol. 6, no. April, p. 200125, 2024, doi: 10.1016/j.sasc.2024.200125. [9] Y. Lu, S. Yeom, J. Maktoubian, M. M. Rahman, and S. H. Kim, “Improve Student Risk Prediction with Clustering Techniques: A Systematic Review in Education Data Mining,” Educ. Sci., vol. 15, no. 12, pp. 1–38, 2025, doi: 10.3390/educsci15121695. [10] A. R. Savira, “Optimizing Clustering Models Using Principle Component Analysis for Car Customers,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 18, no. 2, pp. 1–12, 2024, doi: 10.22146/ijccs.94744. [11] R. Alamsyah and N. Rokhman, “Preprocessing Algorithm for K-Means Anomaly Detection on Payment Logs,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 19, no. 2, pp. 211–222, 2025, doi: 10.22146/ijccs.105290. [12] I. B. G. Sarasvananda, R. Wardoyo, and A. K. Sari, “The K-Means Clustering Algorithm With Semantic Similarity To Estimate The Cost of Hospitalization,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 13, no. 4, p. 313, 2019, doi: 10.22146/ijccs.45093. [13] W. A. Prastyabudi, A. N. Alifah, and A. Nurdin, “Segmenting the Higher Education Market: An Analysis of Admissions Data Using K-Means Clustering,” Procedia Comput. Sci., vol. 234, no. 2023, pp. 96–105, 2024, doi: 10.1016/j.procs.2024.02.156. [14] M. Talebinamvar and F. Zarrabi, “Clustering students’ writing behaviors using keystroke logging: a learning analytic approach in EFL writing,” Lang. Test. Asia, vol. 12, no. 1, 2022, doi: 10.1186/s40468-021-00150-5. [15] N. Alzahrani, M. Meccawy, H. Samra, and H. A. El-Sabagh, “Identifying Weekly Student Engagement Patterns in E-Learning via K-Means Clustering and Label-Based Validation,” Electron., vol. 14, no. 15, pp. 1–27, 2025, doi: 10.3390/electronics14153018. [16] L. Liu, “Application of K-means supported by clustered systems in big data association rule mining,” Syst. Soft Comput., vol. 7, no. March, p. 200211, 2025, doi: 10.1016/j.sasc.2025.200211. [17] A. Wächter, B. Ioshchikhes, N. Kolb, and M. Weigold, “Data mining approach for production order identification in load profiles of machine tools: A change-point and clustering based analysis,” Procedia CIRP, vol. 120, pp. 940–945, 2023, doi: 10.1016/j.procir.2023.09.104. [18] X. Zhu et al., “A prediction model for hazard levels of shallow natural gas in tunnel based on K-means clustering and tabular prior-data fitted network,” Results Eng., vol. 27, no. June, p. 106873, 2025, doi: 10.1016/j.rineng.2025.106873.



DOI: https://doi.org/10.22146/ijccs.117507

Article Metrics

Abstract views : 240 | views : 102




Copyright (c) 2026 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



Copyright of :
IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
ISSN 1978-1520 (print); ISSN 2460-7258 (online)
is a scientific journal the results of Computing
and Cybernetics Systems
A publication of IndoCEISS.
Gedung S1 Ruang 416 FMIPA UGM, Sekip Utara, Yogyakarta 55281
Fax: +62274 555133
email:ijccs.mipa@ugm.ac.id | http://jurnal.ugm.ac.id/ijccs



View My Stats1
View My Stats2