Automatic Essay Scoring Using Data Augmentation in Bahasa Indonesia

Nur Fadilah(1*), Sigit Priyanta(2)

(1) Master Program of Computer Science, FMIPA UGM, Yogyakarta
(2) Department of Computer Science and Electronics, FMIPA UGM, Yogyakarta
(*) Corresponding Author


Essay is one of the assessments to find out the abilities of students in depth.  UKARA is an automatic essay scoring development that combines NLP and machine learning.  This study uses the datasets provided for the UKARA challenge which consists of 2 types, datasets A and B. The dataset provided is still small for the model creation  process so that it is one of the causes of the resulting model is not optimal.

This research focuses on the process of adding or augmenting data using EDA (Easy Data Augmentation Techniques). There are four methods applied, namely Synonym Replacement (SR), Random Insertion (RI), Random Swab (RS), and Random Deletion (RD).  The data is used for model creation by using the BiLSTM method. Performa model evaluated using confusion matrix with nilai accyouracy, precision, recall dan f-measure.

The results showed that the dataset A without augmentation using k-fold cross validation produced the highest accuracy value with a value of 85.07%. While the results in data B show EDA insert with k-fold cross validation of 72.78%.


Test Brief Description; Data Augmentation; Fasttext; BiLSTM

Full Text:



[1] Mansyur, S., & Harun, R. (2015). Asesmen pembelajaran di sekolah: Panduan bagi guru dan calon guru. Yogyakarta: Pustaka Pelajar.

[2] Arikunto Suharsimi. (2013). Dasar-Dasar Pendidikan (2nd ed.). Bumi Aksara.

[3] Ruslan, R., Gunawan, G., & Tjandra, S. (2018, August). Sistem Penilaian Otomatis Jawaban Esai Menggunakan Metode GLSA. In Seminar Nasional Aplikasi Teknologi Informasi (SNATi).

[4] Fadilah, N. (2016). Rancang Bangun Sistem Penilaian Tes Essai Berbasis WEB di Testing Center UNM. Universitas Negeri Makassar

[5] Herwanto, G. B., Sari, Y., Prastowo, B. N., Bustoni, I. A., & Hidayatulloh, I. (2018). UKARA: A fast and simple automatic short answer scoring system for Bahasa Indonesia. ICEAP 2019, 2, 48-53.

[6] Purwarianti, A. (2019, October). Effective Use of Augmentation Degree and Language Model for Synonym-based Text Augmentation on Indonesian Text Classification. In 2019 International Conference on Advanced Computer Science and information Systems (ICACSIS) (pp. 217-222). IEEE.

[7] Wei, J., & Zou, K. (2019). Eda: Easy data augmentation techniques for boosting performance on text classification tasks. arXiv preprint arXiv:1901.11196.


Article Metrics

Abstract views : 1005 | views : 996


  • There are currently no refbacks.

Copyright (c) 2022 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Copyright of :
IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
ISSN 1978-1520 (print); ISSN 2460-7258 (online)
is a scientific journal the results of Computing
and Cybernetics Systems
A publication of IndoCEISS.
Gedung S1 Ruang 416 FMIPA UGM, Sekip Utara, Yogyakarta 55281
Fax: +62274 555133 |

View My Stats1
View My Stats2