Automatic Essay Scoring Using Data Augmentation in Bahasa Indonesia

Nur Fadilah; Sigit Priyanta

doi:10.22146/ijccs.76396

Automatic Essay Scoring Using Data Augmentation in Bahasa Indonesia

https://doi.org/10.22146/ijccs.76396

Nur Fadilah^(1*), Sigit Priyanta⁽²⁾

(1) Master Program of Computer Science, FMIPA UGM, Yogyakarta
(2) Department of Computer Science and Electronics, FMIPA UGM, Yogyakarta
(*) Corresponding Author

Abstract

Essay is one of the assessments to find out the abilities of students in depth. UKARA is an automatic essay scoring development that combines NLP and machine learning. This study uses the datasets provided for the UKARA challenge which consists of 2 types, datasets A and B. The dataset provided is still small for the model creation process so that it is one of the causes of the resulting model is not optimal.

This research focuses on the process of adding or augmenting data using EDA (Easy Data Augmentation Techniques). There are four methods applied, namely Synonym Replacement (SR), Random Insertion (RI), Random Swab (RS), and Random Deletion (RD). The data is used for model creation by using the BiLSTM method. Performa model evaluated using confusion matrix with nilai accyouracy, precision, recall dan f-measure.

The results showed that the dataset A without augmentation using k-fold cross validation produced the highest accuracy value with a value of 85.07%. While the results in data B show EDA insert with k-fold cross validation of 72.78%.

Keywords

Test Brief Description; Data Augmentation; Fasttext; BiLSTM

Full Text:

PDF

References

[1] Mansyur, S., & Harun, R. (2015). Asesmen pembelajaran di sekolah: Panduan bagi guru dan calon guru. Yogyakarta: Pustaka Pelajar.

[2] Arikunto Suharsimi. (2013). Dasar-Dasar Pendidikan (2nd ed.). Bumi Aksara.

[3] Ruslan, R., Gunawan, G., & Tjandra, S. (2018, August). Sistem Penilaian Otomatis Jawaban Esai Menggunakan Metode GLSA. In Seminar Nasional Aplikasi Teknologi Informasi (SNATi).

[4] Fadilah, N. (2016). Rancang Bangun Sistem Penilaian Tes Essai Berbasis WEB di Testing Center UNM. Universitas Negeri Makassar

[5] Herwanto, G. B., Sari, Y., Prastowo, B. N., Bustoni, I. A., & Hidayatulloh, I. (2018). UKARA: A fast and simple automatic short answer scoring system for Bahasa Indonesia. ICEAP 2019, 2, 48-53.

[6] Purwarianti, A. (2019, October). Effective Use of Augmentation Degree and Language Model for Synonym-based Text Augmentation on Indonesian Text Classification. In 2019 International Conference on Advanced Computer Science and information Systems (ICACSIS) (pp. 217-222). IEEE.

[7] Wei, J., & Zou, K. (2019). Eda: Easy data augmentation techniques for boosting performance on text classification tasks. arXiv preprint arXiv:1901.11196.

DOI: https://doi.org/10.22146/ijccs.76396

Article Metrics

Abstract views : 6485 |

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Copyright of :IJCCS (Indonesian Journal of Computing and Cybernetics Systems)ISSN 1978-1520 (print); ISSN 2460-7258 (online)is a scientific journal the results of Computingand Cybernetics Systems
A publication of IndoCEISS.Gedung S1 Ruang 416 FMIPA UGM, Sekip Utara, Yogyakarta 55281Fax: +62274 555133email:ijccs.mipa@ugm.ac.id | http://jurnal.ugm.ac.id/ijccs

View My Stats1View My Stats2

Username
Password
Remember me