Comparing text classification algorithms with n-grams for mediation prediction

Retzi Y. Lewu; Kusrini Kusrini; Ainul Yaqin

doi:10.22146/ijccs.93929

Comparing text classification algorithms with n-grams for mediation prediction

https://doi.org/10.22146/ijccs.93929

Retzi Y. Lewu⁽¹⁾, Kusrini Kusrini^(2*), Ainul Yaqin⁽³⁾

(1)
(2) (Scopus ID : 36057015500); The College of Information Management and Computer Science AMIKOM Yogyakarta
(3) Universitas AMIKOM Yogyakarta
(*) Corresponding Author

Abstract

Tingkat keberhasilan mediasi perkara perdata di pengadilan negeri dari tahun ke tahun sangat rendah dan menyebabkan penumpukan perkara yang harus ditangani dengan persidangan. Sementara itu, pendaftaran perkara baru dengan klasifikasi perkara serupa terus bermunculan dan wajib dimediasi. Penelitian ini dilakukan dengan memanfaatkan data mediasi perkara terdahulu sebagai dataset untuk memprediksi hasil mediasi perkara baru. Ketika n-gram digunakan pada dataset yang telah di-preprocessing, hanya ditemukan nilai pada unigram (n=1). Pada penerapan model menggunakan algoritma machine learning, dihasilkan akurasi yang sama sebesar 0.6875 pada Algoritma Naïve Bayes, Logistic Regression dan Support Vector Machine (SVM), sedangkan algoritma Decision tree menghasilkan akurasi paling rendah sebesar 0,375. Rendahnya nilai dikarenakan Decision Tree lebih cenderung overfit untuk digunakan dengan teks berbahasa Indonesia. Pola kalimat formal pada dokumen mediasi berbahasa Indonesia tidak memenuhi unsur – unsur kata majemuk, imbuhan, variasi susunan kata, dan semantik leksikal. Untuk penelitian selanjutnya direkomendasikan penggunaan algoritma klasifikasi lain, pemanfaataannya pada dokumen – dokumen lain seperti putusan pengadilan, penentuan rangking mediator berdasarkan keberhasilan mediasi serta implementasi model pada aplikasi e-mediasi yang terintegrasi dengan sistem informasi manajemen perkara

Keywords

Algoritma Klasifikasi Teks, N-gram, Prediksi hasil mediasi

Full Text:

PDF

References

P. Lumbantoruan, R. Mawuntu, C. J. J. Waha, and C. Tangkere, “E-Mediation in E-Litigation Stages in Court,” J. Law, Policy Organ., vol. 108, p. 66, 2021, doi: 10.7176/JLPG/108-0.

M. C. Cohen, S. Dahan, C. Rule, and L. K. Branting, “Conflict Analytics: When Data Science Meets Dispute Resolution,” Manag Bus. Rev 2.2, pp. 86–93, 2022.

O. A. Alcántara Francia, M. Nunez-del-Prado, and H. Alatrista-Salas, “Survey of Text Mining Techniques Applied to Judicial Decisions Prediction,” Appl. Sci., vol. 12, no. 20, 2022, doi: 10.3390/app122010200.

A. Setyanto et al., “Arabic Language Opinion Mining Based on Long Short-Term Memory (LSTM),” Appl. Sci., vol. 12, no. 9, 2022, doi: 10.3390/app12094140.

A. P. Ardhana, D. E. Cahyani, and Winarno, “Classification of Javanese Language Level on Articles Using Multinomial Naive Bayes and N-Gram Methods,” J. Phys. Conf. Ser., vol. 1306, no. 1, 2019, doi: 10.1088/1742-6596/1306/1/012049.

D. Ji, P. Tao, H. Fei, and Y. Ren, “An end-to-end joint model for evidence information extraction from court record document,” Inf. Process. Manag., vol. 57, no. 6, p. 102305, 2020, doi: 10.1016/j.ipm.2020.102305.

N. Bansal, A. Sharma, and R. K. Singh, “A Review on the Application of Deep Learning in Legal Domain,” in IFIP Advances in Information and Communication Technology, 2019, vol. 559, pp. 374–381. doi: 10.1007/978-3-030-19823-7_31.

D. Alghazzawi, O. Bamasag, A. Albeshri, I. Sana, and H. Ullah, “Efficient Prediction of Court Judgments Using an LSTM + CNN Neural Network Model with an Optimal Feature Set,” Math. - MDPI, vol. 10, no. 5, p. 683, 2022, doi: https://doi.org/10.2290/math10050683.

C. O. Sullivan and J. Beel, “Predicting the Outcome of Judicial Decisions made by the European Court of Human Rights,” 27th AIAI Irish Conf. Artif. Intell. Cogn. Sci., 2019, doi: https://doi.org/10.48550/arXiv.1912.10819.

M. Medvedeva, M. Vols, and M. Wieling, “Using machine learning to predict decisions of the European Court of Human Rights,” Artif. Intell. Law, vol. 28, pp. 237–266, 2020, doi: https://doi.org/10.1007/s10506-019-09255-y.

M. Baygin, “Classification of Text Documents based on Naive Bayes using N-Gram Features,” in 2018 International Conference on Artificial Intelligence and Data Processing, IDAP 2018, 2019. doi: 10.1109/IDAP.2018.8620853.

B. Strickson and B. De La Iglesia, “Legal Judgement Prediction for UK Courts,” in ACM International Conference Proceeding Series, Mar. 2020, pp. 204–209. doi: 10.1145/3388176.3388183.

S. Sengupta and V. Dave, “Predicting applicable law sections from judicial case reports using legislative text analysis with machine learning,” J. Comput. Soc. Sci., vol. 5, no. 1, pp. 503–516, 2022, doi: 10.1007/s42001-021-00135-7.

S. Alam and N. Yao, “The impact of preprocessing steps on the accuracy of machine learning algorithms in sentiment analysis,” Comput. Math. Organ. Theory, vol. 25, no. 3, pp. 319–335, 2019, doi: 10.1007/s10588-018-9266-8.

T. Georgieva-Trifonova and M. Duraku, “Research on N-grams feature selection methods for text classification,” in IOP Conference Series: Materials Science and Engineering, Feb. 2021, vol. 1031, no. 1. doi: 10.1088/1757-899X/1031/1/012048.

J. Kruczek, P. Kruczek, and M. Kuta, “Are n-gram categories helpful in text classification?,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2020, vol. 12138 LNCS, pp. 524–537. doi: 10.1007/978-3-030-50417-5_39.

F. Khoirunnisa, N. Yusliani, D. Rodiah, R. Bachelor, and O. Ilir, “Effect of N-Gram on Document Classification on the Naïve Bayes Classifier Algorithm,” 2020. doi: https://doi.org/10.36706/sjia.v1i1.13.

W. Haitao, H. Jie, Z. Xiaohong, and L. Shufen, “A Short Text Classification Method Based on N‐Gram and CNN.pdf.” Wiley Online Library, pp. 248–254, 2020. doi: https://doi.org/10.1049/cje.2020.01.001.

Y. Zhang and Z. Rao, “N-BiLSTM: BiLSTM with n-gram Features for Text Classification,” Proc. 2020 IEEE 5th Inf. Technol. Mechatronics Eng. Conf. ITOEC 2020, no. Itoec, pp. 1056–1059, 2020, doi: 10.1109/ITOEC49072.2020.9141692.

H. Mentzingen, N. Antonio, and V. Lobo, “Joining metadata and textual features to advise administrative courts decisions: a cascading classifier approach,” Artif. Intell. Law, no. 0123456789, 2023, doi: 10.1007/s10506-023-09348-9.

H. Hsieh, J. Jiang, T.-H. Yang, R. Hu, and C.-L. Wu, “Predicting the Success of Mediation Requests Using Case Properties and Textual Information for Reducing the Burden on the Court,” Digit. Gov. Res. Pract., vol. 2, no. 4, pp. 1–18, 2022, doi: 10.1145/3469233.

D. T. Larose, Discovering knowledge in data: an introduction to data mining. John Wiley & Sons., 2005.

Kusrini and E. T. Luthfi, Algoritma Data Mining, I. Yogyakarta: ANDI OFFSET YOGYAKARTA, 2009.

DOI: https://doi.org/10.22146/ijccs.93929

Article Metrics

Abstract views : 3098 |

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Copyright of :IJCCS (Indonesian Journal of Computing and Cybernetics Systems)ISSN 1978-1520 (print); ISSN 2460-7258 (online)is a scientific journal the results of Computingand Cybernetics Systems
A publication of IndoCEISS.Gedung S1 Ruang 416 FMIPA UGM, Sekip Utara, Yogyakarta 55281Fax: +62274 555133email:ijccs.mipa@ugm.ac.id | http://jurnal.ugm.ac.id/ijccs

View My Stats1View My Stats2

Username
Password
Remember me