Classifying Indonesian Hoax News Titles with SVM, XGBoost, and BiLSTM

https://doi.org/10.22146/ijccs.106608

I Nyoman Prayana Trisna(1*), I Made Wiraharja Jaya Putra(2), Wayan Oger Vihikan(3)

(1) Udayana University
(2) Udayana University
(3) Udayana University
(*) Corresponding Author

Abstract


This study investigates the automated detection of hoaxes related to President Jokowi in Indonesian news by analyzing only news titles, aiming for efficient detection and reduced traffic to harmful websites. We compared the performance of traditional (SVM, XGBoost) and deep learning (BiLSTM) algorithms, with and without Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance in a dataset scraped from trusted news sources (CNN Indonesia, Detik News) and a fact-checking platform (turnbackhoax.id). The results indicate that BiLSTM generally outperformed SVM and XGBoost, demonstrating the potential of deep learning for this task. However, applying SMOTE negatively impacted BiLSTM's performance, suggesting overfitting. Notably, precision consistently exceeded recall across all models, indicating high reliability in identifying hoaxes but a potential for missing a significant number of actual hoaxes. This highlights a trade-off between avoiding false positives and ensuring comprehensive detection. The findings also suggest that language-specific characteristics influence algorithm effectiveness. This research contributes to developing efficient and accurate tools for combating misinformation in the Indonesian online environment, emphasizing the importance of title-based analysis and careful consideration on data balancing.

Keywords


Hoax detection; SVM; XGBoost; BiLSTM; SMOTE

Full Text:

PDF


References

R. N. Rahayu, "Vaksin covid 19 di Indonesia: analisis berita hoax," Jurnal Ekonomi, Sosial & Humaniora, vol. 2, no. 07, pp. 39-49, 2021.

C. Juditha, "Hoax communication interactivity in social media and anticipation (Interaksi komunikasi hoax di media sosial serta antisipasinya)," Pekommas, vol. 3, no. 1, p. 261723, 2018.

S. Soleman, "Pemanfaatan Metode Klasifikasi Naïve Bayes Untuk Pendeteksi Berita Hoax Pada Artikel Berbahasa Indonesia," Jurnal CoreIT: Jurnal Hasil Penelitian Ilmu Komputer dan Teknologi Informasi, vol. 7, no. 2, pp. 83-93, 2021.

R. R. Sani et al., "Analisis Perbandingan Algoritma Naive Bayes Classifier dan Support Vector Machine untuk Klasifikasi Berita Hoax pada Berita Online Indonesia," Jurnal Masyarakat Informatika, vol. 13, no. 2, pp. 85-98, 2022.

E. Utami et al., "Covid-19 Hoax Detection Using KNN in Jaccard Space," IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 15, no. 3, pp. 255-264, 2021.

S. Chitti et al., "Identifying Hoaxes in Fake Spotter using XG Boost Machine Learning based Classification Method," in 2024 4th International Conference on Ubiquitous Computing and Intelligent Information Systems (ICUIS), Dec. 2024, pp. 990-996.

A. Mohapatra, N. Thota, and P. Prakasam, "Fake news detection and classification using hybrid BiLSTM and self-attention model," Multimedia Tools and Applications, vol. 81, no. 13, pp. 18503-18519, 2022.

A. D. Cahyani and A. K. Ramdani, "Hoax Detection of Covid-19 News on Social Media using Convolutional Neural Network (CNN) and Support Vector Machine (SVM)," International Journal on Information and Communication Technology (IJoICT), vol. 9, no. 2, pp. 177-185, 2023.

C. Agrawal, A. Pandey, and S. Goyal, "Multimodal fake news detection using hyperparameter-tuned BERT and ResNet110," International Journal of Advanced Technology and Engineering Exploration, vol. 11, no. 114, p. 759, 2024.

M. Y. Ridho and E. Yulianti, "From Text to Truth: Leveraging IndoBERT and Machine Learning Models for Hoax Detection in Indonesian News," Jurnal Ilmiah Teknik Elektro Komputer Dan Informatika, vol. 10, no. 3, pp. 544–555, 2024, doi: 10.26555/jiteki.v10i3.29450.

L. H. Suadaa, I. Santoso, and A. T. B. Panjaitan, "Transfer learning of pre-trained transformers for covid-19 hoax detection in indonesian language," IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 15, no. 3, pp. 317-326, 2021.

P. W. Cahyo and U. S. Aesyi, "Perbandingan LSTM dengan Support Vector Machine dan Multinomial Na ve Bayes pada Klasifikasi Kategori Hoax," Jurnal Transformatika, vol. 20, no. 2, p. 23, 2023, doi: 10.26623/transformatika.v20i2.5880.

R. K. Putri and M. Athoillah, "SUPPORT VECTOR MACHINE UNTUK IDENTIFIKASI BERITA HOAX TERKAIT VIRUS CORONA (COVID-19)," vol. 6, no. 3, 2021.

I. A. Ropikoh, R. Abdulhakim, U. Enri, and N. Sulistiyowati, "Penerapan Algoritma Support Vector Machine (SVM) untuk Klasifikasi Berita Hoax Covid-19," Journal of Applied Informatics and Computing (JAIC), vol. 5, no. 1, 2021, Accessed: Apr. 30, 2025. [Online]. Available: http://jurnal.polibatam.ac.id/index.php/JAIC

Z. Khanam, B. N. Alwasel, H. Sirafi, and M. Rashid, "Fake news detection using machine learning approaches," in IOP conference series: materials science and engineering, vol. 1099, no. 1, Mar. 2021, p. 012040.

J. P. Haumahu, S. D. H. Permana, and Y. Yaddarabullah, "Fake news classification for Indonesian news using Extreme Gradient Boosting (XGBoost)," IOP Conference Series: Materials Science and Engineering, vol. 1098, no. 5, p. 052081, 2021, doi: 10.1088/1757-899x/1098/5/052081.

H. B. Aji and E. B. Setiawan, "‘Detecting hoax content on social media using bi-LSTM and RNN," Building Informat., Technol. Sci, vol. 5, no. 1, pp. 114-125, 2023.

I. K. Sastrawan, I. P. A. Bayupati, and D. M. S. Arsa, "Detection of fake news using deep learning CNN–RNN based methods," ICT Express, vol. 8, no. 3, pp. 396–408, 2022, doi: 10.1016/j.icte.2021.10.003.

Reuters Institute for the Study of Journalism, "Digital News Report 2021," University of Oxford, Oxford, UK, 2021. [Online]. Available: https://reutersinstitute.politics.ox.ac.uk/digital-news-report/2021 (Accessed: Apr. 30, 2025)

A. Santosa, I. Purnamasari, and R. Mayasari, "Pengaruh Stopword Removal dan Stemming Terhadap Performa Klasifikasi Teks Komentar Kebijakan New Normal Menggunakan Algoritma LSTM," J-SAKTI (Jurnal Sains Komputer dan Informatika), vol. 6, no. 1, pp. 81-93, 2022.

P. H. Prastyo, I. Ardiyanto, and R. Hidayat, "Indonesian Sentiment Analysis: An Experimental Study of Four Kernel Functions on SVM Algorithm with TF-IDF," in 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy (ICDABI), Oct. 2020, pp. 1-6.

Zhang Qi, "The text classification of theft crime based on TF-IDF and XGBoost model," in 2020 IEEE International conference on artificial intelligence and computer applications (ICAICA), 2020, pp. 1241-1246.

E. Sutoyo and M. Asri Fadlurrahman, "Penerapan SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Television Advertisement Performance Rating Menggunakan Artificial Neural Network," JEPIN (Jurnal Edukasi Dan Penelitian Informatika), vol. 6, no. 3, pp. 379–385, 2020.

R. D. Firtriani, H. Yasin, and Tarno, "Penanganan Klasifikasi Kelas Data Tidak Seimbang dengan Random Oversampling Pada Naive Bayes (Studi Kasus: Status Peserta KB IUD di Kabupaten Kendal)," vol. 10, no. 1, pp. 11–20, 2021.

Y. Ansori and K. F. H. Holle, "Perbandingan Metode Machine Learning dalam Analisis Sentimen Twitter," Jurnal Sistem Dan Teknologi Informasi (JustIN), vol. 10, no. 4, p. 429, 2022, doi: 10.26418/justin.v10i4.51784.

A. R. Hanum et al., "Analisis Kinerja Algoritma Klasifikasi Teks Bert dalam Mendeteksi Berita Hoaks," Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 11, no. 3, pp. 537-546, 2024.

D. I. Puteri, "Implementasi Long Short Term Memory (LSTM) dan Bidirectional Long Short Term Memory (BiLSTM) Dalam Prediksi Harga Saham Syariah," Euler : Jurnal Ilmiah Matematika, Sains Dan Teknologi, vol. 11, no. 1, pp. 35–43, 2023, doi: 10.34312/euler.v11i1.19791.



DOI: https://doi.org/10.22146/ijccs.106608

Article Metrics

Abstract views : 728 | views : 499

Refbacks

  • There are currently no refbacks.




Copyright (c) 2025 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



Copyright of :
IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
ISSN 1978-1520 (print); ISSN 2460-7258 (online)
is a scientific journal the results of Computing
and Cybernetics Systems
A publication of IndoCEISS.
Gedung S1 Ruang 416 FMIPA UGM, Sekip Utara, Yogyakarta 55281
Fax: +62274 555133
email:ijccs.mipa@ugm.ac.id | http://jurnal.ugm.ac.id/ijccs



View My Stats1
View My Stats2