Topic Modeling in the News Document on Sustainable Development Goals

Hidayatul Fitri(1*), Widyawan Widyawan(2), Indah Soesanti(3)

(1) Universitas Gadjah Mada
(2) Universitas Gadjah Mada
(3) Universitas Gadjah Mada
(*) Corresponding Author


Indonesia is a developing country and supports the program of the Sustainable Development Goals (SDGs) which consist of 17 goals. SDGs is not only the government’s duty, but a shared duty from any elements. Online media has a crucial role in implementing goals of Indonesia’s SDG. Information published in online news related to the SDGs is an important consideration for the government, society, and all elements. Categorizing news manually to find out news topics is very time-consuming and done by the ability of news editors. News presented by online media on the news site can be used as topic modeling, where hidden topics can be found in the news on online media. Topic modeling will classify data based on a particular topic and determine the relationship between text. Latent Dirichlet allocation (LDA) is one of the methods on topic modeling to find out the trend of topics of SDGs news. Based on the result of this research, the implementation of LDA is the right choice for finding topics in a document. The result of topic modeling with k = 17 obtained the highest coherence score of 0.5405 on topic 8. Topic 8 discussed news related to the eighth SDGs goals, namely decent work and economic growth. This categorization was based on words formed after the LDA process. Then, topic 5 discussed the news on the 17th SDGs goals, namely partnerships for the goals. Topic 6 discussed the news of the first SDGs, namely no poverty.


Topic Modeling;LDA;SDGs;News;Media Online

Full Text:



Wahyuningsih, “Millenium Develompent Goals (MDGs) dan Sustainable Development Goals (SDGs) dalam Kesejahteraan Sosial,” Bisma Jurnal Bisnis dan Manajemen, Vol. 11, No. 3, pp. 390-399, Sep. 2017.

A.D. Kusumawardani (2015) “Apa itu MDGs?” [Online],, access date: Apr. 4, 2021.

(2019) “Tentang SDGs: Apa itu SDGs?” [Online],,access date: Apr. 4, 2021.

(2017) “Apa itu SDGs” [Online],, access date: Apr. 4, 2021.

R. Zaki (2016) “Arti Penting ‘Sustainable Development Goals’ bagi Indonesia,” [Online],, access date: Apr. 4, 2021.

A.H. Rahardian, “Strategi Pembangunan Berkelanjutan,” Prosiding Seminar STIAMI, Vol. 3, No. 11, pp. 45-56, Feb. 2016.

(2016) “Sekilas SDGs,” [Online],, access date: Apr. 4, 2021.

H. Jelodar, Y. Wang, C. Yuan, X. Feng, X. Jiang, Y. Li, and L. Zhao, “Latent Dirichlet Allocation (LDA) and Topic Modeling: Models, Applications, a Survey,” Multimedia Tools and Applications, Vol. 78, No. 11, pp. 15169-15211, 2019.

D.M. Blei, A.Y. Ng, and M.I. Jordan, “Latent Dirichlet Allocation,” Journal of Machine Learning Research 3, Vol. 3, pp. 993-1022, Jan. 2003.

Z. Tong and H. Zhang, “A Text Mining Research Based on LDA Topic Modelling,” The 6th International Conference on Computer Science, Engineering and Information Technology, 2016, pp. 201-210.

R. Kusumaningrum, M.I.A. Wiedjayanto, S. Adhy, and Suryono, “Classification of Indonesian News Articles Based on Latent Dirichlet Allocation,” 2016 International Conference on Data and Software Engineering (ICoDSE), 2016, pp. 1-5.

K. Nalini and L.J. Sheela, “Classification Using Latent Dirichlet Allocation with Naive Bayes Classifier to Detect Cyber Bullying in Twitter,” Indian Journal of Science and Technology, Vol. 9, No. 28, pp. 3-7, 2016.

Q.V. Bui, K. Sayadi, S.B. Amor, and M. Bui, “Combining Latent Dirichlet Allocation and K-Means for Documents Clustering: Effect of Probabilistic Based Distance Measures,” in Intelligent Information and Database Systems. ACIIDS 2017. Lecture Notes in Computer Science, Vol. 10191, N. Nguyen, S. Tojo, L.Nguyen, and B.Trawiński, Eds., Cham, Switzerland: Springer, 2017, pp. 248-257.

Wahyudin, “Aplikasi Topic Modeling pada Pemberitaan Portal Berita Online selama Masa PSBB Pertama,” Seminar Nasional Official Statistic, 2019, pp. 309-318.

Y. Sahria and D.H. Fudholi, “Analisis Topik Penelitian Kesehatan di Indonesia Menggunakan Metode Topic Modeling LDA,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), Vol. 4, No. 2, pp. 336–344, 2020.

D. Ariadi and K. Fithriasari, “Klasifikasi Berita Indonesia Menggunakan Metode Naive Bayesian Classification dan Support Vector Machine dengan Confix Stripping Stemmer,” Jurnal Sains dan Seni ITS, Vol. 4, No. 2, pp. D.248-D.253, 2015.

L.D. Utami and R.S. Wahono, “Integrasi Metode Information Gain untuk Seleksi Fitur dan Adaboost untuk Mengurangi Bias pada Analisis Sentimen Review Restoran Menggunakan Algoritma Naïve Bayes,” Journal of Intelligent Systems, Vol. 1, No. 2, pp. 120-126, 2015.

A. Deolika, Kusrini, and E.T. Luthfi, “Analisis Pembobotan Kata pada Klasifikasi Text Mining,” Jurnal Teknologi Informasi, Vol. 3, No. 2, pp. 179-184, 2019.

C.B. Asmussen and C. Møller, “Smart Literature Review: A Practical Topic Modelling Approach to Exploratory Literature Review,” Journal of Big Data, Vol. 6, No. 1, pp. 1-18, 2019.

D.M. Blei, “Probabilistic Topic Models,” Communications of the ACM, Vol. 55, No. 4, pp. 77-84, 2012.

A.K. Uysal and S. Gunal, “The Impact of Preprocessing on Text Classification,” Information Processing and Management, Vol. 50, No. 1, pp. 104-112, 2014.


Article Metrics

Abstract views : 1801 | views : 1089


  • There are currently no refbacks.

Copyright (c) 2021 IJITEE (International Journal of Information Technology and Electrical Engineering)

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

ISSN  : 2550-0554 (online)

Contact :

Department of Electrical engineering and Information Technology, Faculty of Engineering
Universitas Gadjah Mada

Jl. Grafika No 2 Kampus UGM Yogyakarta

+62 (274) 552305

Email :