Clustering User Characteristics Based on the influence of Hashtags on the Instagram Platform

Muhammad Habibi(1*), Puji Winar Cahyo(2)

(1) Department of Informatics, FTTI UNJANI, Yogyakarta
(2) Department of Informatics, FTTI UNJANI, Yogyakarta
(*) Corresponding Author


Instagram is a social media that has the potential to be used to increase awareness of a product. Approximately 70% of users spend their time searching for a product on Instagram. Many people promote their products with a lack of attention to the target. So that not infrequently the information distributed is inaccurate information and not following user characteristics. This study aims to cluster the characteristics of Instagram users based on hashtag compatibility. The method used in this study is the K-Means Clustering method. Based on the results of the experiment, this research succeeded in clustering Instagram users based on the hashtag match on the text caption. Besides, TF-IDF can be used as a feature suitable for the K-Means Klastering method. The results of the hashtag "#kopi" analysis resulted in hashtag suggestions that can be used for the promotion of a product related to coffee, including the hashtag #coffeeshop and #coffee with total usage of 14968 captions.


Clustering; Instagram; K-Means; Social Media; Text Analysis

Full Text:



[1] J. Constine, “Instagram hits 1 billion monthly users, up from 800M in September,” 2018. [Online]. Available: [Accessed: 19-Aug-2019].

[2] A. ADI and A. HIDAYAT, “45 Juta Pengguna Instagram, Indonesia Pasar Terbesar di Asia,” 2017. [Online]. Available: [Accessed: 19-Aug-2019].

[3] A. Collins, “Instagram Marketing,” 2018. [Online]. Available: [Accessed: 19-Aug-2019].

[4] M. Yamamoto and K. W. Church, “Using Suffix Arrays to Compute Term Frequency and Document Frequency for All Substrings in a Corpus,” Assoc. Comput. Linguist., vol. 00, no. 0, pp. 1–45, 2000.

[5] M. Habibi and Sumarsono, “Implementation of Cosine Similarity in an automatic classifier for comments,” vol. 3, no. 2, pp. 38–46, 2018.

[6] F. Sebastiani, “Machine Learning in Automated Text Categorization,” ACM Comput. Surv., vol. 34, no. 1, pp. 1–47, 2002.

[7] M. Habibi, “Analisis Sentimen dan Klasifikasi Komentar Mahasiswa pada Sistem Evaluasi Pembelajaran Menggunakan Kombinasi KNN Berbasis Cosine Similarity dan Supervised Model,” Universitas Gadjah Mada, 2017.

[8] D. J. Bora and A. K. Gupta, “Effect of Different Distance Measures on the Performance of K-Means Algorithm : An Experimental Study in Matlab,” Int. J. Comput. Sci. Inf. Technol., vol. 5, no. 2, pp. 2501–2506, 2014.

[9] A. Priadana and M. Habibi, “Face Detection using Haar Cascades to Filter Selfie Face Image on Instagram,” in 2019 International Conference of Artificial Intelligence and Information Technology (ICAIIT), 2019, pp. 6–9.

[10] B. A. Kuncoro and B. H. Iswanto, “TF-IDF method in ranking keywords of Instagram users’ image captions,” in 2015 International Conference on Information Technology Systems and Innovation (ICITSI), 2015, pp. 1–5.

[11] A. F. Azmi and I. Budi, “Exploring practices and engagement of Instagram by Indonesia Government Ministries,” in 2018 10th International Conference on Information Technology and Electrical Engineering (ICITEE), 2018, pp. 18–21.

[12] E. Haddi, X. Liu, and Y. Shi, “The Role of Text Pre-processing in Sentiment Analysis,” Procedia Comput. Sci., vol. 17, pp. 26–32, 2013.

[13] I. Hemalatha and A. Govardhan, “Preprocessing the Informal Text for efficient ALGORITHM FOR,” Int. J. Emerg. Trends Technol. Comput. Sci., vol. 1, no. 2, pp. 58–61, 2012.

[14] H. Siqueira and F. Barros, “A Feature Extraction Process for Sentiment Analysis of Opinions on Services,” Proc. III Int. Work. Web Text Intell., 2010.

[15] C. D. Manning, P. Raghavan, and H. Schutze, An Introduction to Information Retrieval. Cambridge, England: Cambridge University Press, 2009.

[16] A. P. Clustering, “A Survey of Clustering Techniques and Algorithms,” 2nd Int. Conf. Comput. Sustain. Glob. Dev., pp. 3014–3017, 2015.

[17] P. W. Cahyo, “Klasterisasi Tipe Pembelajar Sebagai Parameter Evaluasi Kualitas Pendidikan di Perguruan Tinggi,” Teknomatika, vol. 11, no. 1, pp. 49–55, 2018.


Article Metrics

Abstract views : 7956 | views : 4739


  • There are currently no refbacks.

Copyright (c) 2019 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Copyright of :
IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
ISSN 1978-1520 (print); ISSN 2460-7258 (online)
is a scientific journal the results of Computing
and Cybernetics Systems
A publication of IndoCEISS.
Gedung S1 Ruang 416 FMIPA UGM, Sekip Utara, Yogyakarta 55281
Fax: +62274 555133 |

View My Stats1
View My Stats2