Skip to main content Skip to main navigation menu Skip to site footer
Published: 31-03-2022

Twitter sentiment for Medan city election using the Naive Bayes method

Universitas AMIKOM Yogyakarta
Sentiment Analysis, twitter , Tf-idf, information, data science


Indonesia is the fifth largest country with twitter users with 19.5 million users. Along with the development of information technology, twitter has become a source of information based on twitter sentiment and trending as well as the use of hashtags that are trending. Recently, the archipelago vaccine has reaped the pros and cons, to be able to classify positive and negative sentences in twitter sentiment towards the archipelago vaccine, it requires data from twitter users by taking data based on sentence classification which is then processed in the initial data before being entered into the indoBERT model which will later be resulting in the accuracy of twitter sentiment towards the archipelago vaccine. Indonesia has 19.5 million Twitter users out of a total of 500 million global users and continues to grow from time to time. Twitter users used it as an open forum for campaigns by the Medan mayoral candidate and their volunteers were asked by Netizens to respond. Netizens' responses to each tweet are both Positive and Negative. Therefore, this study tries to analyze tweets about netizens' sentiments towards the 2020 Medan City Election. Opinions or sentiments from Twitter users can of course be used as criticisms and suggestions that can be accommodated by candidates for mayor and deputy mayor of Medan. Twitter netizens often have opinions about Regional Head Candidates through their uploads. The opinions of Twitter Netizens are still random or unclassified. To facilitate the process of classifying netizen opinion data requires Sentiment Analysis. Sentiment analysis was carried out by classifying tweets containing Netizen sentiments towards the 2020 Medan City Election. The classification method used in this study is the Naive Bayes method combined with TF-IDF feature extraction. NS The validity test applied in this study used a confusion matrix. With the tf-idf extraction feature and the Naive Bayes method, it will be able to automatically classify sentiment analysis with an accuracy of 76.00%.


Download data is not yet available.


  1. H. Rosenberg, S. Syed, and S. Rezaie, “The Twitter pandemic: The critical role of Twitter in the dissemination of medical information and misinformation during the COVID-19 pandemic,” Can. J. Emerg. Med., vol. 22, no. 4, pp. 418–421, 2020, doi: 10.1017/cem.2020.361.
  2. B. Monsted, P. Sapiezynski, E. Ferrara, and S. Lehmann, “Evidence of complex contagion of information in social media: An experiment using twitter bots,” arXiv, pp. 1–12, 2017.
  3. K. Rudra, N. Ganguly, and P. Goyal, “Extracting and Summarizing Situational Information,” ACM Trans. WEB, vol. 12, no. 3, p. 35, 2018.
  4. GATRAnews, “Indonesia Peringkat Lima Pengguna Twitter,” 2012. (accessed Apr. 25, 2021).
  5. A. Reyes-Menendez, J. R. Saura, and C. Alvarez-Alonso, “Understanding #worldenvironmentday user opinions in twitter: A topic-based sentiment analysis approach,” Int. J. Environ. Res. Public Health, vol. 15, no. 11, 2018, doi: 10.3390/ijerph15112537.
  6. C. Gu and A. Kurov, “Informational role of social media: Evidence from Twitter sentiment,” J. Bank. Financ., vol. 121, p. 105969, 2020, doi: 10.1016/j.jbankfin.2020.105969.
  7. A. Novantirani, M. K. Sabariah, and V. Effendy, “Analisis Sentimen pada Twitter untuk Mengenai Penggunaan Transportasi Umum Darat Dalam Kota dengan Metode Support Vector Machine,” e-Proceeeding Eng., vol. 2, no. 1, pp. 1–7, 2015.
  8. B. Wilie et al., “IndoNLU: Benchmark and resources for evaluating indonesian natural language understanding,” arXiv, 2020.
  9. A. Faesal, A. Muslim, A. H. Ruger, and K. Kusrini, “Sentimen Analisis Terhadap Komentar Konsumen Terhadap Produk Penjualan Toko Online Menggunakan Metode K-Means,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 19, no. 2, pp. 207–213, 2020, doi: 10.30812/matrik.v19i2.640.
  10. T. Kudo and J. Richardson, “SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text processing,” EMNLP 2018 - Conf. Empir. Methods Nat. Lang. Process. Syst. Demonstr. Proc., pp. 66–71, 2018, doi: 10.18653/v1/d18-2012.
  11. hang, L., Wang, S., & Liu, B. (2018). Deep learning for sentiment analysis: A
  12. survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge
  13. Discovery, 8(4), e1253
  14. Prasetyo, E. (2012). Data Mining Konsep dan Aplikasi menggunakan MATLAB.
  15. (N. WK, Ed.) (1st ed.). Yogyakarta: C.V ANDI OFFSET.
  16. Suryani, S., & Hendriyadi, H. (2016). Metode riset kuantitatif: Teori dan aplikasi
  17. pada penelitian bidang manajemen dan ekonomi Islam.
  18. Susilowati, E., Sabariah, M. K., & Gozali, A. A. (2015). Implementasi Metode
  19. Support Vector Machine untuk Melakukan Klasifikasi Kemacetan Lalu
  20. Lintas Pada Twitter. E-Proceeding of Engineering, 2(1), 1–7.

How to Cite

Prasetyo. 2022. “Twitter Sentiment for Medan City Election Using the Naive Bayes Method”. JNANALOKA 3 (1):27-32.