A Comparison of Performance Metrics of Turkish Twitter Messages Using Text Representations


Creative Commons License

Karcıoğlu A. A.

1. INTERNATIONAL TECHNOLOGICAL SCIENCES AND DESING SYMPOSIUM, Giresun, Türkiye, 27 - 29 Haziran 2018, ss.433-444

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Basıldığı Şehir: Giresun
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.433-444
  • Atatürk Üniversitesi Adresli: Hayır

Özet

With the development of technology and the spread of the internet all over the world, social media platforms have evolved over time so that people can be aware of the changes happening in the world at any moment, and that everyone can share their own thoughts. Twitter, one of the most used social media platforms around the world, has become one of the most important parts of everyday life. With twitter, users share their own feelings and thoughts to create important data sources that can be used in sentiment analysis work on the social media in the field of data mining. In this study, which is implemented in python programming language, sentiment analysis was performed by using text representations in turkish twitter messages that users shared. The aim of the study, the performance effects of Bag-of-Words(BOW) model weighted by Tf-Idf and semantic relation based Word2Vec model are compared on sentiment analysis. In this study, which applied 3 different models, in the third model, the highest accuracy percentage was obtained with 66.40% by applying Random Forest algorithm to Word2Vec model. The results obtained using the machine learning algorithms from the scikit-learn library compared the performance metrics and provided the literature contribution to turkish natural language processing studies.