Performance Comparison of Naive Bayes and Support Vector Machine Algorithms in Sentiment Analysis of TIX ID Application Reviews Using VADER Automatic Labeling

Zenia Kumala  Rizka; Jumanto Unjung

Authors

Zenia Kumala Rizka Department of Computer Science, Faculty of Mathematics and Natural Sciences, Universitas Negeri Semarang, Indonesia
Jumanto Unjung Department of Computer Science, Faculty of Mathematics and Natural Sciences, Universitas Negeri Semarang, Indonesia

Keywords:

Sentiment Analysis, TIX ID, Naive Bayes, Support Vector , Machine (SVM)

Abstract

This study compares the sentiment classification performance of Naive Bayes and Support Vector Machine (SVM). It uses 28247 user reviews for the Google Play Store app TIX ID collected from Kaggle. The reviews were first translated into English, then their sentiment was labeled using VADER. After completing text preprocessing, feature extraction via TF-IDF combined with 1-gram and 2-gram features, and class balancing through random oversampling, test results show that SVM achieved an accuracy of 93.45% and an F1-score of 93.78%, which outperforms Naive Bayes’ respective scores of 90.90% accuracy and 91.72% F1-score. Experiments in this study found that the Support Vector Machine (SVM) outperformed Naive Bayes across all three evaluation metrics: precision, recall, and F1-score. This verifies that the approach consisting of VADER annotation, TF-IDF feature extraction, and SVM can effectively conduct sentiment analysis on mobile application reviews, and meets the needs of the industry.