Arabic Sentiment Analysis with Social Network Data: A Comparative Study

Publisher:
Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:
Conference Proceeding
Citation:
Proceedings - 2024 IEEE International Conference on e-Business Engineering, ICEBE 2024, 2024, 00, pp. 87-94
Issue Date:
2024-01-01
Full metadata record
Social media platforms have become essential in modern society, with usage rapidly increasing. These platforms serve various roles, from entertainment and personal expression to education and e-commerce. Users frequently share opinions, generating valuable but largely unstructured data. Businesses and governments utilize sentiment analysis on this data to understand public opinions and predict behaviors, aiding in the improvement of products, services, and policies. However, the unstructured na-ture of the data makes its analysis complex. This article provide a comprehensive evaluation of several machine learning (ML) mod-els, including Support Vector Machine (SVM), Bernoulli Naive Bayes (BernoulliNB), Random Forest (RF), Gradient Boosting (GB), K-nearest Neighbors (KNN), and Logistic Regression (LR) on Twitter data for Arabic sentiment classification. Additionally, it provides valuable insights into the preprocessing steps required to prepare the data for analysis. It also discusses the selection of effective feature extraction techniques for Saudi dialect and modern Arabic sentiment analysis to gauge their efficacy in analyzing social media data. Among various feature extraction methods, TF-IDF consistently demonstrated superior accuracy, with SVM emerging as the top-performing model at 92.4%. Finally, a public dataset focusing on customer sentiment towards Saudi banks is employed to further validate our findings and their generalizability across different domains such as government policies and commercial sectors.
Please use this identifier to cite or link to this item: