Journal Title
Title of Journal:
|
|
Publisher
Springer, London
|
|
|
|
Authors: A Mountassir H Benbrahim I Berrada
Publish Date: 2012/12/11
Volume: , Issue: , Pages: 259-272
Abstract
Sentiment Analysis is a research area where the studies focus on processing and analyzing the opinions available on the web Several interesting and advanced works were performed on English In contrast very few works were conducted on Arabic This paper presents the study we have carried out to investigate supervised sentiment classification in an Arabic context We use two Arabic Corpora which are different in many aspects We use three common classifiers known by their effectiveness namely Naïve Bayes Support Vector Machines and kNearest Neighbor We investigate some settings to identify those that allow achieving the best results These settings are about stemming type term frequency thresholding term weighting and ngram words We show that Naïve Bayes and Support Vector Machines are competitively effective however K Nearest Neighbor’s effectiveness depends on the corpus Through this study we recommend to use lightstemming rather than stemming to remove terms that occur once to combine unigram and bigram words and to use presencebased weighting rather than frequencybased one Our results show also that classification performance may be influenced by documents length documents homogeneity and the nature of document authors However the size of data sets does not have an impact on classification results
Keywords:
.
|
Other Papers In This Journal:
|