TY - JOUR
T1 - Incorporating sentiment prior knowledge for weakly supervised sentiment analysis
AU - He, Yulan
PY - 2012/6
Y1 - 2012/6
N2 - This article presents two novel approaches for incorporating sentiment prior knowledge into the topic model for weakly supervised sentiment analysis where sentiment labels are considered as topics. One is by modifying the Dirichlet prior for topic-word distribution (LDA-DP), the other is by augmenting the model objective function through adding terms that express preferences on expectations of sentiment labels of the lexicon words using generalized expectation criteria (LDA-GE). We conducted extensive experiments on English movie review data and multi-domain sentiment dataset as well as Chinese product reviews about mobile phones, digital cameras, MP3 players, and monitors. The results show that while both LDA-DP and LDAGE perform comparably to existing weakly supervised sentiment classification algorithms, they are much simpler and computationally efficient, rendering themmore suitable for online and real-time sentiment classification on the Web. We observed that LDA-GE is more effective than LDA-DP, suggesting that it should be preferred when considering employing the topic model for sentiment analysis. Moreover, both models are able to extract highly domain-salient polarity words from text.
AB - This article presents two novel approaches for incorporating sentiment prior knowledge into the topic model for weakly supervised sentiment analysis where sentiment labels are considered as topics. One is by modifying the Dirichlet prior for topic-word distribution (LDA-DP), the other is by augmenting the model objective function through adding terms that express preferences on expectations of sentiment labels of the lexicon words using generalized expectation criteria (LDA-GE). We conducted extensive experiments on English movie review data and multi-domain sentiment dataset as well as Chinese product reviews about mobile phones, digital cameras, MP3 players, and monitors. The results show that while both LDA-DP and LDAGE perform comparably to existing weakly supervised sentiment classification algorithms, they are much simpler and computationally efficient, rendering themmore suitable for online and real-time sentiment classification on the Web. We observed that LDA-GE is more effective than LDA-DP, suggesting that it should be preferred when considering employing the topic model for sentiment analysis. Moreover, both models are able to extract highly domain-salient polarity words from text.
UR - http://www.scopus.com/inward/record.url?scp=84863688889&partnerID=8YFLogxK
UR - http://dl.acm.org/citation.cfm?doid=2184436.2184437
U2 - 10.1145/2184436.2184437
DO - 10.1145/2184436.2184437
M3 - Article
AN - SCOPUS:84863688889
SN - 1530-0226
VL - 11
JO - ACM transactions on Asian Language information processing
JF - ACM transactions on Asian Language information processing
IS - 2
M1 - 4
ER -