Topic: Sentiment Analysis on Twitter

COMP621U, Spring 2011

Project Proposal

Team: Yuanfeng SONG, Jan VOSECKY

Topic: Sentiment Analysis on Twitter

Dataset

Twitter dataset provided by Z. Cheng, J. Caverlee, and K. Lee and used in CIKM 2010.

Statistics:

The training set: 115,886 Twitter users and 3,844,612 tweets from the users.

The test set: 5,136 Twitter users and 5,156,047 tweets from the users.

Used in:

Z. Cheng, J. Caverlee, and K. Lee. You Are Where You Tweet: A Content-Based Approach to Geo-locating Twitter Users. In Proceeding of the 19th ACM Conference on Information and Knowledge Management (CIKM), Toronto, Oct 2010.

Suggested approach

We plan to employ several machine learning techniques to extract users’ sentiment (emotion) from the content of their twitter messages (‘tweets’). Our general approach will consist of the following steps:

Preprocessing:

· Manually label a set of training and testing instances

· Represent tweets in an appropriate format, such as Bag-of-Words

· Indentify any additional features specific to tweets to include in the feature vector, in order to leverage additional information which may increase classification accuracy. Currently, we are considering the addition of temporal data, such as the timestamp.

· Investigate attribute selection and transformation possibilities

Modelling:

· Build a machine-learning model from the training data

· Evaluate the model on the testing data

We plan to experiment with a number of machine-learning algorithms and compare their effectiveness. These may include Bayesian classifiers, Maximum Entropy, Support Vector Machines, as well as clustering algorithms.

Possible extensions to our work:

· Topic-sensitive sentiment analysis: public sentiment with respect to specific topics. Comparison between different locations.

· Tag-clouds: listing the most prominent sentiment-rich words for a category of tweets.

Related work

There has been a large amount of prior research in sentiment analysis, especially in the domain of product reviews, movie reviews, and blogs. Pang and Lee [4] is an up-to-date survey of previous work in sentiment analysis. Sentiment analysis works are mainly focusing on designing platform or tools to do automatic sentiment analysis using models from machine learning area such as latent semantic analysis (LSA), Naive Bayes, support vector machines (SVM) etc. [1] [2]. Besides models, another difference between these works is different datasets, such as Twitter, blogs etc. Following this is the difference in feature set and different feature extraction methodology. For example, Mishne [1] uses many features extracted from Live Journal web blog service to train an SVM binary classifier for sentiment analysis. Alec Go [2] uses a Twitter dataset and extracts features from messages to do semantic analysis.

Besides treating this problem as a positive and negative emotion classification problem, researchers also tried to identify more kinds of emotions. Jung et al. [5] show that there are some idiosyncratic natures of mood expression in Plurk messages; for example, initial mood may change as time passes by (which also known as the fluctuation of moods). Moreover, some blogs are so intertwined that is even difficult for human to classify, not to mention for a machine. All these characteristics make it relatively hard to identify multiple emotions.

Emotion detection can be used for different areas, such as recommendation systems [3], computer-mediated communication (CMC)[6].

Evaluation metrics

Table 1 Confusion Matrix[7]

Predicted / Total
Positive / Negative
Actual / Positive / TP / FN / N(RM)
Negative / FP / TN / N(RB)
Total / N(RPM) / N(RPB) / N

Classification Mean average Precision:

Kappa is used to measure the agreement between predicted and observed categorizations of a dataset, while correcting for agreement that occurs by chance. The equation is as follows:

Where Pe is the hypothetical probability of chance agreement, using the observed data to calculate the probabilities of each observer randomly saying each category.

References

[1] G. Mishne, “Experiments with Mood Classification in Blog Posts,” in Proceedings of the 1st Workshop on Stylistic Analysis of Text For Information Access, 2005.

[2] A. Go, R. Bhayani, and L. Huang, “Twitter sentiment classification using distant

supervision,” Dec 2009. [Online]. Available: http://www.stanford.edu/~alecmgo/papers/TwitterDistantSupervision09.pdf

[3] L. Terveen, W. Hill, B. Amento, D. McDonald, and J. Creter, “PHOAKS: A system for sharing recommendations,” Communications of the Association for Computing Machinery (CACM), vol. 40, pp. 59–62, 1997.

[4] MY Chen. etc. Classifying Mood in Plurks. The 22nd Conference on Computational Linguistics and Speech Processing. Chi-Nan University, Taiwan..

[5] Y. Jung, Y. Choi, and S.H. Myaeng, “Determining mood for a blog by combining multiple sources of evidence,” in Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence, pp. 271-274, 2007.

[6] J. B. Walther, and K.P. D'addario, “The Impacts of Emoticons on Message Interpretation in Computer-Mediated Communication,” Social Science Review, vol. 19, no. 3, pp. 324-347, 2001.

[7] Confusion matrix. [Online]http://en.wikipedia.org/wiki/Confusion_matrix