Sentiment Classification using Machine Learning Techniques

Pranjal Vachaspati

Cathy Wu


We implement a series of classifiers (Naive Bayes, Maximum Entropy, and SVM) to distinguish positive and negative sentiment in critic and user reviews. We apply various processing methods, including negation tagging, part-of-speech tagging, and position tagging to achieve maximum accuracy. We test our classifiers on an external dataset to see how well they generalize. Finally, we use a majority-voting technique to combine classifiers and achieve accuracy of close to 90% in 3-fold cross-validation, far outperforming Pang's 2002 work [7]. Source code is located at

Pranjal Vachaspati 2012-02-05