Using Twitter To Identify Sentiments About A Specific Entity In A System

Twitter has grown to become a very popular platform over the past few years consisting of 190 million users and 65 million tweets publishing everyday. Twitter sentiment analysis has been a very famous paradigm for the past 10 years. This blog makes you acquainted with a new sentiment analysis technique called “target specific sentiment analysis” which can be used to form public opinion on elections.

Before going on to the technical aspects , you may ask a very intuitive question : Why do we need a new sentiment analysis technique? Why can’t the existing techniques suffice to the needs of forming public opinions?
Well the answer to this question lies in analysing the state-of-the-art sentiment analysis techniques. Most of them use machine learning based classifiers (info on classifiers) which do not consider the targets being talked about in the tweet. But what is a target? A target can be any entity that is being discussed in the tweet. For example in “Oh, Jon Stewart. How I love you so.” , the author expresses love for “you” but he/she actually mean to express love for “John stewart” (which is a target). The problem with state-of-the-art techniques can become clearer with the following example : “I will be voting for BJP mainly to remove Congress as a ruling party”. The overall sentiment of the tweet is positive but there is a negative sentiment about congress. These examples are very common among the tweets on topics like politics (Wang et al). It becomes vital to distinguish and differentiate sentiments towards major issues and entities being discussed on the topics such as politics.

Now, the question comes how can the so called “target specific sentiment analysis” be achieved? Do you need to build a completely new system or improve the existing state-of-the art techniques? A completely new system can be made but it is certainly more instinctive to rather add on top of the existing techniques. What relevant things you should add to the current techniques? Assuming you know a bit of machine learning (or probably have gone through the link) one obvious approach can be to add target specific features!
Before going on to the features,  the prior and important task is to identify the targets in a set of tweets. Target identification is done by using a technique called extended targets(Jiang et al). Why you need extended targets ? It is needed because people often express their sentiments about a target by commenting on the things related to target rather than the target itself. For instance in “I am passionate about Microsoft technologies especially Silverlight.” , the author expresses a positive sentiment about Microsoft by referring to Microsoft technologies. Hence the foremost job is to attribute all the noun phrases as targets. You can identify related references to the targets as follows -
  1. All the words referring to the target can be regarded as extended targets. For example,  “All hail Jon Snow! He is king in the north.” he is a reference to Jon snow.
  2. Finding top-k noun and noun phrases which share a strong affiliation with the target and adding them as extended targets.
  3. Identifying head nouns from the extended targets. For instance, assume we have an extended target for ‘ jon snow’ as “The night’s watch” we can further add ‘watch’ as the new extended target.

Having explored the ways to find the targets, now target specific features have to added. It is a robust and practical approach to take into account the figures of speech used for the target. Hence the objective of adding features is not only to be target specific but also figure of speech oriented. Consider a word w and the target as T (or it’s extended target) , features are added in the following fashion -
  1.  w is a transitive verb accompanied by T or any of its extended targets as object, then feature w_arg2 is added. For instance, for the tweet say “I love Macbook ” we generate Macbook_arg2 as the feature for the target Macbook.
  2. w is a transitive verb and T is accounted as object, then feature w_arg1 is added.
  3. w is an intransitive verb and T is its subject, then feature w_it_arg1 is added.
  4. w is an adjective or noun joined via connecting word to T then w_cp_arg1 is added as feature for target.
There are some more features mentioned in Jiang et al . After adding these features (related to a target) one can train classifiers like SVM and analyse the sentiments using the existing techniques.

Adhering to the title of the blog, consider a system of elections. The target can be a political party. You want to find how likely is the chance of this party winning the elections. Here, the target specific sentiment analysis can come to the rescue. You can analyse the sentiments for that political party on twitter which can prove as an aid for polling the elections. Moreover this technique can also be used to find bots on twitter which keep on creating a positive sentiment for a particular political party. Besides taking the system to be elections there can be many more systems such as bollywood (entities can be actors), sports and religion. Hence, target specific sentiment analysis can have plethora of applications in varied domains and can be a very worthwhile technique.


Comments