Suspicious Account Detection in the Social Networks




The social network is a medium for maintaining contacts with our friends. With the increase in the use of social networks, the information wealth of the users is also increasing. So with the advent of the information wealth and its ease of access of user’s information can attract malicious group’s attention. Now to restrict the spammers from attacking the user’s on the social networks, various methods are proposed to identify spammers and their malicious activities.

INTRODUCTION
In recent years, social networks such as Facebook, Twitter, MySpace have gained much popularity. Social networks usually contain a lot of personal information about the users, and the way to regulate them. Users usually share their personal information with their friends. This information can be private and public. In the case of private, this information needs to be carried by the reliable network, and the user only allows friends, to see the information.
Also, sites on the social networks don’t provide strong identity confirmation. Therefore, a spammer can use the fake account to hide their identity.

SPAM ACCOUNT DETECTION ON THE SOCIAL NETWORKS
Various studies have done for the spam account detection and the spam post detection on the social networks. We can look into this one by one:
In 2010, Alex Hai Wang [1] divided the features into two groups: content-based and graph-based.
In content-based, there are four features to detect the spam user account.
  • ·     Repetitive Tweets/Posts: If repetitive tweets/posts are sent from the same account, then that account may be a spam account.
  • ·        Links: If most of the tweets/posts from an account contain the link, then that may be a spam account.
  • ·        Trending Topics: If an account sends unrelated material to trending topics, then it will be spam.
  • ·        Replies and Mentions: If most of the tweets/posts from a user account contain replies and mentions, then that account is spammed.

In the graph-based model, following features are considered for spam detection on Facebook, Twitter, and MySpace.
  • ·        Friends-to-Follower Ratio (FF): Spammers follow a large number of users, but the users who follow them are very limited. This is called as the creditability degree, and if creditability is low or equal to zero, then it will be a spammer account.
  • ·        Friend Request-to-Friends Accept Ratio: Ratio of the number of friends request sent to the number of friends request accepted can also be a measure for spam detection. This measure is considered because, it is general tendency to accept only the request from a known person, if someone knows anybody who sent the request, in real life then the person will surely accept his/her request. And we know that spam is not a real person, so request from a spammed account will be surely rejected with high probability.
  • ·        URL ratio: the presence of links in the posts/tweets is also a measure for spam detection.
  • ·        No. of sent messages: from the continuous observations, we came to know that spammers usually send hundreds of messages.

In 2011, Saeed Abu Nimeh [2], examined spammers. He reported that spammers regularly checks, in what hours of the day, users are more active on the social networks, and in those hours spammers send a remarkable volume of the spam materials. Researchers said that malicious posts are increased in the evening and they are reduced in the early morning. If the user account sends a huge amount of tweets/posts in a short period, then it will be a spammer.
Researchers also found out that, spammers use modern techniques like clickjacking, malicious browsers extension, URL shorteners and socially engineered script injections. Spammers use methods like counterfeit profile; in this, they introduce themselves as individual’s friend. Now when the victim accepts the friend’s request, a spammer can steal the information using their counterfeit account.
Another method can be, a spammer can send a destructive link, and encourage the user to click on it. This is known as click-theft.

SPAM POSTS DETECTION FEATURES ON THE SOCIAL NETWORKS
Study on posts on Facebook contains two phases. In the first phase, all wall posts are analyzed and focused on the posts that contain a link. In the second phase, features of destructive wall post are analyzed. Since each single account can send only a specific number of posts, so spammers use many accounts for making campaigns.
In 2011 Kristopher Beck [3] analyzed posts/tweets for spam detection. There are some special words/expressions which indicate that the posts are spammed. So we can find out the probability of the post being spam by identifying the words. There are few steps with which we can identify the post:
  • ·        Does post contain a link? (X0)
  • ·        Does the post contain the word “Chat”? (X1)
  • ·        Does the post contain the word “With”? (X2)
  • ·        Does the word “Chat” is in user’s biography or not? (X3)
  • ·        The word “naughty” has a direct relationship with a biography of the user.  (X4)

Now, after the above steps, following formula can be applied to find out the probability of the post being a spammed.
                        z = X1 + X2 + X3 + X4 + ….
                                  β = 1/ (1 + e-z)
β is the risk of a malicious message, i.e., the probability of being spammed.

CONCLUSION
Features of spammers and spam posts should be determined. After determining the features, a proper way for spam detection needs to be chosen. Clustering and classification algorithms are chosen for finding out the probability of being a spammed material.
REFERENCES
[1] Wang, A.H. Don't follow me: Spam detection on twitter. in Security and Cryptography (SECRYPT), Proceedings of the 2010 International Conference on. 2010. IEEE.

[2] Abu-Nimeh, S., T. Chen, and O. Alzubi, Malicious and Spam Posts in
Online Social Networks. Computer, 2011. 44(9): p. 23-28.


[3] Beck, K. Analyzing tweets to identify malicious messages. In Electro/Information Technology (EIT), 2011 IEEE International Conference on. 2011. IEEE.

Comments