Download - Twitter Sentiment Anlaysis using Hadoop
1
Speakers• Debarchan Sarkar
Agenda• HDInsight – Hadoop on Windows• Microsoft Offerings• Demo – Twitter sentiment analysis
Social Media Opinion Mining
2
Microsoft Data Platform
3
Demo – Twitter Sentiment Analysis
• Enterprises may analyze sentiment about:
• Product
• Service
• Competitors
• Reputation
Is used to understand how the public feels about something at a particular moment in timeAnd also track how those opinions change over time.
4
What We Can Determine
• What do people think about our product (service, company etc.)?
• How positive (or negative) are people about our product based on geographical locations?
• What would people prefer our product to be like?
5
Basic Steps For Sentiment Analysis
• Filtering – we remove URL links (e.g. http://example.com), Twitter user names (e.g. @alex – with symbol @ indicating a user name),
• Tokenization – we segment text by splitting it by spaces and punctuation marks, and form a bag of words.
• Removing stop words – we remove articles (“a”, “an”, “the”) from the bag of words.
• Constructing n-grams – we make a set of n-grams out of consecutive words. A negation (such as “no” and “not”) is attached to a word which precedes it or follows it.
6
• Convert the raw Twitter data into a tabular format.
• Use a dictionary file to score the sentiment of each Tweet by the number of positive words compared to the number of negative words, and then assign a positive, negative, or neutral sentiment value to each Tweet.
• Create a new table that includes the sentiment value for each Tweet.
• Project the sentiment grouped according to geographical location of the users in an interactive, Excel Map visualization.
Demo – Twitter Sentiment Analysis - Continued
In this demo, we will:
**References: https://tweetinvi.codeplex.com/ http://hortonworks.com/
7
Feed us back
• Support Team’s blog: http://blogs.msdn.com/b/bigdatasupport/ • Facebook Page: https://www.facebook.com/MicrosoftBigData • Facebook Group: https://www.facebook.com/groups/bigdatalearnings/ • Twitter: @debarchans
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Thank You!