thought bubbles: a conceptual prototype for a twitter based recommender system for research 2.0

THOUGHT BUBBLESA Conceptual Prototype for a Twitter based Recommender System for Research 2.0

Patrick Thonhauser1, Selver Softic1, Martin Ebner1

1Department for Social Learning, Institute for Information Systems and Computer Media, Graz University of Technology,

Austria

[email protected], [email protected], [email protected]

Keywords: Recommender System, Twitter, Thought Bubble, Classification, Social, Data Mining

Abstract: The concept of so called Thought Bubbles deals with the problem of finding appropriate new connectionswithin Social Networks, especially Twitter. As a side effect of exploring new users, Tweets are classified andrated and are used for generating a kind of news feed, which will extend the personal Twitter feed. Each userhas several interests that can be classified by evaluating his Tweets in first place and secondly by evaluatinguser related and already existing contacts. By categorizing a user and concerned connections, one can beplaced in an imaginary category specific subset of users, called Thought Bubbles. Following the trace ofpeople who are also active within the same specific Thought Bubble, should reveal interesting and helpfulconnections between similar minded users.

1 INTRODUCTION

Twitter has grown tremendously in the last fewyears and is generating 200 million Tweets and 1.6million search queries each day. As of now (2012),Twitter has over 250 million users1. These are prettyimpressive numbers for a micro blogging/social-network platform and Twitter has already become acultural phenomenon. Every day people all over theworld are communicating via Twitter, exchanging thelatest news and discussing millions of diverse topics.The list of tweetable actions is almost infinite and ev-erybody who is interested in a specific person or a spe-cific topic, has the ability to consume the knowledgeby reading certain tweets or exploring the tweeted re-sources.

However, the interesting questions for researchersare how to make use of the information containedwithin millions of tweets and what to extract fromthose 140 character micro blogs. How much usefulinformation is in a Tweet and how can we separatefeasible information from noise? This paper presentsa novel concept for finding new interesting users andinformation for a specific Twitter account. Many re-searchers already solved parts of this puzzle and sev-eral parts of these concepts are based on findings of(Softic et al., 2010), (Mika and Laniado, 2010) and

1http://thesocialskinny.com/100-social-media-statistics-for-2012/ (April 2012)

(Choudhury and Breslin, 2010). To Semantic Webresearchers, Twitter has become one of the most pop-ular applications for the dissemination of information(Kraker et al., 2010) and it is therefore a legit candi-date to serve as the main source for mining data con-cerning users and provided information of scientificinterest.

This paper doesn’t serve as a detailed descrip-tion of a forthcoming semantic recommender systemfor research 2.0, but rather as a brief overview of aproof of concept application, which’s main task is theclassification and recommendation of Twitter users.Also preliminary results of this extensive categoriza-tion task are presented in this paper.

2 CONCEPT

Twitter users follow other users for specific rea-sons. In the majority of cases these reasons are con-cerned with similar fields of interest. Nonetheless,this doesn’t mean the connection between similar in-terested Twitter users is bidirectional. When socialnetwork connections aren’t bidirectional, an individ-ual user doesn’t implicitly have to know his follow-ers. Obviously, the follower is interested and involvedwith similar topics, as the person he or she follows.Therefore, there is a big probability that friends andother colleagues of the followed user have similar

mebner

Schreibmaschinentext

Draft version, originally published in: Patrick Thonhauser, Selver Softic, and Martin Ebner. 2012. Thought Bubbles: a conceptual prototype for a Twitter based recommender system for research 2.0. In Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies (i-KNOW '12). ACM, New York, NY, USA, , Article 32 , 4 pages. DOI=10.1145/2362456.2362496

connections, which can be of certain interest for a spe-cific user.

A user is active in several kinds of topic basedbubbles, where the participating users do not neces-sarily know all participants of such a bubble. How-ever, in most cases, one doesn’t have just one specialkind of interest and he or she is part of several topicbased subsets of users. Hence, users within one user’sspecific bubble, might be of interest for each other.

Figure 1 shows an example of a so called networkgraph,which reveals the sphere of activity within di-verse Thought Bubbles. Users marked with a star (*)are potentially of big interest for this account (bluehighlighted in figure 1). These users belong to thesame topic specific bubble, as in here, to the Science

Bubble. However, also the connection between theyellow marked account and the accounts marked witha star, isn’t bidirectional.

. **

.

Science Bubble

Music Bubble

Developer Bubble

Figure 1: This is an example of how a user can be placed ina Twitter network graph.

This implies that following a specific user of a cer-tain field opens a big probability of finding furtherrelevant users who are also acting in a field of specificrelevance. The missing bidirectionality of certain userconnections, hints at interest only relationships.

Being conscious of this, led to the concept ofThought Bubbles. This holds the possibility to rec-ommend people and information, which is containedwithin a bubble and wasn’t explored by a specificTwitter user so far.

3 SYSTEM MODULES

The conceptiual realization of Thought Bubbles

can be split into several sub modules.

3.1 Finding potentially interesting users

The first sub module deals with the problem of sepa-rating users that merely produce noise or spam, fromthose that spread news, personal thoughts and facts.To simplify this process we have to define the poolof people who are connected to ones Twitter account.This connection exists because one is following otherusers or because other users are following oneself. Wecall this pool of people the inner circle. Separating theinner circle of people by filtering useful informationprovided by those people helps to reveal further ac-counts of potential interest, which are hidden in theso called outer circle. However, the outer circle ofpeople represents the connection to every person act-ing within ones inner circle. Subsequently, a secondcycle of filtering is performed to efficiently narrowdown and identify the people of potential interest.

(Horn, 2010) uses Support Vector Ma-

chines(SVMs) for this quite rough classificationtask. SVMs are a commonly used technique fortext classification and are recommended by manyresearchers like (Rios and Zha, 2004), (Hsu et al.,2010) or (Nakagawa et al., 2001). By applying thismethod, a potentially interesting set of users wouldremain for further consideration. Also the usageof a POS-tagger and a chunker in advance, couldhelp to acsertain if a Twitter account belongs to aperson. Eliminating duplicates within this set andeliminating the accounts that one already follows,one usually leads to a quite clearly arranged set ofTwitter accounts that is worth exploring in depth.

3.2 Categorization of users

Granular categorization of users is the most complextask within this system. In first place it’s necessary tocategorize the active user who uses the Thought Bub-ble service. In the very beginning, a set of appropri-ate categories that covers all possible interests a usercould have, has to be defined. For example such cate-gories include developing, science, teaching, etc...

To be able to classify a user, it’s necessary to pro-cess ones Tweet history. The first step is to annotatewords in a users Tweets, which can be performed byapplying Natural Language Processing (NLP) (Ritteret al., 2011) techniques. Classifying Tweets is a veryspecial task regarding usual classification of text arte-facts. The reasons are: (a) the shortage of Tweets (140character strings), (b) the often changing context inwhich a word is used and (c) the above average oc-currence of out of vocabulary words. By tagging allwords in Tweets (Part-Of-Speech tagging), the elimi-nation of unimportant words like copulas or preposi-

tions can be realized. (Gimple et al., 2011) for exam-ple, already developed a POS-tagger especially for theneeds of Twitter. Summarizing the results of all cate-gorized user Tweets, leads to a percentaged classifica-tion of a user. There are several techniques availableand approved for realizing this classification task. Re-ferred to section 3.1 SVMs can be used for such a taskas applied by (Nakagawa et al., 2001). But there areseveral other ways for accomplishing this classifica-tion behavior like using Bayesian approaches (Gold-water and Griffiths, 2005). Future testing and evalua-tion will give clarity about the best way for realizingcategorization of Twitter users. One possibility forevaluation is presented by (Chen et al., 2009).

3.3 Additional Ratios

In addition to measuring the similarity of Thought

Bubble attributes, regarding the affiliation of a userinto a category, several other ratios for determiningthe importance of a users recommendation are usedto sharpen the prediction accuracy. The following ra-tios are legit candidates for additionally influencingwether a user within a topic related bubble will berecommended or not.

• Tweet Frequency is the amount of Tweets a Twit-ter user is firing within a defined period of time.

• The Follower ratio. The more followers a userhas, the more influence or credibility one mightposses. On the other side, if a user has very fewfollowers, but is following a huge amount of otherusers, might hint to a Blast Follower

2.

• The amount of retweets a users Tweets have, indi-cates the amplitude a users reputation has.

• If an observed user isn’t connected with the innercircle bidirectionally, this denotes a non friend-ship but a sheer interest related relationship.

• Clients will have the possibility to rate recom-mended users or Tweets as ”interesting” or ”notinteresting” for a specified category. By com-paring users, which are rated as interesting withpotential recommendations for a Thought Bubble,similarity between those, can also influence theusers overall rating score within a bubble.

These ratios could help to sharpen the selectionof recommended Tweets and Twitter users. However,the main task regarding applying these ratios, is tofind an appropriate weighting scheme for every ratio.

2http://www.makeuseof.com/dir/blastfollow-mass-follow-twitter-users/ (April 2012)

3.4 Recommendation

Recommendation decisions are made by calculatingratings for each potentially interesting user, based ontheir category classification and the additional ratios,mentioned in section 3.2 and section 3.3. Subse-quently, category classification of an active serviceuser, is compared to the classified categories of po-tentially interesting other users. In advance, all addi-tional ratios have different weights, which will finallyinfluence the position of a user in the final recommen-dation list. Definite values for those ratios have to befound during development and test runs of the systemand therefore, can’t be predicted previously.

4 DEMO APPLICATION

The Thought Bubble Server will be implementedin Python and runs on an Apache 2 web server. Figure2 visualizes the potential infrastructure of this system.

Twitter API

REST API

SQLite Database

Classification Worker Threads

Tweet Collector

Clients (iOS, Web, etc)

Rater

External

Internal

Database Operations

Thread

Database Wrapper

Figure 2: Thought Bubble infrastructure.

Twitter related API calls, which affect or are sig-nificant for the classification and recommendationtask, are processed and cached by the Thought Bubble

server. The REST API acts as junction between theTwitter REST API and the client. All requests whicharen’t affecting the functionality of the Thought Bub-ble system, are directly processed by the TwitterREST API. When the system has completed catego-rizing and rating of potential recommendations for thefirst time a user starts to use this service, the systemstarts to enrich the Twitter stream with Tweets fromrecommended persons. Recommendation of singleTweets is based on the influence a Tweet has had dur-ing classification of a certain user. Thought Bubbleclients can be used just like usual Twitter clients forreading ones personal Twitter stream, tweeting or di-rect messaging. However, the big difference is that

the user gets recommendations in form of other Twit-ter accounts that most likely fit into his or her specificThought Bubble. Twitter also features a system forrecommended Twitter users concerning specific cate-gories, but these recommendations aren’t user specificat all.

In fact, this is just one part of the whole proof ofconcept application. The second part is an iOS appfor iPhone users, which uses the Thought Bubble ser-vice. This app can be used as a common iOS Twitterclient. However, this iOS Twitter app includes the bigadditional feature of being able to explore new recom-mended, topic specific information and rated users.Another very interesting feature, which would pro-vide huge potential for future applications, is the vi-sualization of Thought Bubbles, which would provideactive exploration for users within their own topic re-lated bubbles. Actually, our group is developing anadvanced prototype of such an app. Nonetheless, alsothe server applications implementation is in progressand preliminary results are discussed in the next sec-tion.

5 PRELIMINARY RESULTS ANDAPPLICATION SETUP

A first proof of concept application accordingto the idea of Thought Bubbles was already im-plemented. Figure 2 visualizes the architecture ofthe current Thought Bubble Server implementation.Django

3 is used as Web framework and the Natural

Language Toolkit (NLTK)

4 is used for classificationand word processing tasks. Data storage is handled byusing SQLite35 and Twitter related requests are gen-erated by the Python Twitter framework6. The currentstage of development contains the following imple-mentations.

5.1 Proof of Concept Setup

Currently, the classification is done by filtering hash-tags within Tweets of users and POS tagging andchunking the last 200 Tweets in a users Twitter time-line. POS tagging is done by using NLTKs Trigram

Tagger

7, trained with Conll 2000 training data (Tjangand Buchholz, 2000), similar to (Ritter et al., 2011).

3https://www.djangoproject.com/ (April 2012)4http://www.nltk.org/ (April 2012)5http://www.sqlite.org/ (April 2012)6http://code.google.com/p/python-twitter/ (April 2012)7http://nltk.googlecode.com/svn/trunk/doc/howto/tag.html

(April 2012)

After applying POS tagging, sentences are broughtinto the form of so called chunk trees (Abney, 1994).By iterating through the trees and searching for de-tected phrases, names and nouns, feature vectors arecompiled. To strengthen the influence of hashtags,they are counted twice within a vector. Additionallyto reduce or even eliminate the weight of words thatoccur very often in the English language (200 mostused English words) and aren’t useful for proper cat-egorization, are scratched from the vectors. This taskis performed for all Twitter users within a potentialThought Bubble and afterwards compared by apply-ing cosine similarity to rate the similarity of Tweetedcontent. This similarity is measured by comparing theword frequency counts of words and phrases, whichwere classified as relevant by the predone operations(POS tagging, chunking and phrase, noun and namefiltering).

5.2 First Test Results

This first test run included all operations mentionedin section 5.1. Test data was fetched via the TwitterREST API and cached for further processing of the tobe observed Twitter accounts. Caching was done toensure that all accounts that were observed were in theexact same stage during testing because users tend toTweet from time to time, which would affect the testresults. 49 Twitter accounts were compared to @meb-ner. Within the test set of users, 21 Twitter accounts ofpeople that work in similar or same fields as @mebneror are actually students of his, were added to measurethe reliability of the system. The rest of Twitter ac-counts for this test run were chosen randomly.

Figure 3 visualizes the results of a first test run,based on Martin Ebners (@mebner) Twitter account.

Nearly all best scoring users were wether studentsof TUGraz or researchers, whose profession is verysimilar to @mebners. @gargamit100 for example,scored a similarity of 0.28 and was therefore, the bestmatch in this test set. This person is for example an e-learning specialist from India and is already followedby @mebner. Not a single random pick scored morethan slightly above 0.09, but still lower than 0.1. Atech bloggers Twitter account scored best in the nonresearcher dataset what indeed could also be of poten-tial interest of a professor of a university of technol-ogy. Five of 21 manually added researchers and stu-dents scored lower than expected. By applying moreratios like discussed in section 3.3, we expect to min-imize the error rate to a satisfying level.

Nonetheless, the 0.1 mark seems to be a goodthreshold for deciding, wether a Twitter accountshould still be considered for further analyzation. At

Figure 3: Test run with 50 Twitter users.

least in the case of @mebner.In advance to this first test, similar test runs

were done for every member of the manually pickedusers. Figure 4 visualizes all found optimal thresh-olds, which would enable the categorization of an ac-count to reach a similar accuracy to @mebners testrun.

Figure 4: Thresholds of the 22 hand picked users.

By observing each result set of the tested users,thresholds were defined. These thresholds were setto meet a minimum 75% limit, where at least three-fourths of the hand picked users were categorized

as potentially interesting8. By summing all specificthresholds and dividing them by the amount of testedusers, we got an average threshold of 0.098. Althoughthe average threshold of 0.098 is very close to the pre-dicted 0.1 of @mebners case, the statistical spreadingof the specific thresholds are up to 50% and more,compared to the average threshold. So maybe the us-age of a threshold isn’t the best choice for pre elim-ination, because the amount of accounts for furtherprocessing may vary too much. Applying a simple k-

nearest-neighbor approach would be more appropri-ate to limit the number of potential recommendationsin advance. A limit for selecting the top n neighborswill be defined during the forthcoming tests.

5.3 Bubble Selection andRecommendation

All top n similar users within a test set, are now partof Thought Bubbles of a service using user. Withinthis set of potentially interesting users, category spe-cific bubbles can be extracted and then recommendedas a topic based subset of users. Unfortunately, thisfeature is currently in very early stages of develop-ment and therefore, not part of first proof of conceptapplication and test runs. In advance to that, Thought

Bubbles for a user of this service, will be available viathe REST API like visualized in figure 2. The bubbleswill be delivered as JSON9 objects and presented onusers client application, according to the client plat-form, as category specific lists, where users will beable to explore the new recommended Twitter profileson their own, to decide, wether a recommendation isuseable and interesting or not. The ability to rate therecommendations, will again sharpen the sense of theclassification task like mentioned in section 3.3.

6 DISCUSSION

Categorization within the additional 21 test runsdelivered of course different results. That’s quite ob-vious, simply based on the fact that different charac-ters use different words and phrases and have differentinterests in advance to their professions. Hence stu-dents often were identified as potentially interested inmusicians or sports men. Nonetheless, the usage ofRetweets in the set of tweets that where POS taggedand chunked, lowered the scores significantly withinthe set of accounts, which should at least score close

8The 75% rate of correct classification is motivated bythe results of @mebners Twitter account.

9http://www.json.org/ (April 2012)

to a specific threshold. As a result of that, futuretest runs will exclude Retweets from the classifica-tion task. Therefore, the number of Retweets a usersTweets have, will be considered as additional ratiolike mentioned in section 3.3.

Actually, another problem during the test runs oc-curred, which was indeed, very annoying. The 350requests per hour limit of Twitters REST API wasreached very fast. This problem could be solvedin future test runs, wether by scheduling the workerthreads according to this limit, or simply using an al-ternative service like Grabeeter

10 for grabbing peo-ple’s Tweets. The disadvantage of the second alter-native would be the fact that it’s necessary for themajority of users that are observed and categorized,to be users of Grabeeter (Muehlberger et al., 2010).Scheduling threads the way that they don’t exceed theTwitter REST APIs limit, would be on the other handvery time intense. Maybe a combination of those al-ternatives could solve this problem at a satisfying rateof time loss.

A big advantage of this system compared to simi-lar approaches like (DeVoch et al., 2011) is that in firstplace, the concept of Thought Bubbles isn’t limited tothe movement of a specified community like Research

2.0, but rather can be used in any kind of topic relatedcommunity. The fact that people are classified, basi-cally on the content of their tweets and not only onhashtags, mentions or already existing connections,leads to new and so far undiscovered personalized rec-ommendations of similar minded people. At the sametime, all recommendations are always based on thecontext of the latest n Tweets of a user. Therefore,recommendations change automatically, when a userchanges his or her interests or projects he or she iscurrently working on. Of course, assuming that theuser is tweeting about his or her current actions.

Although we didn’t make use of any classic se-mantic technologies like FOAF11 or SIOC12 so far,we consider to use them in advance of finishing thisproof of concept application. This would indeed en-able this system to link people beyond the borders ofTwitter. (DeVoch et al., 2011) for example, alreadyconceived and partially approved a system using theseclassic semantic approaches to mine specific sciencerelated events and its participants.

Nonetheless, one of this project’s main purpose isto answer the question to what extend Twitter is salu-tary for discovering utile and interesting informationfor a community like Research 2.0. Considering thefact, that recommendations depend on the quality of

10http://grabeeter.tugraz.at/ (April 2012)11http://www.foaf-project.org/ (April 2012)12http://sioc-project.org/ (April 2012)

Tweets of a user, we aim extract and find metrics andtechniques that enable us to filter as much as noise aspossible and detect mattering facts within a dynami-cally changing context.

7 CONCLUSION AND FUTUREWORK

Classification of user profiles in social networksisn’t just a Twitter related topic but can be usedfor similar networks as well. This can help to es-tablish connections between similar interested peo-ple, especially regarding scientific interests or exper-tise (Stankovic et al., 2010). New connections tonew users often lead to novel and utile information.Nonetheless, this kind of categorization of virtual in-dividuals isn’t only useful for user recommendations,but also for focusing resources regarding the needsand interests of a specific user, which probably willbe the next step in this project. A personalized streamof information similar to a personalized search engineis indeed a very powerful tool for personalization inany kind of business field that deals with supplyinginformation.

This scientific field is still in the early stages. Innear future we plan to finish a first complete proof ofconcept application, which would enable us to evalu-ate our chosen methods for classifying users and rec-ommending them. This certainly will help us to fur-ther access the full potential of such applications. Fu-ture users of the Thought Bubble service, will havethe opportunity to access others people knowledge byjust doing and tweeting about what they do. This isn’tjust a fast and convenient way for finding new inter-esting people, but rather a way to create ones personalsubset of people, which might be able to answer yourquestions or influence your work. Or in other words,this is one step forward to a personalized and focusedstream of information for everyone.

According to our future findindings during devel-opment, we hope to be able to answer if Twitter isa useful source in general, for mining ones needs ofinformation especially for researches and general sci-ence related content, or if the huge amount of noisecan’t be eliminated at a satisfying level computationtime.

REFERENCESAbney, S. P. (1994). Parsing by chunks.Chen, J., Geyer, W., Dugan, C., Muller, M., and Guy, I.

(2009). make new friends, but keep the old recom-mending people on social networking sites.

Choudhury, S. and Breslin, J. G. (2010). Extracting seman-tic entities and events from sports tweets.

DeVoch, L., Softic, S., and Ebner, M. (2011). Semanticallydriven social data aggregation interfaces for research2.0.

Gimple, K., Schneider, N., Brendan, O., Das, D., Mills, D.,Eisenstein, J., Heilman, M., Yogamata, D., Flanigan,J., and Smith, N. A. (2011). Part-of-speech taggingfor twitter: Annotation, features, and experiments.

Goldwater, S. and Griffiths, T. L. (2005). A fully bayesianapproach to unsupervised part-of-speech tagging.

Horn, C. (2010). Analysis and classification of twitter mes-sages.

Hsu, C.-W., Chih-Chung, C., and Lin, C.-J. (2010). A prac-tical guide to support vector classification.

Kraker, P., Wagner, C., Jeanquartier, F., and Lindstaed, S.(2010). On the way to a science intelligence: Visual-izing tel tweets for trend detection.

Mika, P. and Laniado, D. (2010). Making sense of twitter.Muehlberger, H., Ebner, M., and Taraghi, B. (2010). @twit-

ter try out #grabeeter to export, archive and searchyour tweets.

Nakagawa, T., Kudoh, T., and Matsumoto, Y. (2001). Un-known word guessing and part-of-speech tagging us-ing support vector machines.

Rios, G. and Zha, H. (2004). Exploring support vector ma-chines and random forests for spam detection.

Ritter, A., Mausam, C. S., and Etzioni (2011). Named entityrecognition in tweets: An experimental study.

Softic, S., Ebner, M., Muehlburger, H., Altmann, T., andTaraghi, B. (2010). @twitter mining microblogs usingsemantic technologies.

Stankovic, M., Wagner, C., Jovanovic, J., and Laubert, P.(2010). Looking for experts? what can linked data dofor you?

Tjang, K. S. and Buchholz, S. (2000). Introduction to theconll-2000 shared task: Chunking.

thought bubbles: a conceptual prototype for a twitter based recommender system for research 2.0

Documents

concept twitter users

users specic bubble

new users

specic twitter account

participating users

subsets of users

specic user

introduction twitter