facebook, brands everything between… · page but not expressing which is quite intuitive given...
TRANSCRIPT
Facebook, Brands & Everything between…
Preriit Souda, TNS
Abstract: This paper talks about using techniques like network mapping and contextual text mining to
understand brand pages on Facebook. Paper showcases how these techniques can be useful to marketers to
understand their fans, filter topics that concern or attract, devise better communication strategies, understand
perception of products and decipher linkages between, understand impact of competition on your page and in
some cases be able to mine innovation ideas. Try has been made for paper to be simple, easy to read for a
variety of readers.
Keywords‐ Social Media, Social Media Analytics, Social Media Monitoring, Facebook Analytics, Network
Mapping, Data Visualization, Text Mining
Facebook, Brands and Everything Between… Preriit Souda
Let me guess. You either took a leap of faith in spending time on my paper given the pompous abstract you read or your
boss heard about this stuff and being boss has asked you to burn your eyes to fill this knowledge in your over exhausted
mind and then vomit it before a host of un‐interested colleagues. No matter in which category you fall into, your journey
through this paper will at least show you some colorful pictures that you can print and hang on your wall!!...
Enough of ink wastage! Coming to the point, in this paper I will be talking about analyzing facebook (FB) brand pages
using network analytics and text mining. If your face has turned timid worrying if I am now going to spit out some difficult
mathematical stuff, you worry not. One I don’t know those either but more importantly this paper is meant to give
overview of these things and their usefulness from a marketing standpoint.
This paper will show how I have been using these different techniques using three facebook pages namely How I met your
Mother (HIMYM), Yes Scotland and Better Together. How I met your Mother is a famous American sitcom in its last
season ending this march (sob sob!) so one can see a lot of chatter.
On the other side, I have shown pages from two camps of
Scottish referendum, Yes Scotland and Better Together. If you
are not aware (if so, very bad general knowledge!), Scotland
goes to poll on 18th September if it wants to be an independent
country or continue to be part of UK. Two topics are different in
a lot of sense and that adds an interesting spice to this paper.
As you can see that this paper is not exactly like a typical
academic paper so feel no fear, sit back and enjoy the read..
A BIT OF FLASHBACK
I come from engineering background and even till date continue to read
technical publications like say IEEE Spectrum, Potentials and similar. I had
come across graph theory [1] during my study year at UCLA but hell did I care
then! But lately I have been seeing a lot of growing interest in such techniques
[2] and hence got attracted to this subject to explore data from social media
networks in a manner useful for marketers. As a starting point, I extracted
connections on my facebook page shown in fig1 and could see cluster of
friends being formed based on connections between them. Slowly took on
twitter which I won’t go into here. That’s a beast in itself! As I looked deeper, I
felt that networks as stand‐alone don’t give a complete picture. Network maps
in social media give a framework to think and a context to help understand but
they lacked the soul offered by conversations. So I started looking at network
analysis in conjunction with text mining to understand social media in a better way. Network mapping and analytics gave
me a framework to explore while text mining added richness to the analysis. Coming back to facebook, this social network
is a bit tricky but once tamed is one of the most wonderful place to understand a brand. Given that facebook is not so
open like twitter, what one can do with facebook is limited. Generally speaking, only conversations available in public like
say on public FB brand pages can be analyzed or comments from people who have lost sync with FB’s constantly changing
privacy settings. I have often felt that comments coming from other than brand pages often lack context making it
difficult to understand. Hence, I generally tend to focus only on FB brand pages. Few words about brand pages. I have
often felt that brand pages are like townhalls where people interested about a topic (say a brand, political movement etc)
come to talk about different issues. Brand pages (through updates) give fans fodder to start talking but then fans can take
their own course to drive conversations forward by adding their own twists and turns. A page, if managed with due
diligence, simulates an actual meeting room able to attract and hear voices from all over the globe. And mining such a
data is an experience in itself. Moving on to details now.
HERE WE GO
Any project starts with extraction of data. Once extracted, it’s always a good idea to look at volumetrics from that page.
Volumetrics are like say no. of fans, demography, most commented/liked/shared posts, engagement etc. Barring reach
and a few other metrics available exclusive to page owners, several useful metrics can be derived. Coming to our
facebook pages, one can see the basic
stats in the figures below.
You can see that HIMYM has a lot more
number of fans which is quite
understandable given that it draws viewers
from all over the world and has a lot more
interactions (likes + comments). Having
said that, looking at average
comment/interactor shows that Better
Together (called BT hence forth; I am too
lazy writing the full name!) has a lot more
higher proportion of discussions going on.
In pure numeric terms, no. of comments
on BT (fig 2) is 72486 compared to
HIMYM’s 40861. This shows that a large
no. of interactors are silently following the
page but not expressing which is quite intuitive given that I may like Barney Stinson’s (one of the characters of HIMYM)
gimmicks but I may not be so motivated or emotionally interested to express my opinion like I may be on issues raised by
BT or Yes pages. This is an important point for several analysts who look only at volumetric analysis of social media but do
not delve into richness of it. Numbers are important for gauging reach but there is a limit to how much one can go
beyond to understand actual conversations [3].
Talking about likes, Yes’ brand page got a
very high average of likes (3.99)
compared to BT page (2.54) which draws
me to another interesting difference
between how different these pages are.
BT has a lot of conversations (72k+)
happening around referendum, often
heated often informative while yes page
lacks such discussions (18k+). Often
higher proportions of likers can mean
that several accept without much interest
or they agree to the brand/organization’s
message without much contradiction
worth expressing. It’s also worth noting
that often political pages attract
commenters from other side of the
spectrum to engage and spread their
word. Looking at textual analysis, if it
appears that not many conversations hidden beneath thousands are favorable to the brand, that’s an alarm bell for the
page managers. Either constant monitoring or advanced text mining is needed to decipher such issues.
Another interesting aspect is gender. HIMYM has a higher % of female (55%) following but whether there is a split of
interest in accordance to different communications or not can be easily answered using network maps (coming next, hold
on). If so, communication strategies can be devised accordingly to suit different gender. Often it’s also useful to
understand how different genders are interacting (actively‐comments and/or passively‐likes) and if there is any pattern
coming out. A look at Scots show that while BT has an even split, insight lies in Yes page where male are more by 30%
compared to female something which also resonated with surveys done by TNS BMRB [4]. Brand managers can delve
deeper to devise appropriate communication strategies. Trend analysis gives a peek to see if any date garnered higher
interest. But more importantly looking at time is what gives an interesting understanding of how your fans are
interacting. Depending on subject and type of post, different patterns can be seen emerging. For example, a quick server
restaurant page, generally got higher interactions 2 hours prior to lunch time and taking into consideration propagation
time, time of communications can be devised accordingly. Now let’s quickly move to network mapping.
FANCY CHARTS ARE COMING. Pay attention!
Pictures shown next are basically showing all interactions that have taken place on these three brand pages (last 100
posts). Let’s start from fig 4. It shows all people (red colored circles) who have commented on last 100 posts while blue
colored circles represent FB updates (communications put by brands). Whenever a person interacts with a FB post, it gets
linked by an orange line and more it interacts with same post thicker the line becomes. More the person has interacted
on an overall, bigger the red circle and more the interactions happen on a FB post, bigger in size it becomes. For example,
as shown in fig 4, user ABC had interacted 1478 times while user XYZ has interacted 2 times so the size changes
proportionately. Post marked David Bowie got highest number of combined comments, share, like compared to other
posts , hence the bigger size. Same explanation holds for fig 5‐ (Yes page) (except that here links are blue in color) and in
fig 6‐(HIMYM page) where posts are orange colored, interactors violet and links yellow.
Apart from the fact that these pics are beautiful what insight does it give? To start with it gives a complete snapshot of all
interactions happening on your page. Once created you can analyze which post got highest number of engagements and
see how people who commented on that interacted with other posts. One can filter out specific themed posts and see if
there is any pattern coming out from such posts. It also shows a quick understanding of communications which generated
higher number of interactions like facebook likes but lesser number of comments; helping you to understand
communications based on fan behavior.
It also shows that No camp (BT page) has lots of extremely active interactors while yes page doesn’t have any. It can
mean positive and negative depending on which side they represent. Having said that, one can filter out such influential
highly active users to look if they are bots (you never know!) and if not is there a specific theme coming out from their
conversations. Are they supportive of brand and if not do they have genuine concerns which need to be addressed. If
tackled properly, these users can turn out to be much more reliable (& cheaper) brand advocates given that they are
common users and they are perceived to have higher
credibility in certain scenarios.
Often fans comment on your brand communications and
that generates a large number of likes showing an
agreement among fellow fans for the idea expressed.
Shown in figure 7, green lines show such fans whose
comments on No page attracted proportionally higher
positive interest. Often it can be useful to mine these
comments and one see if there is a pattern coming out of
it; giving an idea of the underlying often lost voices of
common fans of your brand. Sometimes these comments
can also give ideas in relation to product improvement or
spark innovation for brand. We also looked at likes made
on each comments made by fans on an overall basis and
linked it with what they were speaking about to give an
idea of what was drawing positive attention of other fans.
We did this for No page and found that comments on
themes like nationalism and currency drew higher
interest. Such analysis apart from helping understand
mood of fans can also help strategize communications. Sometimes it can also be useful to see if there are specific themes
driving engagement on different updates. If you are an analyst reading, as a word of advice, dependent you choose for
modeling here and categories you derive based on text mining determine how good your model will be.
One can also map themes based on fan comments to see which posts attracted highest engagement in regards to that
theme. Shown in figure 8 is a map of yes page where red lines show people (light blue circle here) who have talked about
a theme concerning European Commission on specific posts. Thicker the red line more the comments and bigger the size
(of orangeish red circle) more interactions have taken place on that post. When posts are on wider topics and you wish to
see how your specific topic/theme of interest is spoken of and in which context, such maps can be useful.
Moving to the next colorful picture, fig 9. What you are
seeing here is a clustering done on FB posts basis fan’s
interactions. For data scientists in you, exercise caution
because depending on which algorithm you take
interpretation and story emerging out of it changes. In
fig 9, all posts on HIMYM page are clustered into 14
groups. This process is iterative and continues till you
are happy with story emerging out of your clusters. Here
say all posts and users who have been clustered in one
group are colored same. So say green one is group which
talked about Barney (Neil Patrick Harris) at Grammys
while flouroscent green ones are about slap bet (only an
HIMYM fan will understand what’s a slap bet) while red
(like of pomegranate color) ones are about season
finale. I know that you can’t decipher so many colors yet
I am telling just for sake of telling. Keeping it very simple,
here posts are clustered basis common fans interacting
on posts and it lies up to analyst to decide algorithm to opt and story to decipher. Once clustered, one can deep dive and
see if any story emerges in relation to behavioral attributes like time of posting, discussion dynamics, gender, themes of
conversation etc. and use such information to devise targeted communications. I have been referring several times about
textual attributes like themes and now is the time to look into it. Before you read ahead, stretch yourself, drink a
coffee/tea and be back with a fresh mind.
TAKE A BREAK & be back soon..
Soul of SM analysis is text mining. Richness of your analysis is largely dependent on the algorithms you create to
understand conversations on facebook. Conversations on facebook are different to twitter given the absence of character
limits on facebook. Additionally, mostly all conversations on brand pages are incited by messages propagated by brand.
Hence the importance of context to the algorithm you create.
Before starting, it’s very important to have an understanding of topics of concern around brand, brand’s positioning and
certainly communications by brand. Here I will show text mining only for two Scottish pages. To start with, I created a
very basic framework in which I decided to use three buckets to start with namely issues around the referendum,
information being mentioned in relation to and actions by citizens in this regards. Each of these can have different shades
of emotions mentioned in conjunction and they are put in positive, negative, contextual and uncertainty. Positive can
include positive emotions around an issue or action or functioning of govt and similar. Contextual is a type of tone which
has shades of grayness across the emotional spectrum. Here every shade of emotion is preserved and never converted
into a +1/‐1 like most SM software do. Moving deeper, say for issues we create algorithm that can decipher it explicitly,
implicitly and in context. To give an example, Alex Salmond or first minister is classified as who he is which is quite simple
but people often call him eck or eek which needs to be taught to the system. Additionally, descriptors like say
Westminister or London mean location generally but here people use it to refer to UK govt which needs to be modeled.
Words like say “ruk” which does not mean anything but scots use it to denote rest of UK. Taking diversion to quote an
example, I was once working on a campaign monitoring and the client had an advertisement with elephant on it and
people were referring to it. Machines are unable to understand that here elephants are not simply animals here but are
spoken in relation to an ad campaign. Such customizations can make a major impact in understanding rich insights
flowing by your feet on social media but left unnoticed.
BT page had around 75k+ conversations and thousands of
descriptors created for various issues talked about.
Themetic maps shown in fig 10 categorize these
conversations into a manageable number of themes which
are precise and concise to give one snapshot view of all
chatter. In the themetic map, bigger the circle more that
theme has been discussed. For the analysts reading this,
please note that these themes are not mutually exclusive
and exhaustive. These themes are created keeping in mind
the objectives of study. Talking about two scottish pages,
there isn’t much difference between two camps given that
currency, EU Membership, Benefits, Oil, Banking etc appear
to be the most important issues. There are other things
which people also talk about like say political parties,
politicians, government etc. & fig 11 shows encompassing
these additional themes (for BT page). Apart from giving an
idea of chatter on brand page, one can understand
difference between what is being broadcasted on page and how are fans responding back. Can understand if themes
coming up are in line with brand’s overall strategy or not.
A word about methodology: based on
descriptors talked about, which are then
categorized using a host of NLP (Natural
Language Processing) techniques we also
create specific pattern recognitions into
the algorithm to suit objectives of the
study. An important point to note is that
themes you are seeing is basis overall FB
page, so there are a portion of
conversations left uncategorized because
they are too specific to the FB update they
are commented on. Hence, we also look
at posts with highest number of
engagement, individually, to decipher
themes. This improves understanding
given the refined context being added to.
Taking a step back, in pic 11, there are
themes related to political organization,
politicians etc. Each of these themes have
several sub themes. Subtheme for
politician theme is shown in fig 12 (yes camp shown). A look at both pages showed that this referendum is all about Alex
Salmond. These sub themes help deep dive into top level themes and that’s where several nuts and bolts of an effective
communication strategy lie.
This brings me to inter themes. A lot of times, people
speak about different topics in relation to each other. We
mine for such to find inter‐relations between different
themes and understand strength between each. We call
such inter‐themetic map which is shown next (fig 13).
Shown here is an inter‐themetic map for the Issue‐
Currency created for BT page. Here, whenever any other
theme is spoken alongside currency, it gets linked to issue‐
currency and more the occurrence thicker and darker the
link becomes. If you look at pic, there are obvious themes
like banking, Bank of England (Central bank of UK) but
other themes like SNP , Alex Samond, debt, EU
membership which are also spoken about. Additionally
though weak, there are also instances of 3 issues being spoken in conjunction. Such inter‐themetic maps help understand
how different dimensions of a brand are perceived together in conjunction to each other. Diving deeper, one can
understand tones inside such inter‐connected themes. Sometimes, often minor aspects (sub‐themes) of a brand (or your
product offering) tend to have a strong linkage with important aspects of your offering but are left unnoticed. Such maps
help mine such linkages.
This brings me to tones. I guess by now you have started flipping for pages left in this paper and to your delight let me tell
that you are nearing the end of this torment. Talking about sentiments, I have never believed that sentiments coded as
+1/‐1 help understand conversations meaningfully [3]. Additionally by converting sentiments to +1/‐1 we are in a sense
throwing away a lot richer insight offered by our fans.
With this intention, I tried to create what I call tones. Basically tonal
mappings try to map different aspects of your brand personality (or
objectives of interest) with different shades of sentiments expressed.
Tonal maps are a more qualitatively richer approach of understanding
sentiments otherwise shown as mere +1/‐1. Though it does loose some
information but its choice one makes between reading thousand
conversations or one simple graph. Depending on objective of study we
create such maps. Here, we focused on Issues & personalities and mined
for related shades of positive, negative and contextual tones. Shown in
pic 14 is the entire tonal map of a Yes camp. Given the vast amount of
information shown in one picture these maps often become too big.
Hence I have created a toned down version of a similar tonal map for BT
page and showed it in pic 15.
Here blue and light blue colored circle represent issues and person respectively while red, green and grey represent
positive, negative and contextual tones respectively. Thicker the line and darker greyish more powerful is that
issue/person associated with that shade of tone. So currency union is called excellent, right thing to do and is of interest
while has a contextual tone of not independent (said in relation with Bank of England). Alex Salmond is associated with
dishonesty. Such tonal maps give a quick snapshot of different tones being linked with different issues giving a richer feel
of opinion around different aspects of brand, mined in accordance to objective of study.
And that’s it folks. Following were few
of my thoughts on how FB pages can
be analyzed. Feel free to correct me if
you think I am wrong or advise me to
improve or if in case you would like to
send me a LIKE for this work, do not
hesitate. I have often said that Social
Media analytics specifically in context
of marketing research is new and
hence more the conversations
happen, more the experiments we
undertake and more we experience
failures we have a real chance of
succeeding in this subject. With this
it’s time to stop wasting time reading
my paper and start doing something
useful.
REFERENCES:
[1] http://en.wikipedia.org/wiki/Graph_theory [2] http://www.wired.com/insights/2014/03/graph‐theory‐key‐understanding‐big‐data/ [3] http://www.kdnuggets.com/2014/02/social‐media‐analysis‐what‐is‐missing.html [4] http://www.tns‐bmrb.co.uk/uploads/files/scotland‐september‐18‐poll‐data‐tables_1392031949.pdf
About me (Preriit Souda):
This is one the most difficult question I get asked and having been asked so many times, I have prepared a short
paragraph which is copy pasted below. I am easily accessible at my email id‐ [email protected] or at my
landline number‐ +44 (0) 20 7656 5846. I could have given my cell phone but I am not sure how good you are in finding
out difference between your time zone and British time. Any ways, you can also contact me at
http://www.linkedin.com/in/preritsouda or @preriit2131.
Preriit Souda works closely with clients and researchers in TNS (London), analyzing social media data, passively collected
and survey data to answer key research questions. Preriit has a background in engineering (SPU & UCLA) and
Management Sciences and was a Research Assistant at SMU, Dallas. Preriit won ESOMAR Young Researcher of year
(2011), was nominated as one of the New Faces of Engineering 2012 (IEEE USA) and won the Best Analytics paper award
at the MRSI‐Annual Conference 2012. His most recent research was published in the CASRO 2014 journal. He has spoken
at ESOMAR, MRSI (India), IIEX North America , IIEX Europe, WARC, MRMW, ASC etc. Preriit blogs at ESOMAR &
KDNuggets.