text, content, and social analytics: bi for the new world
DESCRIPTION
Presentation by Seth Grimes to the TDWI Washington DC chapter, July 15, 2011TRANSCRIPT
![Page 1: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/1.jpg)
Text, Content, and Social Analytics: BI for the New World
Seth GrimesAlta Plana Corporation
@sethgrimes
TDWI – Washington DCJuly 15, 2011
![Page 2: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/2.jpg)
Text, Content & Social Analytics
Table of Content:1. Principles.2. Perspectives.3. Semantics.4. Text/content analytics.5. Social.6. BI for the New World.
![Page 3: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/3.jpg)
Text, Content & Social Analytics
Imperatives for the 2010s:Do more with more.
“It’s Not Information Overload. It’s Filter Failure”: Clay Shirky, 2008.
• More sources & types of data.• Greater data volumes.• New hardware and methods.
Automate more, more intelligently.• Analytics.• Semantics.
Engage. Socialize.
![Page 4: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/4.jpg)
Text, Content & Social Analytics
I see three categories of data:1. Quantities, whether measured,
observed, or computed.2. Content, which I’ll characterize as
non-quantitative information.3. Metadata (semantic & structural)
describing quantities and content.
• Our concern is content, analytics & fusion.
• Structured/unstructured is a false dichotomy.
• Where do relationships fit?
![Page 5: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/5.jpg)
Text, Content & Social Analytics
DW & BI relate numbers...
...but by-the-numbers BI lacks doesn’t explain.
![Page 6: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/6.jpg)
Text, Content & Social Analytics
Questions for business (& government):
What are people saying? What’s hot/trending?
What are they saying about {topic|person|product} X?
... about X versus {topic|person|product} Y?
How has opinion about X and Y evolved?
How has opinion correlated with {our|competitors’|general} {news|marketing|sales|events}?
What’s behind opinion, the root causes?
Who are opinion leaders?
How does sentiment propagate across multiple channels?
![Page 7: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/7.jpg)
Text, Content & Social Analytics
The answers are here...
But how do you get at them?
![Page 8: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/8.jpg)
Text, Content & Social Analytics
“In this example, you can quickly see that the Drooling Dog Bar B Q has gotten lots of positive reviews, and if you want to see what other people have said about the restaurant, clicking this result is a good choice.”
-- http://googleblog.blogspot.com/2009/05/more-search-options-and-other-updates.html
“In the recap of [Searchology] from Google’s Matt Cutts, he tells us that: ‘If you sort by reviews, Google will perform sentiment analysis and highlight interesting comments.’
-- Bill Slawski, “Google's New Review Search Option and Sentiment Analysis,” http://www.seobythesea.com/?p=1488
![Page 9: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/9.jpg)
Text, Content & Social Analytics
Text Analytics!
More generally...
![Page 10: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/10.jpg)
Text, Content & Social Analytics
Analytics is a collection of tools and techniques that extract insights from data. Apply or embed analytics within business contexts – collect data and information about customers, markets, suppliers, and business processes – use results to inform, drive, and optimize business decision making – and you harness analytics as a core BI asset.
![Page 11: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/11.jpg)
Text, Content & Social Analytics
http://www.tropicalisland.de/NYC_New_York_Brooklyn_Bridge_from_World_Trade_Center_b.jpg
x(t) = t
y(t) = ½ a (et/a + e-t/a)
=acosh(t/a)
http://en.wikipedia.org/wiki/Seven_Bridges_of_K%C3%B6nigsberg
Analytics seeks structure in “unstructured” sources.
![Page 12: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/12.jpg)
Text, Content & Social Analytics
“Statistical information derived from word frequency and distribution is used by the machine to compute a relative measure of significance, first for individual words and then for sentences.”
-- H.P. Luhn, The Automatic Creation of Literature Abstracts, IBM Journal, 1958.
Text analytics models text.
http://wordle.net
![Page 13: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/13.jpg)
Document input and processing
Knowledge handling is key
Desk Set (1957): Computer engineer Richard Sumner (Spencer Tracy) and television network librarian Bunny Watson (Katherine Hepburn) and the "electronic brain" EMERAC.
Hans Peter Luhn “A Business Intelligence
System”IBM Journal, October 1958
![Page 14: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/14.jpg)
“This rather unsophisticated argument on ‘significance’ avoids such linguistic implications as grammar and syntax... No attention is paid to the logical and semantic relationships the author has established.”
-- H.P. Luhn
![Page 15: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/15.jpg)
Text, Content & Social Analytics
Named entities – people, companies, geographic locations, brands, ticker symbols, etc.
Topics and themes
Sentiment, opinions, attitudes, emotions
Concepts, that is, abstract groups of entities
Events, relationships, and/or facts
Metadata such as document author, publication date, title, headers, etc.
Other entities – phone numbers, e-mail & street addresses
Other
0% 10% 20% 30% 40% 50% 60% 70% 80%
71%
65%
60%
58%
55%
53%
40%
15%
Text Analytics 2009: User Perspectives on Solutions and Providers
My 2009 text-analytics market survey asked, [What information] do you need (or expect to need) to extract or analyze:
![Page 16: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/16.jpg)
Text, Content & Social Analytics
![Page 17: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/17.jpg)
Text, Content & Social Analytics
From document to DB; an IBM example: “The standard features are stored in the STANDARD_KW table, keywords with their occurrences in the KEYWORD_KW_OCC table, and the text list features in the TEXTLIST_TEXT table. Every feature table contains the DOC_ID as a reference to the DOCUMENT table.”
![Page 18: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/18.jpg)
Text, Content & Social Analytics
Ken Jennings, IBM Watson, and Brad Rutter play Jeopardy!
https://secure.wikimedia.org/wikipedia/en/wiki/File:Watson_Jeopardy.jpg
Welcome to the New World.
The Far Side by Gary Larson
![Page 19: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/19.jpg)
Text, Content & Social Analytics
Search BI
Text Analytic
sSemantic search
Information Access
Integrated analytics
In a sense, text analytics, by generating semantics, bridges search and BI to turn Information Retrieval into Information Access.
![Page 20: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/20.jpg)
Text, Content & Social Analytics
Have we arrived?
2001: A Space Odyssey, Stanley Kubrick
![Page 21: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/21.jpg)
Text, Content & Social Analytics
http://www.businessweek.com/magazine/content/04_19/b3882029_mz072.htm
En route.
![Page 22: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/22.jpg)
Text, Content & Social Analytics
Intelligent computing involves:Big (and little) Data.• Quantities.• Content.• Metadata.
Analytics.Semantics.Integration.Inference
![Page 23: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/23.jpg)
Text, Content & Social Analytics
Semantics enables better content production, management & use.
Semantics captures –Meaning
RelationshipsContext
Understanding– the sense of “unstructured” online, social, and enterprise information, for content consumers and publishers.
Semantics unites data of all types.
![Page 24: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/24.jpg)
Text, Content & Social Analytics
Content, composites, connections.
![Page 25: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/25.jpg)
Text, Content & Social Analytics
Content, composites, connections, 2.
![Page 26: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/26.jpg)
Text, Content & Social Analytics
Content, composites, connections, 3.
![Page 27: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/27.jpg)
Text, Content & Social Analytics
From connections to influence: What’s wrong with these pictures? (Radian6, Sysomos, Klout)
![Page 28: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/28.jpg)
Text, Content & Social Analytics
Social analytics:1. Use social data in analyses
(alongside enterprise & online information).• Content.• Connections.
2. Bring BI to social analyses.3rd & 4th senses of social analytics:
3. Adopt agile, collaborative methods.
4. Share your data.A challenge: Enterprise-social-online
data integration.
![Page 29: Text, Content, and Social Analytics: BI for the New World](https://reader038.vdocuments.us/reader038/viewer/2022103111/54c65b674a7959e9438b45f3/html5/thumbnails/29.jpg)
Text, Content, and Social Analytics: BI for the New World
Seth GrimesAlta Plana Corporation
@sethgrimes
TDWI – Washington DCJuly 15, 2011