combining social data and semantic content analysis for l ... · combining social data and semantic...
TRANSCRIPT
Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops (Università degli Studi di Bari ‘Aldo Moro’, Italy - SWAP Research Group)
I-CiTies 2015 2015 CINI Annual Workshop on ICT for Smart Cities and Communities
Palermo (Italy) - October 29-30, 2015
2
April 6, 20095.8 magnitude earthquake20 billions damages70,000 people displaced309 people died
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
L’Aquila
3
2015: six years later7 billions fundings still needed22,000 people still displacedDiaspora
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
L’Aquila
4
19 ‘new towns’ around l’Aquila 15,200 people today live there
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
L’Aquila
5
What about the consequences?
Loss of trust, sense of belonging, relationships
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
L’Aquila
6
Loss of social capitalCataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
L’Aquila
7Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
L’Aquila Social Urban Network
8Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
L’Aquila Social Urban Network
Our contribution!
9
Research Question:Is it possible to extract and process social
media to monitor in real time people feelings, opinions and sentiments about the current
state of the social capital of L’Aquila?
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
L’Aquila Social Urban Network
10
A framework for real-time Semantic Analysis of Social Streams
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
CrowdPulse
11
CrowdPulse
Social Data Extraction
features
Semantic Tagging
Sentiment Analysis Processing & VisualizationCataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
12
workflowCrowdPulse
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
13
Step 1: Social Data ExtractionCrowdPulse
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
14
Step 1: Social Data Extraction
Extraction
Source
Heuristics
CrowdPulse
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
15
Step 1: Social Data Extraction
Extraction
Source
Heuristics
CrowdPulse
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
16
Step 1: Social Data Extraction
Extraction
Source
Heuristics
ContentUser
Geo
Content+Geo
#www2015#democrats
#traffic
@barack_obama@comunefi
#earthquake
Page
Group
CrowdPulse
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
17
Step 1: Social Data Extraction
Extraction
Source
Heuristics
ContentUser
Geo
Content+Geo
#icities2015#democrats
#traffic
@barack_obama@comunepalermo
#earthquake
Page
GroupWe only extract public content
CrowdPulse
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
18
Use CaseL’Aquila Social Urban Network
Heuristics: - Twitter users (local newspapers, mention to politicians) - Twitter content+geo (50km around l’Aquila and/or specific hashtags as #laquila #earthquake, etc)
CROWDPULSE SETTINGS
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
19
Use CaseL’Aquila Social Urban Network
CROWDPULSE SETTINGS
Heuristics: - Facebook groups (identified after a thorough analysis) - Facebook pages (identified after a thorough analysis)
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
20
Use CaseL’Aquila Social Urban Network
Extracted content (example)
Tweets about the fear of new earthquakes.
Facebook posts about citizens’ proposals.
Tweets about people worried of the situation.Tweets about new buildings in the city.
CROWDPULSE SETTINGS
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
21
Use CaseL’Aquila Social Urban Network
Sentiment Analysis and Semantic Tagging of the content
CROWDPULSE SETTINGS
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
Keyword-based representation introduces a lot of noise in the analysis
22
aquila
??
(eagle)
(italian city)
(italian)
Semantic TaggingMotivations
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
(Please, do something: l’Aquila is going to die!)(Please, do something: the eagle is going to die!)
“Fate qualcosa per favore, l’Aquila sta morendo!”
?
23
Semantic TaggingMotivations
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
24
Step 2: Semantic TaggingCrowdPulse
Non-trivial NLP tasks (stopwords removal, n-grams identification, named entities recognition and disambiguation) are automatically performed
identification and disambiguation of the entities mentioned in the text.
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
25
CrowdPulseStep 3: Sentiment Analysis
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
26
Sentiment AnalysisMotivations
Is this content conveying any opinion?
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
27
Sentiment AnalysisMotivations
Is this content conveying any opinion?
This is a crucial issue if people-based findings have to be generated
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
28
Sentiment AnalysisDefinition
“It is the field of study that analyzes people’s
opinions, sentiments, evaluations, appraisals, attitudes, and emotions towards entities such as
products, services, organizations, individuals, issues, events, topics, and
their attributes “ (*)
(Pang, Bo, and Lillian Lee. "Opinion mining and sentiment analysis." Foundations and trends in information retrieval, 2008)
We concentrated on the polarity detection taskCataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
29
CrowdPulseStep 3: Sentiment Analysis
Overall sentiment: :-(
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
30
CrowdPulseStep 3: Sentiment Analysis
Overall sentiment: :-(
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
The process can be iterated over a larger set of content, to get findings about the feeling of the
population regards a certain topic
31
CrowdPulseStep 3: Sentiment Analysis
Overall sentiment: :-(
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
32
CrowdPulseStep 4: Processing & Visualization
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
33
Use CaseL’Aquila Social Urban Network
How to map each content with the social indicator it refers to?
CROWDPULSE SETTINGS
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
34
Use CaseL’Aquila Social Urban Network
Given a fixed set of social capital indicators, we built a classification model to associate each content (along with
its sentiment) to the social indicator it refers to.
CROWDPULSE SETTINGS
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
35
Use CaseL’Aquila Social Urban Network
Tweet about new buildings in the city.
Social Capital Mapper
Tweet about new buildings in the city.
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
36
Use CaseL’Aquila Social Urban Network
Tweet about new buildings in the city.
Input: Social indicators + classification model
Tweet about new buildings in the city.
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
37
Use CaseL’Aquila Social Urban Network
Domain-specific processing: Classification task
Tweet about new buildings in the city.
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
38
Use CaseL’Aquila Social Urban Network
Output: (multi-class) classification + sentiment
Tweet about new buildings in the city.
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
39
Use CaseL’Aquila Social Urban Network
Tweet about new buildings in the city.
The score of a social indicator is the average sentiment of all the content referring to it.
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
40
Use CaseL’Aquila Social Urban Network
CROWDPULSE OUTPUT
Overall score of the social indicators between March and August 2014
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
41
Use CaseL’Aquila Social Urban Network
CROWDPULSE OUTPUT
COMMUNITY PROMOTER
DEFINES SOME INITIATIVES TO EMPOWER THE SOCIAL CAPITAL
MONITORS THE STATE OF THE SOCIAL INDICATORS
Real-world applicationof the output
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
Lessons Learned
42Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
Lessons Learned
43
Pipeline of state of the art techniquesSemantic Processing, Sentiment Analysis, Machine Learning, Data Visualization
Use Case: L’Aquila Social Urban Network
DEFINITION OF A FRAMEWORK FOR REAL-TIME SEMANTIC CONTENT ANALYSIS
Cataldo Musto, Giovanni Semeraro, Marco de Gemmis, Pasquale Lops Combining Social Data and Semantic Content Analysis for L’Aquila Social Urban Network. iCities 2015 Workshop, Palermo (Italy) 29.10.2015
Thanks to the huge availability of textual data very complex
phenomena can be analyzed in a totally new way