social networks in data mining s as talks
DESCRIPTION
smsTRANSCRIPT
1
Copyright © 2012, SAS Institute Inc. All rights reserved.
Social Networks in Data Mining: Challenges and Applications SAS Talks May 10, 2012
PLEASE STAND BY
Today’s event will begin at 1:00pm EST.
The audio portion of the presentation will be heard through your computer speakers.
This is an automatic setup and is preferred. There will also be a limited option to listen
through the telephone to 250 lines.
If you would prefer to dial in, please call:
US Toll-Free: 1-888-682-4285
Toll/International: +1-973-368-0695
Conference Code: 4675179#
If you experience any technical difficulties,
you may contact WebEx Technical Support
at 866-229-3239.
#sastalks
Copyright © 2012, SAS Institute Inc. All rights reserved.
Social Networks in Data Mining: Challenges and Applications SAS Talks May 10, 2012
3
Copyright © 2012, SAS Institute Inc. All rights reserved.
Speakers
Stacy Hobson
Director, Customer Loyalty and Retention SAS Institute
Bart Baesens
Associate Professor, K.U. Leuven (Belgium)
Lecturer, University of Southampton (United Kingdom)
Social Networks in Data Mining: Challenges and Applications
Prof. dr. Bart Baesens1
Dr. Wouter Verbeke2
1,2Department of Decision Sciences and Information Management
K.U.Leuven (Belgium)
1Vlerick Leuven Ghent Management School (Belgium) 1School of Management University of Southampton (United Kingdom)
{Bart.Baesens;Wouter.Verbeke}@econ.kuleuven.be
Twitter: DataMiningApps
Facebook: Data Mining with Bart
My Research Team
process mining
business process management
data mining
(social) network analysis
incorporating domain knowledge
in classification models
customer churn prediction
data quality in a credit risk
management context
data quality and decision
making
data quality metrics
customer churn prediction
social network analysis
profit based data mining
credit risk modeling and scoring
rating transitions
microfinance
survival analysis
machine learning in software
engineering: software fault &
effort prediction
comprehens. decision suppor-
tive data modeling systems
Overview
• Revisiting Traditional analytics
• Improving Traditional analytics
• Social networks and applications
• A three-layered social network learner
• Case study: social networks in Telco
– Markov assumption
– Local versus Network variables
– Featurization
– Empirical Findings
• Conclusions
6
Revisting Traditional Analytics
Traditional Analytics: Performance benchmarks
Improving Traditional Analytics: 2 strategies
• Strategy 1: Use complex modeling techniques
– E.g. neural networks, support vector machines, random forests, …
– Pro: powerful models (e.g. universal approximation)
– Con: loss of interpretability, marginal performance gains
• Strategy 2: Enrich your data
– External data (FICO score, bureau data, …)
– Social Network data!
– Pro: model still interpretable
– Con: additional resources needed (economic, computational)
9
Traditional Approach to Analytics
Social Networks: Nodes versus Edges
• Nodes
– Customer (private/professional), household/family, patient, doctor, paper, author, terrorist, Web page, …
• Edges
– Different kinds of relationships, e.g., colleagues, friends, patients, disease, contact, reference, …
– Weighted based on, e.g., interaction frequency, importance of information exchange, intimacy, emotional intensity, …
11
Example Social Network Applications
• Churn detection in a Telco setting – Nodes are customers
– Edges are calling patterns between customers (based on CDR data)
• System risk in a Credit Risk setting – Nodes are banks
– Edges are liquidity dependencies
• Anti-Money Laundering – Nodes are bank accounts
– Edges are money transfers
• Viral marketing – Nodes are customers
– Edges are messages
12
Social Network Analytics: Challenges
• Finding the right balance between local, customer specific versus network information
– It’s not all in the network!
• Need procedures to infer the behavior of all nodes simultaneously
– Collective inference procedures (e.g. Gibbs sampling)
• No easy separation in training and test set
– Cannot just cut the network in two!
– Out-of-time validation needed
13
Out-of-Sample versus Out-of-Time Validation
14
?
?
?
?
? ?
?
?
?
Time
A three layered Social Network Learner
• Local model
– Only uses local (e.g., customer specific) information
– E.g. socio-demographic, RFM, customer interaction, …
– Can be estimated using e.g. logistic regression, decision trees, …
• Network model
– Takes into account the network information
• Collective inference
– Determines how the nodes mutually influence each other
15
16
?
?
?
?
? ?
?
?
?
?
?
?
?
?
?
?
?
?
?
Case Study: Social Networks in Telco
• Traditional customer churn prediction models treat customers as isolated entities
• Customers are however believed to be strongly influenced by their social environment
– Recommendations from peers, mouth-to-mouth publicity
– Social leader influence
– Promotions to acquire groups of friends
– Reduced tariffs for intra-operator traffic
17
Local Models for Churn Prediction
18
• Call Detail Records (CDR) data
– Detailed logs about each interaction involving a customer
– Gigabytes to Terabytes of data each day
– Extract the call graph using computationally efficient algorithms
– Represent call graph as sparse matrix
– Edge definition (SMS/Voice/MMS/Email/…)
181806208300809 32462208699 206105300897975 357014032645640 I 32461002530 9 MOBISTAR MOBILE 99 21JAN2010:23:45:44 0 0 0 0 2 1 1 … 195455641 32475611232 206102200262341 351913035725230 I 32476000005 10 Base SMSC Platform 99 21JAN2010:23:46:02 0 0 0 0 2 1 1 … 187097451101277 32465245451 206101100499483 356712034636630 I 32473161616 8 Proximus SMSC Platform 99 21JAN2010:23:45:44 0 0 0 0 2 1 1 … …
Constructing a social network using CDR Data
19
From CDR data to Sparse Matrix • Need facilities for sparse matrix handling and parallel computing
181806208300809 32462208699 206105300897975 357014032645640 I 32461002530 9 MOBISTAR MOBILE 99 21JAN2010:23:45:44 0 0 0 0 2 1 1 …
195455641 32475611232 206102200262341 351913035725230 I 32476000005 10 Base SMSC Platform 99 21JAN2010:23:46:02 0 0 0 0 2 1 1 …
187097451101277 32465245451 206101100499483 356712034636630 I 32473161616 8 Proximus SMSC Platform 99 21JAN2010:23:45:44 0 0 0 0 2 1 1 …
…
Raw
CDRs
C
A D
E
B F
J
I
H
G
Weighted
network
8
9
4
3
2
3
3
3
2 2
9 8
7
Case Study: European Telco operator
• Prepaid segment; about 2.000.000 customers
• 5 months call detail records + local attributes
• Churn rate 0.5% per month (skewed class distribution!)
• Weighted edges: number of seconds called during 3 months
• About 8.000.000 edges
• Total data set about 300 Gigabytes in size
The Markov assumption
• The class/behavior of a node in the network only depends upon the class/behavior of its direct neighbors
• Aka homophily, guilt by association
– Birds of a feather, flock together attributed to Robert Burton (1577-1640)
– (People) love those who are like themselves Aristotle, Rhetoric and Nichomachean Ethics
• Needed to facilitate computations (cf. Markov chains)
22
Local versus Network Variables
• A network variable aggregates information that is contained within a network structure and makes a differentiation in the destination of outgoing links or the origin of incoming links
• Examples:
– the number of contacts (local variable)
– the number of contacts with churners (network variable)
– the number of international calls (network variable)
23
Local versus Network variables
24
A Basic Network Model: Featurization
• Featurization or propositionalization: translate network into traditional attributes
• Network attributes can be included in traditional model (e.g. logistic regression)
• Create as many as possible and do stepwise regression
• A simple, interpretable social network classifier!
25
Example Network Model: Featurization
Example Network Model: WVRN
Results: Finding 1
• Network models boost performance and profit compared to a local model
28
Incremental profit increase
compared to no network effects
• Non-Markovian network effects – incorporating the impact of higher order neighbors leads to improved predictive power and profit!
Results: Finding 2
29
Incremental profit increase
compared to first order network
effects
Note: higher order effects previously
discovered in the spreading of happiness
and obesitas (N. Christakis, ‘Social
networks and happiness’)
Results: Finding 3
• Network models detect other types of churners compared to traditional models!
Synergy opportunities!
30
Fraction of the churners detected by the
network models (as a function of the
selected fraction of customers, ranked
according to their predicted probability to
churn), that are NOT detected by the
local model
Different curves represent different network
models (induced by different techniques)
Ensemble approach : Combining Local and Network models
• Use two models in parallel by selecting customers indicated by the local model and the network model
• Decide upon optimal fraction (current research)
31
Network
model
0.24
0.68
0.18
0.92
0.22
Ensemble model output
Local model
0.13
0.54
0.34
0.84
0.29
Ensemble approach: 2D Lift Curve
32
Current Research Topics
• Extensions towards regression context (e.g. CLV)
• Applications in other contexts (e.g. credit risk, anti-money laundering, customer acquisition, …)
• Integrating local information in a network learner
• Quasi-Social Networks
• Community mining
• Backtesting
33
Key lessons learnt • Introduced a three-layer social network learning
environment (local information, network information, collective inferencing)
• Defined local versus network variables
• Introduced featurization as a basic social network learner
• Discussed how non-Markovian behavior can be modelled in a straightforward way
• Illustrated the theoretical concepts using a real-life case study about churn prediction in the Telco sector
34
References • VERBEKE W., DEJAEGER K, MARTENS D., HUR J., BAESENS B., New insights into churn prediction in the
telecommunication sector: a profit driven data mining approach, European Journal of Operational Research, forthcoming, 2011.
• DEJAEGER K., VERBEKE W., MARTENS D., BAESENS B., Data Mining Techniques for Software Effort Estimation: a Comparative Study, IEEE Transactions on Software Engineering, forthcoming 2011.
• MARTENS D., FAWCETT T., BAESENS B., Editorial Survey: Swarm Intelligence for Data Mining, Machine Learning, Volume 82, Number 1, pp. 1-42, 2010.
• VERBEKE W., MARTENS D., MUES C., BAESENS B., Building customer churn prediction models with advanced rule induction techniques, Expert Systems with Applications, Volume 38, pp. 2354-2364, 2011.
• BAESENS B., MUES C., MARTENS D., VANTHIENEN J., 50 years of Data Mining and OR: upcoming trends and challenges, Journal of the Operational Research Society, Volume 60, pp. 16-23, 2009.
• GLADY N., CROUX C., BAESENS B., Modeling Churn Using Customer Lifetime Value, European Journal of Operational Research, Volume 197, Number 1, pp. 402-411, 2009.
• MARTENS D., BAESENS B., VAN GESTEL T., Decompositional Rule Extraction from Support Vector Machines by Active Learning, IEEE Transactions on Knowledge and Data Engineering, Volume 21, Number 1, pp. 178-191, 2009.
• GLADY N., CROUX C., BAESENS B., A Modified Pareto/NBD Approach for Predicting Customer Lifetime Value, Expert Systems With Applications, Volume 36, Number 2, pp. 2062-2071, 2009.
• BAESENS B., SETIONO R., MUES C., VANTHIENEN J., Using Neural Network Rule Extraction and Decision Tables for Credit-Risk Evaluation, Management Science, Volume 49, Number 3, pp. 312-329, March 2003.
35
FYI • Advanced Analytics for Customer Intelligence Using SAS
• Lecturer: prof. dr. Bart Baesens
• 3-day course offered
• Many companies have gathered huge amounts of customer data about marketing success, use of financial services, online usage, and even fraud behavior. Given recent trends and needs such as mass customization, personalization, Web 2.0, one-to-one marketing, risk management, and fraud detection, it becomes increasingly important to extract, understand, and exploit analytical patterns of customer behavior and strategic intelligence. This course helps clarify how to successfully adopt recently proposed state-of-the art analytical and data-mining techniques for advanced customer intelligence applications. This highly interactive course provides a sound mix of both theoretical and technical insights as well as practical implementation details and is illustrated by several real-life cases. Background material such as selected papers, tutorials, and guidelines are provided.
36
Acknowledgments • Jerry Oglesby, Director Global Academic Program & Global
Certification Education Division
• Larry Stewart, SAS Education Vice President
• Sean O’Brien, Director, Business and Curriculum Development
• Bob Lucas, Statistical Training and Technical Services Director
• Karen Washburn, Business Knowledge Series Manager
• Patsy Poole, Project Manager
• Hillary Kokes, former Business Knowledge Series Manager
• Lieve Goedhuys, former Academic Program Manager, SAS Institute Belgium-Luxembourg
• All the other great SAS folks for the excellent collaboration during the past years!
37
38
Copyright © 2012, SAS Institute Inc. All rights reserved.
Q & A
39
Copyright © 2012, SAS Institute Inc. All rights reserved.
Additional Resources Live Classes
Advanced Analytics for Customer Intelligence Using SAS
Analytics: Putting It All to Work
Upcoming Live Webinars
May 18: Getting Started with SAS® Enterprise Miner™
June 14: SAS® Information Management: Leverage and Extend Hadoop
SAS Talks on support.sas.com
Upcoming Live Events
Analytics 2012
Follow along on Twitter using #sastalks
Copyright © 2011, SAS Institute Inc. All rights reserved.
support.sas.com