data driven decisions seminar
TRANSCRIPT
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Data Science Company
Data Driven Decisions withloosely structured data
InfoFarm seminar25/11/2015
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Wrap up and lunch
9:30 9:40 10:00 10:30 11:45Coffee & welcome
Dark data
External data To structure or not to structure?
Log files
Text mining
Network analysis
Image processing
Overview
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.beData Science Company
About us
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Building (Big) Data (Science) solutions
– Recommendation engines, Prediction models, Automated classification, …
– Custom-made data applications
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
developmentDevelopment
Domain
knowledge
Data
Science
Visualization
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Completing the puzzle
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Our approach
•Explorative Data Analysis (EDA)
•Formulate hypotheses•Hypotheses testing
•Implement•Automate•Integrate•Add extra data gathering•Rollout
•Identify use cases•Clean data•Enrich data
•Gather the data you need
Acquire Prepare
AnalyzeAct
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Our approach
“Don’t run before you can walk”
CollectDescribe
DiscoverPredict
Advise
This is were the hype
around Big Data and Data
Science generates
unrealistic expectations!
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.beData Science Company
Data Driven Decisions
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Business KnowledgeAcquired by experience
(assumed) insights
RISK: too high bias on past experience and gut feeling
Data ScienceComplementary to business knowledge
Confirmative or new insightsData-driven decision taking
RISK: too naive data intepretation, disconnected from business
Versus business knowledge
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
It’s all about asking the right question!
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Finding those questions
What do you want?
What do you have?
What is feasible?
What is implementable?What can you get?
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Where to start?
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
The key point is spotting opportunities to outperform your
competitors!
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
What to dream?
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
How about the in the room?
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.beData Science Company
Unused but valuable data sources
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Hidden – Forgotten - Underestimated
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Inaccessible
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Terrifying
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Or all of the above
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Internal
ERP SYSTEM
Financial management
Supply Chain Management
Manufacturing Resource Planning
Human Resource
management
Customer Relationship Management
Secondary use of
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
When all comes together…
WEB SERVER LOGSWhich customers
looked at similar products?
ORDER HISTORYWhich
complementary products does the
customer own?
EXTERNAL DATAReviews or critics?
CRM INFORMATIONTypical profile of a
customer responsive on campaigns for a
similar product?
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.beData Science Company
Analyzing non-relational data
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Who’s connected to who? Who bought what?
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Log files
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
From weblogs
How long?
When?What?
What will you buy???Will you buy???
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Over usage logsUses of Google
When you're too lazy to type in ".com"Finding PornFinding useful informationSpell Checking
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
To performance logs
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Demo: the use of external logs
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
External weather logs
Coordinates?
Missing data
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Do not blindly trust your data
Outliers Missing
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
But do not give up to soonLa
titud
e
Longitude
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
As usage is still possible
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Text mining
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
From named entity extraction
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Over sentiment analysis
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
To topic extraction
Text
TopicWhat should we
communicate about?
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Network analysis
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
From network optimization
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Over mathematical mumbo-jumbo
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
To (predictive) network analysis
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
And recommenders
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Image processing
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
From human taught
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
To self-learned
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.be
Veldkant 33A, Kontich ● [email protected] ● www.infofarm.beData Science Company