big data becomes big analysis
TRANSCRIPT
Slide 2
The Current Situation in the Pharma Industry
Many challenges exist for data to be captured, integrated and shared
Data Silos
Incompatible instruments and software
systems, proprietary data formats
Legacy architectures are brittle and
rigid
SME knowledge resides in people’s
heads, little common vocabulary
Data schemas are not explicitly
understood
Lack of common vision between
business units and scientists
2
How do we change this landscape?
Slide 3
Pharma Is An Example of One Industry that Must Adapt
“It's better to be a pirate than to
join the Navy.”
―Steve Jobs
There normally exists a persistent desire to
look to past success and anchor ourselves to
it
Following preconceived doctrines is not always
what’s best
Apple changed telecommunications as a
computer company
What will the future of technology hold?
Whatever it is – will require an adventurous
approach
Slide 4
Moving to Smart Data
Smart data can be added to existing systems
Does not require replacement of existing tech
Smart data provides a separation of:
Model Layer
Data Layer
Link to the model layer
Leave data in place
Smart data links information from the models to instance-level data
Slide 5
Codes
Terms
Vocabularies
TaxonomiesModels
Ontologies
Reasoning
SEMANTIC METHOD
Slide 6
Enter Big Data
Hypothesis:
If I have more data at my
fingertips – then I will have more
answers
This is not necessarily the case.
One major hurdle:
“Real-world data […] is messy data,
filled with inconsistencies, potential
biases, and noise.”
Copping & Li Harvard Business Review
Nov 29, 2016
Slide 7
Understanding the 4V’s of Big Data
Normally the focus –
Big Data Analysis is
more than just size
Performance is
Critical to Success
Data complexity is
increasing – Model
complexity
Uncertainty abounds
– requires statistics
and probabilities
Majority of Big Data analytics
approaches treat these two V’s
Semantic
technologies provide
clear advantages
Mathematical
Clustering
Techniques
provide clear
advantages
Slide 8
The power of analytics is now just
beginning to be felt
Moore’s Law pertaining to
processing is not the problem
Focus on the growth of Analysis:
From 1988-2003 Computer
processing speed grew by 1000x
In the same period algorithm dev
grew by 43,000x
Advanced analytics is reaching an
inflection point in adoption by both
mid-market organizations and large
enterprises in an effort to gain a
competitive advantage.
The Growth of Analytics is Changing the Game
AN
ALY
TIC
SInternational Institute for Analytics
Jan 6, 2015
Slide 9
THE MOVE FROM BIG DATA TO
B IG ANALYS IS
STA
TIS
TIC
AL
SE
MA
NT
ICS
MA
CH
INE
LE
AR
NIN
G
RE
AS
ON
ING
Slide 10
Big Analysis Requires Hybrid Architectures
Semantic DBs
Unstructured Docs
Structured Data
Cloud DBs (NoSQL)Analytics
Dashboards & Reports
Integration Layer
Slide 11
1. Data Lakes
Lightweight metadata provides search
Addresses problem of “schema on read”
2. Data Catalogs
Vocabs, Taxonomies, Ontologies
Links private & public data
3. Advanced Analytics
Text extraction – combines statistics and semantics
Classifiers inside of algorithms can be uniform
Trends, clusters can be labeled as “named graphs”
The WHAT (content), WHO (users) & HOW (workflows) can
all be captured and used.
Use Cases
• Small Molecule
• Large molecule
• Crop Sciences
• Regulatory Intelligence
• Archiving
Slide 12
Innovation is key
The Role of Innovation:
Requires foresight and stepping out
of your comfort zone
Today’s problems will not be
tomorrow’s problems – so we need
new approaches
Cannot be “business as usual”
because the landscape is changing
Be outside the box and reward
creativity