machine learning - verdazo€¦ · 31/05/2018  · different types of machine learning supervised...

89
Machine Learning Practical Use in Upstream Oil & Gas SPE Oil and Gas Analytics Breakfast Series May 31 , 2018

Upload: others

Post on 16-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Machine Learning

Practical Use in Upstream Oil & Gas

SPE Oil and Gas Analytics Breakfast Series

May 31, 2018

1

Page 2: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Outline

2

1) Machine Learning Hype

2) What is Machine Learning?

3) Feature Importance

4) Machine Learning Challenges

5) What We’ve Found to Work Well in Practice

6) Machine Learning Power

7) Case Study 1: Predicting Reservoir Rock Properties

8) Case Study 2: Drilling Location & Completion Optimization

9) Conclusions

Page 3: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

1) Machine Learning Hype

3

Page 4: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Gartner Hype Curve for Analytics & BI

4

Visual Data Discovery

Exp

ecta

tio

ns

Time

Predictive Analytics (Machine Learning)

Page 5: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Top Two Red Flags Signaling Your Analytics Program will Fail

1) Executive team doesn’t have a clear vision for its advanced

analytics program

• CEO regularly mentions the company is using artificial intelligence or machine

learning, but never any specifics

2) No one has determined the value that initial use cases can deliver

within the first year

• Large-scale projects have long time-to-value, high chance of failure…and are

expensive!

(Source: McKinsey)

5

Page 6: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

How I Got into Machine Learning

6

Source: xkcd

Page 7: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Algorithmic Trading System Development Process

7

Price Data

Volume Data

Technical Indicators

Machine Learning

Trading Signals

(BUY or SELL)

Risk & Position Sizing

Rules

Automated Trading

System

Optimize ProfitabilitySource: The Wolf of Wall Street

Page 8: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Oil & Gas Development Planning Process

8

Reservoir Data

Completion Data

Drilling Data

Machine Learning

Performance

Predictions

Development

Possibilities, Costs,

Commodity Prices

Development Plan

Optimize NPV

Source: The Wolf of Wall Street

Page 9: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

2) What is Machine Learning?

9

Page 10: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

What is Machine Learning?

Machine Learning (ML) is a field of computer science that uses

statistical techniques and algorithms to give computers the ability

to “learn”:

• with respect to some task

• to optimize one or more performance measures

• without being explicitly programmed

10

the magic

Page 11: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

What Machine Learning Isn’t

• New

• It’s been around in the form of statistical models for centuries

• Smarter than us

• It can’t think laterally and has no understanding of causality

• Able to overcome the laws of physics, statistics, etc.

11

Page 12: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Different Types of Machine Learning

Supervised LearningTraining a model by example – predicting an outcome using data where examples

of input-output pairs (“correct answers”) can be provided

Unsupervised LearningUsing a model to draw inferences from unlabeled data to describe structures or

patterns

Reinforcement LearningGiven a specific environment/context and a feedback mechanism, a model

automatically tries to determine optimal behavior

12

Today’s focus

Page 13: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Supervised Learning

13

Today’s focus

Predict which group

each sample

belongs to

Predict some

continuous value

output

Classification Regression

What is the sugar

content of an

orange in grams

given its size, color,

weight, age, etc.?Based on its

characteristics, is it an

apple or an orange?

Page 14: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Supervised Learning: Regression

y = f(a, b, c, …)

14

The goal of regression is to find the function ‘f’

Page 15: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Supervised Machine Learning Terminology

1) Target: what we want to predict, with examples of the correct answers provided

y = f(a, b, c, …)

2) Features: inputs used by the learning algorithm in the training process to form a

predictive model for the Target

3) Model: the algorithm(s) used to calculate a prediction of the Target

15

Model

Page 16: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Supervised Machine Learning Terminology

4) Training: the process that uses Features and Targets to inform a predictive Model

5) Feature Importance: how much impact a Feature has on the predictive power of a

Model

6) Fitness Function: a measure of the predictive power of a Model (i.e. this is what we

want to optimize, like R2 or Mean Absolute Error)

16

Page 17: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

3) Feature Importance:The Real Prize?

17

Page 18: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

The Importance of Feature Importance

18

Helping us understand what matters most

from a feature perspective can direct:

• our time and attention

• data acquisition and quality efforts

• which features we may want to remove

from our dataset

Page 19: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

The Many Faces of Feature Importance

19

Linear RankInformation Theoretical

Statistical Impurity Permutation

Dropout Additive Recursive

Bayesian GroupedDomain (Physics)

Page 20: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Features: Timing and Influence

20

• Which features are we able to influence rather than simply measure?

• Which features are knowable at each stage of the process?

• The features we know earliest are the ones that can guide our decisions the

soonest

Page 21: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Feature Interdependence (Correlation) – A Challenge

21

1) It’s hard to understand and separate the effects

of two or more features that are correlated

2) It’s harder to intelligently select which features to

include in a predictive model

Some are obvious:

• Total proppant and total fluid volume

Some aren’t:

• Total proppant and location within field

Page 22: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Handling Correlated Features: Stage 1 of 4

22

Stage 1: Denial

• Let’s pretend we can just evaluate

the impact of all these different

features independently

• That would be so much easier…

Page 23: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Handling Correlated Features: Stage 2 of 4

23

Stage 2: Acceptance

• Accepting that some features are correlated,

and that this must be part of our overall

understanding, is valuable in itself

• Handling correlated features is difficult, but

necessary

Page 24: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Handling Correlated Features: Stage 3 of 4

24

Stage 3: Analysis

• Measure feature correlations in different

ways

• Linear

• Rank

• Mutual Information

• Look for instances of unexpected

correlation (or lack of correlation)

• Identify feature groups where information

is unique and where it’s redundant

Read more: Multivariate Analysis Using Advanced Probabilistic Techniques for Completion Optimization

Page 25: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Handling Correlated Features: Stage 4 of 4

25

Stage 4: Understanding

• There is no “cure for” or “solution to” feature correlation

• Understanding which features are strongly correlated informs feature selection and helps

us intelligently reduce dimensionality

• We want maximum relevance with minimum redundancy

• Understand how features are correlated (linearly, ordinally, mutual information, etc.)

• Noting moderate correlations may uncover hidden insights

• e.g., so far we’ve only drilled longer laterals in areas of poorer reservoir quality, so we should be

cautious in making generalizations

Page 26: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Feature Grouping

26

• Feature Grouping can help us understand

which broader factors matter most (e.g.,

geology, pressure, lateral length, completion

design parameters)

• e.g., how important are all the completion

parameters in aggregate versus all the

geological parameters in aggregate?

Geological Geophysical

Completion

Design

Proximal

Production

Page 27: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

4) Machine Learning Challenges

27

Page 28: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Challenge: Data

• Data quantity

• Data quality

• Missing data

• Data normalization/calibration

• Outlier treatment

• Data matching (when integrating datasets)

• Representativeness of the data

28

Page 29: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Challenge: Communication, Transparency, Domain Expertise

29 Read the blog: Machine Learning: Is it really a Black Box?

Communication

• Terminology

• Turning results into recommendations

Transparency (i.e. black box syndrome)

• Explaining the result

• It’s easy to get an answer, but tough to back it up

Domain Expertise

• Understand data choices

• Evaluate results

• Have a clear goal

• Choose appropriate predictive performance measures

Page 30: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Challenge: Underfitting/Overfitting

30

Sign of underfitting: poor model fit

during training and poor model fit

on new, unseen data

Missing Relevant Relations Good Generalization Fitting the Noise

Sign of a good fit: good model

fit during training and good

model fit on new, unseen data

Sign of overfitting: excellent

model fit during training and poor

model fit on new, unseen data

Source: https://pythonmachinelearning.pro

Page 31: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Wait….What is the Goal?

31

Are we trying to get the best fit to the data?

or

Are we trying to get the best predictive capability?

Read the blog: Machine Learning: Finding the signal or fitting the noise?

Page 32: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Validation: Why Do We Do It?

32

Three main purposes:

1) To estimate the predictive capability (error) our model will have in

the future

2) To reduce the chance our model will “overfit” our available data

3) To get an indication of the predictive limits of our dataset

Page 33: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

full dataset

testtraining

Validation Example: K-Fold Cross-Validation

33

K-fold cross-validation is one technique

used to estimate the predictive capability

of a machine learning model on data it

has not yet seen

in-sampleout-of-sample

1

3

2

4

5

Page 34: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Validation Provides Predictive Capability Estimates

34

full dataset

testtraining

full dataset

test

training

full dataset

testtraining

testset

5-f

old

cro

ss-v

alid

atio

n

test set validation

Bad(overfitting likely)

Best(cross validation + test set)

Better(cross validation)

training

Page 35: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Post-Validation Out-of-Sample Testing: Why Do We Do It?

35

• It is entirely separate from the iterative training, validation and model

selection process

• It provides one last “sanity check” to make sure the training, validation and

model selection process was sound

• It is as close to a live, real-world test of a model as we can get

• Unfortunately, the only way to truly test a model’s future predictive power

is to actually use it to make predictions about the future, wait for the

results, and then measure how well it did

Page 36: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Noise and Bias in the Data

36

Noise: Unexplained variability within a data sample

• Measurement precision

• Data processing

Bias: Systematic difference between measurement and true value

• Selection bias, analytical bias, survivorship bias, observer bias, …

• Perverse incentives, laziness, career risk, honest mistakes, …

Page 37: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Illustration 1: A Simple Equation

37

Equation: Y = 20A + B3 + 10eC

A is a random integer between 0 and 10 from a uniform distribution

B is a random integer between -10 and 10 from a uniform distribution

C is a random real number between -5 and 5 from a normal distribution

We are predicting target Y, given features A, B and C

[email protected]

Page 38: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Noise and Bias Added to the Data

38

Model 1: Trained with original A, B, C values

Model 2: A, B, C values with 20% noise

Model 3: A, B, C values with 10% noise, plus 10% bias

(half the values for each of A, B, C were biased upward and half were biased downward)

Page 39: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Noise and Bias in Data Reduce Model Predictive Power

39

Original 20% noise 10% noise and 10% bias

Almost all oil & gas data is both noisy and biased to some degree

Page 40: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Missing Features Reduce Model Predictive Power

40

Original No C values provided No B values provided

This illustrates the inherent predictive limits of a dataset with important information

missing (e.g., an attempt at completion optimization with no geological data)

Page 41: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

5) Our Approach General Principles We’ve Found to Work Well in Practice

41

Page 42: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Where We Spend Our Time

42Visualization supports all stages of our process

Machine Learning

10%

Data Preparation

20%

Analysis & Building

Understanding

30%

Discussing

40%

Page 43: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Never Talk About the In-Sample Training Fit

• Out-of-sample predictive power is the goal, not in-sample training

fit perfection

• Modern ML algorithms can achieve a nearly perfect in-sample

training fit to virtually any dataset (overfitting)

• Presenting an unrealistic R2 value can inflate expectations of a

predictive model to unachievable levels

43

Page 44: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Make Sure We Have Enough Data

Three equally unsatisfactory answers to “How much data is enough?”

1. There is no right answer – it is unknowable

2. More is always better – we can never have enough

3. It depends…

Factors that can affect sample size requirements

• Complexity of the problem (e.g., nonlinearities, dependencies, empirical understanding)

• Number of features

• Range and distribution of feature and target data

• Data quality, cleanliness, representativeness

• Complexity of the machine learning algorithm(s) used44

Page 45: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Make Sure We Have Enough Data

Our rule of thumb

If a dataset has fewer than 200 samples, there’s a good chance it isn’t well -suited for

machine learning…

…however, useful insights can still be found in smaller datasets (e.g., feature

importance)

45

Page 46: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Understand the Predictive Limits of Our Dataset

46

A “good” R2 value (model fitness) could be R2=0.10 for one dataset

and R2=0.70 for another

• How predictable is our target?

• How relevant are our available features to our target?

• How many samples do we have?

• How good is the quality of our data?

• How good are existing predictive models?

• How good do predictions need to be for the model to be useful to us?

Page 47: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Linear

RegressionNeural Networks Random Forests

Gradient Boosted

Trees

Genetic

Programming

Support Vector

Machines

K-Nearest

Neighbours Smart Scaling

Feature Engineering

Encoders Bayesian PCA/ICA

Use a Library of Algorithms

No single ML algorithm is best for all cases

47

Learning algorithms in combination with preprocessing algorithms generally perform better

Page 48: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Aside: Neural Networks

48

• We include some neural networks in our library of

algorithms

• There are many different classes of neural networks

with essentially infinite configurations

• Neural networks perform best with very large datasets

• On smaller datasets, including most oil & gas

datasets, we’ve found neural networks are usually

outperformed and out-generalized by other methods

(at least within a reasonable amount of time)

Source: https://leonardoaraujosantos.gitbooks.io/artificial-inteligence

Page 49: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Wisdom of the Crowd

49

• Ensembles of models almost

always outperform single-

algorithm models

• Ensembles protect against

any individual model’s

weaknesses or biases

• An iterative evolutionary

approach to creating,

optimizing, and validating

these ensembles has

consistently yielded our best

results

https://www.analyticsvidhya.com/blog/2015/08/optimal-weights-ensemble-learner-neural-network/

Model 1

Model 2

Model 3

Model 4

Ensemble Model

Model 5

Input Data

Predictions

Page 50: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Use Supporting Visualizations

50

Page 51: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Focus on Time to Value

51

Insights from machine learning can and should be realized and applied to

decisions in weeks or months, not years

• Time has a significant opportunity cost if ML benefits can’t be realized quickly

• Time has a significant real cost

• worker-hours

• salaries

• software fees

• Well-defined shorter term projects have a much better cost-benefit profile than trying to

“boil the ocean” with machine learning

Page 52: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Domain Expertise is Critical

Domain expertise lets us make sure the results make sense

• Reduces time spent chasing spurious relationships

• Enables a quicker understanding of “why”

The best domain experts are the technical teams working on the problems every day52

Source: xkcd

Page 53: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Domain Expertise is Critical

The laws of physics still matter

• Healthy skepticism to any data-driven approach is good, especially when data is noisy,

sparse, biased, incomplete, etc.

But, sometimes there are genuine unexpected findings in the data that

warrant challenging the status quo…isn’t this part of our goal?

53

Page 54: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

6) Machine Learning Power

54

Page 55: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Not Everything is Linear

55

Sigmoid Function

Source: xkcd

Page 57: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Not Everything is a Number

57

Source: http://survivestatistics.com/variables/

Page 58: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Not Everything is Simple

58

Boss: “Look at these 80

features and tell me how

everything relates to

everything else and what

matters most”

Tyler: “Sure, give me a few

months”

ML Algorithms: “Sure, give

me a few minutes”

Page 59: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Illustration 2: Latitude and Longitude

59

Goal: Predict Latitude and Longitude from a UWI

07 – 31 – 054 – 24 W 5

LS

D

Se

cti

on

To

wn

sh

ip

Ra

ng

e

Me

rid

ian

Boss: “please write me an Excel macro that returns

latitude and longitude, given any UWI in Western

Canada, by the end of the day”

Tyler: “yikes!”

Source: Alberta Environment and Parks

Page 60: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Why This is Difficult

60

• Curvature and ellipticity of the Earth affect this

translation differently in different places

• Discontinuous thresholds where some numbers

get reset (LSDs, Sections, Ranges) and some

don’t (Townships), and these thresholds vary

• It’s relatively easy for a small, focused area,

but broad generalization is difficult

Source: An engineering textbook from 1897

Page 61: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Why This is Difficult

61

Page 62: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Model Predicted vs. Target, Out-of-Sample

62

Latitude Longitude

Page 63: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Why This is Not Difficult for Machine Learning

63

• We know there is some connection between the features (LSD,

Section, Township, Range, Meridian) and the result

(Latitude/Longitude)

• Whenever there truly are informational relationships between the

features and the target, even if they’re very complex, modern machine

learning algorithms will almost certainly find them

Page 64: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

7) Case Study 1Predicting Reservoir Rock Properties

64

Page 65: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Case Study 1: Predicting Rock Properties

Business Case

Populate a detailed reservoir model with reliable (core) rock property

values for development planning purposes

• To do this with coring & lab analysis is cost prohibitive

• Many existing wells have no core data, but they do have log data

• Predictions from best existing model (using wireline log data and traditional

approaches) are not as accurate as we would like

Can we use machine learning to develop a better predictive model?

65

Page 66: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Case Study 1: Data

1) Detailed core analysis from dozens of wells → source of target values

2) Open-hole log data from the same wells → source of feature values

3) Established depth matching between core and log data

66

Sample size: ~2500

Data acquisition costs for this case study: ~$50 million

Page 67: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Modelling Process

Example: CRISP DM Model

67

1) Predict core porosity (Target)

2) Using open hole log data (Features)

3) Use learning algorithm(s) to Train a predictive

Model using the Target and Features

4) Evaluate the predictive power of the Model using

a Fitness Function (e.g., MSE, R2)

5) Iterate

Page 68: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Case Study 1: Feature Importance

68

A

B

C

D

E

F

G

H

I

J

K

L

M

Fe

atu

re

Features we

could probably

exclude from

our model

Page 69: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Case Study 1: Top 4 Features

The top 4 features all have very weak linear correlation to porosity

69

Page 70: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Case Study 1: Porosity Distribution (Density Function)

70

Page 71: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Case Study 1: Model Comparison

71

Best Traditional Model 0.48 0.23

ML Logs-Only Model 0.61 0.37

ML Logs+Drilling Model 0.68 0.46

R R2

Best Traditional Model 17.4 E-5

ML Logs-Only Model 7.3 E-5

ML Logs+Drilling Model 6.2 E-5

Mean Squared

Error (MSE)

Page 72: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

A Similar Use Case: Generating Log Traces

72

1) Generate a predicted log trace for a

missing density or sonic log

• From other log traces

• From drilling data

• From both

2) Generate a synthetic log trace for

predicted core properties

Source: Wikipedia

Page 73: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

8) Case Study 2Optimizing Drilling Locations and Completion Designs

73

Page 74: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Case Study 2: Optimizing Drilling Locations and

Completion Designs

74

Business Case

Decide where to drill new wells in a light tight oil play and how to complete

them to maximize NPV

• Don’t drill where we’re unlikely to be successful

• Where we do drill, use the best completion design

• Our existing ability to predict productivity from new drills has been poor

Can we use machine learning to develop a better predictive model?

Page 75: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Case Study 2: Data

Sample size: ~300 wells

We want to predict first-six-month cumulative oil production (Target) from:

• Geological data

• Seismic data

• Completion data

• Drilling data

• Proximal production data

With these predictions in hand, we can then layer in cost and commodity price

information to optimize expected NPV

75

Features

Page 76: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

The Problem is Complex

76

• > 80 features

• Varying degrees of interdependence

• Most features could plausibly

impact well performance

• Some feature values only exist in

combination with each other –

difficult to distinguish individual

feature impacts

Page 77: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Individual Feature Importance

77

• In this dataset, many features are

informative

• The top two individual features are

related to geology

• Different feature importance measures

yield different ranking orders

Page 78: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Grouped Feature Importance

78

• Grouped feature importance allows

characterization of which categories of data

have the greatest impact

• In this dataset, reservoir geology appears to

have about 50% more influence than

completion design – but both are important

Page 79: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Model Results – Predicted vs. Target

79

TestingCross-Validation

Page 80: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Target Sensitivity to Individual Features

80

How does changing the value of just one feature impact the prediction?

(holding everything else constant)

A B

Page 81: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Statistical Distribution of Error

81

How do the absolute error, relative (%) error and direction of error vary within the dataset?

What is the P10/P90 ratio of the error distribution?

Note: this is not the case study area, it is intended for illustrative purposes only.

Page 82: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Spatial Distribution of Error

82 Note: this is not the case study area, it is intended for illustrative purposes only

Page 83: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

So, What Do We Get? (Final Outputs)

83

1) One or more predictive models which can be used to predict performance of future

well locations and completion designs

2) Using the model, we can generate range of predicted outcomes for each possible

drilling location (covering a variety of possible completion designs)

3) An understanding of what matters (feature importance characterization)

4) Statistical and spatial characterization of prediction error (confidence)

5) Ability to test hypotheses (e.g., “I think combining X and Y might deliver a better

production result…what would our model predict?”)

Page 84: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Conclusions

84

Page 85: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Conclusions

A good result relies on:

1) Domain Expertise: knowledge of the problem space, data and goals

2) Data Expertise: ability to explore, understand, select and condition data

3) Technical (Machine Learning) Expertise: ability to use tools and technology for

efficient, effective, reliable outcomes

4) Communication Expertise: ability to craft a compelling narrative from

modelling insights and make actionable recommendations

85

Page 86: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Good News: Machine Learning Won’t Take Our Jobs

• Domain expertise is critical

• Business understanding is critical

• Communication is critical

• Oil & gas data “needs us” – it’s too

messy on its own

• Just like us, all ML models are wrong,

but sometimes they’re useful

86

Source: xkcd

Page 87: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Thank YouTyler Schlosser

Chief Data Scientist

Verdazo Analytics

403-708-2864

[email protected]

Check out our blog at verdazo.com

87

Page 88: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Appendix

88

Page 89: Machine Learning - VERDAZO€¦ · 31/05/2018  · Different Types of Machine Learning Supervised Learning Training a model by example –predicting an outcome using data where examples

Aside: Time Series Forecasting with ML

89

• Can be framed using an empirical model (Arps)

• Can be framed as a stochastic process (Markov)

• Can be framed as a supervised learning

problem where:

• Target is the value in the future (Vt+1)

• Features are past and current target values along

with any other relevant data available at the

current time (t)

• Need to be extra careful to get good

generalization

• Is it likely to offer significant improvement?