fracture flow rate estimation using machine learning on ... · and horne (2015) applied...

FRACTURE FLOW RATE

ESTIMATION USING MACHINE

LEARNING ON TEMPERATURE

DATA

A REPORT SUBMITTED TO THE DEPARTMENT OF ENERGY

RESOURCES ENGINEERING

OF STANFORD UNIVERSITY

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE

DEGREE OF MASTER OF SCIENCE

By

Dante Isaac Orta Alemán

June 2018

iii

I certify that I have read this report and that in my opinion it is fully

adequate, in scope and in quality, as partial fulfillment of the degree

of Master of Science in Energy Resources Engineering.

__________________________________

Prof. Roland Horne

(Principal Advisor)

v

Abstract

Near-wellbore fracture characterization methodologies help in identifying fluid entry

points as well as flow rate in order to assess the effectiveness of hydraulic fracturing

treatments, optimize the completion plan or identify the need for refracturing. Temperature

transient analysis is one of such methods and previous work has shown that it allows for

the estimation of flow rate coming out of fractures.

In this work, a machine learning approach to fracture flow rate estimation using

temperature data is presented. The problem was formulated as a time series regression

problem where the temperature data is used as the input of a reverse model that estimates

flow rate. The Lasso Regression, Random Forest and Kernel Ridge Regression algorithms

were tested in the study and three case studies are presented with varying levels of

complexity.

The Kernel Ridge Regression approach was found to outperform the other two algorithms

in the most complex case due to the specific formulation of the features as well as the

mathematical similarity of the learning algorithm with the analytical solution of the

physical problem.

vii

Acknowledgments

This work would not have been possible without the support of Professor Roland Horne,

who opened me the door to Stanford, guided me through the process of doing this research,

always asked, and made me ask the relevant questions. I would also like to acknowledge

the financial support of CONACYT, the Mexican Council for Science and Technology,

SENER, Mexico’s Energy Ministry as well as the Department of Energy Resources

Engineering at Stanford University.

Many thanks to my colleagues at the ERE Department, especially the members of the

SUPRI-D research group for their very useful questions and suggestions. I am also

extremely grateful to Dr. Timur Garipov for his generous help in doing reservoir

simulation. His explanations and suggestions enabled me to get through the multiple

obstacles encountered in the process.

Special thanks to my friends EJ, Jason, Srininkaeth, Greg and Jeff with whom I spent

innumerable hours during my time at Stanford brewing coffee and discussing the important

issues of both research and life. Meeting you made me a better student, researcher and

person.

Finally, I would like to thank my sister and my parents. They always supported and

encouraged me to do my best and ultimately taught me to be the person I am.

ix

Contents

Abstract ............................................................................................................................... v

Acknowledgments............................................................................................................. vii

Contents ............................................................................................................................. ix

List of Tables ..................................................................................................................... xi

List of Figures .................................................................................................................. xiii

1. Introduction ............................................................................................................... 14

1.1. Temperature transient analysis ........................................................................... 14 2. Problem Statement .................................................................................................... 17

3. Methodology ............................................................................................................. 19

3.1. Forward model ................................................................................................... 19

3.2. Inverse model ..................................................................................................... 21 3.2.1. Time series Cross-Validation ...................................................................... 21

3.3. Machine Learning algorithms ............................................................................ 23 3.3.1. Linear regression methods .......................................................................... 23 3.3.2. Basis expansions ......................................................................................... 24

3.3.3. Regularization: The Lasso and Ridge Regression ...................................... 24

3.3.4. Kernels ........................................................................................................ 25 3.3.5. Tree based methods..................................................................................... 26 3.3.6. Random Forest ............................................................................................ 26

3.4. Machine learning framework ............................................................................. 27 4. Modeling Results ...................................................................................................... 29

4.1. Case 1. Injection at constant pressure ................................................................ 29 4.1.1. Features set.................................................................................................. 29 4.1.2. Model performance ..................................................................................... 31

4.2. Case 2. Injection with variable flow rate............................................................ 32 4.2.1. Model Performance ..................................................................................... 32

4.3. Case 3. Injection with variable flow rate - lookback approach .......................... 34 4.3.1. Case 3A - Convolutional features ............................................................... 34

4.3.2. Case 3B - Kernel Ridge Regression - Temperature lags ............................ 36 4.3.3. Effect of temperature lags on model error .................................................. 38 4.3.4. Effect of number of training samples on model error ................................. 38

5. Conclusions ............................................................................................................... 41

5.1. Future work ........................................................................................................ 41 References ......................................................................................................................... 43

xi

List of Tables

Table 1. Feature set for Case 1. 30

Table 2. List of features used in Case 3B 36

xiii

List of Figures

Figure 2-1. Schematic of a fractured wellbore. 17

Figure 3-1. Diagram of the modeling process 19

Figure 3-2. Schematic of the K-Folds Cross-Validation method used for time series 22

Figure 3-3. A Regression Tree can be visualized as a tree of binary partitions. 27

Figure 4-1. Out of Sample prediction for Case 1 using the Lasso and Random Forest.31

Figure 4-2. Performance metrics for the Lasso and Random Forest for Case 1. 31

Figure 4-3 Fracture injection profile for Case 2. 32

Figure 4-4. Features and actual fracture flow rate for Case 2. 33

Figure 4-5. Injection profile for Case 3 34

Figure 4-6. Model results for the Lasso using the convolutional features set. 35

Figure 4-7. Model comparison for Case 3B. 37

Figure 4-8. Effect of different number of time lags on the Kernel Ridge Regression

model 38

14

1. Introduction

Understanding fractures is an important aspect of oil and gas reservoir evaluation. Fractures

have significant effects in the reservoir fluid flow and influence the productivity of wells

as they can increase the reservoir permeability and porosity as well as increase the

reservoir´s anisotropy and heterogeneity (Nelson, 1985) and directly influence the

reservoir’s development strategy. The growing popularity of hydraulic fracturing has also

increased the need for advanced fracture characterization techniques that allow the industry

to optimize field development and well economics. Current fracture characterization

techniques can be divided in three main groups: indirect, direct far-field and direct near-

wellbore techniques (Cipolla and Wright, 2000).

Indirect methodologies are the most widely used. These techniques make assumptions of

physical processes and try to match a reservoir or fracture model to the observed data. Well

testing, production data analysis and net pressure analysis are examples of these indirect

methodologies that allow the estimation of fracture dimensions, effective fracture length

and fracture conductivity.

Direct far-field techniques are usually done during hydraulic fracturing stimulation and

rely on instruments placed on the surface or in offset wellbores. This group of techniques

includes tilt fracture mapping and microseismic mapping and usually has field-wide

coverage but has the downside of losing resolution with increasing distance from a fracture.

Direct near-wellbore methodologies measure physical properties in the near-wellbore

region and are usually done during or after fracture treatment. These methodologies are

used to identify fluid and proppant entry points as well as getting a profile of production

entering the wellbore in order to assess the effectiveness of hydraulic fracturing treatments,

optimize the completion plan or identify the need for refracturing (Barree, Fisher and

Woodroof, 2002). Among these techniques are radioactive tracing, production logging,

borehole image logging, downhole video and temperature logging.

1.1. Temperature transient analysis

Traditional temperature logging is done by using wireline tools to obtain multiple

temperature profiles of the wellbore at different times in order to create a temperature vs.

time profile. This allows to determine where the fluid had entered the formation by

observing a delayed recovery to geothermal temperature over a time interval (Sierra et al.,

2008). Distributed Temperature Sensing (DTS) is another option for measuring the

temperature profile of the wellbore. With DTS technology, it is possible to get a dynamic

picture of the temperature profile in the full wellbore as opposed to only snapshots like in

traditional temperature logging.

15

Distributed Temperature Sensing technology relies on an optical fiber deployed directly in

the well flow path or behind the casing. A laser is used to send light pulses through the

fiber and a detection system recovers the backscattered light, whose intensity depends on

the temperature of the fiber surroundings. DTS allows for a sampling resolution of one to

half a meter and the frequency can range from a few seconds to minutes (Sierra et al.,

2008). Hence, DTS technology can also provide real time and continuous measurement

with no movement of the sensor and little or no impact in the well operations, resulting in

a reduction in operation downtime (Ouyang et al., 2004).

Because temperature behavior in a reservoir is controlled mainly by convection, the rate of

change of temperature in the wellbore is highly affected by the flow rate coming into the

well (Ribeiro and Horne, 2014). This makes temperature data valuable for estimating the

flow rate profile of the well, which in turn is useful for applications such as identifying

well conditions before treatment, assessing the effectiveness of diverters and fracture

treatment, quantifying damage in the well and optimizing injected fluid volumes

(Glasbergen et al., 2009).

Previous work has used temperature data for flow profiling through the development of

forward physical models. Ouyang et al. (2004) developed a thermal model for single-phase

and multiphase fluid flow, explored the dependence of temperature on fluid properties and

observed that a production profile could be determined using DTS data through history

matching. Duru and Horne (2010) modeled temperature transients in a reservoir using a

semianalytical method and provided a method for porosity and permeability estimation

using temperature data.

Ribeiro and Horne (2013, 2014) explored the temperature and pressure responses during

and after hydraulic fracturing by using a numerical model, studied the limitations of

information carried by temperature and showed that the local characteristics of temperature

can be used to determine the number of fractures and interaction between them. Li and Zhu

(2016) presented a thermal and flow simulation model that allowed for fracture propagation

and examined the effect of DTS fiber location on the simulated temperature data.

Forward numerical models are currently more popular for temperature transient analysis.

Nonetheless, some effort has been done in using reverse statistical models for temperature

transient analysis as well. Duru and Horne (2011) developed a Bayesian inversion method

to deconvolve pressure and temperature data to obtain well flow rates. Furthermore, Tian

and Horne (2015) applied feature-based machine learning to recover pressure from

synthetic temperature data. Nevertheless, the use of machine learning for temperature

transient analysis is still in its early stages and has not been thoroughly explored for flow

rate profiling in the presence of fractures. Machine learning models have the advantage of

being less computationally intensive than full physics numerical models, and in a DTS data

rich environment; they are natural candidates to be applied for fracture characterization.

17

2. Problem Statement

The main objective of this study was to estimate the flow rate coming out of an individual

fracture when the well is subject to injection or shut in. The estimation was made by using

only temperature time series corresponding to the intersection of a fracture with the

wellbore as shown in Figure 2-1. The time series were then used to train multiple machine

learning algorithms and evaluated in unseen or test data.

To generate the temperature time series the Automatic Differentiation General Purpose

Research Simulator (AD-GPRS) simulator was used (Rin et al., 2017). The use of a

simulator allowed for precise control of the reservoir parameters and well flow rates,

critical inputs for the training of the machine learning algorithms.

Figure 2-1. Schematic of a fractured wellbore (left). Injected fluid causes a temperature response

at the intersection of the fracture and the wellbore (right).

Temperature in a reservoir is controlled by convection. For a given initial condition 𝑇𝑡 and

a fluid motion vector field 𝑤(𝑥), the general solution is represented by a convolution,

denoted in Equation 1 (de Bézenac, Pajot and Gallinari, 2018). Therefore, the main goal of

the study was to find a machine learning model that successfully deconvolves the

temperature history to recover the fracture flow rate.

𝑇(𝑡, 𝑥) = ∑ 𝐾(𝑥 − 𝑤(𝑥), 𝑦) 𝑇𝑡(𝑦)

𝑦=Ω

𝐾(𝑥 − 𝑤, 𝑦) =1

4𝜋𝐷Δ𝑡𝑒−

14𝐷Δ𝑡

‖(𝑥−𝑤)−𝑦‖2(1)

18

It is evident that in order to estimate the flow rate coming out of an individual fracture it is

a requirement to know the spatial location of the intersection of that fracture with the

wellbore. Previous knowledge of that location was assumed in the modeling done for this

study. This was deemed a reasonable assumption, as identifying the location of a fracture

can be done manually by looking at the temperature transient data in the form of waterfall

plots (Ribeiro and Horne, 2014). This operation only needs to be done once, as opposed to

the flow rate estimation problem, which by its continuous nature is a better candidate for

automation. It can also be noted that in the case of hydraulically fractured reservoirs, the

intersection of the fractures with the wellbore is specified by the perforations made in the

casing.

Nonetheless, an exploration of the problem of detecting fracture location from temperature

data was done for this study and is presented in Appendix A.

19

3. Methodology

The modeling process was divided in two main components: a forward physics-based

model and an inverse statistical or machine learning model. The forward model was used

to generate the flow rate and temperature history, which was then used to build the inverse

model that recovered the flow rate using temperature data as an input. A schematic view

of the process is shown in Figure 3-1.

Figure 3-1. Diagram of the modeling process

3.1. Forward model

For the forward model a two-dimensional reservoir was used. This simplified reservoir

model provided enough flexibility to capture the important physical properties of

subsurface flow and fractures while keeping things simple enough to test the algorithm

capabilities and limitations. The simulation was run using the AD-GPRS flow simulator

(Rin et al., 2017).

For the spatial discretization of the reservoir, an unstructured grid was employed. The

simulator’s default automatic time stepping was used with a minimum time step of 0.024

hours and a maximum time step of 0.96 hours. The results from the simulation were then

preprocessed using the Python programming language so they could be used for the

training of the inverse model.

20

The characteristics of the reservoir were as follows:

Dimensions: 400 x 200 m

Well length: 200 m

Permeability: 0.5 md

Porosity: 0.15

Reservoir temperature: 368 K

Rock thermal conductivity: 124.5 kJ/ (m day K)

Rock heat capacity: 0.9211 KJ/ Kg K

Rock density: 2250 kg/m3

Fracture number: 3

Fracture half length: [10, 30, 50] m

Spacing between fractures: 50 m

Injected fluid: water @ 25 C

No flow boundary conditions

Figure 3-2. Simulation grid used for the inverse model with the fractures shown in the zoomed-in

area.

21

3.2. Inverse model

It can be noticed that flow rate estimation can be expressed as a regression problem of the

form:

𝑞 = 𝑓(𝑋, �̂�) (2)

The inverse model was then defined as the regression model 𝑓(𝑋, �̂�) that takes 𝑋 as an

input and is parametrized by �̂�. In the case of the present study, 𝑋 was a matrix containing

encoded temperature and time data and the output of 𝑓(𝑋, �̂�) was the flow rate estimate at

a given time. The parameters �̂� were case dependent, as different machine learning

methods1 require distinct parameters. Additional detail on each method’s parameters is

contained in Section 3.3.

The process of building the matrix 𝑋 is commonly known as feature encoding. In the

present case, each column of 𝑋 represented a feature and each row was a data point. Some

of the features in this study included but were not limited to absolute temperature, spatial

and temporal temperature derivatives.

Once the matrix 𝑋 was defined, the model 𝑓(𝑋, �̂�) was fitted or trained. The standard way

of training a machine learning model is to partition the dataset 𝑋 into three different

subsets: training, validation and test sets. The training data are used to fit the model using

a set of parameters �̂� followed by an evaluation of the model’s performance using the

validation set. The process is repeated until an optimal set of parameters �̂� is found for the

validation set. Finally, the test set is used to obtain a true and unbiased estimate of the

model’s performance, as those data were not ‘seen’ by the model during the training or

parameter optimization.

However, in this study the dataset was limited in size because of the computational cost of

running the forward model. Therefore, splitting the data into three sets would severely limit

the ability of a model to capture the reservoir behavior as diversity in the data would

decrease and some observed characteristic of the reservoir could be omitted from the

training data. To solve this problem, Cross-Validation was applied, where the equivalent

of the training and validation dataset are fully used for both the fitting process and

parameter optimization.

3.2.1. Time series Cross-Validation

The most popular Cross-Validation methodology is K-Fold Cross-Validation (Hastie,

Tibshirani and Friedman, 2009). This methodology splits the dataset into K roughly equal-

1 In this study, a machine learning method or algorithm is the regression method. Examples of this are the Lasso, Ridge

Regression and Random Forests. A model 𝑓(𝑋, �̂�) is an instance of a method that is parametrized by �̂�.

22

sized parts that are then used to recursively train and validate a model. The process of K-

Fold Cross-Validation goes as follows:

1. Choose the kth part of the dataset as the validation set.

2. Use the K-1 parts for training the model and compute the error using the validation

set.

3. Repeat the process for k = 1,2,…K

4. Combine the K estimates of error into a single one.

Regular K-Fold Cross-Validation relies on the assumption of independent and identically

distributed data points, this is why any kth subset of the dataset can be used for the

validation of a model. However, temperature time series data clearly violates this

assumption as it has a high temporal correlation (Bergmeir and Benítez, 2012). This comes

from the nature of the physical process and can be seen in the case of sharp changes in flow

rate, which will cause less sharp changes in temperature because of diffusion. This means

that the value of temperature 𝑇𝑡 cannot be any value, but one close to that of 𝑇𝑡−1.

Using regular K-Fold Cross Validation would violate the independence assumption, as the

model could be fitted with data corresponding to a time posterior to the data in the

validation set. This would imply that the model has already seen ‘the future’, which

implicitly includes information from ‘the past’, where the validation set was defined.

In order to go around this problem, a different flavor of K-Fold cross validation was used

to test the regression model’s performance. The method kept the temporal order of the

dataset by extending the training set in time. This meant using successive training sets,

which were supersets of those that come before them as shown in Figure 3-3.

Figure 3-3. Schematic of the K-Fold Cross-Validation method used for time series

23

3.3. Machine Learning algorithms

Three different machine learning algorithms were investigated in this study, Lasso

Regression, Random Forest and Kernel Ridge Regression. This section covers some of the

basic theory on the algorithms and the assumptions under which they work. There were

two main reasons why these algorithms were selected:

Ability to model nonlinear behavior. This is that the relation between temperature

and fracture flow rate is not constrained to be a linear one by the algorithm.

Robustness against irrelevant features. The feature encoding is usually an iterative

process, so the algorithms needs to be able to pick the relevant features and drop

the irrelevant ones without compromising performance

For this study, simple machine learning algorithms were preferred over complex ones. In

this way, the problem’s physics dictated the complexity necessary for the modeling,

avoiding the risk of introducing additional and potentially unnecessary algorithmic

complexities.

3.3.1. Linear regression methods

The Lasso Regression and Kernel Ridge Regression algorithms fall under the category of

generalized linear models. This may sound contradictory with the requirements mentioned

in Section 3.3, however linear models can model nonlinear behavior using feature

transformations.

A linear regression model is one where the inputs 𝑋 are linearly related to the variable, this

is a model of the form:

𝑓(𝑋) = 𝛼0 + ∑𝛼𝑗𝑋𝑗

𝑝

𝑗=1

(3)

where the parameters 𝛼 are chosen in such a way that a error or loss function is minimized.

Typically, the loss function used is least squares, where the residual sum of squares (RSS)

is minimized:

𝑅𝑆𝑆 = ∑(𝑦𝑖 − 𝑓(𝑥𝑖))2

𝑁

𝑖=1

𝑅𝑆𝑆 = ∑(𝑦𝑖 − 𝛼0 − ∑𝛼𝑗𝑥𝑖,𝑗

𝑝

𝑗=1

)

2𝑁

𝑖=1

(4)

Another way of posing the regression problem is as finding the conditional expected value

of a variable 𝑦 given inputs 𝑋. For this, Equation 3 can be rewritten as shown in Equation

24

5, where 𝑋 now includes a column of ones corresponding to the bias term 𝛼0. Defining the

regression output as a conditional expectation makes a difference when specifying the

features, as discussed in Section 4.1.1.

𝐸(𝑦|𝑋) = 𝑓(𝑋) = ∑𝑋𝑗𝛼𝑗

𝑝

𝑗=0

(5)

3.3.2. Basis expansions

In problems like convection, where the physics dictate a nonlinear relation between flow

rate and temperature, some modification to the model stated in Equation 3 is needed in

order to use a linear regression method. One way of introducing nonlinearity in such a

model is to use a so-called basis expansion. If the term ℎ𝑚(𝑋) = 𝑋𝑗𝑋𝑘 denotes the mth

transformation of 𝑋, then the model in Equation 5 reads (Hastie, Tibshirani and Friedman,

2009):

𝑓(𝑋) = ∑ 𝛼𝑚ℎ𝑚(𝑋)

𝑀

𝑚=0

(6)

The term ℎ𝑚(𝑋) = 𝑋𝑗𝑋𝑘 can be defined to include higher order polynomial terms of 𝑋.

This specific form of ℎ𝑚(𝑋) is called a polynomial expansion and was applied in this study.

From Equation 6, it can be noted that the coefficients 𝛼 now correspond to the terms in ℎ𝑚

and no longer directly on 𝑋 directly. This is how even though the model is linear, the

relation between the output 𝑓(𝑋) and 𝑋 is no longer linear.

3.3.3. Regularization: The Lasso and Ridge Regression

When a polynomial expansion is done, the number of variables or features that the model

is fitting increases exponentially with the degree of the polynomial. However, there is no

reason for all of those newly introduced variables to be important for the model. Therefore,

it is necessary to have a mechanism to reduce the impact of the coefficients 𝛼𝑚

corresponding to those irrelevant features. The process of doing so is called regularization

and the idea behind it is to set some of the coefficients 𝛼 to zero or to very small values.

Regularization is not limited to polynomial expansion but can also be applied to any other

type of feature expansion.

Regularization works by introducing a penalty in the norm of the parameters 𝛼. Then, the

least squares loss function reads:

𝑅𝑆𝑆 = ∑(𝑦𝑖 − 𝛼0 − ∑𝛼𝑗𝑥𝑖,𝑗

𝑝

𝑗=1

)

2

+ 𝜆 ∑‖𝛼𝑗‖

𝑝

𝑗=1

𝑁

𝑖=1

(7)

25

The Lasso and Ridge Regression algorithms share the loss function stated in Equation 7,

with the only difference being the type of norm used for the penalty.

For the Lasso the L1 norm is used, so the loss function is:

𝑅𝑆𝑆𝐿𝑎𝑠𝑠𝑜 = ∑(𝑦𝑖 − 𝛼0 − ∑𝛼𝑗𝑥𝑖,𝑗

𝑝

𝑗=1

)

2

+ 𝜆 ∑‖𝛼𝑗‖1

𝑝

𝑗=1

𝑁

𝑖=1

(8)

In contrast, for Ridge Regression the L2 norm is used so the loss function is:

𝑅𝑆𝑆𝑅𝑖𝑑𝑔𝑒 = ∑(𝑦𝑖 − 𝛼0 − ∑𝛼𝑗𝑥𝑖,𝑗

𝑝

𝑗=1

)

2

+ 𝜆 ∑‖𝛼𝑗‖2

𝑝

𝑗=1

𝑁

𝑖=1

(9)

3.3.4. Kernels

A more general way of doing a feature basis expansion is through kernels. The underlying

idea, as in a polynomial expansion, is to have a model that is no longer dependent on 𝑋 but

on a different set of quantities resulting from a transformation of 𝑋. If the transformation

of 𝑋, also called feature mapping is denoted as 𝜙(𝑥), then the corresponding Kernel can

be defined as:

𝐾(𝑥, 𝑧) = 𝜙(𝑥)𝑇𝜙(𝑥) (10)

If Equation 5 is written as 𝑓(𝑋) = 𝛼𝑇𝑋, then when a kernel is applied, the model is:

𝑓(𝑋𝑗) = 𝐾(𝛼, 𝑋𝑗) (11)

The effect of the kernel is then that the learning of the model takes place in the kernel space

and not in the feature space anymore. One of the benefits of using kernels is that different

feature mappings 𝜙(𝑥) can be used. Two of the most commonly used kernels are the

Polynomial Kernel and the Radial Basis Function (RBF) Kernel shown in Equations 12

and 13 respectively.

𝐾(𝑥, 𝑧) = (𝑥𝑇𝑧 + 𝑐)𝑑 (12)

𝐾(𝑥, 𝑧) = 𝑒𝑥𝑝 (−‖𝑥 − 𝑧‖2

2𝜎2) (13)

26

3.3.5. Tree based methods

Regularization and feature basis expansions allow linear models to capture nonlinearities

as well as to reduce the impact of irrelevant features. In contrast, tree based methods

natively meet those two requirements by using a different regression paradigm. The basic

model for tree-based regression is the Regression Tree. This model is composed of

piecewise constant functions, which means that the space of features is partitioned into M

regions 𝑅1, 𝑅2, … , 𝑅𝑀 and the value of the function 𝑓(𝑋) is represented as a constant in

region M as shown in Equation 14 (Hastie, Tibshirani and Friedman, 2009):

𝑓(𝑋) = ∑ 𝑐𝑚𝐼{𝑋 ∈ 𝑅𝑚}

𝑀

𝑚=1

(14)

where 𝐼{𝑋 ∈ 𝑅𝑚} is the indicator function that specifies that 𝑋 is in the domain of 𝑅𝑚.

Similarly to the linear case presented in Section 3.3.1, the chosen loss function is least

squares, which has the consequence of defining the value of 𝑐𝑚 as the average of the

response data 𝑦𝑖 in the region 𝑅𝑚 as shown in Equation 15.

𝑐𝑚 = 𝑎𝑣𝑒𝑟𝑎𝑔𝑒(𝑦𝑖|𝑥𝑖 ∈ 𝑅𝑚) (15)

To specify the regions 𝑅𝑚such that the residual sum of squares is minimized, a process of

binary partition of the data 𝑋 is followed. The idea is to choose a feature 𝑗 and a split point

𝑠 such that the two new regions 𝑅1 and 𝑅2 minimize the residual sum of squares within

them as shown in Equation 16.

min𝑗,𝑠

[min𝑐1

∑ (𝑦𝑖 − 𝑐1)2

𝑥𝑖∈𝑅1(𝑗,𝑠)

+ min𝑐2

∑ (𝑦𝑖 − 𝑐2)2

𝑥𝑖∈𝑅2(𝑗,𝑠)

] (16)

This binary partitioning process can be also represented as a tree of splitting decisions of

the dataset, hence the name of Regression Tree.

3.3.6. Random Forest

Regression trees suffer from high variance in their estimations, this is the output can be

highly sensitive to small changes in the input. The problem can be addressed by model

averaging, where the outputs of multiple model realizations are averaged, resulting in an

estimation with reduced variance compared to that of a single model. An arrangement of

this kind where the models are Regression Trees is called a Random Forest. The model is

represented in Equation 17 (Hastie, Tibshirani and Friedman, 2009).

27

𝑓(𝑋) =1

𝐵∑ 𝑇𝑏(𝑥)

𝐵

𝑏=1

(17)

where 𝑇𝑏(𝑥) represents an individual Regression Tree. From Equation 17 it can be seen

that the main parameter in a Random Forest is the number of Regression Trees used for the

ensemble and it is usually obtained through Cross-Validation.

Figure 3-4. A Regression Tree can be visualized as a tree of binary partitions. The top left figure

shows an arbitrary partitioning of the dataset that cannot be achieved by binary partition.

The top right shows a binary partition equivalent to the tree shown in the bottom left.

The bottom right shows the partition in the feature space. Image taken from (Hastie,

Tibshirani and Friedman, 2009).

3.4. Machine learning framework

The Scikit-Learn 0.19 (Pedregosa et al., 2012) Python implementations of the Lasso, Ridge

and Regression Tree were used in this study. This allowed for faster development and

testing of the models and their capabilities. The Scikit-Learn library is widely adopted by

the industry and the implementation is highly stable.

All model parameters were chosen by conducting a grid search in the parameter space and

choosing the optimal combination using a 5-Fold Cross-Validation process as described in

Section 3.2.1. Scikit-Learn also allowed running of this process with off the shelf

implementations of the grid search and Cross-Validation processes.

29

4. Modeling Results

Three different scenarios were tested for flow rate estimation in this study. The cases are

presented in this report from simplest to more complex from the point of view of the

regression problem as well as the physical processes involved in generating the data. The

order in which the cases are presented matches the order in which they were made, which

guides the reader through the evolution of the understanding of the problem that took place

during the study.

The three cases have three common characteristics, the first of which is that at the

beginning of each processes the reservoir was assumed to be untouched, meaning no

previous production or injection was assumed. Therefore, temperature was constant

throughout the reservoir at the beginning of each test.

The second shared characteristic is that all the cases are injection processes. The decision

to use injection instead of production was made to make the cases resemble the state at

which a recently hydraulically fractured well would be at the end of the fracturing process.

The third commonality between the cases is that the flow rate was estimated for a single

fracture at a time. As it was mentioned in Section 2, the location of the fracture was

assumed known before the modeling.

4.1. Case 1. Injection at constant pressure

For this case, the forward model simulated the well subject to injection at constant

bottomhole pressure (BHP), the simplest injection process besides constant flow rate. The

BHP target was set to 700 bar or 10,152 psi and the well was flowed for 3 days.

The simulation time step was set automatically, resulting in an average spacing of the data

points of 43 minutes, with a higher time resolution or about 2 minutes at the beginning of

the process. Two machine learning algorithms were selected for the regression in this case,

the Lasso Regression and the Random Forest. Those algorithms were selected because they

are very different kinds of models as described in Section 3.3.

4.1.1. Features set

For this case, a feature set composed by the absolute temperature, spatial and temporal

derivative were used as shown in Table 1. The inclusion of the derivatives was motivated

by the fact that it is the rate of change of temperature that is relevant for identifying flow

rate changes, as described in (Ribeiro and Horne, 2014)

30

Table 1. Feature set for Case 1.

Feature Description Data Type

Temperature 𝑇

Absolute temperature at the

fracture and wellbore

intersection at time 𝑡.

Numeric (floating point)

𝑑𝑇

𝑑𝑥

Temperature spatial

derivative


𝑑𝑇

𝑑𝑡

Temperature time

derivative


The proposed feature set could be considered a naïve approach, as very little preprocessing

was done to the temperature time series before feeding the data to the algorithm. It must be

said that even though the process is time-dependent, time itself was not considered as an

input and instead it is considered implicitly in the time derivative. Equation 18 shows the

regression output as the expected flow rate 𝑞 at time 𝑡 conditional on the values of

{𝑇,𝑑𝑇

𝑑𝑥,𝑑𝑇

𝑑𝑡}. Were time to be included as a feature in 𝑋, the regression output would imply

that for a specific value of temperature and its derivatives, flow rate would also depend on

time by itself, not the rate of change of temperature.

𝑓(𝑋𝑡) = 𝐸 (𝑞𝑡| {𝑇,𝑑𝑇𝑑𝑥

,𝑑𝑇𝑑𝑡

}𝑡) (18)

Both the spatial and temporal derivatives were computed numerically using a two point

formula. Because no noise was added, this method provided reliable values for the

derivatives. However, in the presence of noise a more robust method would need to be

implemented such as linear regression on a modest number of adjacent points.

In the case of the Lasso Regression, a third-order polynomial basis expansion was applied,

as described in Section 3.3.2. This effectively expanded the three original features

described in Table 1 to 19 features that include the squared and cubic variables as well as

all the possible interactions between them.

31

4.1.2. Model performance

Figure 4-1 shows the out of sample prediction results for the Lasso Regression and Random

Forest as well as the original flow rate data. An out of sample prediction for the full dataset

was obtained by saving and concatenating the test estimations from the Cross-Validation

process as described in Section 3.2.1.

It can be noted that the prediction from the Lasso is smoother than the one of the Random

Forest, which presents oscillations around the initial part of the data. This is not unexpected

given the functional form of both the Lasso and the Random Forest. Because the Lasso is

defined as a continuous function as stated in Equations 3 and 8, the resulting estimation is

smoother. However, the Random Forest is made out of an ensemble of piecewise constant

functions (Equations 14 and 17) which causes the regression output to be less smooth.

Figure 4-2 shows two performance metrics for the Lasso Regression and Random Forest

for Case 1. The Lasso achieved a mean squared error of 28% less than the Random Forest

and 68% less error when measured in terms of mean absolute error.

Figure 4-1. Out of Sample prediction for Case 1 using the Lasso and Random Forest.

Figure 4-2. Performance metrics for the Lasso and Random Forest for Case 1. The units are

rm3/day for the Mean Absolute Error and (rm3/day)2 for the Mean Squared Error.

32

4.2. Case 2. Injection with variable flow rate

For Case 2 a scenario with variable flow rate was tested. The injection profile is shown in

Figure 4-3 and is composed by four days of increasing stepwise injection followed by a

well shut-in for one day. For this case the same set of features as in Case 1 was tested

(Table 1). In this case only the Lasso Regression method was tested, as the Random Forest

proved to be noisy even in simple cases as shown in Section 4.2. Similarly to Case 1, a

third-order polynomial basis expansion was applied to allow the model to capture

nonlinearities thus expanding the feature set to 19 variables.

4.2.1. Model Performance

In Case 2, the first three days were used as training/validation sets and the last two days

were considered for testing. The regularization parameter for the Lasso method was chosen

as the best one from a grid search, where each candidate was cross-validated as described

in Section 3.2.1

Figure 4-3 shows the model performance as well as the actual flow rates for Case 2. It is

evident that the model did not capture the reservoir’s behavior even when the optimal

parameters were found. While the test estimate captures the direction of flow rate change,

with lower flow rate on day three and higher flow rate on day four, the estimate is nowhere

near the actual values.

Figure 4-3 Fracture injection profile for Case 2. The training set is composed by days 0-3 while

the test set is the remaining two days. The test estimation is shown in blue.

The lack of generalization by the Lasso method in Case 2 can be explained by looking at

the flow rate data with their corresponding temperature and temperature derivatives at a

given point in time. In Figure 4-4 it can be seen that for day one and five, the values of

flow rate are very similar but their corresponding values of temperature and derivatives

differ widely.

33

This nonuniqueness of the features-output pairs makes it impossible for a machine learning

algorithm to generalize the reservoir behavior. It is a requirement that for a specific value

of flow rate there is a unique combination of feature values so that the model can capture

the behavior. Therefore, Case 2 proved that the features proposed in Table 1 were not

sufficient for capturing the full reservoir behavior. Temperature and its rate of change are

not sufficient for estimating flow rate. Intuitively, this is consistent with the general

solution of the convection equation (Equation 1) which states that temperature is the result

of a convolution, where temperature is not only dependent on the current value of flow rate

but also the previous values.

Figure 4-4. Features and actual fracture flow rate for Case 2. The blue lines show where similar

values of flow rate have widely different values in the features.

34

4.3. Case 3. Injection with variable flow rate - lookback approach

The results of Case 2 made evident that the information of the current state of the system

is not sufficient for a model to estimate flow rate. Information on the past behavior of the

system is necessary as well. Two approaches were taken for this problem in this study,

using a set of physics inspired features and applying a simple feature lagging strategy.

In order to introduce more variability in the data, the injection process was extended to six

days with shorter periods of constant injection followed by shut in as showed in Figure 4-5.

The training/validation data was set to 70% of the data, corresponding to 4.25 days, with

the last 1.75 days used as test data.

Figure 4-5. Injection profile for Case 3

4.3.1. Case 3A - Convolutional features

A feature formulation introduced in (Liu and Horne, 2013) makes use of the results of a

physical model to represent pressure as a convolution of flow rate change events. Tian and

Horne (2015) made use of that feature formulation to model pressure using temperature

data and vice versa with reasonable success. Inspired in those applications, in this study, a

modification of this ‘convolutional features’ formulation was applied for the flow rate

estimation problem. The features definition is shown in Equation 19 for the features at time

𝑖. These set of features implicitly capture information from the past as all the previous

values of temperature are included in the summation for a time 𝑡 = 𝑖.

35

𝑥𝑖 =

[ ∑ (𝑇𝑗 − 𝑇𝑗−1)

𝑖

𝑗=1

∑ (𝑇𝑗 − 𝑇𝑗−1)𝑖

𝑗=1𝑙𝑜𝑔(𝑡𝑖 − 𝑡𝑗−1)

∑ (𝑇𝑗 − 𝑇𝑗−1)𝑖

𝑗=1(𝑡𝑖 − 𝑡𝑗−1)

∑ (𝑇𝑗 − 𝑇𝑗−1)𝑖

𝑗=1/(𝑡𝑖 − 𝑡𝑗−1)

]

(19)

Figure 4-6 shows the resulting fit for two Lasso models using the convolutional features

set defined in Equation 19. The green line is the result of the Lasso without a polynomial

expansion and the purple line shows the results for the Lasso with a third order polynomial

basis expansion as defined in Section 3.3.2. From Figure 4-6 it is clear that both approaches

fail at capturing the reservoir behavior in both the training and test data.

Figure 4-6. Model results for the Lasso using the convolutional features set. The

training/validation set covers the first 4.25 days.

36

4.3.2. Case 3B - Kernel Ridge Regression - Temperature lags

As shown in Section 3.3.3, Ridge Regression allows for automatic feature selection through

regularization. This opens the door for using a different approach to introduce past

information in the model by adding lagged values of temperature as new features. While

the Lasso method also allows for feature selection, Ridge Regression can be implemented

in combination with a Kernel, allowing for a more flexible basis expansion (Section 3.3.4).

Equation 1 states that the solution to the convection problem is expressed as exponential

or Gaussian functions. This makes the Gaussian or RBF Kernel an ideal candidate for using

in combination with Ridge Regression as the model effectively learns in a space defined

by exponential functions.

For Case 3B, a Kernel-Ridge Regression model was implemented using the Gaussian

Kernel and an expanded set of features was also introduced. Table 2 shows the list of

features, where it can be noted that additionally to five temperature lags, the temperature

logarithmic time derivative and the value of Δ𝑡 were included as features. The inclusion of

Δ𝑡 instead of raw time, and measuring it from the beginning of each of the training,

validation and test datasets, solved the issue of the conditional expectation of flow rate

given time described in Section. 4.1.1.

Two additional Lasso models were also fitted to the data using the features listed in Table

2. One with a third-order polynomial basis expansion and the other without the basis

expansion.

Table 2. List of features used in Case 3B

Feature Description Data Type

𝑇𝑖

Absolute temperature at the

fracture and wellbore

intersection at time 𝑡.


(𝑑𝑇

𝑑𝑥)𝑖

Temperature spatial

derivative Numeric (floating point)

(𝑑𝑇

𝑑𝑡)𝑖

Temperature time


(𝑑𝑇

𝑑𝑙𝑜𝑔(𝑡))𝑖

Temperature log-time


37

Δ𝑡𝑖 Time delta from the

beginning of the data. Numeric (floating point)

{𝑇𝑖−1, 𝑇𝑖−2, 𝑇𝑖−3, 𝑇𝑖−4, 𝑇𝑖−5} Previous temperature

values (lags) Numeric (floating point)

Figure 4-7 shows the performance of the three models for both the training/validation and

test sets. The Lasso with a polynomial basis expansion and the Kernel Ridge regression fit

the training data similarly. However, in the test set it is clear that the Kernel Ridge

Regression model is able to capture the behavior better than the Lasso with polynomial

expansion. The Lasso without the polynomial expansion fails to capture the flow rate

behavior in both the training and testing sets.

Figure 4-7. Model comparison for Case 3B. The training/validation set covers the first 4.25 days.

The Lasso and Ridge algorithms are highly similar regularized regression methods as

shown in Equations 8 and 9. However, it is clear that the use of a Kernel greatly improves

the model performance. This is likely to be caused by the similarity in the functional form

of the Kernel Ridge model and the solution from the PDE in Equation 1.

38

4.3.3. Effect of temperature lags on model error

The number of temperature lags was set to five as that proved to minimize the mean squared

error in the test set as shown in Figure 4-8. A number of lags from five to nine showed very

similar error, however five was chosen to keep the dimensionality of the features as small

as possible. Below five lags the error increased by a full order of magnitude, with the

feature set that includes three temperature lags having the highest error.

A potential shortcoming of the proposed features with temperature lags that there is no

specific treatment of the potential difference in the temporal spacing of the data. In this

study, the time step of the data was capped to a maximum value of 0.96 hours, with most

of the dataset having a value around that maximum. This capping was chosen to maximize

precision in the simulation but data coming from a real well may not be as consistent.

Figure 4-8. Effect of different number of time lags on the Kernel Ridge Regression model

4.3.4. Effect of number of training samples on model error

A relevant performance metric of the Kernel Ridge Regression model is the amount of data

it requires to achieve reasonable error. Figure 4-9 displays the so-called learning curve of

the model, which plots the number of data points in the training dataset vs the mean squared

error of the model. The curves for training and test error are displayed in the figure.

Additionally, the dispersion of the test error is shown by plotting the 25th and 75th

percentiles. These percentiles are computed by doing cross-validation with each of the

different training dataset sizes that were tested.

39

As expected, when the model is trained with more data, the mean squared error for the test

data decreases. However, the dispersion of the error also increases with the size of the

training data. Two inflexion points can be observed for the test error around 200 and 400

data points, corresponding to two and four days of data respectively.

It must be noticed that the two inflexion points appear just after the data includes a full

period of injection followed by shut-in (Figure 4-5). This suggests that adding more

variability to the training data in the form of flow rate changes and shut-ins heavily

influences the out of sample prediction of the model.

The test error line in Figure 4-9 does not show a continuous downward slope. Instead, it

displays a stepwise behavior. Because of this it cannot be said with certainty that the model

has converged to a stable test error, even though the slope of the line is small. Therefore,

additional data could still improve the test error of the Kernel Ridge Regression model. For

this study, the length of the dataset in this study was constrained by the computational time

it requires to run the high-resolution spatial simulation using small time steps.

There is also a clear relationship between the length of the training dataset, the time step

between each data point and the number of chosen temperature lags in the features. How

the Kernel Ridge Regression model performs when these three variables change

simultaneously remains to be investigated.

Figure 4-9. Learning curve for Case 3B

41

5. Conclusions

This study showed that machine learning methodologies have the potential for recovering

flow rate using only temperature data in simple geometries and flow regimes. The work

done also allowed for the exploration of the difficulties that this inverse problem poses as

it is described by a deconvolution.

The approach taken for the modeling, from simple models to complex ones, allowed for

the examination of the limitations of using simple features for the modeling and highlighted

the importance of the time dependence of the process. The comparison between Case 1 and

Case 2 clearly shows that the simple features that were successful for estimating flow rate

in the case of constant pressure failed to capture the reservoir behavior when the

temperature data was caused by a history of flow rate changes.

The comparative success of the Kernel Ridge Regression approach can be attributed to the

combined effect of the introduction of past information at a given point in time, as well as

to the use of the Gaussian Kernel. Introducing information from the past effectively broke

the nonuniqueness of the feature combinations and allowed the model to deconvolve the

temperature history to recover the flow rate.

It is worth mentioning that the Kernel Ridge Regression model still used features that

required very little preprocessing and are not purposely formulated to reflect actual

physical processes. This had the advantage of not requiring a full understanding of the

physical problem in order to build useful features for the problem.

Part of the performance improvement gained with the Kernel Ridge Regression could be

attributed to the similarity of the functional form that the Gaussian Kernel has with the

actual solution of the convection equation (Equation 1), which is expressed as a

combination of Gaussian functions. This opens the door for further increasing the model’s

performance by embedding the functional form of the physics in the machine learning

model. By modifying the kernel and feature combinations, the function that is being fitted

by the machine learning model can be approached to the functional form of the actual

physical process that created the data. Doing that could constrain the model to physically

possible solutions, a challenge that most data driven models face in the physical sciences.

5.1. Future work

The impact of multiphase flow in the performance of the inverse model presented in this

study remains to be studied. In the presence of multiphase flow is existent, the same

temperature profile could be caused by multiple component combinations potentially

making the regression ill posed (Ouyang et al., 2004).

42

The strategy of embedding the functional form of the physics in the machine learning

model is part of a larger class of models denominated hybrid models, as they are a

combination of mechanistic or physics based models and data driven ones. In addition to

the mentioned kernel and features modifications, there are other ways to introduce physical

knowledge into data driven models. One of such approaches is to treat the data as the result

of a stochastic process with unobserved or latent functions in it (Alvarez et al, 2013). The

physics can then be specified as a prior on those unobserved functions and the problem

becomes a statistical one.

While the Kernel Ridge Regression approach shows plenty of potential for improvement,

deep learning methodologies have also shown to be useful for inverse modeling. Tian

(2018) successfully applied nonlinear autoregressive models with exogenous inputs

(NARX) to deconvolve flow rate from raw pressure data eliminating the need of manually

defining features or using kernels. These methodologies could also be useful in the case of

estimating flow rate from temperature data.

43

References

Alvarez, M. A., Luengo-Garcia, D. and Lawrence, N. D. (2013) ‘Latent Forces Models

using Gaussian Processes’, IEEE Transactions on Pattern Analysis and Machine

Intelligence, 35(11), pp. 1–20. doi: 10.1109/TPAMI.2013.86.

Barree, R. D., Fisher, M. K. and Woodroof, R. a. (2002) ‘A Practical Guide to Hydraulic

Fracture Diagnostic Technologies’, Proceedings of SPE Annual Technical Conference and

Exhibition, p. SPE 77442. doi: 10.2523/77442-MS.

Bergmeir, C. and Benítez, J. M. (2012) ‘On the use of cross-validation for time series

predictor evaluation’, Information Sciences. Elsevier Inc., 191, pp. 192–213. doi:

10.1016/j.ins.2011.12.028.

de Bézenac, E., Pajot, A. and Gallinari, P. (2018) ‘Deep Learning for Physical Processes:

Incorporating Prior Scientific Knowledge’, in International Conference on Learning

Representations, pp. 2009–2010.

Cipolla, C. L. and Wright, C. a. (2000) ‘State-of-the-Art in hydraulic fracture diagnostics’,

SPE Asia Pacific Oil and Gas Conference and Exhibition, p. 15. doi: 10.2118/64434-MS.

Cook, P. (2017) ‘Use of Diagnostics in Refracturing Applications to Understand Treatment

Effectiveness’, in SPE Distributed Fiber-Optic Sensing for Well, Reservoir and Facilities

Management Workshop. Society of Petroleum Engineers.

Duru, O. and Horne, R. (2011) ‘Simultaneous Interpretation of Pressure, Temperature, and

Flow-Rate Data Using Bayesian Inversion Methods’, SPE Reservoir Evaluation &

Engineering, (April), pp. 225–238. Available at:

http://www.onepetro.org/mslib/servlet/onepetropreview?id=SPE-124827-PA.

Duru, O. O. and Horne, R. N. (2010) ‘Modeling Reservoir Temperature Transients and

Reservoir-Parameter Estimation Constrained to the Model’, SPE Reservoir Evaluation &

Engineering, 13(06), pp. 873–883. doi: 10.2118/115791-PA.

Glasbergen, G. et al. (2009) ‘Real-Time Fluid Distribution Determination in Matrix

Treatments Using DTS’, SPE Production & Operations, 24(1), pp. 135–146. doi:

10.2118/107775-pa.

Hastie, T., Tibshirani, R. and Friedman, J. (2009) ‘The Elements of Statistical Learning’,

Bayesian Forecasting and Dynamic Models, 1, pp. 1–694. doi: 10.1007/b94608.

Li, X. and Zhu, D. (2016) ‘Temperature behavior of multi-stage fracture treatments in

horizontal wells’, Society of Petroleum Engineers - SPE Asia Pacific Hydraulic Fracturing

Conference, (August 2016), pp. 24–26. Available at:

https://www.scopus.com/inward/record.uri?eid=2-s2.0-

84991662070&partnerID=40&md5=27f9c1b0ba3c621f5a5411ab05fd3265.

Liu, Y. and Horne, R. N. (2013) ‘Interpreting Pressure and Flow-Rate Data From

Permanent Downhole Gauges by Use of Data-Mining Approaches’, SPE Journal, 18(01),

pp. 69–82. doi: 10.2118/147298-PA.

44

Nelson, R. A. (1985) ‘Evaluating Fractured Reservoirs: Part 6. Geological Methods’, (m),

p. 3.

Ouyang, L. et al. (2004) ‘SPE 90541 Flow Profiling via Distributed Temperature Sensor (

DTS ) System – Expectation and Reality’.

Pedregosa, F. et al. (2012) ‘Scikit-learn: Machine Learning in Python’, Journal of Machine

Learning Research, 12, pp. 2825–2830. doi: 10.1007/s13398-014-0173-7.2.

Ribeiro, P. and Horne, R. (2014) ‘Detecting Fracture Growth Out of Zone Using

Temperature Analysis’, SPE Annual Technical Conference and …, (October 2014), pp.

27–29. doi: 10.2118/170746-MS.

Ribeiro, P. M. and Horne, R. N. (2013) ‘Pressure and Temperature Transient Analysis:

Hydraulic Fractured Well Application’, SPE Annual Technical Conference and Exhibition,

(1981). doi: 10.2118/166222-MS.

Rin, R. et al. (2017) ‘General Implicit Coupling Framework for Multi-Physics Problems’,

SPE Reservoir Simulation Conference, (February). doi: 10.2118/182714-MS.

Sierra, J. R. et al. (2008) ‘DTS Monitoring Data of Hydraulic Fracturing: Experiences and

Lessons Learned’, SPE Annual Technical Conference and Exhibition, (1), pp. 1–15. doi:

10.2118/116182-ms.

Tian, C. (2018) Machine Learning Approaches for Permanent Downhole Gauge Data

Interpretation. Stanford University.

Tian, C. and Horne, R. N. (2015) ‘Applying Machine Learning Techniques to Interpret

Flow Rate, Pressure and Temperature Data From Permanent Downhole Gauges’, SPE

Western Regional Meeting, California, USA, (June). doi: 10.2118/174034-MS.

Ugueto, G. A. et al. (2014) ‘Application of Integrated Advanced Diagnostics and Modeling

To Improve Hydraulic Fracture Stimulation Analysis and Optimization’, SPE Hydraulic

Fracturing Technology Conference, pp. 1–14. doi: 10.2118/168603-MS.

45

Appendix A. Detecting Fracture Location

Identifying the presence and location of a fracture along the wellbore is a problem that

requires the integration of multiple sources of information for a meaningful interpretation.

Temperature data has been traditionally used for fracture diagnostics when there are

processes involving the injection of fluids into the formation such as hydraulic fracturing

stimulation, acid stimulation, or water injection for pressure support (Ugueto et al., 2014).

During the injection process, the injected fluid cools down the wellbore and the

surrounding region. As more fluid enters the formation, the cooling effect becomes more

pronounced in the regions adjacent to fractures. After the injection process is complete and

the well is shut-in, a thermal recovery or warm-back process takes place. The regions of

the well that have no fractures recover at a different rate compared to where fractures are

present. This time lag in the temperature recovery can be recognized in the data and used

for identifying the presence of fractures (Figure 5-1).

Figure 5-1. Temperature warm-back response in the presence of fractures. Figure modified from

(Ugueto et al., 2014)

Ugueto et al. (2014) showed that temperature data was useful for detecting fractured

regions as well as fracture length in hydraulically fractured wells when used in combination

with other data like radioactive tracers and production logging tools. Ribeiro and Horne,

(2014) used simulated data to show that the temperature first derivative with respect to

position can be used to identify the exact location where fluid enters the well during early

times of injection and flowback.

46

Figure 5-2 shows an example of the temperature profile of a well during a hydraulic

fracturing stimulation. The fractures can be identified as ‘cold spots’, or points of constant,

comparatively lower, temperature throughout the process. The presence of the fractures

can be characterized by the depth, width and time duration of those ‘cold spots’ existing in

the data, which in turn makes data driven good candidates for automatic or semiautomatic

fracture detection.

Automated fracture detection falls in the realm of time series pattern recognition and

anomaly detection. Multiple approaches can be taken for solving the problem, with varying

requirements of human interaction in the process. These approaches can be classified into

two general groups, supervised and unsupervised detection.

In supervised detection, the detection algorithm is trained with labeled data where fractures

are already tagged and it is the algorithm’s job to identify similar patterns in new data. This

data can be generated either by a simulator or can come from a real well where the fractures

have been previously identified with certainty.

Unsupervised detection does not require the data to be labeled. Instead, the algorithm

classifies the data into different clusters. Because the behavior of a fractured region is

different to a non-fractured region, the algorithm detects the differences and a human

interpreter then decides which cluster corresponds to the presence of fractures.

Figure 5-2. Temperature history in a hydraulically stimulated well. Taken from (Cook, 2017)

47

An exercise in supervised fracture detection was done for this study. In the exercise, a two

fracture, two-dimensional reservoir was simulated, similar to the one presented in Section

3.2. Water was injected into the reservoir at a constant rate and the temperature was

captured along the wellbore.

The output of the simulation was a spatial time-series of temperature at each point in the

wellbore. Each point in the time series was tagged as being in the vicinity of a fracture or

not and those data were then used as the input for a Regression Tree, whose output was a

binary classification of the data. The features for the Regression Tree were the normalized

temperature and time values at each location in the wellbore.

Figure 5-3 shows the results of the Regression Tree model. A length vs. time classification

plot is shown, where yellow points denote the presence of a fracture. It can be noted that

the Regression Tree correctly detected the presence of a fracture in most of the data tagged

as such, achieving an accuracy of around 80% for this simple scenario.

Figure 5-3. Model results for the supervised fracture detection algorithm

The presented exercise showed the potential of using data driven methodologies and

temperature data for identifying the location of fractures. However, when a well is drilled

temperature is not the only available information for fracture detection. A more

comprehensive and potentially less uncertain approach would also include all additional

data such as well logs or distributed acoustic data (DAS) when available.

fracture flow rate estimation using machine learning on ... · and horne (2015) applied...

Documents