data driven methodology for model selection in ow pattern

31
Data driven methodology for model selection in flow pattern prediction Juan Sebasti´ an Hernandez Dept of Industrial Engineering, Universidad de los Andes, [email protected] Carlos Felipe Valencia Dept of Industrial Engineering, Universidad de los Andes, [email protected] The determination of multiphase flow parameters such as flow pattern, pressure drop and liquid holdup, is a very challenging and valuable problem in chemical industry, especially during petroleum transport. Recently, many approaches have been made for establishing a Unified Flow Model to predict accurately the mentioned properties with an outstanding advance made by Zhang et al. (2003b). This paper proposes a novel methodology for selecting closure relationships from the models included in the Unified Flow Model developed by the previously cited authors. A tree based model is built based on a data driven methodology developed from a 27670 points data set and later tested for flow pattern prediction in a set made of 9224 observations. The closure relationship selection model achieved a 74% accuracy in classifying flow regimes for a wide range of two-phase flow conditions. Key words : Two phase flow; Flow pattern; Decision tree; Bagging; Unified flow model 1. Introduction Multiphase flows in pipes are complex physical processes which are very common in chemical industry, especially during petroleum transport (Picchi and Poesio 2017). In the latter, fluids are pushed upwards from oil wells using gas injection, water and steam to improve the production rate of the system. Once the product is extracted, it is taken to processing facilities through a pipeline system, which complexity depends on the ground conditions of the area that in hilly-terrain carries to a wide range of pipe inclination angles. Based on this, it becomes necessary to establish a unified model to predict accurately two-phase flow properties such as pressure drop, flow pattern and phase hold-up at all inclination angles (Zhang et al. 2003a). As mentioned before, one of the main properties in the study of two-phase flows is the flow pattern, which makes reference to the spatial distribution of the gas and liquid phases during the flow in pipes. When gases and liquids flow simultaneously in a pipe, depending on a wide number of variables, the phases can distribute themselves in a variety of configurations. The configuration is determined by the interface distribution, which results in different flow characteristics. The correct estimation of the regimes is fundamental in two-phase analysis, taking into account that 1

Upload: others

Post on 06-Dec-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Data driven methodology for model selection in flowpattern prediction

Juan Sebastian HernandezDept of Industrial Engineering, Universidad de los Andes, [email protected]

Carlos Felipe ValenciaDept of Industrial Engineering, Universidad de los Andes, [email protected]

The determination of multiphase flow parameters such as flow pattern, pressure drop and liquid holdup,

is a very challenging and valuable problem in chemical industry, especially during petroleum transport.

Recently, many approaches have been made for establishing a Unified Flow Model to predict accurately the

mentioned properties with an outstanding advance made by Zhang et al. (2003b). This paper proposes a

novel methodology for selecting closure relationships from the models included in the Unified Flow Model

developed by the previously cited authors. A tree based model is built based on a data driven methodology

developed from a 27670 points data set and later tested for flow pattern prediction in a set made of 9224

observations. The closure relationship selection model achieved a 74% accuracy in classifying flow regimes

for a wide range of two-phase flow conditions.

Key words : Two phase flow; Flow pattern; Decision tree; Bagging; Unified flow model

1. Introduction

Multiphase flows in pipes are complex physical processes which are very common in chemical

industry, especially during petroleum transport (Picchi and Poesio 2017). In the latter, fluids are

pushed upwards from oil wells using gas injection, water and steam to improve the production rate

of the system. Once the product is extracted, it is taken to processing facilities through a pipeline

system, which complexity depends on the ground conditions of the area that in hilly-terrain

carries to a wide range of pipe inclination angles. Based on this, it becomes necessary to establish

a unified model to predict accurately two-phase flow properties such as pressure drop, flow pattern

and phase hold-up at all inclination angles (Zhang et al. 2003a).

As mentioned before, one of the main properties in the study of two-phase flows is the flow

pattern, which makes reference to the spatial distribution of the gas and liquid phases during the

flow in pipes. When gases and liquids flow simultaneously in a pipe, depending on a wide number

of variables, the phases can distribute themselves in a variety of configurations. The configuration

is determined by the interface distribution, which results in different flow characteristics. The

correct estimation of the regimes is fundamental in two-phase analysis, taking into account that

1

2 Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction

design variables such as pressure drop, phase holdup, rate of chemical reaction and others, are

strongly related to the registered flow pattern (Pereyra et al. 2012). The overlapping between flow

regimes, especially at the transition zones makes the identification a difficult work and introduces

metering errors (Bratland 2008).

Petroleum industry, and in general chemical industry, has dealt with multiphase flow in a more

rigorous mathematical context for about 60 years, making a special emphasis in simulation

software over the past 30 years. Taking into account that flow patterns depend on parameters

such as pipe inclination, diameter and length, physical properties of the phases, and superficial

velocities (Shippen and Bailey 2012), many machine learning approaches have been developed in

the last years to identify flow patterns, focused especially in ANNs implementation. Xie et al.

(2004) used a transportable artificial neural network for the classification of flow regimes in three

phase gas/liquid/pulp fiber systems by using pressure signals as input was examined. Tan et al.

(2007) used features extracted from Electrical Resistance Tomography (ERT) data as input of a

Support Vector Machine (SVM) algorithm to recognize the flow regime. Ozbayoglu and Yuksel

(2012) implemented a back propagation neural network model for flow pattern identification and

a regression tree for liquid holdup estimation. Al-Naser et al. (2016) used ANN for flow pattern

identification with a preprocessing stage using natural logarithmic normalization, to reduce

overlapping between flow patterns.

There is not convention in the number of flow patterns in two-phase flow due to overlapping and

characterization subjectivity, especially at the transition zones. Four multiphase flow patterns are

considered in this work: Bubble, Intermittent, Annular and Segregated, which visual characteristics

are shown in figure 1.

Figure 1 Flow patterns

In bubble flow (BF), the gas phase is distributed as discrete bubbles in a continuous liquid phase,

for the case of horizontal flow and inclined pipes, the presence of bubbles is higher in the zones

Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction 3

which are closer to the top of the pipe (Taitel and Barnea 2016).

Intermittent flow (IT) is registered when the inventory of liquid in the pipe is distributed in a

non-uniform way in the axial direction, for horizontal flow, plugs or slugs of liquid separated by

gas zones fill the whole cross-section of the pipe with a stratified liquid layer flowing along the

bottom. For annular flow (A), the liquid flows as a continuous film in the border of the pipe due

to high velocity of the gas. The last flow regime considered was segregated flow (SG), for which

liquid flows along the bottom of the pipe and gas at the top (Taitel and Barnea 2016).

The main aim of this work is to develop a tree based data driven methodology to select correctly

the closure relationships of each submodel of the Unified Model for Gas-Liquid Pipe Flow,

using dimensionless numbers as input and a database of only horizontal flows, and to test the

performance of the proposed approach on two-phase flow pattern prediction. The results of this

analysis offer a starting point for the study of model selection based on machine learning in

fluid dynamics, which in addition to flow pattern prediction, provides a great estimation of other

properties such as pressure drop and phase hold-up, key aspects in the design process and real

time control strategies.

After implementing the tree based model, an accuracy of 74.34% was registered for flow pattern

prediction using the combinations of closure relationships obtained for each point of the test

set from the algorithm. Most misclassifications were presented for observations located in the

boundaries between similar flow regimes like annular-segregated and bubble-intermittent.

The paper is organized as follows: Section 2 explains the origin of the data base and the dimension-

less numbers included in the study. Section 3 explains briefly the structure and origin of the Unified

Flow Model. Section 4 describes the machine learning literature background and the proposed

methodology. Section 5 shows the results of the implementation based on the metrics established

previously. Section 6 develops a brief analysis of the results making special emphasis on misclassi-

fication problems. Conclusions and future work are presented in section 7.

2. Experimental data base

An experimental data base collected by Pereyra (Pereyra et al. 2012) , which consists of the

most relevant studies on flow pattern prediction was used, as given in table 1. The earliest set

of data is Shoham (1982) (Shoham 1982), acquired in 50.8 and 25.4 mm pipe diameters utilizing

air water at atmospheric conditions, which was the first study covering systematically all the

4 Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction

inclinations angles, from -90 to +90. On the same year, Lin (1982) (Lin 1985) developed horizontal

flow experiments in 25.4 mm and 95.4 mm diameter pipes, varying the superficial gas velocity

between 0.8 and 200 m/s. Four years later, Kouba (1986) (Kouba 1986) using air-kerosene, studied

slug-flow in a horizontal 76.2 mm diameter pipe. Afterward, Kokal (1987) (Kokal 1987) carried

out a study of two-phase flow patterns in horizontal and slightly inclined horizontal flow, using

25.8, 51.2 and 76.3 mm diameter pipes, with air and light oil as working fluid. On the next

decade, Wilkens (1997) (Wilkens 1997) developed a study for two phase gas-liquid flows at 0, 1

and 90 angles. Only his data for oil and CO2 were considered in the present study. Later, Manabe

(2001) (Manabe 2001) analyzed the relation between pressure and flow pattern for oil-natural gas

systems using 0, 1 and 90 inclinations. Later, Mata (2002) (Mata et al. 2002) worked on a flow

pattern map for high viscosity oil and air in a 50.8 mm horizontal pipe. Three years later, Gokcal

(2005) (Gokcal 2005) developed a study in a 50.8 mm diameter horizontal pipe for two different

liquid viscosities: 181 and 587 mPa s. The complete origin of the data is explained in detail in the

appendix A.1.

Authors Year Diameter Angle Gas velocity Liquid velocity Fluids Points(mm) (m/s) (m/s)

Shoham 1982 25 and 51 -90 to 90 0.0037 to 42.95 0.0011 to 25.51 Water-Air 5676Weisman 1981 25 2.75 to 10 0.0002 to 49.46 0.0033 to 26.65 Water-Air 2075Shi 2013 200 -90 to 85 0.01 to 61.68 0.0104 to 9.97 Water-Air 2058Ekinci 2015 50.8 -2 to 2 0.088 to 5.32 0.095 to 0.817 Oil-Nat Gas 1040Tzotzi 2013 24 0 to 1 1.28 to 25.71 0.002 to 0.246 Water-Air 943Ghajar 2009 12.7 and 27.9 0 to 90 0.19 to 26.72 0.0241 to 1.17 Water-Air 925Kokal 1987 25.8 and 76.3 -9 to 9 -0.1 to 19.42 0.03 to 3.048 Light Oil-Air 875Hand 1992 93.5 0 0.079 to 99.81 0.0067 to 26.19 Water-Air 874Hernandez-Perez 2007 38 0 to 90 0.13 to 12.42 0.036 to 2.68 Water-Air 861Matsubara 2011 20 0 1.06 to 26.32 0.0025 to 0.18 Water-Air 803Xu 2007 20 and 60 -75 to 75 0.011 to 3.52 0.010 to 13.95 Water-Air 802

Table 1: Data base references

The data base consists of a total 37649 experimental data points with information related with fluid

properties such as density, viscosity and surface tension, pipe configuration parameters like angle

and diameter, and operational conditions like liquid and gas velocity. The previously mentioned

variables were used to calculate values of fluid dimensionless numbers, which make the model

results scalable to other variable values not included in the data base. 755 points that did not

registered flow pattern were deleted.

2.1. Dimensionless numbers

Four dimensionless numbers were estimated from the original variables included in the data base:

the modified Froude number (Fr), Weber number (We), Lockhart-Martinelli parameter (Xm) and

Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction 5

Eotvos number (Eo). The inclusion of these numbers seeks to establish a global model and not

one adjusted to the range of the variables included originally.

The modified Froude number and the Lockhart-Martinelli parameter (which expresses the superfi-

cial pressure gradient ratio), regulate the transition boundaries from segregated to non-segregated

flow as well as annular to non-annular flow. These numbers have been used by authors like

Graham et al (1999) (Graham et al. 1999) for estimating the void fraction and Thome (2003)

(Thome 2003) to deduce the transition from annular flow to intermittent flow.

By the other side, Weber and Eotvos number explain the bubble agglomeration that carries to

the differentiation between bubble flow and non-bubble flow (Pereyra et al. 2012). In the case of

Weber number, if the surface tension of the fluid decreases (We is greater) bubbles will tend to

decrease because of higher momentum transfer between the phases (M 2012). Authors like Zhao

and Rezkallah (1993) (Zhao and Rezkallah 1993) used this number to determine the boundary

between different regimes like bubble flow, annular flow and slug flow at microgravity conditions.

Eotvos number was used by Clift et al (1987) (Cli et al. 1978) to characterize shape regimes for

bubbles and drops in unhindered gravitational motion liquids, additionally Ullmann and Brauner

(2007) (Ullmann and Brauner 2007) used this non-dimensional number to analyze the relation

between pipe diameter and flow pattern transitions.

3. Unified flow model

The Unified Flow Model used to estimate flow patterns based on pipe parameters and fluid

properties was proposed by Zhang and Sarica (Zhang et al. 2006) and Zhang et al. (Zhang et al.

2003b), who established a model for predicting flow pattern transitions, pressure gradient, liquid

holdup and slug characteristics for all angles from -90 to 90 from horizontal. Considering slug

flow (a subclass of intermittent flow) shares transition boundaries with all the other flow regimes,

the unified model presented by Zhang bases its calculations on the dynamics of this flow regime.

With this, equations of slug flow are taken to estimate the slug characteristics and also predict

transitions from slug flow to other regimes.

The closure relationships included in the Unified Flow Model consist basically of empirical

equations, established by many authors among different studies for different fluid combinations

and property values (appendix A.2), which can be classified in 8 main models: Entrainment Model,

6 Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction

Interfacial Friction Model, Wall Friction Model, Mixture Friction Model, Slug Model, Slug Body

Holdup Model, Slug Drift Velocity Model and Slug Translational Velocity Model. As mentioned

before, these models explain different factors that affect the flow behavior like the wall friction,

interface friction and transition boundaries between slug flow and other flow regimes, which are

essential for predicting the pattern.

Due to the huge number of possible combinations of equations that can be generated from the 8

subgroups of equations included in the model, 4 equations were selected from the 4 most important

models taking into account expert criteria: entrainment models, interfacial friction models, slug

body holdup models and slug drift velocity models , leaving the most popular closure relationship

in the others.

4. Methodology

In this section we present the proposed data analysis methodology used to select a flow theoretical

model on each point in the input space formed by the selected variables. We start with a brief

explanation of the statistical and machine learning models that inspired the proposed methodology,

followed by a detailed description of the developed machine learning algorithm. Afterwards, metrics

to measure the performance of the method are proposed.

4.1. Decision tree

Decision trees are statistical models which are learnt from a given training data set to perform a

classification or regression task (Aldrich and Auret 2013). Training data set is made of a response

vector Y and an input matrix X with number of columns equal to number of variables, and both

with N observations. When Y is a categorical variable (also known as factor), the model is called

a classification tree, while in the case of a continuous response the model is known as regression

tree.

The fundamental point of decision tree models is to recursively split the input data space (X) to

generate a particular number of regions with a higher purity index for the output (in the case

of classification). Purity is a measure for how homogeneous is the classification of the class with

the majority of votes in each region. High purity, means that the partition has a good fit of the

data. The purity index is established based on the task the model is seeking to solve, establishing

measures like the gini index for classification or the least-squares deviation for regression. When the

method stops iterating and the data-driven subspaces have been found, simple models or values are

fitted to each of the obtained spaces. Taking into account the previous explanation it is possible to

establish that a decision tree consists basically in a non-overlapping local models set with regions

determined by a recursively data driven partition of the training data space.

Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction 7

4.2. Purity index

As mentioned before, literature has established many purity indexes to measure the performance

of classification and regression trees, nevertheless, the task treated in this paper is a model

selection problem, for which purity indexes have not been developed.

Regarding that for the study development we used the flow pattern as response variable for

measuring the accuracy in model selection, the problem treated can be compared with a classi-

fication rather than regression task. The main difference between a classification model and the

one developed in this paper, is that for each observation we have as much estimated responses as

combinations of model equations implemented, while in a regular classification problem there is

only an estimated response per point. Taking into account this, we proposed a new purity index

to help us selecting the best combination of equations of the Unified Flow Model based on the

flow regime prediction.

First, for each observation data point, we calculate Y , the vector of binary variables that represents

for each model combination if the observed flow pattern is predicted correctly or not, for example:

Yi = (1,0,1,1,0, . . . ,1)∈RE

where E i the number of models considered, for the present case E = 256, and i= 1, . . . , n. Each

position, is filled with 1 if the selected model predicts correctly the flow pattern or 0 otherwise.

Once the response vector Yi is obtained for each point, it is possible to calculate yp ∈ RE for

each region of the partition (p= 1, . . . , P ), which consists of the total points classified correctly by

each combination of equations, like a multinomial distribution. P stands for the total number of

partitions the decision tree has in the current iteration. Therefore,

Yp =∑i∈Rp

Yi (1)

where the sum is over the point in the region Rp. After this vector is built up for each region, the

values were taken and organized decreasingly, to get the combinations of equations ordered from

the best to the worst. That is, Yp = (y[1]p , y[2]p , · · · , y[E]

p ), where y[1]p is the largest values of Yp. Taking

the resulting vector of this process it was possible to establish a purity index for model selection

8 Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction

Purityindex=∑p∈P

2y[1]p − y[2]p (2)

The purity index seeks to find the best combination of equations for each partition (first term) and

differentiate the accuracy of this model from the others (second term).

4.3. Bagging

The main advantage of decision tree algorithms relates to their capability to fit almost any

data distribution, nevertheless, this can be considered also as a problem if we analyze noisy

distributions, which can be erroneously overfitted. This issue can be reduced significantly by

restricting the complexity of the tree using a stopping criterion or by combining decision tree

models in ensembles with methods such as random forest, boosting and bagging (Aldrich and

Auret 2013).

Bagging, also called bootstrap aggregation, is a general-purpose procedure for reducing the variance

of machine learning methods that is frequently used for decision tree ensembles (James et al. 2013).

This method starts by generating a group of new data sets with the same size of the original one by

sampling randomly with replacement. Once the new sets are created, the same machine learning

algorithm is applied to each one to obtain different data driven models. For classification problems,

the final answer for each observation is the most frequently class registered by the resulting models.

4.4. Proposed method

Based on the previously explained review of literature, we proposed a tree based method, using

bagging to reduce overfitting (the variance of the method), for selecting accurately equations form

the Unified Flow Model.

The method starts by dividing the sample into a training set and a test set for a later validation.

Once the data set is divided, bootstrapping is applied to the training data set to generate 100

new data sets by sampling randomly with replacement. For each data set, a 15 partitions data

tree is built up for model selection using the purity index introduced previously. The established

tree-growing process is based in popular algorithms such as CART (Breiman et al. 1984) and C4.5

(Quinlan 2014), which create subspaces by recursively searching for a partition on a single input

variable that registers the greatest reduction in impurity for the output variable.

The tree construction starts by setting the vector Yi to each observation and estimating the total

correctly classified points by each combination of equations for the database without partitions in

Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction 9

Yp. Once these vectors are estimated, the purity index is calculated based on the original Yp vector,

taking the best two combinations of equations for the non-divided database. In the first partition,

the method searches recursively for a division in each single input, calculating the change in the

purity index for each possibility, which means, nxm changes in the index are estimated, where

n stands for the number of observations and m for the number of input variables, drawing the

first partition over the input and observation that registers the greatest change. For the second

partition the same procedure is repeated, but with a Yp vector for each subset. This time, the

method goes over all the observations for each input variable, searching initially in the first subset

and then in the second one, which means that when the method finds the best partition in terms

of the purity index, the division is drawn for the value of the input variable only in the subset

the observation is located. The previously explained procedure is repeated until the tree reaches

15 partitions and 16 subsets are created. Once the subsets are generated, the best combination of

equations of the Unified Flow Model is assigned to each one.

This tree growing process runs simultaneously for each data set generated by bootstrapping, until

the 100th tree is generated. After this procedure is over, the method selects the best combination of

equations for each observation by taking the most frequent selection over all the trees and estimates

the flow pattern taking the values of the input variables.

4.5. Metrics

Taking into account that the proposed method is a model selection tool, which result is used

for estimating the flow pattern, it is not possible to build a ROC curve or estimate an AUC

considering that the result is a flow regime and not a probability, which makes it impossible to

establish a threshold that can be changed in order to estimate different values of sensitivity and

specificity. Based on this, the proportion of correct classifications (accuracy) was established as

the most suitable metric for the problem.

For estimating this metric, it is necessary to build a confusion matrix first, which allows a more

detailed analysis of the final result. The confusion matrix is a table, where each row represents

the classes of the real values and each column represents the class of the predicted values.

10 Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction

Figure 2 Confusion matrix structure

Each xi,j is the number of observations that corresponds to the class i and are predicted as j.

Based on this it is possible to establish the accuracy as shown in the next equation.

Accuracy=

∑i xii∑

i

∑j xij

A confusion matrix was built for each data set (train and test) for measuring the performance

and estimating the accuracy in the model construction and validation. The information registered

in the confusion matrices was also used to analyze which flow regime registered the worst

classification rate and the causes of the misclassification.

Additionally, the variable importance was estimated for each of the inputs by calculating the

change in the purity index generated by the partitions associated with each variable for each tree.

This procedure was repeated for all the grown trees to establish the variable importance as the

percentage of the total purity index change explained by the input.

5. Results

This section starts by showing the results obtained for one of the decision trees that were grown and

its interpretation, followed by the confusion matrix obtained for the test data sets, ending with the

results registered for the variable importance. After implementing all the proposed methodology,

accuracies of 74.5% and 74.3% were registered for the training and test sets respectively. Also,

significant values were obtained for variable importance by three inputs (Xm, Eo and Fr), while

one input (We) presented a low importance on the equation selection.

5.1. Decision tree results

Figure 3 shows the results for a tree grown based on these inputs, where left and right branches

represent the positive and negative outcome of the condition respectively.

Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction 11

Figure 3 Sample result tree

Taking into account that the proposed methodology is based in bagging (a non-deterministic algo-

rithm), the trees generated can change for each iteration, considering that the data sets created by

bootstrapping take observations randomly. The grown trees are made of 15 conditions which take

to 16 leafs that return a number related to a combination of closure relationships. The final com-

bination selected by the model, was the most frequent sequence returned by the total constructed

trees.

5.2. Comparison of model prediction

As mentioned before, after running the tree based model proposed, over the test set, the accuracy

obtained for the flow pattern prediction was 74.34%, with 6857 correctly classified points from the

total 9224 experimental observations as shown in table 2.

Successful UnsuccessfulTotal No. [%] BF IT A SG

BF 965 522 54.09 522 383 5 55IT 4523 3881 85.81 246 3881 103 293A 1469 698 47.52 21 354 698 396

SG 2267 1756 77.46 53 335 123 1756Total 9224 6857 74.34 842 4953 929 2500

Table 2: Comparison between model prediction and test set

The left side of the table shows the total number of points, correctly classified observations and

percentage of accuracy per flow pattern. For example, in the case of bubble flow (BF) from the

total of 965 experimental points, 522 (54.09%) were successfully predicted. A deeper analysis of

the table shows that the best accuracy was registered for intermittent flow (85.81%) while annular

12 Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction

flow registered the worst (47.52%).

Right side of the table shows the confusion matrix for test set. The values registered for the

annular regime row present additional causes of the poor performance of the classification for

this pattern. Looking at the values related with intermittent flow and segregated flow for this

row, it can be seen that 354 (24.10%) and 396 (26.96%) were wrongly classified into these groups

respectively.

5.3. Variable importance

After building the whole tree based model described before and analyzing its performance on flow

pattern classification, it was possible to estimate the importance of the variables as established in

section 4.5.

As shown in figure 4 the variables that registered the greatest values for variable importance were

Ettvos number, Lockhart-Martinelli parameter and the modified Froud number.

We

Fr

Xm

Eo

%0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35

Figure 4 Variable importance

Considering the connection between non-dimensional numbers and flow patterns explained in

section 2.1, the values registered for Xm and Fr result consistent with the theoretical relations

reported in literature, taking into account both dimensionless numbers establish boundaries

between segregated and non-segregated flow, as well as annular and non-annular flow, cases that

represent 41.59% of the train set. Following the previous analysis, it results expected to register a

low value for Weber number importance considering it provides a boundary between non-bubble

Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction 13

and bubble flow, which represents only 10.37% of the complete training set.

The greatest value for variable importance was registered by Eotvos number. Apart of describing

the boundary for bubble flow, this dimensionless number has been used in many fluid studies to

characterize the shape of bubbles in liquid flows, which could have explained the transitions from

slug flow (subclass of intermittent flow) to the other flow regimes as established in the Unified

Flow Model.

6. Discussion6.1. Graphical analysis of model performance

The results shown in figure 5 register the correct and misclassified points of the test set, in black

and gray respectively, against their Modified Froude number and Lockhart-Martinelli parameter

values in logarithmic scale.

−10

0

10

20

−10 −5 0 5 10 15

Xm

Fr

Figure 5 Modified Froude number vs Lockhart-Martinelli parameter

As can be seen, most of the points in the test sample have Xm values from -5 and 10 and Fr

mostly between 5 and -10, where correctly classified points were more frequent than misclassified

points. However, in the inferior point cloud, is possible to notice some regions where the model

selection algorithm did not predict flow pattern properly. Specifically, observations which have

Xm values greater than 7.5 with Fr values lower than 0, registered a poor performance in flow

pattern prediction with an accuracy of 44.63%. Additional graphs of classification performance

14 Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction

−10

0

10

20

−10 −5 0 5 10 15

Xm

Fr

Flow PatternBFITASG

Figure 6 Real flow pattern-Fr vs Xm

against other inputs are presented in appendix B.

Figure 6 allows a deeper analysis of the causes of misclassification in the lower-right corner of the

previous graph. A brief look at the zone mentioned before, shows that most of the points there

correspond to observations which registered intermittent flow as real flow regime. The analyzed

points can be located also at the lower part of figure 7 which shows the predicted flow pattern for

each observation against modified Froude number and Weber number values. After crossing the

results of both graphs it was possible to conclude that the mentioned missclassified observations

were incorrectly classified as bubble regimes, which makes sense taking into account the low values

of Weber number presented by these observations.

Other big problem mentioned in section 5.2 was that many annular flow observations were miss-

classified as segregated regimes. As can be seen in figure 6, these flow patterns register similar

values of Lockhart-Martinelli parameter and modified Froude number, which makes it difficult for

the tree based model to select different combinations of equations in the zone where these flow

patterns overlap and also adds some difficulty to the flow regime prediction made by the Unified

Flow Model.

6.2. Non-covered zones

As was introduced in the previous section, additionally to the error registered by the closure

relationship selection method, the 25.66% misclassification rate obtained for the test set can be

explained partially by the lack of predictive capacity registered for the possible combinations of

Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction 15

−10

0

10

20

0 5 10 15

We

Fr

Flow PatternBFITASG

Figure 7 Predicted flow pattern - Fr vs We

closure relationships included in the study. In detail, 1397 (14.95%) observations from the test

set, which were mostly located in boundaries between patterns, did not get correct classifications

with any of the included combinations, which makes it impossible for the tree based algorithm to

classify them correctly.

−10

0

10

20

−10 −5 0 5 10 15

Xm

Fr

050100150200250

Correct combinations

Figure 8 Number of correct predictions per point

Figure 8 allows a deeper analysis of the performance of the selected combinations of closure

relationships in the prediction of the flow pattern for the observations of the test set. The graph

16 Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction

shows in darker colors the points that obtained a correct classification with a great amount of

combinations, while lighter colors stand for those that were missclasified by most of them. A

brief look at the zones mentioned in section 6.1, shows that none of the possible combinations of

equations established for the study manage to predict correctly the flow pattern for points with

very high values of Xm and low values of Fr.

Regarding to the problem of missclasification of annular flows, the cross analysis of figures 8 and 6

shows that the cause of this issue was not the combinations of closure relationships set by the tree

based algorithm, but the lack of predictive capacity of the selected combinations for this zone.

In detail, most of the annular flows of the low point cloud (low values of Fr) presented wrong

classifications for almost all the combinations included in the study, which shows that the selected

equations of the Unified Flow Model are prone to error in the boundary between annular and

segregated flow.

7. Conclusions and future work

A novel methodology is proposed for selecting closure relationships from the different models

included in the Unified Flow Model. Considering the results obtained for flow pattern prediction,

we conclude tree based methods provide an accurate tool for model selection in two-phase flow

problems. A total of 27670 points were used to build the model, which was tested later in a 9224

point set, registering the highest performance for intermittent flow classification (85.81%) and the

lowest for annular flow (47.52%) . The graphical results have shown that the prediction of flow

pattern is subject to the values of dimensionless numbers registered for an observation, which was

reflected in the worse performance presented for atypical values and zones close to boundaries

between flow patterns. The misclassifications registered by the algorithm in these zones, were

explained by the result of the predictions of the UFM analyzed submodels and not by the selection

of closure relationships developed by the decision tree algorithm, this means that even if the

method is run with more partitions (more combinations of closure relationships are assigned) the

performance of the model will not improve significantly.

Based on the lack of predictive capacity registered by the combinations of equations included

in the study for the boundaries between similar flow patterns, we recommend for future work

and extensions, including more proposed closure relationships for each of the 8 models that

conform the UFM, making emphasis in the 4 models which the study focused on. In addition,

include more dimensionless numbers after optimizing the UFM prediction capacity to get a better

Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction 17

understanding of the relation between the assigned closure relationships and the studied variables.

Furthermore, once these problems are solved, the model selection performance can be tested in

the prediction of other properties such as pressure drop and liquid holdup.

18 Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction

Appendix A: References

A.1. Data base references

Authors Year Diameter Angle Gas velocity Liquid velocity Fluids Points(mm) (m/s) (m/s)

Shoham 1982 25 and 51 -90 to 90 0.0037 to 42.9562 0.0011 to 25.517 Water-Air 5676Weisman 1981 25 2.75 to 10 0.0002 to 49.4657 0.0033 to 26.6514 Water-Air 2075Shi 2013 200 -90 to 85 0.01 to 61.6858 0.0104 to 9.9726 Water-Air 2058Ekinci 2015 50.8 -2 to 2 0.088 to 5.321 0.095 to 0.817 Oil-Nat Gas 1040Tzotzi 2013 24 0 to 1 1.2813 to 25.7106 0.002 to 0.246 Water-Air 943Ghajar 2009 12.7 and 27.9 0 to 90 0.19 to 26.7244 0.0241 to 1.17 Water-Air 925Kokal 1987 25.8 and 76.3 -9 to 9 -0.1 to 19.421 0.03 to 3.048 Light Oil-Air 875Hand 1992 93.5 0 0.0799 to 99.8157 0.0067 to 26.1918 Water-Air 874Hernandez-Perez 2007 38 0 to 90 0.1388 to 12.4292 0.0363 to 2.6827 Water-Air 861Matsubara 2011 20 0 1.0601 to 26.3202 0.0025 to 0.1835 Water-Air 803Xu 2007 20 and 60 -75 to 75 0.0117 to 3.5259 0.0104 to 13.9544 Water-Air 802Haoulo 2007 38.1 -10 to 10 2.9579 to 37.7391 0.0191 to 1.8665 Water-Air 760Caetano 2005 63.4 90 0.01 to 22.86 0.0001 to 3.58 Kerosene-Air 678Dziubinski 2004 25.3 and 50.5 90 0.0199 to 2.9068 0.004 to 1.442 Water-Air 619Mukherjee 1979 38.1 and 101.6 -70 to 90 0.0427 to 36.2621 0.0091 to 4.3628 Kerosene-Air 599Foletti 2011 22 0 0.0346 to 2.37 0.028 to 1.13 Oil -Air 597MacGillivray 2004 9.525 0 to 90 11.9 to 78.8 0.075 to 24.7 Water-Air 577Andritsos 1986 25.146 and 95.25 0 0.7986 to 98.8802 0.0003 to 0.335 Water + Glicerine-Air 535Nicholson 1978 30.48 -9 to 9 0 to 28.4968 0.0295 to 2.9982 Oil-Nat Gas 493Mathure 2010 12.7 0 0.077 to 12.768 0.1489 to 1.6972 Water-Air 485Triplett 1999 1.097 and 1.45 0 0.0364 to 83.3166 0.0223 to 6.1149 Water-Air 478Asante 2000 25.4 and 76.2 0 15 to 30 0.0001 to 0.2 Oil-Air 476Abduvayt 2003 54.9 and 106.4 0 to 3 0.03651 to 11.0959 0.0089 to 6.4755 Water-Nitrogen 443Gawas 2013 150 -2 to 2 11 to 23 0.01 to 0.03 Oil-Air 439Grolman 1997 26 and 51 0.5 to 1 3.4269 to 19.4793 0.0003 to 0.0219 Water-Air 364Wegmann 2005 5.6 and 7 0 0.087 to 86.361 0.018 to 2.237 Water-Air 355Fan 2005 50.8 and 149.6 -2 to 2 4.93 to 25.7 0.00026 to 0.0522 Water-Air 351Griffith 1961 57 30 to 90 0.0515 to 0.8097 0.1094 to 1.3681 Oil-Nat Gas 351Brito 2012 50.8 0 0.0936 to 7.7377 0.0097 to 2.9606 Oil-Air 346Vijay 1978 11.684 90 0.05 to 96.7719 0.0122 to 10.607 Water-Air 333Garcia 2009 38.1 0 to 10 2.8971 to 37.7394 0.0194 to 1.9534 Water-Air 331Chokshi 1994 76 90 0.7702 to 52.5808 0.1050 to 5.6587 Water-Air 325Xu 2009 20 and 40 -15 to 15 0.0107 to 13.42 0.4878 to 2.0846 Water-Air 297Tang 2011 18.6 0 0.03340 to 39.8509 0.0085 to 1.3958 Water-Air 296Khaledi 2014 69 0 0.0002 to 3.7345 0.0381 to 3.0999 Oil-Sulfur Hexafluoride 294Alsaadi 2013 76.2 2 to 30 1.829 to 39.992 0.01 to 0.101 Water-Air 288Wong 1997 25.4 0 0.217 to 27.439 0.011 to 3.488 Water-Air 278Liu 2008 2.37 and 3.04 90 0.0009 to 4.2763 0.0022 to 1.4987 Water-Air 273Almabrok 2013 101.6 -90 to 90 0.2056 to 34.2626 0.07 to 1.5 Water-Air 270Manabe 2001 54.9 0 to 90 0.0972 to 7.0134 0.0382 to 0.9504 Oil-Nat Gas 247Oddie 2003 150 and 152 0 to 88 0.0459 to 6.7639 0.0307 to 7.3574 Water-Nitrogen 227Sujumnong 2007 11.68 90 0.0031 to 98.6694 0.0073 to 81.4237 Water-Air 222Smith 1999 50.8 2 8.433 to 24.753 0.0001 to 0.0009 Water-Air 209Aggour 1976 11.684 90 0.0732 to 96.0272 0.3139 to 10.5796 Water-Air 206Wilkens 1997 97.2 -2 to 5 0.7973 to 13.6089 0.0953 to 1.5655 Oil-CO2 204Meng 2001 50.1 0 4.8 to 26.6 0.0013 to 0.0538 Oil-Air 203Govier 1958 30.75 and 30.78 0 to 90 0.0439 to 27.3919 0.0029 to 2.20427 Water-Air 194Taitel 1980 25 and 51 90 0.0109 to 27.4114 0.0026 to 3.7414 Water-Air 187Dhulesia 1996 75 and 200 90 0.0702 to 16.3375 0.04862 to 4.2975 Oil-Nat Gas 186Gokcal 2005 50.4 0 0.09 to 20.3 0.01 to 1.76 Oil-Air 183Aziz 1972 30.42 0 0.015 to 3.6879 0.03 to 1.6483 Oil-Air 179Hanafizadeh 2011 2 90 0.5321 to 13.0375 0.0524 to 1.3413 Water-Air 175Nydal 1991 31 and 90 0 0.226 to 24.59 0.59 to 3.5 Water-Air 173Okawa 2005 5 90 19.9866 to 106.0161 0.0891 to 1.6332 Water-Air 170Chung 2014 50.8 -90 0.285 to 7.837 0.0476 to 0.7037 Lubsoil ND-50-Air 159Spedding 1981 45 0 0.5146 to 55.1164 0.0037 to 1.0303 Water-Air 154Gregory 1978 24.8 and 51.2 0 0.046 to 15.626 0.03 to 2.316 Oil-Air 154Lucas 2010 195.3 90 0.0025 to 3.185 0.0405 to 1.611 Water-Air 153Jamari 2008 77.9 0 4.7003 to 20.9968 0.0484 to 0.7036 Water-Air 150Crowley 1986 50.8 and 150 0 0.2 to 48.91 0.0387 to 3.26 Water-Air 150Yuan 2011 76.2 30 to 90 9.9 to 36 0.0001 to 0.1 Water-Air 146Kora 2011 50.8 0 0.1 to 3.51 0.1 to 0.82 Oil-Nat Gas 144Mantilla 2008 50.8 and 152.4 0 1.5 to 82.2 0.0033 to 0.1 Water-Air 143Lin 1985 25.4 and 95.4 0 0.814 to 200.6098 0.0043 to 1.4939 Water-Air 141Kanizawa 2011 15.83 0 0.1638 to 7.031 0.0031 to 0.19 Refrigerant-Refrigerant 134Omebere(b) 2007 189 0 0.49 to 14.3 0.0048 to 4.67 Naphtha-Nitrogen 131Ali 2009 25.4 90 0.19 to 1.11 0.06 to 2.28 Water-Air 131Vieira 2014 76 0 to 90 9 to 40 0.005 to 0.2 Water-Air 127Jeyachandra 2011 50.8 2 0.1 to 3.6 0.1 to 0.8 Oil-Air 123Zubir 2011 21 and 95 90 0.0071 to 2.0084 0.0015 to 1.03 Water-Air 122Kim 2010 25.4 0 0.329 to 2.927 0.021 to 1.362 Water-Air 111Karami 2015 152.4 0 7.4 to 22.6 0.01 to 0.0269 Water-Air 106Monni 2014 19.5 0 0.1375 to 32.3188 0.0459 to 2.6245 Water-Air 102Guler-Quadir 1991 88.9 90 0.1189 to 17.7485 0.0366 to 1.6673 Water-Air 102Al-safran 2003 50.8 -2 to 2 0.6096 to 4.572 0.06096 to 1.2192 Lubspar 107-Air 101Mishima 1996 2.05 and 4.08 90 0.0719 to 36.4905 0.0107 to 2.3216 Water-Air 101Kadri 2011 52 0 0.4791 to 2.5148 0.0465 to 0.2936 Water-Air 100Sharaf 2012 127 90 0.0888 to 14.759 0.0039 to 4.0944 Naphtha-Nitrogen 100Franca 1992 19 0 0.13 to 23.76 0.01 to 14.85 Water-Air 99Omebere(a) 2007 189 90 0.0881 to 15.6331 0.0037 to 4.3380 Naphtha-Nitrogen 98Durant 2003 27.86 and 27.87 0 0.5883 to 20.9855 0.0853 to 0.89 Water-Air 95Costigan 1996 32 90 0.03 to 37.34 0.02 to 1.04 Water-Air 94Marcano 1996 77.9 0 0.45 to 8.67 0.14 to 2.07 Kerosene-Air 93

Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction 19

Picchi 2014 60 0 0.0476 to 2.9969 0.5148 to 1.7301 Water-Air 93Chupin 2003 60 and 60.4 0 to 1 4.93 to 29.6 0.0002 to 0.04 Oil-Air 93Yang 1996 50.8 -30 to 0 0.87 to 9.43 0.88 to 2.02 Kerosene-Air 90Taitel 1987 38 0 0.7298 to 20.9293 0.0012 to 0.4095 Glycerol-Air 90Lu 2007 200 0 0.1856to 1.4456 0.0313 to 0.2852 Oil-Nitrogen 90Abdul-Majeed 1995 50.8 0 0.2 to 48.91 0.0001 to 1.83 Kerosene-Air 88Nguyen 2010 80 90 0.1243 to 1.0075 0.0153 to 1.7719 Water-Air 86Kaji 2009 51.2 and 52.3 90 0.1016 to 1.6106 0.0093 to 19.0138 Water-Air 85Zhao 2013 26 0 0.2253 to 14.3301 0.0594 to 0.5101 Oil-Air 84Rozenblit 2006 24.2 90 0.0966 to 27.8509 0.0639 to 1.3855 Water-Air 83Mata 2002 50.8 0 0.2124 to 38.2373 0.0020 to 1.3408 Oil-Air 80Abdulkadir 2011 67 90 0.047 to 4.727 0.047 to 0.378 Silicone oil-Air 78Rosa 2010 26 90 0.12 to 28.8 0.22 to 3.08 Water-Air 73Alssayh 2013 50 0 0.6 to 7.5 0.3 to 1.03 Water-Air 60Ohnuki 1999 200 90 0.0573 to 1.0682 0.0327 to 4.6298 Water-Air 58Cai 1999 100 -2 0.4005 to 16.6472 0.142 to 1.7026 Water-CO2 55Kouba 1986 76.2 0 0.3018 to 7.3609 0.1524 to 2.1366 Kerosene-Air 53Adedigba 2007 50 0 0.03 to 2.59 0.15 to 3 Water-Air 51Jackman 1992 50.3 0 21.375 to 42.964 0.027 to 0.3175 Water-Air 50Agrawal 1973 30.42 0 0.1091 to 6.0657 0.0137 to 0.0605 Oil-Air 45Luo 1996 50.8 90 0.012 to 0.9558 0.1011 to 0.5089 Polysaccharide-Nitrogen 42Vongvuthipornchai 1982 76.2 0 0.06 to 2.94 0.07 to 2.03 Kerosene-Air 42Golan 1969 38 90 0.7907 to 52.0229 0.2234 to 1.2286 Water-Air 42Mora 2011 21 0 0.2181 to 10.628 0.0481 to 1.6232 Water-Air 39Smith 2011 68.6 0 0.5239 to 9.3877 0.7 to 0.9 Oil -SF6 38Alamu 2010 19 90 13.13 to 42.9011 0.03 to 0.1943 Water-Air 37Zheng 1991 76.2 0 to 5 1.21 to 3.68 0.15 to 1.22 Kerosene-Air 36Fernandez 1982 50.8 and 76.2 90 4.95 to 17.89 0.0157 to 0.0459 Oil-Air 35Skopich 2012 50.8 and 101.6 90 7.986 to 29.49 0.009 to 0.0503 Water-Air 35Andreussi 1983 24 -90 30.9467 to 88.9499 0.0214 to 0.4436 Water-Air 33Al-Lababidi 2006 50 0 0.6 to 3.15 0.3 to 1.03 Water-Air 33Ortiz-Vidal 2012 20.4 0 0.0542 to 23.67 0.2448 to 6.303 Water-Air 32Mata 1989 10.4 90 0.1635 to 14.715 0.1635 to 0.8175 Water-Air 30Brito et al. 2012 50.8 -2 0.37 to 7.42 0.01 to 0.096 Oil -Air 30McNulty 1992 50.8 0 0.316 to 6.32 0.04 to 0.8 Water-Air 29Icardi 2014 67 90 0.0476 to 5.7869 0.1755 to 0.6988 Water-Air 29Schmidt 2008 54.5 90 0.01 to 21.23 0.0045 to 3.31 Luviskol-Nitrogen 27Kim 2000 25.4 0 0.13 to 1.676 0.034 to 0.511 Water-Air 26Schubring 2010 23.7 90 35.6 to 83.5 0.063 to 0.338 Water-Air 25Zhu 2003 200 90 0.02 to 0.24 0.25 to 0.45 Water-Air 21Zhao 2005 50 90 0.38 to 1.5 0.016 to 0.61 Water-Air 21Camargo 1991 18.6 0 1.3805 to 14.4382 0.2591 to 3.3117 Water-Argon 20Nuland 2007 60 10 to 60 0.99 to 3.05 0.099 to 1.017 Light Oil (Exxsol D80)-SF6 20Faccini 2015 25.2 -10 to -2.5 0.2451 to 2.6733 0.1114 to 0.5569 Water-Air 17Duarte 2007 26 0 0.47 to 1.55 0.33 to 0.68 Water-Air 14Shanmugam 1994 26.1366 90 4.3956 to 17.5824 0.2348 to 0.4697 Water-Air 12Lunde 1997 290 5 0.5 to 1 0.5 to 2 Nahptha-Nitrogen 4Wang 2001 76.2 90 0.15 to 0.4 1.05 to 2.04 Water-Air 3

20 Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction

A.2. Unified model references

Model Key Authors Year1 Wallis 19692 Paleev and Filippovich 19663 Oliemans et al. 19864 Zhang et al. 2003

Entrainment Models 5 Ishii and Mishima 19896 Pan and Hanratty 20027 Sawant et al. 20088 Sawant et al. 20099 Ousaka et al. 199610 Al-Sarkhi et al. 20121 Taitel and Dukler 19762 Taitel and Dukler 1976

Wettability Models 3 Hart et al. 19894 Grolman and Fortuin 19975 Fan 20056 Zhang et al. 20111 Cohen and Hanratty 19682 Andritsos and Hanratty 19873 Baker et al. 19884 Bendiksen et al. 1991

Friction Models for 5 Cheremisinoff & Davis Cheremisinoff and Davis 1979Stratified Configuration 6 Hart et al. 1989

7 Kim et al. 19858 Kowalski 19879 Andreussi and Persen 198710 Taitel and Dukler 197611 Vlachos et al. 19971 Wallis 19692 Wallis 19693 Whalley and Hewitt 19784 Henstock and Hanratty 19765 Oliemans et al. 1986

Friction Models for 6 Asali et al. 1985Annular Configuration 7 Fore et al. 2000

8 Hamersma and Hart 19879 Fukano and Furukawa 199810 Dallman et al. 197911 Ambrosini et al. 199112 Govan et al. 19911 Cohen and Hanratty 19682 Hart et al. 19893 Kowalski 19874 Taitel and Dukler 19765 Vlachos et al. 19976 Wallis 1969

Friction Models for 7 Wallis 1969Separated Configuration 8 Whalley and Hewitt 1978

9 Oliemans et al. 198610 Fore et al. 200011 Dallman et al. 197912 Ambrosini et al. 199113 Hamersma and Hart 198714 Chen et al. 1997

Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction 21

15 Zhang et al. 20031 Blasius 19132 Hall 19573 Churchill 1977

Wall & Mixture Friction Models 4 Swanee and Jain 19765 Zigrang and Sylvester 19826 Haaland 19837 Colebrook et al. 19398 McKeon et al. 20051 Zhang et al. 20032 Scott et al. 19893 Brill et al. 19814 Norris 19825 Al-safran et al. 20136 Gregory and Scott 1969

Slug Main Closure 7 Heywood and Richardson 1979(length or frequency) 8 Trononi 1990

9 & Wood.a Hill et al. 199010 & Wood.b Hill et al. 199011 Hill et al. 199012 Zabaras et al. 200013 Gokcal et al. 201014 Alruhaimani 20151 Gregory et al. 19782 Malnes 19823 Ferschneider 19834 Andreussi and Bendiksen 19895 Marcano 1996

Slug Liquid Holdup Models 6 Gomez et al. 20007 Abdul-Majeed 20008 Barnea and Brauner 19859 Al-Safran 200910 Zhang et al. 200311 Alruhaimani 201512 Al-Safran et al. 20151 Zhang et al. 2003

Drift Velocity Models 2 Weber et al. 19863 Jeyachandra et al. 20124 Moreiras et al. 2013

Translational Velocity Models 1 Dukler & Hubbard Dukler and Hubbard 19752 Fabre et al. 1994

22 Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction

Appendix B: Graphical classification results

B.1. Modified Froude number vs Weber number

−10

0

10

20

0 5 10 15

We

Fr

B.2. Weber number vs Eotvos number

0

5

10

15

0.0 2.5 5.0 7.5 10.0

Eo

We

Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction 23

B.3. Modified Froude number vs Eotvos number

−10

0

10

20

0.0 2.5 5.0 7.5 10.0

Eo

Fr

B.4. Weber number vs Lockhart-Martinelli parameter

0

5

10

15

−10 −5 0 5 10 15

Xm

We

24 Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction

B.5. Eotvos number vs Lockhart-Martinelli parameter

0.0

2.5

5.0

7.5

10.0

−10 −5 0 5 10 15

Xm

Eo

Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction 25

References

Ghassan H Abdul-Majeed. Liquid slug holdup in horizontal and slightly inclined two-phase slug flow. Journal

of Petroleum Science and Engineering, 27(1):27–32, 2000.

Mustafa Al-Naser, Moustafa Elshafei, and Abdelsalam Al-Sarkhi. Artificial neural network application for

multiphase flow patterns detection: A new approach. Journal of Petroleum Science and Engineering,

145:548–564, 2016.

Eissa Al-Safran. Prediction of slug liquid holdup in horizontal pipes. Journal of Energy Resources Technology,

131(2):023001, 2009.

Eissa Al-Safran, Ceyda Kora, and Cem Sarica. Prediction of slug liquid holdup in high viscosity liquid and

gas two-phase flow in horizontal pipes. Journal of Petroleum Science and Engineering, 133:566–575,

2015.

Eissa M Al-safran, Bahadir Gokcal, Cem Sarica, et al. Investigation and prediction of high-viscosity liquid

effect on two-phase slug length in horizontal pipelines. SPE Production & Operations, 28(03):296–305,

2013.

Abdelsalam Al-Sarkhi, Cem Sarica, and Bilal Qureshi. Modeling of droplet entrainment in co-current annular

two-phase flow: A new approach. International Journal of Multiphase Flow, 39:21–28, 2012.

Chris Aldrich and Lidia Auret. Unsupervised process monitoring and fault diagnosis with machine learning

methods. Springer, 2013.

Feras AS Alruhaimani. Experimental analysis and theoretical modeling of high liquid viscosity two-phase

upward vertical pipe flow. The University of Tulsa, 2015.

W Ambrosini, P Andreussi, and BJ Azzopardi. A physically based correlation for drop size in annular flow.

International Journal of Multiphase Flow, 17(4):497–507, 1991.

P Andreussi and K Bendiksen. An investigation of void fraction in liquid slugs for horizontal and inclined

gasliquid pipe flow. International Journal of Multiphase Flow, 15(6):937–946, 1989.

P Andreussi and LN Persen. Stratified gas-liquid flow in downwardly inclined pipes. International journal

of multiphase flow, 13(4):565–575, 1987.

Nikolaos Andritsos and TJ Hanratty. Influence of interfacial waves in stratified gas-liquid flows. AIChE

journal, 33(3):444–454, 1987.

JC Asali, TJ t Hanratty, and Paolo Andreussi. Interfacial drag and film height for vertical annular flow.

AIChE Journal, 31(6):895–902, 1985.

A Baker, K Nielson, and A Gabb. Pressure loss, liquid-holdup calculations developed. Oil Gas J.;(United

States), 86(11), 1988.

Dvora Barnea and Neima Brauner. Holdup of the liquid slug in two phase intermittent flow. International

Journal of Multiphase Flow, 11(1):43–49, 1985.

26 Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction

Kjell H Bendiksen, Dag Maines, Randi Moe, Sven Nuland, et al. The dynamic two-fluid model olga: Theory

and application. SPE production engineering, 6(02):171–180, 1991.

Heinrich Blasius. Das ahnlichkeitsgesetz bei reibungsvorgangen in flussigkeiten. In Mitteilungen uber

Forschungsarbeiten auf dem Gebiete des Ingenieurwesens, pages 1–41. Springer, 1913.

O Bratland. Update on commercially avaliable flow assurance software tools: What they can and cannot

do? In 4th Asian Pipeline Conference & Exibition, Kuala Lumpur, Malaysia, 2008.

Leo Breiman, Jerome Friedman, Charles J Stone, and Richard A Olshen. Classification and regression trees.

CRC press, 1984.

James P Brill, Zelimir Schmidt, William A Coberly, John D Herring, David W Moore, et al. Analysis of two-

phase tests in large-diameter flow lines in prudhoe bay field. Society of Petroleum Engineers Journal,

21(03):363–378, 1981.

XT Chen, XD Cal, and JP Brill. Gas-liquid stratified-wavy flow in horizontal pipelines. Journal of energy

resources technology, 119(4):209–216, 1997.

Nicholas P Cheremisinoff and E James Davis. Stratified turbulent-turbulent gas-liquid flow. AIChE journal,

25(1):48–56, 1979.

Stuart W Churchill. Friction-factor equation spans all fluid-flow regimes. Chemical engineering, 84(24):

91–92, 1977.

R Cli, JR Grace, and ME Weber. Bubbles, drops and particles, 1978.

Leonard S Cohen and Thomas J Hanratty. Effect of waves at a gasliquid interface on a turbulent air flow.

Journal of Fluid Mechanics, 31(3):467–479, 1968.

Cyril Frank Colebrook, T Blench, H Chatley, EH Essex, JR Finniecome, G Lacey, J Williamson, and GG Mac-

donald. Correspondence. turbulent flow in pipes, with particular reference to the transition region

between the smooth and rough pipe laws.(includes plates). Journal of the Institution of Civil engineers,

12(8):393–422, 1939.

JC Dallman, BG Jones, and Thomas J Hanratty. Interpretation of entrainment measurements in annular

gas-liquid flows. Two-Phase Momentum, Heat and Mass Transfer in Chemical, Process and Energy

Engineering System, 2:681, 1979.

Abraham E Dukler and Martin G Hubbard. A model for gas-liquid slug flow in horizontal and near horizontal

tubes. Industrial & Engineering Chemistry Fundamentals, 14(4):337–347, 1975.

Jean Fabre et al. Advancements in two-phase slug flow modeling. In University of Tulsa Centennial Petroleum

Engineering Symposium. Society of Petroleum Engineers, 1994.

Yongqian Fan. An investigation of low liquid loading gas-liquid stratified flow in near-horizontal pipes., 2004.

G Ferschneider. Ecoulements diphasiques gaz-liquide a poches et a bouchons en conduites. Revue de l’Institut

Francais du Petrole, 38(2):153–182, 1983.

Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction 27

LB Fore, SG Beus, and RC Bauer. Interfacial friction in gas–liquid annular flow: analogies to full and

transition roughness. International journal of multiphase flow, 26(11):1755–1769, 2000.

T Fukano and T Furukawa. Prediction of the effects of liquid viscosity on interfacial shear stress and frictional

pressure drop in vertical upward gas–liquid annular flow. International journal of multiphase flow, 24

(4):587–603, 1998.

Bahadir Gokcal. Effects of High Oil Viscosity on Two-phase Oil-gass Flow Behavior in Horizontal Pipes.

PhD thesis, University of Tulsa, 2005.

Bahadir Gokcal, Abdelsalam Al-Sarkhi, Cem Sarica, Eissa M Alsafran, et al. Prediction of slug frequency

for high viscosity oils in horizontal pipes. In SPE Annual Technical Conference and Exhibition. Society

of Petroleum Engineers, 2009.

LE Gomez, O Shoham, and Y Taitel. Prediction of slug liquid holdup: horizontal to upward vertical flow.

International Journal of Multiphase Flow, 26(3):517–521, 2000.

AH Govan, GF Hewitt, HJ Richter, and A Scott. Flooding and churn flow in vertical pipes. International

journal of multiphase flow, 17(1):27–44, 1991.

DM Graham, HR Kopke, MJ Wilson, DA Yashar, JC Chato, and TA Newell. An investigation of void fraction

in the stratified/annular flow regions in smooth, horizontal tubes. Technical report, Air Conditioning

and Refrigeration Center. College of Engineering. University of Illinois at Urbana-Champaign., 1999.

GA Gregory and DS Scott. Correlation of liquid slug velocity and frequency in horizontal cocurrent gas-liquid

slug flow. AIChE Journal, 15(6):933–935, 1969.

GA Gregory, MK Nicholson, and K Aziz. Correlation of the liquid volume fraction in the slug for horizontal

gas-liquid slug flow. International Journal of Multiphase Flow, 4(1):33–39, 1978.

Eric Grolman and Jan MH Fortuin. Gas-liquid flow in slightly inclined pipes. Chemical engineering science,

52(24):4461–4471, 1997.

Skjalg E Haaland. Simple and explicit formulas for the friction factor in turbulent pipe flow. Journal of

Fluids Engineering, 105(1):89–90, 1983.

Newman A Hall. Thermodynamics of fluid flow. Longmans, Green, 1957.

PJ Hamersma and J Hart. A pressure drop correlation for gas/liquid pipe flow with a small liquid holdup.

Chemical engineering science, 42(5):1187–1196, 1987.

J Hart, PJ Hamersma, and JMH Fortuin. Correlations predicting frictional pressure drop and liquid holdup

during horizontal gas-liquid pipe flow with a small liquid holdup. International Journal of Multiphase

Flow, 15(6):947–964, 1989.

William H Henstock and Thomas J Hanratty. The interfacial drag and the height of the wall layer in annular

flows. AIChE Journal, 22(6):990–1000, 1976.

NI Heywood and JF Richardson. Slug flow of airwater mixtures in a horizontal pipe: Determination of liquid

holdup by γ-ray absorption. Chemical Engineering Science, 34(1):17–30, 1979.

28 Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction

TJ Hill, DG Wood, et al. A new approach to the prediction of slug frequency. In SPE Annual Technical

Conference and Exhibition. Society of Petroleum Engineers, 1990.

Mamoru Ishii and Kaichiro Mishima. Droplet entrainment correlation in annular two-phase flow. Interna-

tional Journal of Heat and Mass Transfer, 32(10):1835–1846, 1989.

Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. An introduction to statistical learning,

volume 112. Springer, 2013.

BC Jeyachandra, B Gokcal, Abdelsalam Al-Sarkhi, Cem Sarica, A Sharma, et al. Drift-velocity closure

relationships for slug two-phase high-viscosity oil flow in pipes. SPE Journal, 17(02):593–601, 2012.

HJ Kim, SC Lee, and SG Bankoff. Heat transfer and interfacial drag in countercurrent steam-water stratified

flow. International journal of multiphase flow, 11(5):593–606, 1985.

Sunil L Kokal. An experimental study of two phase flow in inclined pipes. Chemical and Petroleum Engi-

neering, University of Calgary, 1987.

Gene E Kouba. Horizontal slug flow modeling and metering. Technical report, Tulsa Univ., OK (USA),

1986.

JE Kowalski. Wall and interfacial shear stress in stratified flow in a horizontal pipe. AIChE journal, 33(2):

274–281, 1987.

Pui-Yan Lin. Flow regime transitions in horizontal gas-liquid flow. University of Illinois at Urbana-

Champaign, 1985.

Awad M. An Overview of Heat Transfer Phenomena, chapter Two-Phase Flow. InTech, 2012.

D Malnes. Slug flow in vertical, horizontal and inclined pipes. 1982. CAMARGO, RMT Hidrodinamica e

Transferencia de Calor no Escoamento Intermitente Horizontal. Campinas: Departamento de Engen-

haria de Petroleo, Faculdade de Engenharia Mecanica, Universidade Estadual de Campinas, 1991.

R Manabe. A comprehensive mechanistic heat transfer model for two-phase flow with high-pressure flow

pattern validation (ph. d dissertation), u. Tulsa, Tulsa, 2001.

Robert Marcano. Slug Characteristics for Two-Phase Horizontal Flow. University of Tulsa Fluid Flow

Projects, 1996.

C Mata, E Pereyra, JL Trallero, and DD Joseph. Stability of stratified gas–liquid flows. International journal

of multiphase flow, 28(8):1249–1268, 2002.

BJ McKeon, MV Zagarola, and AJ Smits. A new friction factor relationship for fully developed pipe flow.

Journal of fluid mechanics, 538:429–443, 2005.

Jose Moreiras, Eduardo Pereyra, Cem Sarica, and Carlos F Torres. Unified drift velocity closure relation-

ship for large bubbles rising in stagnant viscous fluids in pipes. Journal of Petroleum Science and

Engineering, 124:359–366, 2014.

Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction 29

L Norris. Correlation of prudhoe bay liquid slug lengths and holdups including 1981 large diameter flow line

tests. Exxon Production Research Co., Houston, TX, 1982.

RVA Oliemans, BFM Pots, and N Trompe. Modelling of annular dispersed two-phase flow in vertical pipes.

International journal of multiphase flow, 12(5):711–732, 1986.

Akiharu Ousaka, T Nagashima, A Kariyasaki, I Morioka, M Kiyota, and H Mamoto. Effect of inclination

on distribution of entrainment flow rate in an inclined upward annular flow. Trans. JSME, Ser. B, 62

(600):2950–2956, 1996.

A Murat Ozbayoglu and H Ertan Yuksel. Analysis of gas–liquid behavior in eccentric horizontal annuli with

image processing and artificial intelligence techniques. Journal of petroleum science and engineering,

81:31–40, 2012.

II Paleev and BS Filippovich. Phenomena of liquid transfer in two-phase dispersed annular flow. International

Journal of Heat and Mass Transfer, 9(10):1089–1093, 1966.

Lei Pan and Thomas J Hanratty. Correlation of entrainment for annular flow in vertical pipes. International

Journal of Multiphase Flow, 28(3):363–384, 2002.

E Pereyra, C Torres, R Mohan, L Gomez, G Kouba, and O Shoham. A methodology and database to

quantify the confidence level of methods for gas–liquid two-phase flow pattern prediction. Chemical

Engineering Research and Design, 90(4):507–513, 2012.

Davide Picchi and Pietro Poesio. Uncertainty quantification and global sensitivity analysis of mechanistic

one-dimensional models and flow pattern transition boundaries predictions for two-phase pipe flows.

International Journal of Multiphase Flow, 90:64–78, 2017.

J Ross Quinlan. C4. 5: programs for machine learning. Elsevier, 2014.

Pravin Sawant, Mamoru Ishii, and Michitsugu Mori. Droplet entrainment correlation in vertical upward

co-current annular two-phase flow. Nuclear Engineering and Design, 238(6):1342–1352, 2008.

Pravin Sawant, Mamoru Ishii, and Michitsugu Mori. Prediction of amount of entrained droplets in vertical

annular two-phase flow. International Journal of Heat and Fluid Flow, 30(4):715–728, 2009.

Stuart L Scott, Ovadia Shoham, James P Brill, et al. Prediction of slug length in horizontal, large-diameter

pipes. SPE Production Engineering, 4(03):335–340, 1989.

Mack Shippen and William J Bailey. Steady-state multiphase flow past, present, and future, with a perspec-

tive on flow assurance. Energy & Fuels, 26(7):4145–4157, 2012.

O Shoham. Flow pattern transition and characterization in gas-liquid two-phase flow in inclined pipes. 1982.

PhD Disertation, 1982.

PK Swanee and Akalank K Jain. Explicit equations for pipeflow problems. Journal of the hydraulics division,

102(5), 1976.

Yehuda Taitel and Dvora Barnea. Encyclopedia of two-phase heat transfer and flow i: Fundamentals and

methods, 2016.

30 Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction

Yemada Taitel and AE Dukler. A model for predicting flow regime transitions in horizontal and near

horizontal gas-liquid flow. AIChE Journal, 22(1):47–55, 1976.

Chao Tan, Feng Dong, and Mengmeng Wu. Identification of gas/liquid two-phase flow regime through ert-

based measurement and feature extraction. Flow Measurement and Instrumentation, 18(5):255–261,

2007.

John R Thome. On recent advances in modeling of two-phase flow and heat transfer. Heat Transfer Engi-

neering, 24(6):46–59, 2003.

Enrico Trononi. Prediction of slug frequency in horizontal two-phase slug flow. AIChE Journal, 36(5):

701–709, 1990.

Amos Ullmann and Neima Brauner. The prediction of flow pattern maps in minichannels. Multiphase Science

and Technology, 19(1), 2007.

NA Vlachos, SV Paras, and AJ Karabelas. Liquid-to-wall shear stress distribution in stratified/atomization

flow. International journal of multiphase flow, 23(5):845–863, 1997.

Graham B Wallis. One-dimensional two-phase flow. McGraw-Hill, 1969.

ME Weber, A Alarie, and ME Ryan. Velocities of extended bubbles in inclined tubes. Chemical engineering

science, 41(9):2235–2240, 1986.

PB Whalley and Geoffrey Frederick Hewitt. The correlation of liquid entrainment fraction and entrainment

rate in annular two-phase flow. UKAEA Atomic Energy Research Establishment, 1978.

Robert Joseph Wilkens. Prediction of the flow regime transitions in high pressure, large diameter, inclined

multiphase pipelines. PhD thesis, Ohio University, 1997.

T Xie, SM Ghiaasiaan, and S Karrila. Artificial neural network approach for flow regime classification in

gas–liquid–fiber flows based on frequency domain analysis of pressure signals. Chemical Engineering

Science, 59(11):2241–2251, 2004.

GJ Zabaras et al. Prediction of slug frequency for gas/liquid flows. SPE journal, 5(03):252–258, 2000.

Hong-Quan Zhang, Qian Wang, Cem Sarica, and James P Brill. A unified mechanistic model for slug liquid

holdup and transition between slug and dispersed bubble flows. International journal of multiphase

flow, 29(1):97–107, 2003a.

Hong-Quan Zhang, Qian Wang, Cem Sarica, James P Brill, et al. Unified model for gas-liquid pipe flow via

slug dynamics-part 1: Model development. TRANSACTIONS-AMERICAN SOCIETY OF MECHAN-

ICAL ENGINEERS JOURNAL OF ENERGY RESOURCES TECHNOLOGY, 125(4):266–273, 2003b.

Hong-Quan Zhang, Cem Sarica, et al. Unified modeling of gas/oil/water pipe flow-basic approaches and

preliminary validation. SPE Projects, Facilities & Construction, 1(02):1–7, 2006.

Hong-Quan Zhang, Cem Sarica, et al. A model for wetted-wall fraction and gravity center of liquid film in

gas/liquid pipe flow. SPE Journal, 16(03):692–697, 2011.

Hernandez, Valencia Data driven methodology for model selection in flow pattern prediction 31

L Zhao and KS Rezkallah. Gas-liquid flow patterns at microgravity conditions. International Journal of

Multiphase Flow, 19(5):751–763, 1993.

DJ Zigrang and ND Sylvester. Explicit approximations to the solution of colebrook’s friction factor equation.

AIChE Journal, 28(3):514–515, 1982.