computer methods and programs...

11
Computer Methods and Programs in Biomedicine 169 (2019) 59–69 Contents lists available at ScienceDirect Computer Methods and Programs in Biomedicine journal homepage: www.elsevier.com/locate/cmpb Geometrical features for premature ventricular contraction recognition with analytic hierarchy process based machine learning algorithms selection Bruno Rodrigues de Oliveira a,, Caio Cesar Enside de Abreu b , Marco Aparecido Queiroz Duarte c , Jozue Vieira Filho d a Department of Electrical Engineering, São Paulo State University (UNESP), Ilha Solteira, Brazil b Department of Computing, Mato Grosso State University (UNEMAT), Alto Araguaia, Brazil c Department of Mathematics, Mato Grosso do Sul State University (UEMS), Cassilândia, Brazil d Telecommunication and Aeronautic Engineering, São Paulo State University (UNESP), São João da Boa Vista, Brazil a r t i c l e i n f o Article history: Received 22 August 2018 Revised 24 November 2018 Accepted 24 December 2018 Keywords: Electrocardiogram analysis Premature Ventricular Contraction Geometrical features a b s t r a c t Background and Objective: Premature ventricular contraction is associated to the risk of coronary heart disease, and its diagnosis depends on a long time heart monitoring. For this purpose, monitoring through Holter devices is often used and computational tools can provide essential assistance to specialists. This paper presents a new premature ventricular contraction recognition method based on a simplified set of features, extracted from geometric figures constructed over QRS complexes (Q, R and S waves). Methods: Initially, a preprocessing stage based on wavelet denoising electrocardiogram signal scaling is applied. Then, the signal is segmented taking into account the ventricular depolarization timing and a new set of geometrical features are extracted. In order to validate this approach, simulations encompass- ing eight different classifiers are presented. To select the best classifiers, a new methodology is proposed based on the Analytic Hierarchy Process. Results: The best results, achieved with an Artificial Immune System, were 98.4%, 91.1% and 98.7% for accuracy, sensitivity and specificity, respectively. When artificial examples were generated to balance the dataset, the recognition performance increased to 99.0%, 98.5% and 99.5%, employing the Support Vector Machine classifier. Conclusions: The proposed approach is compared with some of latest references and results indicate its effectiveness as a method for recognizing premature ventricular contraction. Besides, the overall system presents low computation load. © 2018 Elsevier B.V. All rights reserved. 1. Introduction The health of an individual is mainly associated to its heart health. There are several diseases that affect the sinus rhythm, which is considered as normal heartbeat, and produce what is called arrhythmia. One of them is the Premature Ventricular Con- traction (PVC), which are premature heartbeats originating from the ventricles. In order to assess the heart health, the most com- mon clinical examinations are those that employ Electrocardio- gram (ECG) analysis. An ECG records the electrical heart activity Corresponding author. Department of Electrical Engineering, São Paulo State University (UNESP), Brasil Avenue 56, Ilha Solteira, Brazil. E-mail addresses: [email protected] (B.R.d. Oliveira), [email protected] (C.C.E.d. Abreu), [email protected] (M.A.Q. Duarte), [email protected] (J. Vieira Filho). triggered by the atria and the ventricles, from electrodes placed on the body surface. It is mainly composed by P, Q, R, S, T and U waves, and by PR, ST and QT intervals, according to Fig. 1. P wave and QRS complex represent the atrial and ventricular depo- larization, respectively, and T wave represents the ventricular repo- larization, whereas the atrial repolarization is hidden by the QRS complex due to its low amplitude. The membrane of the cardiac cells has resting and action poten- tial. When the depolarization threshold is overcome the action po- tential is triggered [1], resulting in the atrial and ventricular con- tractions. The PVC is a result of three possible effects: reentry, trig- gered activity and abnormal impulse formation [2]. Although some patients may suffer from PVC episodes without realizing them (e.g. due to external factors such as food and medication), increasing their occurrence frequency can lead to hemodynamic problems [3]. PVC episodes occurrence is common in some patients with heart https://doi.org/10.1016/j.cmpb.2018.12.028 0169-2607/© 2018 Elsevier B.V. All rights reserved.

Upload: others

Post on 15-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

  • Computer Methods and Programs in Biomedicine 169 (2019) 59–69

    Contents lists available at ScienceDirect

    Computer Methods and Programs in Biomedicine

    journal homepage: www.elsevier.com/locate/cmpb

    Geometrical features for premature ventricular contraction recognition

    with analytic hierarchy process based machine learning algorithms

    selection

    Bruno Rodrigues de Oliveira a , ∗, Caio Cesar Enside de Abreu b , Marco Aparecido Queiroz Duarte c , Jozue Vieira Filho d

    a Department of Electrical Engineering, São Paulo State University (UNESP), Ilha Solteira, Brazil b Department of Computing, Mato Grosso State University (UNEMAT), Alto Araguaia, Brazil c Department of Mathematics, Mato Grosso do Sul State University (UEMS), Cassilândia, Brazil d Telecommunication and Aeronautic Engineering, São Paulo State University (UNESP), São João da Boa Vista, Brazil

    a r t i c l e i n f o

    Article history:

    Received 22 August 2018

    Revised 24 November 2018

    Accepted 24 December 2018

    Keywords:

    Electrocardiogram analysis

    Premature Ventricular Contraction

    Geometrical features

    a b s t r a c t

    Background and Objective: Premature ventricular contraction is associated to the risk of coronary heart

    disease, and its diagnosis depends on a long time heart monitoring. For this purpose, monitoring through

    Holter devices is often used and computational tools can provide essential assistance to specialists. This

    paper presents a new premature ventricular contraction recognition method based on a simplified set of

    features, extracted from geometric figures constructed over QRS complexes (Q, R and S waves).

    Methods: Initially, a preprocessing stage based on wavelet denoising electrocardiogram signal scaling is

    applied. Then, the signal is segmented taking into account the ventricular depolarization timing and a

    new set of geometrical features are extracted. In order to validate this approach, simulations encompass-

    ing eight different classifiers are presented. To select the best classifiers, a new methodology is proposed

    based on the Analytic Hierarchy Process.

    Results: The best results, achieved with an Artificial Immune System, were 98.4%, 91.1% and 98.7% for

    accuracy, sensitivity and specificity, respectively. When artificial examples were generated to balance the

    dataset, the recognition performance increased to 99.0%, 98.5% and 99.5%, employing the Support Vector

    Machine classifier.

    Conclusions: The proposed approach is compared with some of latest references and results indicate its

    effectiveness as a method for recognizing premature ventricular contraction. Besides, the overall system

    presents low computation load.

    © 2018 Elsevier B.V. All rights reserved.

    1

    h

    w

    c

    t

    t

    m

    g

    U

    (

    F

    t

    o

    U

    w

    l

    l

    c

    t

    t

    h

    0

    . Introduction

    The health of an individual is mainly associated to its heart

    ealth. There are several diseases that affect the sinus rhythm,

    hich is considered as normal heartbeat, and produce what is

    alled arrhythmia. One of them is the Premature Ventricular Con-

    raction (PVC), which are premature heartbeats originating from

    he ventricles. In order to assess the heart health, the most com-

    on clinical examinations are those that employ Electrocardio-

    ram (ECG) analysis. An ECG records the electrical heart activity

    ∗ Corresponding author. Department of Electrical Engineering, São Paulo State niversity (UNESP), Brasil Avenue 56, Ilha Solteira, Brazil.

    E-mail addresses: [email protected] (B.R.d. Oliveira), [email protected]

    C.C.E.d. Abreu), [email protected] (M.A.Q. Duarte), [email protected] (J. Vieira

    ilho).

    t

    g

    p

    d

    t

    P

    ttps://doi.org/10.1016/j.cmpb.2018.12.028

    169-2607/© 2018 Elsevier B.V. All rights reserved.

    riggered by the atria and the ventricles, from electrodes placed

    n the body surface. It is mainly composed by P, Q, R, S, T and

    waves, and by PR, ST and QT intervals, according to Fig. 1 . P

    ave and QRS complex represent the atrial and ventricular depo-

    arization, respectively, and T wave represents the ventricular repo-

    arization, whereas the atrial repolarization is hidden by the QRS

    omplex due to its low amplitude.

    The membrane of the cardiac cells has resting and action poten-

    ial. When the depolarization threshold is overcome the action po-

    ential is triggered [1] , resulting in the atrial and ventricular con-

    ractions. The PVC is a result of three possible effects: reentry, trig-

    ered activity and abnormal impulse formation [2] . Although some

    atients may suffer from PVC episodes without realizing them (e.g.

    ue to external factors such as food and medication), increasing

    heir occurrence frequency can lead to hemodynamic problems [3] .

    VC episodes occurrence is common in some patients with heart

    https://doi.org/10.1016/j.cmpb.2018.12.028http://www.ScienceDirect.comhttp://www.elsevier.com/locate/cmpbhttp://crossmark.crossref.org/dialog/?doi=10.1016/j.cmpb.2018.12.028&domain=pdfmailto:[email protected]:[email protected]:[email protected]:[email protected]://doi.org/10.1016/j.cmpb.2018.12.028

  • 60 B.R.d. Oliveira, C.C.E.d. Abreu and M.A.Q. Duarte et al. / Computer Methods and Programs in Biomedicine 169 (2019) 59–69

    Fig. 1. ECG waves and intervals.

    e

    f

    t

    t

    b

    m

    s

    g

    a

    c

    t

    e

    v

    m

    a

    O

    o

    t

    p

    t

    p

    e

    t

    i

    i

    a

    p

    s

    i

    B

    c

    b

    d

    o

    e

    S

    a

    a

    p

    2

    2

    [

    B

    (

    v

    O

    diseases or those that already had a myocardial infarction. For the

    first group, PVC prevalence is associated to the risk of sudden

    death [4] . Prevalence of PVC was detected in some studies cited by

    Latchamsetty and Bogun [2] . In a study with 301 middle-aged (55

    to 58 years old) men, 62% had some of the ventricular arrhythmias,

    including PVC. For those people with higher risk of coronary heart

    disease, PVC was more frequent. In another study, carried out with

    122,043 men in this majority young and healthy, from 16 years old

    to over 50, the PVC occurrence was less than 1%. Latchamsetty and

    Bogun in [2] highlighted that prevalence of PVC depends on the

    screening duration and type. In general, PVC occurs in approxi-

    mately 50% of the population when the 24/48-hour Holter mon-

    itoring is considered. As a consequence, for a simple ECG, the PVC

    occurrence in the whole population is close to 1%. Therefore, com-

    putational systems are essential for the analysis of long-term ECG

    obtained by Holter monitoring.

    The problem of PVC recognition is very complex because its

    pattern is quite changeable, even for the same patient. An example

    is shown in Fig. 2 , where a segment from record 207 from MIT-

    BIH [11,12] database with different PVC (V label) waveforms is pre-

    sented.

    Many researches have considered automatic arrhythmia recog-

    nition by using different methodologies, but just a few ones have

    been focused on PVC. Li et al. [5] proposed an approach based on

    template-matchings, considering the morphological differences be-

    tween ventricular depolarization (QRS complex) and repolarization

    phases. Principal Component Analysis and linear regression were

    used by Hadia et al. [6] to detect PVC and Normal heartbeat. Zarei

    et al. [7] proposed a strategy based on the variation of principal

    directions provided by Principal Component Analysis, by means of

    the construction of a matrix with non-PVC beats and the replace-

    ment of one of these beats by a PVC beat, obtaining a new matrix.

    Bazi et al. [8] extracted the features using the Wavelet Transform

    at four levels and the S-Transform to compute QRS duration and

    RR interval. Liu et al. [9] inserted a new set of features based on

    Lyapunov exponents and their derivative. Adnane and Belouchrani

    [10] evaluated an approach for QRS detection based on wavelet co-

    Fig. 2. Example of different PVCs for record 207 of MIT-BIH database. V, R and L

    fficients to obtain an h value representing the sum of coefficients

    rom Haar wavelet transform at levels 3 and 4, which is compared

    o a threshold value to determine whether the clusters belong to

    he PVC or Normal beat.

    In the present work, it is proposed a PVC recognition method

    ased on a completely new set of features extracted from a geo-

    etrical viewpoint. Each QRS complex is projected onto the Carte-

    ian plane, where the initial QRS complex sample is the plane ori-

    in. A triangle, whose vertices are the origin and the maximum

    nd minimum of the QRS complex, is built on the plane. An in-

    ircle is designed for this triangle and twelve measurements from

    hese two geometric figures are calculated, also including differ-

    nces of maximum/minimum projections on the horizontal and

    ertical axis. This approach is unprecedented, since the most com-

    on features are based on measures from RR-intervals, duration

    nd amplitude of the PQRST-waves, according to Luz et al. [11] .

    n the other hand, there are classes of methods that take use

    f domain transformation techniques, such as Wavelet and Fourier

    ransforms, or advanced decomposition processes such as Princi-

    al Component Analysis, in order to extract features [11] . However,

    hese methods are more computationally intensive than the one

    roposed here.

    Aiming to highlight the contributions by this work, eight differ-

    nt types of machine learning algorithms were implemented and

    he results show that the efficiency in classification performance

    s clearly due to the new set of features, even considering some

    nfluence of the algorithm. In order to select the best classifiers,

    new methodology based on Analytic Hierarchy Process (AHP) is

    roposed. Main contribution in this aspect is related to the propo-

    ition of a new conversion function responsible to convert numer-

    cal differences to Saaty‘s scale, aiming the suitable AHP execution.

    Simulations were performed using two datasets from two MIT-

    IH [12] databases. In addition of using real data, balanced datasets

    omposed by synthetic QRS complexes from averaged PVC heart-

    eats were used. The proposed method is also confronted with a

    ataset that simulates QRS complexes misdetection. Furthermore,

    verall results are compared with the ones obtained from refer-

    nce methods cited previously.

    The remainder of this paper is organized as follows: in

    ection 2 , dataset description, evaluation measures, foundations

    nd proposed approach are described; in Sections 3 and 4 results

    re respectively presented and discussed; finally, conclusions are

    resented in Section 5 .

    . Materials and methods

    .1. Dataset description

    Two classical databases provided by Goldberger et al.

    12,13] were chosen for the PVC heartbeats recognition: MIT-

    IH Arrhythmia Database (ARDB) and MIT-BIH ST Change Database

    STDB). They are composed by 48 half-hour ECG recordings and 28

    arying lengths, respectively, all of them are sampled at 360 Hz.

    nly recordings from the main lead were considered. According

    labels are PVC heartbeats, left and right bundle branch block, respectively.

  • B.R.d. Oliveira, C.C.E.d. Abreu and M.A.Q. Duarte et al. / Computer Methods and Programs in Biomedicine 169 (2019) 59–69 61

    Table 1

    Datasets description.

    Name Recordings Normal PVC

    DS1 101, 106, 108, 109, 112, 114, 115, 116, 118, 119, 122, 124, 201, 203, 205, 207, 208, 209, 215, 220, 223, 230 38,087 3683

    DS2 100, 103, 105, 111, 113, 117, 121, 123, 200, 202, 210, 212, 213, 214, 219, 221, 222, 228, 231, 232, 233, 234 36,428 3219

    DS3 300 to 327 75,019 322

    Total 71 150,534 7224

    t

    (

    A

    [

    d

    S

    n

    T

    a

    e

    p

    t

    2

    s

    o

    t

    p

    S

    t

    a

    t

    f

    P

    fi

    v

    a

    a

    m

    t

    S

    2

    c

    a

    p

    r

    2

    f

    b

    t

    p

    m

    e

    2

    r

    j

    t

    m

    N

    t

    w

    i

    d

    2

    p

    P

    c

    c

    i

    c

    t

    t

    t

    e

    a

    a

    t

    n

    g

    i

    2

    n

    b

    a

    l

    o

    t

    f

    o

    2

    v

    t

    c

    d

    m

    D

    d

    m

    f

    n

    a

    2

    t

    p

    o Association for the Advancement of Medical Instrumentation

    AAIM) recommendations, recordings 102, 104, 107 and 217 from

    RDB, which are composed by paced heartbeats, were discarded

    14] . The remainders were split into training (DS1) and test (DS2)

    atasets. An additional validation dataset (DS3), obtained from

    TDB, is used to evaluate the proposed features robustness on

    ew test items. Recordings from these datasets are described in

    able 1 .

    It is noteworthy that DS1 and DS2 datasets used in this work

    re the same as in the studies conducted by Li et al. [5] and Zarei

    t al. [7] . Furthermore, DS3 dataset is used only in the validation

    hase. Normal and PVC heartbeats are those labeled as N and V in

    he MIT-BIH databases, respectively.

    .2. Evaluation measures

    In order to assess the performance of the proposed approach,

    everal classical evaluation measures are employed. They are based

    n the amount of true positives TP , true negatives TN , false posi-

    ives FP and false negatives FN classifications obtained in the test

    hase: accuracy A cc = ( TP + TN )/( TP + TN + FP + FN ); sensitivity e = TP /( TP + FN ); specificity S p = TN /( TN + FP ); positive predic-ion P + = TP /( TP + FP ); negative prediction P − = TN /( TN + FN );nd AUC (Area Under Curve) relating to Receiver Operating Charac-

    eristic Curve (ROC curve) supported by a true positive rate versus

    alse positive rate plot [15] .

    In this work, positive classifications are those belonging to the

    VC class. It is important to highlight that sensitivity and speci-

    city measures are the recall measures for positive and negative

    alues, respectively [15] . The measures mentioned before ensure an

    ccurate assessment for the proposed approach, since training, test

    nd validation datasets are unbalanced, i.e., there exist more Nor-

    al than PVC heartbeats. This unbalance still justifies the construc-

    ion of artificial QRS complexes in order to balance the datasets in

    ection 3.4 .

    .3. Machine learning algorithms

    Machine learning algorithms are widely used for data classifi-

    ation. Therefore, in this work some of them are briefly discussed

    nd applied. Furthermore, a new methodology based on AHP is

    roposed to select the top three studied machine learning algo-

    ithms that lead to the best PVC classifiers [15–23] .

    .3.1. k-Nearest neighbors

    k -Nearest Neighbors ( k -NN) is one of the simplest algorithms

    or data classification. It is an instance-based learning algorithm

    ecause it labels a new sample presented by computing the dis-

    ance among it and samples previously stored in the training

    hase. The class assigned to the test item is chosen based on a

    ajority vote approach. The most voted class, amongst the k near-

    st neighbors, is then assigned to classification [15] .

    .3.2. Multinomial naive Bayes

    Bayesian classification algorithms are broadly used in pattern

    ecognition and they are based on Bayes rule which considers

    oint, conditional and marginal probabilities [16] . These probabilis-

    ic classifiers assume that the samples are generated by a mixture

    odel where each class is a component from this mixture [16,17] .

    aive Bayes is one of these algorithms. This model is supported by

    he independence assumption about the probability distribution,

    hich indeed reduces its complexity. A variant of this algorithm

    s the so-called Multinomial Naive Bayes, for which the probability

    istribution is multinomial [16,17] .

    .3.3. Voted perceptron

    Perceptron is a popular algorithm for binary linear classification

    roposed by Rosenblatt [18,19] and supported on the McCulloch-

    itts’s nonlinear neuron [19] . Essentially, the perceptron structure

    onsists of data input, summing node, which computes a linear

    ombination, and the output, which is produced by the hard lim-

    ter function [19] . The synaptic weights (elements of the linear

    ombination) are updated, according to a convergence rule, un-

    il the computed output is equal to the desired output. Percep-

    ron pattern recognition uses linear combination from the synap-

    ic weights and the data input to compute an output for a new

    xample. Freund and Schapire [18] proposed a more sophisticated

    lgorithm based on Perceptron, named Voted Perceptron. In this

    lgorithm, at each iteration, a list of all prediction vectors is main-

    ained and a weight of such vectors is computed considering the

    umber of iterations with no changes while a new mistake is not

    enerated. Lastly, prediction is calculated through weighted major-

    ty vote.

    .3.4. Multilayer perceptron

    Multilayer Perceptron (MLP) is a class of feed-forward artificial

    eural network with at least three layers of nodes. Therefore, the

    asic MLP configuration consists of input, intermediary (hidden)

    nd output layers [19] . The number of hidden layers and nodes per

    ayer are defined by means of simulations and depends essentially

    n the data complexity. It is noteworthy that each node, except for

    he ones in the input layer, is a neuron with nonlinear activation

    unction. In our implementation, supervised learning was carried

    ut with the back-propagation learning algorithm [19] .

    .3.5. Support vector machines

    Support Vector Machines (SVM) constitutes a class of super-

    ised learning binary classifiers and thus discriminates between

    wo possible classes. Classification task is made by means of the

    onstruction of a linear hyperplane to separate the two classes of

    ata. The optimal separating hyperplane is selected based on the

    aximization of the margin (distance) between the two classes.

    ue to data complexity, it is not always possible to create a linear

    ata separation in the input data space. Therefore, space transfor-

    ation by means of kernels is often used. Such an approach trans-

    orms the input data space in a higher dimension space, and the

    onlinearly separable data in the input space become linearly sep-

    rable in the new space [20] .

    .3.6. Radial-basis functions network

    In contrast to a neural network structure, the Radial-basis Func-

    ions (RBF) network has only three layers: input, hidden, and out-

    ut [19] . In the hidden layer lies the main distinction, due to the

  • 62 B.R.d. Oliveira, C.C.E.d. Abreu and M.A.Q. Duarte et al. / Computer Methods and Programs in Biomedicine 169 (2019) 59–69

    Table 2

    Saaty’s scale. Converts verbal judgments to nu-

    meric.

    Verbal scale Importance

    Equal importance 1

    Somewhat more importance 3

    Much more important 5

    Very much more important 7

    Absolutely more important 9

    Intermediate value (weaker) 2, 4, 6, 8

    A

    A

    w

    t

    h

    a

    i

    a

    p

    n

    A

    a

    c

    i

    v

    w

    i

    c

    i ∑

    s

    t

    i

    c

    2

    a

    w

    c

    [

    s

    t

    o

    a

    e

    c

    t

    p

    [

    Y

    nonlinearly mapping from the input space to a higher dimension

    space through radial basis functions (usually a Gaussian function).

    In this layer, an unsupervised learning is performed using the k -

    means algorithm, which determines the parameters of the func-

    tions (or kernels). The basis of this approach is the Cover’s theo-

    rem, which states that nonlinear pattern recognition is more likely

    to be linearly separable in a high-dimensional space [19] . The out-

    put layer is linear and trained in a supervised manner.

    2.3.7. Random forest

    Tree-based predictors are widely employed in data classification

    problems, especially in environments where some expert knowl-

    edge is available, since this structure can be easily interpreted and

    validated. One disadvantage of such predictors is data-overfitting

    [22] , which makes their implementation questionable in many sit-

    uations, because it reduces their general predictive capacity. To

    overcome this problem and improve prediction, the Random For-

    est algorithm is an alternative that maintains the same tree model

    structure, but combining these predictors to generate a better over-

    all model, where each tree is constructed from a random vector

    which is sampled independently, whose probability distribution is

    the same for all the trees in the forest [23] .

    2.3.8. Artificial immune system

    Artificial Immune Systems (AIS) constitutes a relatively recent

    approach in the artificial intelligence field that connects con-

    cepts from biological immune system to computational concepts

    by means of metaphors. In other words, researchers in the AIS field

    look to the biological immune system as inspiration on how to

    solve problems in engineering and computer science [24] . Some of

    these biological concepts, such as the distinction among molecules

    from our own cells and foreign molecules (the negative selection

    concept); the duplication of B-cells able to recognize an antigen

    and thus stimulating the production of antibodies (the clonal se-

    lection concept); some of these clones will compete with the origi-

    nal ones and those with higher affinity may become a memory cell

    (affinity maturation concept); they have natural learning properties

    which may be used to advantage in machine learning systems [24] .

    The AIS used in this work is the one proposed by Watkins, Tim-

    mis and Boggess [25] . Classification process performed by this AIS

    is based on the development of a set of memory cells generated in

    the training stage, capable of classifying data. Each memory cell is

    iteratively presented with each test item for stimulation. Then, sys-

    tem’s classification is determined by the most stimulated memory

    cells in a k nearest neighbor way: the class with higher number of

    stimulated memory cells is assigned to the test item.

    2.4. Analytic Hierarchy Process (AHP)

    AHP is a method for decision-making based on multiple crite-

    ria. It has been used in many areas to assist the decision mak-

    ers in complex and extremely important problems, such as: per-

    sonal, educational, manufacturing, political, engineering, industry,

    governmental [26] , judiciary [27] and machine learning algorithms

    selection [28–30] .

    Proposed by Saaty [31] , in order to perform an AHP, firstly a

    decision process is modeled as a hierarchy. At the top is the goal,

    right below is the criterion and in the bottom, the alternatives rep-

    resenting the choices available. After problem modeling, the alter-

    natives and criteria are pairwise compared with respect to some

    criterion at the level before the hierarchy. For each comparison, it

    is attributed a value from the Saaty’s scale, according to Table 2 .

    The result of these comparisons is a square pairwise matrix

    = ( a ij ) n × n , n ≥ 2:

    (h ) k

    =

    ⎡ ⎢ ⎢ ⎢ ⎣

    1 a (h ) 12

    · · · a (h ) 1 n

    1 /a (h ) 12

    1 · · · a (h ) 2 n

    . . . . . .

    . . . . . .

    1 /a (h ) 1 n

    1 /a (h ) 2 n

    · · · 1

    ⎤ ⎥ ⎥ ⎥ ⎦ , (1)

    here a (h ) i j

    = w (h ) i

    /w (h ) j

    represents the judgment with respect to al-

    ernative i over alternative j in relation to some criterion in the

    ierarchy level h ; w (h ) i

    > 0 , ∀ i, h ; n is the amount of alternativesnd k = 1, 2, 3, ��� is used to identify the matrices. The elementsn the normalized eigenvector w (h )

    k = [ w (h )

    1 w (h )

    2 · · · w (h ) n ] T , associ-

    ted to the principal eigenvalue λmax , are the weights that ex-lain the local importance of one alternative over another. It is

    amed local priority vector and is obtained by solving the system

    (h ) k

    w (h ) k

    = λmax w (h ) k subject to w (h ) T k

    [ 1 1 · · · 1 ] T = 1 [32] . Aiming toggregate all local judgments in a unique priority, it is necessary to

    ompute the global priority vector v = [ v 1 v 2 … v n ] T . Each elementn v is given by:

    p = o ∑

    l=1 s (

    h −1 ) l

    t ( h )

    pl , p = 1 , · · · , n, (2)

    here s ( h −1 ) l

    is the l -th criterion weight in relation to the goal, t (h ) pl

    s the local priority of the alternative p with respect to the l -th

    riterion, and o is the amount of criteria. After obtaining vector v ,

    t must be normalized by computing v norm = v / ‖ v ‖ 1 , where ‖ v ‖ 1 =

    r | v r | is the l 1 -norm. AHP is used in this work to the best PVC classifier choice, con-

    idering each evaluation measure as a criterion and the alterna-

    ives as the own classifiers. In this way, global priority vector has

    n itself the preferences order to choose the classifiers. The best

    lassifier is related to the greater element in v norm .

    .5. Wavelet denoising

    Discrete Wavelet Transform (DWT) implementation is based on

    digital filter bank in a tree structure, where low (approximation

    avelet coefficients) and high (detail wavelet coefficients) frequen-

    ies are successively separated. Originally proposed by Donoho

    33] , for denoising purposes, the wavelet thresholding method con-

    ists in applying the DWT on the original signal and modifying de-

    ail coefficients with absolute values under a certain threshold. In

    ther words, it is accepted as noisy coefficients those ones with

    bsolute values under the estimated threshold. This procedure is

    fficient due to the DWT characteristics [33] . The way as the noisy

    oefficients will be modified depends on the thresholding func-

    ion choice. Hard thresholding is used in this work, since it has

    resented better results [34] . Its formulation is defined as follows

    33] :

    ˆ =

    {Y [ k, n ] if | Y [ k, n ] | > δ

    0 if | Y [ k, n ] | ≤ δ ,

  • B.R.d. Oliveira, C.C.E.d. Abreu and M.A.Q. Duarte et al. / Computer Methods and Programs in Biomedicine 169 (2019) 59–69 63

    Fig. 3. Proposed approach overview. Fs is the sampling rate.

    Table 3

    Parameters configuration for the classifiers.

    Name Label Parameters configuration

    Random forest RF 90 trees

    k -Nearest neighbors KNN k = 3. Euclidean distance Multinomial Naive Bayes MNB None

    Support Vector Machine SVM Linear Kernel. Use shrinking heuristic. γ = 1/12 and cost c = 10 Multilayer Perceptron MLP One hidden layer with 8 units. Learning rate equal to 0.3. Backpropagation learning

    algorithm. 500 epochs

    Radial basis function network RBF k-Means with 40 clusters.

    Voted Perceptron VP 10 iterations and 34,169 perceptrons.

    Artificial Immune System AIS Affinity threshold, clonal rate, hypermutation rate, k-nearest neighbor, stimulation

    value and B-cells equal to 0.2, 10, 2, 3, 0.9 and 150, respectively.

    w

    n

    s

    l

    a

    o

    2

    a

    m

    R

    o

    M

    o

    t

    p

    d

    s

    0

    a

    c

    c

    2

    w

    a

    r

    d

    c

    w

    l

    t

    s

    t

    2

    s

    b

    c

    n

    Y

    m

    (

    r

    p

    n

    w

    a

    e

    a

    here Y [ k, n ] is the n -th wavelet coefficient at the k -th scale,

    = 1, 2, ���, N, k = 1, 2, ��� and ˆ Y is the signal Y attenuated ver-ion. The threshold is estimated as δ = σ

    √ 2 lo g 2 N , where N is the

    ength of Y and σ is the noise standard deviation. After filter-banknalysis, and wavelet coefficient thresholding, a denoised signal is

    btained by applying the inverse DWT.

    .6. A PVC recognition approach

    Supported by a new set of features which are extracted from

    geometric viewpoint, a PVC recognition approach is proposed by

    eans of six steps, described in Fig. 3 .

    First, raw ECG signal is read and then segmented, taking the

    -peak location and the standard cardiac cycle duration (0.75 sec-

    nds). In the present implementation, annotations provided by

    IT-BIH databases about R-peaks localization are used. For each

    btained segment, the wavelet denoising method is applied, and

    hen it is scaled in order to keep zero mean and unit variance. This

    reprocessing step ensures better results because it standardizes

    ata. After this step, an approximated QRS complex is obtained

    egmenting it to the standard width of the QRS complexes, i.e.,

    .12 seconds. After QRS complex segmentation, proposed features

    re extracted and used to the training and test/validation of the

    lassifiers. In the following subsections, a detailed discussion en-

    ompassing the steps of the proposed method is provided.

    .6.1. Preprocessing

    In the preprocessing step, a denoising procedure based on

    avelet thresholding is performed, according to Section 2.5 . In this

    pproach, Y and ˆ Y are noisy and denoised ECG signal segments,

    espectively, and σ = std ( Y [0: 10]), i.e., ten first samples standardeviation. Besides denoising high frequency noise, approximation

    oefficients in the last level are reduce to zero, aiming baseline

    andering removal [35] . DWT decomposition is performed in five

    evels. Next, this ECG signal segment is scaled by means of equa-

    ion: ˜ Y = ( ̂ Y − ˆ Y ) / σ ˆ Y , where ˜ Y is the denoised scaled ECG signalegment, ˆ Y and σ ˆ Y are the average and the standard deviation of

    he denoised ECG signal segment ˆ Y .

    .6.2. Segmentation and training

    After obtaining a denoised scaled ECG signal segment, a new

    egmentation is performed considering 0.6 seconds forward and

    ackward from R-peak, resulting in a 44-sample segment. This pro-

    edure returns an ˜ Y = [ ̃ Y 1 , · · · , ̃ Y 44 ] signal. On this segmented sig-al, proposed features described in Table 4 are calculated. Thus,

    ˜ is mapped into a space of features with fewer dimensions. This

    apping is f : � 44 → � 12 where f ( ̃ Y ) = [ a 1 , · · · , a 12 ] is the featureattribute) vector to be used as input to a machine learning algo-

    ithms. Considering that the main motivation in this research is the

    roposition of a new way to get consistent features for PVC recog-

    ition, eight classifiers encompassing different learning approaches

    ere used in the simulations. In Table 3 , their main configurations

    re summarized. Those configurations were defined by means of

    xhaustive experiments in order to obtain better results, using DS1

    nd DS2 datasets in training and test phases, respectively.

  • 64 B.R.d. Oliveira, C.C.E.d. Abreu and M.A.Q. Duarte et al. / Computer Methods and Programs in Biomedicine 169 (2019) 59–69

    Table 4

    Features description.

    Equation Description

    a 1 = ‖ v 1 ‖ Triangle side a 2 = ‖ v 2 ‖ Triangle side a 3 = ‖ v 1 − v 2 ‖ Triangle side a 4 = ‖ ( v 1 x + v 2 x , v 1 y + v 2 y ) ‖ /3 Triangle center of mass a 5 = θ = arccos( v 1 ∗v 2 / a 1 a 2 ) Angle between a 1 and a 2 a 6 = a 1 a 2 sin ( θ )/2 Triangle area a 7 = 2 a 1 / p Incircle radius a 8 = ‖ ( a 2 v 1 x + a 3 v 2 x , a 2 v 1 y + a 3 v 2 y ) ‖ / p Incenter a 9 = 2 πa 7 Length of the incircle a 10 = πa 2 7 Incircle area a 11 = p x = | v 1 x − v 2 x | Distance between projections on the x -axis a 12 = p y = | v 1 y − v 2 y | Distance between projections on the y -axis

    Fig. 4. Modeling of the proposed geometric characteristics: (a) Standard non-PVC heartbeat. (b) Standard PVC heartbeat.

    l

    c

    Q

    p

    m

    n

    a

    p

    a

    j

    t

    i

    o

    i

    E

    w

    p

    i

    p

    i

    [

    m

    o

    t

    (

    2.6.3. Proposed features

    As outlined previously, the main motivation for this paper is the

    proposition of new features based on a geometric viewpoint able

    to recognize PVC heartbeats. In other words, proposed features aim

    to explain the QRS complex morphology by building geometric fig-

    ures that represent its waveform in a suitable way. Therefore, each

    QRS complex is projected onto the Cartesian plane. Afterward, a

    triangle is built by fixing the fiducial points R and S as two vertices

    and the third vertex at the origin of the plan, which corresponds

    to the beginning of the QRS complex segment. Furthermore, a cir-

    cle (incircle) inscribed in this triangle is also built. This approach

    starts a new way to design attributes from ECG waveforms based

    on geometric figures, representing them in a simplified manner.

    Note that unlike state-of-the-art methods, the proposed method

    does not only consider the amplitude and width (duration) of the

    ECG characteristics waves, but also the relation among them, in a

    linear and a nonlinear way.

    After building these geometric figures, the measures presented

    in Table 4 are calculated. In the proposed approach, it is assumed

    that triangles from abnormal heartbeats suffer some distortion in

    relation to the ones from normal heartbeats, since the respective

    QRS complexes differ in amplitude and position. Consequently, the

    incircle is also modified. Specifically for PVC heartbeats, the mini-

    mum of the QRS complex tends to be larger than in normal beats.

    This is due to the fact that PVC is characterized by a gain in R

    and S waves’ amplitude, generating a distortion in the QRS com-

    plex morphology [3] . Such morphology depends on the origin of

    the ectopic heartbeat. If it occurs at the right ventricle, PVC has a

    eft bundle branch block (LBBB) morphology, since the right ventri-

    le depolarization occurred before the left ventricle. It makes the

    RS complexes predominantly positive [4] . Such occurrence com-

    licates the Normal heartbeat recognition, since the LBBB and Nor-

    al heartbeats may be coincident [36] .

    Fig. 4 , (a) and (b), shows examples of standard waveforms for

    ormal heartbeat and PVC, respectively. Variables v 1 and v 2 ( v ′ 1 nd v ′

    2 ), are vectors whose coordinates, in y -axis, are the QRS com-

    lex waveform global maximum and minimum, respectively. p y nd p x ( p

    ′ y and p

    ′ x ) are the differences between the coordinate pro-

    ections of vectors v 1 and v 2 ( v ′ 1 and v ′ 2 ) in the respective axis. Theriangle center of mass (barycenter) is represented by c ( c ′ ). Thencircle is tangent to each side of the triangle. Finally, the features

    btained by this modeling are discriminated in Table 4 , where “∗”s the vector product, p is triangle perimeter, ‖ · ‖ and | · | are theuclidean norm ( l 2 -norm) and absolute value, respectively.

    Although the geometric figures summarize the QRS complexes

    aveforms and so the features only explain these forms, it is em-

    hasized that the projections on the x -axis are regarded as tim-

    ng features, since they are related to positions of the local fiducial

    oints of the QRS complexes. It is important because such timing

    nformation is relevant for classification, as verified by Bazi et al.

    8] . On the other hand the projections on the y -axis represent the

    agnitude information. It is noteworthy that features a 1 ,…, a 12 are

    f simple computation and do not require any kind of transforma-

    ion or complicated decomposition process.

    Finally, as can be seen in Fig. 4 , QRS complexes for normal

    Fig. 4 (a)) and PVC ( Fig. 4 (b)) beats are different, since PVC config-

  • B.R.d. Oliveira, C.C.E.d. Abreu and M.A.Q. Duarte et al. / Computer Methods and Programs in Biomedicine 169 (2019) 59–69 65

    Fig. 5. Proposed AHP model to choose a classifier.

    u

    a

    a

    2

    t

    a

    F

    fi

    s

    p

    s

    i

    c

    t

    w

    c

    a

    i

    f

    t

    w

    S

    i

    c

    i

    t

    δ

    w

    s

    t

    t

    a

    d

    p

    q

    s

    f

    q

    f

    e

    r

    c

    p

    g

    t

    w

    l

    c

    i

    e

    A

    v

    d

    t

    A

    l

    t

    a

    a

    p

    p

    m

    s

    s

    i

    w

    fi

    p

    m

    t

    h

    o

    F

    t

    c

    m

    s

    c

    j

    3

    p

    res distortions in the QRS. Therefore, the twelve geometric char-

    cteristics proposed, provide a set of comparisons which will help

    classifier to distinguish between the two QRS waveforms.

    .6.4. Classifiers selection

    In order to select the best classifiers and take into account

    he performance of all measures simultaneously, an AHP based

    pproach is proposed. The proposed AHP model is illustrated in

    ig. 5 . Note, from this figure, that evaluation measures and classi-

    ers (alternatives) are located in the hierarchy levels 1 and 2, re-

    pectively. At level h = 2, from left to right, each classifier is com-ared to another in relation to some measure at the top level, re-

    ulting in differences among the scores evaluated by the function

    n (3) . Each of the evaluated scores is an element in the pairwise

    omparison matrix, from which the local priority vector t (2) is ob-

    ained according to Section 2.4 . Therefore, for each measure there

    ill be a pairwise comparison matrix.

    On the other hand, at level h = 1, since the measures are notompared in the experiments and the most important measures

    re those related to the PVC heartbeats identification, the elements

    n the local priority vector s (1) were set as 1 for the measures re-

    erring to positive classes, and 0.5 for negative classes. Therefore,

    he pairwise comparison matrix is not required at level one. In this

    ay, s (1) = [1 1 0.5 1 0.5 1] is related to the measures A cc , P + , P − , e , S p and AUC , respectively, which after normalization, mentioned

    n Section 2.4 , it is close to s (1) = [0.15 0.15 0.07 0.15 0.07 0.15],onsidering only two decimal places.

    Since the judgments are made in an automatic way, a function

    s proposed to replace the decision maker action, which converts

    he objective measures scores to the Saaty’s scale:

    d pq = c p − c q , where p, q = 1 , · · · , 8 and p = q, mpq = � | d pq | κ� , where m = 1 , · · · , 6 and κ > 0 ,

    ( δmpq ) =

    ⎧ ⎪ ⎨ ⎪ ⎩

    1 if δmpq < 1 9 if δmpq > 1 1 / δmpq if d pq < 0 δmpq otherwise

    ,

    (3)

    here c p and c q are values from measure m obtained by the clas-

    ifiers p and q , respectively. x returns the smallest integer greater

    han or equal to x . The constant κ can be adjusted to increasehe weight of the difference | d pq |. For example, in the comparisons

    mong three distinct classifiers p, q 1 and q 2 , if d p q 1 = 0 . 001 and p q 2 = 0 . 1 and κ = 10, then in both cases conversion function( δmp q 1 ) = ( δmp q 2 ) = 1 . Therefore, both alternatives are equally

    referable, i.e., the classifier p has the same preference over q 1 and

    2 . In order to increase function sensitivity, setting κ = 16, for in-tance, the new results are ( δmp q 1 ) = 1 and ( δmp q 2 ) = 2 . There-ore, the last result means that p classifier is more preferred over

    2 than over q 1 . In this work, κ = 20 presents appropriated resultsor comparisons.

    In the global priority vector calculation, given by Eq. (2) , for hi-

    rarchy proposed model, note that indices l = 1, 2, …, 6 and p = 1, 2,, 8 index the lists { A cc , P + , P − , S e ,S p ,AUC } and {RF, KNN, MNB,

    SVM, MLP, RBF, VP, AIS}, representing measures and classifiers,

    espectively. For each measure pairwise comparison matrix a lo-

    al priority vector t (2) l

    ∈ � p is obtained. This vector contains theriority of an alternative over another. In order to calculate the

    lobal priority of the p -th alternative in relation to the l -th cri-

    erion (measure) the summation in Eq. (2) is applied.

    It should be noted that the AHP methodology proposed in this

    ork is grounded on the conversion function in (3) , which trans-

    ates the dissimilarity among objective measures related to the

    lassifiers to the Saaty’s scale. This function is irrelevant regard-

    ng the computational load, since it is composed by simple math-

    matical operations. Although a consistency index is often used in

    HP framework, in this work it is replaced by an automatic con-

    ersion function in such a way that the subjectivity is purged. In

    ecision-making problems, the inconsistency arises due to subjec-

    ive human judgments, when an alternative A 1 is preferable over

    2 , and A 2 over A 3 , but A 3 is preferable over A 1 , characterizing the

    ack of transitivity among the alternatives. It results in an inconsis-

    ent matrix to which there exists an associated consistency ratio,

    ccording to Saaty [31] . This fact is observed in Khanmohammadi

    nd Rezaeiahari [29] , where this ratio is calculated because an ex-

    ert’s knowledge was incorporated to obtain the priorities in the

    erformance measures.

    In contrast to this proposal, Kou and Wu [30] use the AHP

    ethod together associated to other decision-making methods,

    uch as TOPSIS, VIKTOR, PROMETHEE II and GRA, to sort the clas-

    ifiers. In Khanmohammadi and Rezaeiahari [29] , AHP method is

    ncorporated to the expert’s knowledge and used to establish the

    eights of performance measures, which in turn ranks the classi-

    ers. However, the performance measures are not specifically com-

    ared to each other. Finally, in Jaya and Tamilselvi [28] the perfor-

    ance is computed by win-loss tables from the paired t -test and

    he priority vectors are calculated in the same way as proposed

    ere.

    Although AHP has been used in machine learning problems, as

    utlined above, the proposed approach in this paper (according to

    ig. 5 ) is original. It is preferable to others, since it depends only on

    he AHP method and the conversion function, which is easy to cal-

    ulate. In addition, due to the ability to link classifiers to objective

    easures, this approach can be used in the models ensemble con-

    truction. Thus, facing a large number of classifiers, the best ones

    an be selected to compose the ensemble according to several ob-

    ective measures in an automated way.

    . Results

    In order to verify the proposed characteristics robustness, ex-

    eriments encompassing several classifiers with different learn-

  • 66 B.R.d. Oliveira, C.C.E.d. Abreu and M.A.Q. Duarte et al. / Computer Methods and Programs in Biomedicine 169 (2019) 59–69

    Table 5

    Results 1 considering DS1 and DS2 datasets for training and test, respectively.

    Cls A cc P + P − S e S p AUC Confusion Matrix

    T N F P

    F N T P

    RF 0.972 0.827 0.985 0.833 0.985 0.963 35 , 866 562

    539 2680

    KNN 0.968 0.804 0.983 0.805 0.983 0.928 35 , 797 631

    623 2596

    MNB 0.970 0.819 0.984 0.821 0.984 0.945 35 , 841 587

    571 2648

    SVM 0.976 0.978 0.976 0.721 0.999 0.860 36376 52

    898 2321

    MLP 0.967 0.778 0.985 0.835 0.985 0.955 35 , 662 766

    530 2689

    RBF 0.971 0.912 0.976 0.724 0.994 0.977 36 , 204 224

    888 2331

    VP 0.970 0.953 0.971 0.665 0.997 0.831 36 , 322 106

    1079 2140

    AIS 0.984 0.857 0.992 0.911 0.987 0.949 35 , 938 490

    287 2932

    Avg ± σ 0.972 ± 0.005 0.866 ± 0.069 0.982 ± 0.006 0.789 ± 0.075 0.989 ± 0.006 0.926 ± 0.049

    t

    S

    t

    A

    t

    l

    v

    3

    ing approaches are carried out. This phase is performed into four

    stages: 1) General experiments: all classifiers are used in the

    dataset training and test. It ensures overall performance and en-

    ables to choose the best classifiers, besides providing the classi-

    fiers parameters which are used later in the other experiments;

    2) Selection of the classifiers: the top three classifiers are selected

    by the AHP method, based on the proposed conversion function

    (3) ; 3) Deviation in the QRS complex location: for the best classi-

    fiers, new tests are performed considering the characteristics ex-

    tracted from DS2 dataset modified by the insertion of deviations

    in R-peak locations, in order to verify its robustness to misdetec-

    tion; and 4) Artificial heartbeats: aiming to balance the datasets

    and provide more PVC instances for model learning, some artifi-

    cial heartbeats are constructed and the cross-validation approach

    is implemented, avoiding misleading performance due to overfit-

    ting, for both databases described in Section 2.1 .

    3.1. General experiments

    Table 5 shows the classification performance on test dataset

    (DS2). Classifiers and their configurations were presented in

    Table 3 . Some measures are calculated separately for Normal and

    PVC heartbeats, indicated by the superscript initials.

    3.2. Selecting the classifiers

    By means of the AHP method and conversion function (3) , the

    pairwise comparison matrices and their respective priority vectors

    are obtained, according to expressions (4) –(9) with respect to P + ,S e and AUC measures. Matrices for A cc , P − and S p measures andtheir priority vectors have all elements equal to 1.0 and 0.1, respec-

    tively. For these measures, such results mean that there is no pref-

    erence among the classifiers.

    P + =

    ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

    1 1 1 1 / 4 1 1 / 2 1 / 3 1 1 1 1 1 / 4 1 1 / 3 1 / 3 1 / 2 1 1 1 1 / 4 1 1 / 2 1 / 3 1 4 4 4 1 4 2 1 3 1 1 1 1 / 4 1 1 / 3 1 / 4 1 / 2 2 3 2 1 / 2 3 1 1 2 3 3 3 1 4 1 1 2 1 2 1 1 / 3 2 1 / 2 1 / 2 1

    ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

    (4)

    1 σ , Avg and Cls mean standard deviation, average and classifier, respectively.

    Bold results are the best.

    p

    f

    p

    s

    2 = [ 0 . 071 0 . 062 0 . 071 0 . 265 0 . 060 0 . 166 0 . 208 0 . 093 ] T (5)

    e =

    ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

    1 1 1 3 1 3 4 1 / 2 1 1 1 2 1 2 3 1 / 3 1 1 1 2 1 2 4 1 / 2 1 / 3 1 / 2 1 / 2 1 1 / 3 1 2 1 / 4 1 1 1 3 1 3 4 1 / 2 1 / 3 1 / 2 1 / 2 1 1 / 3 1 2 1 / 4 1 / 4 1 / 3 1 / 4 1 / 2 1 / 4 1 / 2 1 1 / 5 2 3 2 4 2 4 5 1

    ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

    (6)

    4 = [ 0 . 151 0 . 125 0 . 136 0 . 061 0 . 151 0 . 061 0 . 037 0 . 273 ] T (7)

    UC =

    ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

    1 1 1 3 1 1 3 1 1 1 1 2 1 1 2 1 1 1 1 2 1 1 3 1 1 / 3 1 / 2 1 / 2 1 1 / 2 1 / 3 1 1 / 2 1 1 1 2 1 1 3 1 1 1 1 3 1 1 3 1 1 / 3 1 / 2 1 / 3 1 1 / 3 1 / 3 1 1 / 3 1 1 1 2 1 1 3 1

    ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

    (8)

    6 = [ 0 . 153 0 . 138 0 . 145 0 . 063 0 . 145 0 . 153 0 . 054 0 . 145 ] T (9)

    In order to aggregate all results, from Eq. (2) and taking the

    ocal priority vectors values, the global priority vector is

    norm = [ 0 . 123 0 . 115 0 . 119 0 . 125 0 . 120 0 . 123 0 . 111 0 . 161 ] T(10)

    .3. Deviation in the QRS complex location

    The proposed method is dependent on the QRS complex or R-

    eak location, in the same way that the baseline methods. There-

    ore, some deviation at this location is expected to damage the

    rediction accuracy [11] . To verify the robustness in relation to

    uch deviations, for each R-peak location a random number from

  • B.R.d. Oliveira, C.C.E.d. Abreu and M.A.Q. Duarte et al. / Computer Methods and Programs in Biomedicine 169 (2019) 59–69 67

    Table 6

    Results 2 considering artificial deviations in R-peak locations.

    Cls A cc P + P − S e S p AUC Confusion Matrix Pri

    T N F P

    F N T P

    SVM 0.976 0.977 0.976 0.721 0.999 0.860 36374 54

    897 2322

    0.35

    RBF 0.967 0.853 0.976 0.724 0.989 0.958 36 , 147 402

    889 2330

    0.33

    AIS 0.945 0.622 0.986 0.857 0.955 0.901 34 , 771 1657

    491 2728

    0.32

    Avg ± σ 0.962 ± 0.013 0.917 ± 0.147 0.979 ± 0.004 0.767 ± 0.063 0.981 ± 0.018 0.906 ± 0.040

    Table 7

    Five folds cross-validation results using real and artificial PVC heartbeats from DS1 and DS2 datasets.

    Cls A cc P + P − S e S p AUC Confusion Matrix Pri

    T N F P

    F N T P

    SVM 0.990 0.995 0.985 0.985 0.995 0.990, 36253 175 0.33

    547 35881

    RBF 0.987 0.987 0.988 0.988 0.987 0.998 35951 477 0.33

    430 35998

    AIS 0.982 0.981 0.984 0.984 0.981 0.982 35735 693 0.33

    588 35840

    Avg ± σ 0.986 ± 0.003 0.987 ± 0.005 0.985 ± 0.001 0.985 ± 0.001 0.987 ± 0.005 0.990 ± 0.006

    Table 8

    Five folds cross-validation results using real and artificial PVC heartbeats from DS3 dataset.

    Cls A cc P + P − S e S p AUC Confusion Matrix Pri

    T N F P

    F N T P

    SVM 0.942 0.907 0.985 0.987 0.899 0.943 34417 3875 0.30

    513 37779

    RBF 0.977 0.975 0.980 0.980 0.975 0.978, 37335 957 0.36

    755 37537

    AIS 0.945 0.953 0.938 0.937 0.954 0.945 36512 1780 0.34

    2431 35861

    Avg ± σ 0.954 ± 0.015 0.945 ± 0.028 0.967 ± 0.021 0.968 ± 0.022 0.942 ± 0.032 0.955 ± 0.016

    t

    i

    l

    p

    t

    i

    3

    o

    s

    d

    s

    a

    T

    P

    D

    3

    p

    f

    p

    t

    b

    f

    d

    p

    n

    4

    t

    r

    a

    i

    r

    t

    t

    t

    a

    t

    t

    s

    h

    r

    t

    m

    he set { ± 1, ±2, ���, ± 5} is added. This procedure is done onlyn test dataset (DS2). It is equivalent to diverting the QRS complex

    ocation in up to five samples, according to Ye and Kumar [37] . In

    ractice, this experiment simulates errors in the QRS complex de-

    ector algorithms. Results for the top three classifiers are presented

    n Table 6 .

    .4. Balanced datasets

    Results presented so far are provided by experiments conducted

    n unbalanced datasets (see Table 1 ), since there are more in-

    tances of Normal than PVC heartbeats. In order to overcome this

    rawback, it is proposed to generate artificial PVC heartbeat in-

    tances. Therefore, two PVC heartbeats are randomly chosen and

    veraged among all available, generating a new PVC heartbeat.

    his procedure results in 36,428 Normal heartbeats and 36,428

    VC, where 29,526 are artificial, for DS1 and DS2 datasets. For

    S3 dataset there are considered 38,292 Normal heartbeats and

    8,292 PVC, where 37,970 are artificial. The classification stage is

    erformed considering the cross-validation method [38] , with five

    olds to avoid models overfitting. It is noteworthy that, in this ex-

    eriment, different heartbeats from a same patient can be used for

    he classifiers training and test instead of what was done with un-

    alanced datasets. Table 7 shows results considering the recordings

    rom DS1 and DS2 datasets. Table 8 shows results only for DS3

    2 Columns “Cls” and “Pri” mean Classifier and Priority, respectively.

    s

    a

    t

    ataset aiming to validate the performance obtained in the last ex-

    eriments, using classifiers configurations described in Table 3 , but

    ow, on a validation dataset.

    . Discussion

    Analyzing the classifiers average results in Table 5 , it is noted

    hat 86.6% of positive and 98.2% of negative examples are cor-

    ectly classified. For SVM and AIS, these results are higher, 97.8%

    nd 99.2%, respectively. Taking into account overall performances,

    t is noted that the AIS is the best, but it is worse than SVM with

    espect to P + and S p measures, close to 12.8% and 1.2%, respec-ively. It is also noteworthy that SVM generates less false positives

    han the other classifiers, whereas AIS produces less false nega-

    ives. In addition, RBF presents the greater AUC result, around 2.8%

    nd 11.7% higher in relation to AIS and SVM classifiers, respec-

    ively. By taking the measures separately, it is not clear which is

    he best classifier absolutely. For example, AIS presents most mea-

    ures higher than SVM, however the positive prediction for PVC

    eartbeats recognition and specificity are larger for SVM. For this

    eason, AHP is used to rank them.

    Analyzing the pairwise comparison matrices and their respec-

    ive priority vectors, according to expressions (4) –(9) , from P + easure, matrix (4) and vector (5) , it is concluded that SVM clas-

    ifier is the best, with priority of 0.265 (4-th column in vector (5) )

    nd since all values are integers, greater than or equal to one (4-

    h row in matrix (4) ). Therefore, SVM is preferable over the others,

  • 68 B.R.d. Oliveira, C.C.E.d. Abreu and M.A.Q. Duarte et al. / Computer Methods and Programs in Biomedicine 169 (2019) 59–69

    Table 9

    Comparing the proposed method with baseline methods.

    Approach PVC Beats Classifier A cc P + P − S e S p

    Liu et al. [9] 2400 LVQ neural network 0.99 0.92 – 0.90 –

    Bazi et al. [8] 7117 Gaussian process 0.96 – – 0.97 0.96

    Hadia et al. [6] – k -NN – – – 0.93 0.93

    Adnane and Belouchrani [10] 1540 Threshold coefficients 0.98 0.92 – 0.97 0.98

    Li et al. [5] 3213 Template matching 0.98 0.81 0.99 0.93 0.99

    Zarei et al. [7] 3220 Principal directions 0.99 0.86 – 0.96 –

    Proposed - Table 5 3219 AIS 0.98 0.85 0.99 0.91 0.98

    Proposed - Table 7 36,428 SVM 0.99 0.99 0.98 0.98 0.99

    a

    s

    w

    t

    t

    A

    T

    b

    [

    t

    n

    p

    B

    a

    t

    m

    b

    c

    m

    fi

    5

    u

    s

    r

    s

    m

    s

    v

    c

    s

    b

    e

    d

    w

    a

    a

    a

    p

    t

    s

    w

    o

    c

    e

    u

    f

    c

    except to VP classifier, since the value P + 4,7 (the element locatedin the fourth row and seventh column) is equal to one. Likewise,

    VP is the second best, with priority of 0.208 and having on the

    7-th row all integer values. These results are consistent with those

    in the Table 5 , since SVM and VP classifiers obtained better results

    for P + . In relation to S e measure, from expressions (6) and (7) , AIS clas-

    sifier is the best again. The preference over VP classifier is five (el-

    ement S e 8,7 ), since VP classifier got the lowest value of all the mea-

    sures, which is 0.665 ( Table 5 , row 8 and column 9). In addition,

    this preference is the highest value obtained.

    Lastly, from matrix in (8) and vector in (9) , it is concluded that

    RBF and RF are the best in terms of AUC measure, with priority of

    0.153 (first and 6-th rows in (9) ), followed by AIS, MNB and MLP,

    with the same priority.

    Considering that the elements in global priority vector v norm ,

    given by (10) , are ordered as RF, KNN, MNB, SVM, MLP, RBF, VP

    and AIS, the top three classifiers, from the best to the worst, are

    AIS, SVM and RBF, with 16.12%, 12.53% and 12.39% of priority, re-

    spectively. This selection is consistent with the results presented in

    Table 5 , taking into account bold values.

    Comparing the respective results shown in Table 5 and Table 6 ,

    it is observed that AIS and RBF present worse performances. There

    was a decrease mainly in the positive class measures. Mean-

    while, SVM maintained the same performance, getting only two

    more false positives and one more false negative. RBF only main-

    tained P − and S e measures. On the other hand, AIS presented inTable 6 all scores lower than those obtained in Table 5 . In general,

    AIS presents the worse performance, according to priority vector

    (last column of Table 6 ). Nevertheless, AIS presented lower false

    negative alarms in both experiments. Therefore, these results en-

    sure that the proposed geometrical features robustness, using SVM,

    facing a QRS complex misdetection.

    Overall results shown in Table 7 make clear a significant im-

    provement in comparison with previous results on unbalanced real

    data. RBF classifier yet keeps first position with respect to AUC

    measure. Furthermore, it overcomes AIS in relation to the other

    measures, but only from the third decimal place on. Finally, SVM

    obtained better results for a larger amount of measures. Even so,

    AHP tool provides the same priority values for these classifiers. It is

    justified because the relative difference among measurements ob-

    tained by each classifier is very low.

    On the other hand, results obtained from balanced STDB

    database are worse than the ones obtained by ARDB in relation

    to all measures. Partly, it can be explained due to lower amount

    of PVC instances in STDB, which prevents further generalization

    by the classifiers. Note that the decreases, comparing results in

    Table 7 with the ones in Table 8 , for P + and P − , are on average4.2% and 1.8%, respectively. Another circumstance comprising these

    decreases refers to the classifiers parameters, since they are stipu-

    lated from ARDB database and are adjusted for it.

    Ultimately, the proposed approach is compared with baseline

    methods. Note from Table 9 that some researches did not report

    a

    ll performance measures used in evaluations. Furthermore, each

    elected paper used different classifiers. However, the goal in this

    ork is to evaluate the proposed geometrical features, and thus,

    he classifier that provides best results. It is also worth noting that

    he reported papers and the present work used the same database

    RDB, but different dataset recordings.

    Regarding Normal heartbeats recognition, it is noted from

    able 9 that the proposed approach performed at least equal to

    aseline methods, being superior when compared to Hadia et al.

    6] , Bazi et al. [8] and Adnane and Belouchrani [10] in relation

    o specificity measure. When considering PVC heartbeats recog-

    ition results, the last two rows in Table 9 , the proposed ap-

    roach overcome all methods in relation to P + and S e measures.roadly speaking, the proposed method obtained less false positive

    nd negative alarms than the most of baseline methods. Note still

    hat the artificial dataset used in the simulations by the proposed

    ethod, in relation to PVC heartbeats, is larger than those used

    y the baseline methods. For the real dataset it is also larger, ex-

    ept to Bazi et al. [8] and Zarei et al. [7] . Moreover, cross-validation

    ethod implemented for the artificial dataset avoids model over-

    tting, which generally overestimates performance.

    . Conclusion

    In this paper a new set of features based on geometrical fig-

    res extracted from QRS complexes for PVC recognition was pre-

    ented, opening a new way to design features looking for geomet-

    ical aspects from ECG waveforms, aiming to represent them in a

    imple and low cost manner. The proposed mathematical model

    apped the signal, obtained after a preprocessing stage, into a

    pace of reduced dimension. The new features were built from

    irtual triangles and incircles constructed over each QRS complex,

    onsidering their fiducial points. Results obtained from extensive

    imulations indicated that the proposed approach performed the

    est in terms of specificity (98.7%) and sensitivity (91.1%) measures

    valuated on DS2 dataset, using an AIS classifier. Using balanced

    ataset by means of artificial PVC heartbeats insertion, the results

    ere improved to 99.5% and 98.5% in terms of positive prediction

    nd sensitivity, respectively. Applying this approach to another bal-

    nced dataset (DS3), the obtained results were 97.5% for specificity

    nd 98.0% for sensitivity. In general, for accuracy measure, the pro-

    osed approach is as good as the baseline methods, being better

    han two of them. Furthermore, the proposed features are not sen-

    itive to errors of detection in QRS complexes in up to five samples,

    hen using an SVM classifier.

    In addition to a new set of features, this paper has reported

    n a new methodology based on AHP in order to select the best

    lassifier supported by consistent criteria. This methodology can be

    xtended to any research in the field of machine learning, being

    seful mainly in the construction of ensemble of models. This is a

    urther topic to be investigated.

    In future works it is intended to get other geometrical figures

    onstructed over T waves, making possible the recognition of other

    rrhythmia types, besides the improvement in performance. In ad-

  • B.R.d. Oliveira, C.C.E.d. Abreu and M.A.Q. Duarte et al. / Computer Methods and Programs in Biomedicine 169 (2019) 59–69 69

    d

    b

    t

    C

    A

    o

    E

    I

    t

    E

    B

    S

    f

    R

    [

    [

    [[

    [

    [

    [

    [

    [

    [

    [

    [

    [

    [

    [

    ition, validation of the proposed features on more databases will

    e performed, as well as the setting of a single classifier in order

    o obtain the best parameters to achieve higher results.

    onflict of interest statement

    There are no conflicts of interest.

    cknowledgments and declarations

    Authors would like to thank Coordination for the Improvement

    f Higher Education Personnel ( CAPES ) and the Department of

    lectrical Engineering of the State University of São Paulo ( UNESP ),

    lha Solteira, Brazil. There are no conflicts of interest. The men-

    ioned experiments were not directly performed with people, since

    CG signals used for tests were artificial or acquired from two MIT-

    IH databases.

    upplementary materials

    Supplementary material associated with this article can be

    ound, in the online version, at doi: 10.1016/j.cmpb.2018.12.028 .

    eferences

    [1] T.W. Smith , Tarascon ECG Pocketbook, Jones and Bartlett Learning, Burlington,2013 .

    [2] R. Latchamsetty, F. Bogun, Premature ventricular complexes and prematureventricular complex induced cardiomyopathy, Curr. Probl. Cardiol 40 (2015)

    379–422. https://doi.org/10.1016/j.cpcardiol.2015.03.002 .

    [3] T.B. Garcia , G.T. Miller , Arrhythmia Recognition: The Art of Interpretation, Jonesand Bartlett Publishers, Burlington, 2004 .

    [4] K. Fred , ECG Interpretation: From Pathophysiology to Clinical Application,Springer, New York, 2009 .

    [5] P. Li, C. Liu, X. Wang, D. Zheng, Y. Li, C. Liu, A low-complexity data-adaptive ap-proach for premature ventricular contraction recognition, Signal, Image Video

    Process. 8 (2014) 111–120. https://doi.org/10.1007/395s11760- 013- 0478- 6 .

    [6] R. Hadia, D. Guldenring, D.D. Finlay, A. Kennedy, G. Janjua, R. Bond, J. McLaugh-lin, Morphology-based detection of premature ventricular contractions, Com-

    put. Cardiol. 44 (2017) 1–4. https://doi.org/10.22489/CinC.2017.211-260 . [7] R. Zarei, J. He, G. Huang, Y. Zhang, Effective and efficient detection of prema-

    ture ventricular contractions based on variation of principal directions, DigitalSignal Process. 50 (2016) 93–102. https://doi.org/10.1016/j.dsp.2015.12.002 .

    [8] Y. Bazi, H. Hichri, N. Alajlan, N. Ammour, Premature Ventricular Contraction Ar-

    rhythmia Detection and Classification with Gaussian Process and S Transform,in: IEEE Fifth Int. Conference on Comput. Intell., Commun. Syst. and Networks,

    2013, pp. 36–41. https://doi.org/10.1109/CICSYN.2013.44 . [9] X. Liu, H. Du, G. Wang, S. Zhou, H. Zhang, Automatic diagnosis of premature

    ventricular contraction based on Lyapunov exponents and LVQ neural network,Comp. Methods Programs Bio. 22 (2015) 47–55. https://doi.org/10.1016/j.cmpb.

    2015.06.010 .

    [10] M. Adnane, A. Belouchrani, Premature Ventricular Contraction Arrhythmia De-tection using Wavelets Coefficientes, in: IEEE 8th Int. Workshop on Syst., Signal

    Process. and their Appl., 2013, pp. 170–173. https://doi.org/10.1109/WoSSPA.2013.6602356 .

    [11] E.J. da S. Luz, W.R. Schwartz, G. Cámara-Chávez, D. Menotti, ECG-based heart-beat classification for arrhythmia detection: A survey, Comp. Methods Pro-

    grams Bio. 127 (2016) 144–164. https://dx.doi.org/10.1016/j.cmpb.2015.12.008 .

    [12] A .L. Goldberger , L.A .N. Amaral , L. Glass , J.M. Hausdor , P.C. Ivanov , R.G. Mark ,J.E. Mietus , G.B. Moody , C.-K. Peng , H.E. Stanley , PhysioBank, PhysioToolkit, and

    PhysioNet: Components of a new research resource for complex physiologicsignals, Circulation 101 (23) (20 0 0) e215–e220 .

    [13] G.B. Moody , R.G. Mark , The impact of the MIT-BIH arrhythmia database, IEEEEng. Med. Biol. 3 (20) (2001) 45–50 .

    [14] AAMI, Association for the Advancement of Medical Instrumentation (1987). [15] M. Kubat , An Introduction to Machine Learning, Springer, New York, 2015 .

    [16] H. Deng , Y. Sun , Y. Chang , J. Han , Data Classification: Algorithms and Applica-

    tions, Chapman & Hall/CRC, New York, 2015 . [17] A. Mccallum , K. Nigam , A comparison of event models for naive bayes text

    classification, AAAI-98 Workshop on Learning for Text Categorization, 1998 . [18] Y. Freund , R.E. Schapire , Large margin classification using the perceptron algo-

    rithm, Mach. Learn. (1999) 277–296 . [19] S. Haykin , Neural Networks and Learning Machines, 3rd Edition, Pearson Pren-

    tice Hall, New York, 2009 .

    20] C. B. , A tutorial on support vector machines for pattern recognition, Data Min.Knowl. Discovery 2 (2) (1998) 121–167 .

    [21] V.N. V. , The Nature of Statistical Learning Theory, Springer-Verlag, New York,1995 .

    22] T.K. Ho, The random subspace method for constructing decision forests, IEEETrans. Pattern Anal. Mach. Intell. 20 (8) (1998) 832–844. https://doi:10.1109/

    34.709601 .

    23] L. Breiman , Random forests, Mach. Learn. 45 (2001) 5–32 . 24] D. Dasgupta , F. Nino , Immunological computation: Theory and Applications,

    CRC Press, New York, 2008 . 25] A. Watkins, J. Timmis, L. Boggess, Artificial immune recognition system (airs):

    An immune-inspired supervised learning algorithm, Genet. Program. EvolvableMach. 5 (2004) 291–317. https://doi.org/10.1023/B:GENP.0000 .

    26] O.S. Vaidya, S. Kumar, Analytic hierarchy process: An overview of applications,

    Eur. J. Oper. Res. 169 (2006) 129. https://doi.org/10.1016/j.ejor.2004.04.028 . [27] B.R. de Oliveira , L.R. Oliveira , M.A.Q. Duarte , Multicriteria analysis applied at

    the choice of projects specified by resolution 154/2012 of the national councilof justice (in portuguese), Revista Democracia Digital e Governo Eletrônico 14

    (2016) 121–142 . 28] Y.B.J. Jaya, J.J. Tamilselvi, Simplified MCDM analytical weighted model for rank-

    ing classifiers in financial risk datasets, in: 2014 International Conference on

    Intelligent Computing Applications, 2014, pp. 158–161. https://doi.org/10.1109/ICICA.2014.42 .

    29] S. Khanmohammadi, M. Rezaeiahari, AHP based classification algorithm selec-tion for clinical decision support system development, Procedia Comput. Sci.

    36 (2014) 328–334. https://doi.org/10.1016/j.procs.2014.09.101 . 30] G. Kou, W. Wu, An analytic hierarchy model for classification algorithms selec-

    tion in credit risk analysis, Math. Prob. Eng. 2014 (2014) 7. https://.org/10.1155/

    2014/297563 . [31] R. Saaty, The analytic hierarchy process - what it is and how it is used, Math.

    Modell. 9 (3) (1987) 161–176. https://doi.org/10.1016/0270- 0255(87)90473- 8 . 32] M. Brunelli , Introduction to the Analytic Hierarchy Process, no. VIII in Springer

    Briefs in Operations Research, Springer, New York, 2015 . 33] D.L. Donoho, De-noising by soft-thresholding, IEEE Trans. Inf. Theory 41 (1995)

    613–627. https://doi:10.1109/18.382009 . 34] B.R. de Oliveira, C.C.E. de Abreu, M.A.Q. Duarte, J.V. Filho, A wavelet-based

    method for power-line interference removal in ECG signals, Res. Biomed. Eng.

    34 (2018) 73–86. https://doi.org/10.1590/2446-4740.01817 . 35] C. Bunluechokchai , T. Leeudomwong , Discrete wavelet transform-based base-

    line wandering removal for high resolution electrocardiogram, Int, J. Appl. Bio.Eng. 3 (1) (2010) 26–31 .

    36] F.E. Gossler , B.R. de Oliveira , M.A.Q. Duarte , R.L. Lamblém , F.V. Alvarado , Awavelet generated from Fibonacci-coefficient polynomials and its application

    in cardiac arrhythmia classification, in: Proc. of XIX ENMC-National Meeting on

    Comp. Model. and VII ECTM - Meeting on Materials Science and Tech., 2016 . [37] C. Ye , B.V.K.V. Kumar , M.T. Coimbra , Heartbeat classification using morpholog-

    ical and dynamic features 460 of ecg signals, IEEE Trans. Biomed. Eng. 59 (10)(2012) 2930–2941 .

    38] R. Kohavi , in: A study Of Cross-Validation and Bootstrap For Accuracy Estima-tion and Model Selection, Morgan Kaufmann, 1995, pp. 1137–1143 .

    http://dx.doi.org/10.13039/501100002322http://dx.doi.org/10.13039/501100009523https://doi.org/10.1016/j.cmpb.2018.12.028http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0001http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0001https://doi.org/10.1016/j.cpcardiol.2015.03.002http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0003http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0003http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0003http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0004http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0004https://doi.org/10.1007/395s11760-013-0478-6https://doi.org/10.22489/CinC.2017.211-260https://doi.org/10.1016/j.dsp.2015.12.002https://doi.org/10.1109/CICSYN.2013.44https://doi.org/10.1016/j.cmpb.2015.06.010https://doi.org/10.1109/WoSSPA.2013.6602356https://dx.doi.org/10.1016/j.cmpb.2015.12.008http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0012http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0012http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0012http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0012http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0012http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0012http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0012http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0012http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0012http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0012http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0012http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0013http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0013http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0013http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0014http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0014http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0015http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0015http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0015http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0015http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0015http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0016http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0016http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0016http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0017http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0017http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0017http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0018http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0018http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0019http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0019http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0020http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0020https://doi:10.1109/34.709601http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0022http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0022http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0023http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0023http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0023https://doi.org/10.1023/B:GENP.0000https://doi.org/10.1016/j.ejor.2004.04.028http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0026http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0026http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0026http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0026https://doi.org/10.1109/ICICA.2014.42https://doi.org/10.1016/j.procs.2014.09.101https://.org/10.1155/2014/297563https://doi.org/10.1016/0270-0255(87)90473-8http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0031http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0031https://doi:10.1109/18.382009https://doi.org/10.1590/2446-4740.01817http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0034http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0034http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0034http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0035http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0035http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0035http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0035http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0035http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0035http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0036http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0036http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0036http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0036http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0037http://refhub.elsevier.com/S0169-2607(18)31243-4/sbref0037

    Geometrical features for premature ventricular contraction recognition with analytic hierarchy process based machine learning algorithms selection1 Introduction2 Materials and methods2.1 Dataset description2.2 Evaluation measures2.3 Machine learning algorithms2.3.1 k-Nearest neighbors2.3.2 Multinomial naive Bayes2.3.3 Voted perceptron2.3.4 Multilayer perceptron2.3.5 Support vector machines2.3.6 Radial-basis functions network2.3.7 Random forest2.3.8 Artificial immune system

    2.4 Analytic Hierarchy Process (AHP)2.5 Wavelet denoising2.6 A PVC recognition approach2.6.1 Preprocessing2.6.2 Segmentation and training2.6.3 Proposed features2.6.4 Classifiers selection

    3 Results3.1 General experiments3.2 Selecting the classifiers3.3 Deviation in the QRS complex location3.4 Balanced datasets

    4 Discussion5 ConclusionConflict of interest statementAcknowledgments and declarationsSupplementary materialsReferences