first prototype of welcome cloud softwared5.1 – first prototype of welcome cloud software page 5...

``````````

(Page intentionally blank)

WELCOME Wearable Sensing and Smart Cloud Computing for Integrated Care to COPD Patients with

Comorbidities

FP7-611223

Project co-funded by the European Commission under the

Information and Communication Technologies (ICT) 7th

Framework Programme

First prototype of WELCOME Cloud

Software

Progress on WP 5: Development of DSS, Algorithms and Cloud Services

Document Identifier: D5.1 Due Date: 31/07/2016

Delivery Date: 06/11/2015 Classification: Public

Editors: Nicos MAGLAVERAS, Andreas RAPTOPOULOS

Document Version: 3.0

Contract Start Date: 1st November 2013 Contract Duration: 48 months

Project Partners: EXODUS (GR), CSEM (CH), KINGSTON (UK), AUTH (GR), INVENTYA (UK), CAU (DE), ROYAL COLLEGE OF SUR (IR), SMARTEX (IT), CIRO+ B.V. (NL), Kristronics GmbH (DE), UNIVERSADE DE COIM (PT), Croydon Health Servi (UK)

D5.1 – First prototype of WELCOME Cloud Software of 159


(Page intentionally blank)


Contributors

Name Organization

Ioanna Chouvarda AUTH

Vasilis Kilintzis AUTH

Pantelis Natsiavas AUTH

Nikos Beredimas AUTH

Kostas Haris AUTH

Bruno Rocha UCOMB

Carlos Cortinhas UCOMB

César Teixeira UCOMB

Diogo Nunes UCOMB

Jorge Henriques UCOMB

Luís Mendes UCOMB

Paulo de Carvalho UCOMB

Ricardo Couceiro UCOMB

Rui Pedro Paiva UCOMB

Peer Reviewers

Name Organization

Andreas Raptopoulos EXUS

IIsabelle Killane RCSI

Revision History

Version Date Modifications

0.1 01/10/2015 Initial draft

0.2 28/10/2015 Integration of partners inputs

0.3 30/10/2015 Minor changes

0.4 2/11/2015 Minor changes

0.5 2/11/2015 Formatting and minor modifications before internal review

0.6 3/11/2015 Integrating internal review comments

0.7 3/11/2015 Adding test cases in appendix

0.8 4/11/2015 Moving details of signal processing algorithms in appendix

0.9 6/11/2015 Minor typo corrections, adding more test cases

1.0 6/11/2015 Submitted

2.0 30/3/2016 Submitted after 2nd year’s review comments

3.0 15/7/2016 Submitted after 2nd year’s interim review comments


Executive Summary

This deliverable describes the progress and accomplishments of WP5 so far, which concerns the

development of the WELCOME cloud system, the “heart of WELCOME system”, where the management

and processing of data takes place. Specifically, WP5 develops:

the signal processing algorithms that use the biosignals made available by the multi-sensorial

vest setup for robust detection and characterization of relevant arrhythmias, reliable

detection of cough, wheezes and crackles, in the context of activity. In addition, existing

medication adherence processing algorithms are incorporated to generate appropriate

adherence information for COPD patients, based on the audio recordings of all inhaler events

adherence provided by the inhaler adherence devices.

the EIT signal processing, employing the vest EIT sensor system, to estimate regional lung

function during tidal breathing and ventilation manoeuvres and to make available novel

quantitative measures for clinical evaluation, especially as regards COPD and CHF comorbidity.

the COPD-comorbidities treatment methodologies and the DSS system for extracting the

pertinent information to be presented to the healthcare stakeholders allowing them to acquire

a better view of the disease progress and patient’s condition and supporting medical decisions

for timely and efficient medical intervention.

the cloud computing environment where the algorithms and DSS is deployed as cloud services,

including also cloud data management services.

WP5 development is based on the WELCOME system design as defined in WP3 (D3.1 and D3.2). In this

document, the initial development is described, including data management, data analysis and

knowledge elicitation/management sections. The open issues and the plans for the next development

phase are also discussed. This deliverable, D5.1 has the purpose to serve as an input for the integration

work package (WP7). As development continues to evolve, the complete development of the

WELCOME cloud system will be documented in D5.2.

This document has been revised in order to address 2nd year’s review comments. Subsection 1.2

includes one by one comments and responses.

In addition, this document has also been revised in order to address the 2nd year’s interim review

comments. Subsection 1.3 includes one by one comments and responses. The updated parts of the

deliverable are color coded, with a) highlighted yellow title where the whole section is new or radically

updated, and b) highlighted yellow body text as newly inserted text in existing sections. It has to be

clarified that these updated parts include basically updated reporting with respect to the review

questions, while the further development that took place during this period will be reported in D5.2.


Table of Contents

Executive Summary .................................................................................................................................. 5

1 Introduction ................................................................................................................................... 11

1.1 Development process ............................................................................................................. 12

1.2 Review's Comments and Answers .......................................................................................... 13

1.3 Interim review’s Comments and Answers .............................................................................. 15

2 Data management on the cloud .................................................................................................... 16

2.1 WELCOME Cloud Communication API .................................................................................... 16

2.1.1 Overview ......................................................................................................................... 16

2.1.2 Storage Engine ................................................................................................................ 17

2.1.3 SPARQL Graph Store Protocol Client .............................................................................. 17

2.1.4 Validation ........................................................................................................................ 17

2.1.5 REST API and CRUD Operations ...................................................................................... 17

2.1.6 Web Application ............................................................................................................. 17

2.1.7 Implementation Status ................................................................................................... 18

2.2 Orchestrator Agent ................................................................................................................. 18

2.2.1 External Communication Interface (ECI) ........................................................................ 19

2.2.2 Workflow Manager (WM)............................................................................................... 20

2.2.3 External sources connector (ESC) ................................................................................... 21

2.2.4 Calculations handler (CH) ............................................................................................... 22

2.2.5 Periodic behavior handler (PBH) .................................................................................... 22

2.2.6 Feature Extractions Handler (FEH) ................................................................................. 22

2.2.7 Storage Engine Module (SEM) ........................................................................................ 23

2.2.8 Logging Module (LM) ...................................................................................................... 23

3 Security aspects regarding the integration of various modules .................................................... 25

3.1 Types of integration of WELCOME modules ........................................................................... 25

3.2 Measures taken for each interconnection point .................................................................... 25

3.3 Error handling and logging ..................................................................................................... 26

4 Data analysis .................................................................................................................................. 28

4.1 Feature Extraction Server ....................................................................................................... 28

4.1.1 Server Execution Environment ....................................................................................... 28

4.1.2 Server Communication API ............................................................................................. 29

4.2 Implementation of Algorithms ............................................................................................... 29

4.2.1 Signal Processing Algorithms .......................................................................................... 29

4.2.2 EIT Signal Processing ....................................................................................................... 39


4.3 Performance of the feature extraction algorithms ................................................................ 40

4.3.1 Testing environment ....................................................................................................... 40

4.3.2 Results and discussion .................................................................................................... 41

5 Medical Knowledge and decision support ..................................................................................... 43

5.1 Decision Support System ........................................................................................................ 43

5.1.1 DSS Functionality Overview ............................................................................................ 43

5.1.2 Main components ........................................................................................................... 44

5.1.3 DSS information flow ...................................................................................................... 45

5.1.4 Usage scenarios .............................................................................................................. 51

6 Risk Management .......................................................................................................................... 53

7 Conclusions .................................................................................................................................... 62

7.1 Future steps ............................................................................................................................ 62

7.2 Open issues ............................................................................................................................. 62

8 Bibliography ................................................................................................................................... 64

9 Abbreviations ................................................................................................................................. 69

Appendix A.REST API

70

Appendix B.Signal Processing Algorithms: Details

77

B.1 ECG ......................................................................................................................................... 77

B.1.1 ECG signal quality assessment ........................................................................................ 77

B.1.2 ECG segmentation and intervals computation ............................................................... 81

B.1.3 Ventricular Arrhythmias ................................................................................................. 85

B.1.4 AV Blocks ........................................................................................................................ 90

B.1.5 ST deviation .................................................................................................................... 92

B.1.6 Atrial Fibrillation using 12-lead ECG ............................................................................... 94

B.1.7 References ...................................................................................................................... 99

B.2 Lung Sounds .......................................................................................................................... 100

B.2.1 Data Collection ............................................................................................................. 100

B.2.2 Sound signal quality assessment .................................................................................. 101

B.2.3 Cough Detection ........................................................................................................... 102

B.2.4 Detection of crackles and wheezes .............................................................................. 106

B.2.5 References .................................................................................................................... 114

B.3 EIT Signal Processing ............................................................................................................. 115

B.3.1 EIT Processing Modules and EIDORS ............................................................................ 116

B.4 Medication adherence algorithms ....................................................................................... 121


Appendix C.Test cases

121

List of Figures Figure 1: WELCOME cloud system overview .......................................................................................... 12

Figure 2: WELCOME Communication API components stack ................................................................. 16

Figure 3: Orchestrator agent’s modules’ stack ....................................................................................... 19

Figure 4: The main information handling workflow ............................................................................... 21

Figure 5: Feature Extraction Server Architecture. .................................................................................. 28

Figure 6: Potential positions for the acquisition of sounds (red). For each volunteer we selected the data

acquired from the two positions where the adventitious sounds/normal sounds were better heard. 35

Figure 7: Outline of the cough detection algorithm. .............................................................................. 37

Figure 8: Workflow of the crackles/wheezes events detector algorithm. ............................................. 38

Figure 9: WELCOME DSS: Knowledge base enhancement and knowledge flow .................................... 44

Figure 10: Core DSS software stack ........................................................................................................ 45

Figure 11: WELCOME Clinical DSS Process ............................................................................................. 46

Figure 12: Code snippet regarding the handling of the feature extraction processes. Web hooks logic is

used as described in D3.2 ....................................................................................................................... 47

Figure 13: OA calling the various DSS processes and handling possible exceptions .............................. 48

Figure 14: SPIN function used to identify sputum increase based on questionnaire ............................. 49

Figure 15: SPIN function used to identify purulent sputum ................................................................... 49

Figure 16: Features extracted directly connected to ECG characteristics. ............................................. 86

Figure 17 : Comparison of amplitude differences between normal beats and PVCs morphologic

derivatives. ............................................................................................................................................. 86

Figure 18 : Examples of incorrectly classified ECG signals. ..................................................................... 89

Figure 19: Example of an AV block – first-degree................................................................................... 90

Figure 20: Example of an 3:2 AV block – second degree type I. ............................................................. 91

Figure 21: Example of an 3:1 AV block – second degree type II. ............................................................ 91

Figure 22: Example of and electrocardiogram and the corresponding high frequency components

(Wigner-Ville transform). ....................................................................................................................... 94

Figure 23: Spectra of AF and non-AF episodes and corresponding extracted features. ........................ 98

Figure 24: Potential positions for the acquisition of sounds (red). For each volunteer we selected the

data acquired from the two positions where the adventitious sounds/normal sounds were better heard.

.............................................................................................................................................................. 101

Figure 25: Outline of the cough detection algorithm. .......................................................................... 103

Figure 26: Audio signal (a) before and (b) after pre-processing. .......................................................... 103

Figure 27: Magnitude spectrum of 2 cough segments followed by 1 artifact and 2 speech segments.

.............................................................................................................................................................. 104

Figure 28: Workflow of the crackles/wheezes events detector algorithm. ......................................... 107

Figure 29: (Left) Injection of electrical currents and surface voltage measurement. (Right) Functional EIT

image showing the distribution of tidal ventilation in a seated, spontaneously breathing man ......... 115

Figure 30:EIT Feature Extraction flowchart .......................................................................................... 116

Figure 31: EIT raw voltage measurements with electrode detachment problem ................................ 117

Figure 32: Raw Global Impedance Curve (Left) and a zoomed part of it (Right) .................................. 118

Figure 33: RGIC after band-pass filtering .............................................................................................. 118


Figure 34: Breath Segmentation of the filtered Raw Global Impedance Curve before (Top) and after

(Bottom) weak breath detection and elimination. ............................................................................... 119

Figure 35: An example of detected tidal breathing period. ................................................................. 119

Figure 36: Forward model used for training GREIT using contrasting lung regions (source: EIDORS web

site) ....................................................................................................................................................... 120

Figure 37: Detection of three tidal breathing periods (red lines) in an EIT sequence of 20000 frames (10

minutes) ............................................................................................................................................... 120

Figure 38: A functional EIT image (32 x 32) .......................................................................................... 121

List of Tables Table 1: Development status of ECI functionality .................................................................................. 20

Table 2: Development status of WM functionality ................................................................................ 21

Table 3: Development status of ESC functionality ................................................................................. 22

Table 4: Development status of FEH functionality ................................................................................. 23

Table 5: Development status of SEM functionality ................................................................................ 23

Table 6: Development status of LM functionality .................................................................................. 24

Table 7: Test results for the MLII-lead, and the right one to the results of V2 signals. .......................... 30

Table 8: Test results for the V2-lead....................................................................................................... 30

Table 9: Results achieved by the proposed algorithm for each dataset of the QT database. ................ 31

Table 10: Results for PVC detection. ...................................................................................................... 32

Table 11: VT/VF Classification performance........................................................................................... 33

Table 12: ST deviation correlation analysis. ........................................................................................... 34

Table 13: Results achieved by the proposed multi-lead and single-lead AF detection algorithms. ....... 35

Table 14: Default thresholds. ................................................................................................................. 36

Table 15: Classification results for cough detection with LOOCV. ......................................................... 37

Table 16: Classification results for cough detection with LOOCV after downsampling to 2KHz. ........... 37

Table 17: Sensitivity and PPV (mean and standard deviation (SD) measured after classifying the data

with adventitious sounds using the crackles/wheezes detector. ........................................................... 38

Table 18: Downsampled signal. Sensitivity and PPV (mean and standard deviation (SD) measured after

classify the data with adventitious sounds using the crackles/wheezes detector. ................................ 39

Table 19: Performance testing environment.......................................................................................... 41

Table 20: Performance results................................................................................................................ 41

Table 21: Risk management table .......................................................................................................... 53

Table 22: Test results for the MLII-lead, and the right one to the results of V2 signals. ........................ 80

Table 23: Test results for the V2-lead. .................................................................................................... 81

Table 24: Results achieved by the proposed algorithm for each dataset of the QT database. .............. 84

Table 25: Results achieved by the ECG segmentation algorithms proposed in literature (results extracted

from [15] for comparison purposes). ..................................................................................................... 85

Table 26: Results for PVC detection. ...................................................................................................... 88

Table 27: VT/VF Classification performance........................................................................................... 89

Table 28: ST segment deviation measurement: Pang et al. [32]. ........................................................... 93

Table 29: ST deviation correlation analysis. ........................................................................................... 94

Table 30: Results achieved by the proposed multi-lead and single-lead AF detection algorithms. ....... 99

Table 31: Default thresholds. ............................................................................................................... 102

Table 32: Dataset description. .............................................................................................................. 105

Table 33: Classification results for cough detection. ............................................................................ 105

Table 34: Classification results for cough detection after downsampling to 2KHz. ............................. 105


Table 35: Classification results for cough detection with LOOCV. ....................................................... 105

Table 36: Classification results for cough detection with LOOCV after downsampling to 2KHz. ......... 105

Table 37: Detailed results for cough detection (after second step) with LOOCV after downsampling to

2KHz. ..................................................................................................................................................... 106

Table 38: Features tested to detect crackles and wheezes events. ..................................................... 107

Table 39: Rank of the first thirty features selected by the sequential feature selection in the forward

direction when the objective of the classification was the detection of crackles (C) and when the

objective was the detection of wheezes (W). ...................................................................................... 110

Table 40: Musical features and the correspondent labels used in this study. ..................................... 110

Table 41: Results of the detection of crackles and wheezes events for the different volunteers. E

corresponds to the number of events, TP to true positives and FP to false positives. ........................ 111

Table 42: Downsampled signal. Results of the detection of crackle and wheeze events for the different

volunteers. The E corresponds to the number of events, TP to the true positives and FP to the false

positives. ............................................................................................................................................... 112

Table 43: Sensitivity and PPV (mean and standard deviation (SD) ) measured after classifying the data

with adventitious sounds using the crackles/wheezes detector. The symbol *** means that

crackles/wheezes were not present in that particular acquisition. ..................................................... 112

Table 44: Downsampled signal. Sensitivity and PPV (mean and standard deviation (SD) ) measured after

classify the data with adventitious sounds using the crackles/wheezes detector. The symbol *** means

that crackles/wheezes were not present in that particular acquisition. .............................................. 113

Table 45: Test case template ................................................................................................................ 122


1 Introduction

Integrated care of patients with COPD and comorbidities requires the ability to regard patient status as

a complex system. It can benefit from technologies that extract multi-parametric information and

detect changes in status along different axes. This raises the need for generation of systems that can

unobtrusively monitor, compute, and combine multi-organ information.

Ιn the EU-funded project WELCOME (Wearable Sensing and Smart Cloud Computing for Integrated Care

to COPD Patients with Comorbidities), such an approach is followed. Multiple types of data are recorded

unobtrusively and medically relevant features are extracted to support decision making by the

healthcare professionals.

The challenge is to employ as much as possible ubiquitous monitoring of patient’s condition, and

present the health professionals with valuable but not overwhelming information, as regards the

overall patient’s status and trends. This includes information on cardiac and lung function, patient’s

mobility and depression status, as well as correct use of medication. These pieces of information would

then be combined in order to detect deteriorations, and additionally to trace the causes of

deterioration, which is extremely important for the timely and efficient treatment and management of

multi-morbid patients. For example, a combination of deteriorating / stable signs and signals should

suggest the probable causative factors and the required course of action.

In WELCOME, a number of enabling ICT technologies have been put in-place, in order to address these

needs for acquisition, transmission, storage, analysis, decision support, user interaction. A core

component is the sensor system, i.e. the sensors deployed on wearable technologies (vest) for

recording and streaming bio-data to the mobile patient hub (e.g. a tablet), along with a pick mix of

standard Bluetooth-enabled sensors for periodic measurements (e.g. weight) also connected to the

patient-hub. The WELCOME cloud platform (Figure 1) is the heart of the system where all the medical

records and the monitoring data are managed and processed. The WELCOME cloud platform consists

of several modules (storage engine, feature extraction module, decision support system and external

applications connector) and the orchestrator agent. WP5 works on the development of the WELCOME

cloud platform based on the design defined in WP3 and described in D3.1 and D3.2. This deliverable

describes the initial development as regards the cloud-based management and processing of the data

streams towards detecting changes in patient status and supporting decisions within an integrated care

context. The user applications (developed in WP6 and currently described in D6.1) interact with the

cloud system for fata and information exchange via standards services.

Figure 1 illustrates the anatomy of the WELCOME cloud system. Its main components are:

Storage engine (SE)

Orchestrator Agent (OA)

Feature extraction server (FES)

Decision support system (DSS)

External sources connector (ESC)

The cloud system communicates via standard interfaces with the patient hub and patient applications,

as well as with the healthcare professional applications.


Figure 1: WELCOME cloud system overview

In the “Data management on the cloud” part of the document, the implementation of the data

management components of the WELCOME cloud platform is described. The next section (section 3)

addresses the security regarding the integration of various modules. In the Data analysis section

section, the implementation of the data analysis algorithms extracting features from the provided data

sets is described. Medical Knowledge and decision support gives details regarding the implementation

of the decision support system and the methodology of medical knowledge utilization. Furthermore, in

the Risk Management (section 6) the details of the risk management procedures are presented. Section

7.1 describes the planning of the next steps regarding development. Finally, the Bibliography and

Abbreviations section contain the bibliographic abbreviations and the bibliographic references

respectively. More detailed information is included in the three Appendices, as regards the REST API,

Signal processing details, and Test cases.

1.1 Development process

Enforcing one strict development process between teams that are not co-located and they belong to

different organizations with different time schedules and different prioritizations is not easy, if it is

feasible at all. The consortium has decided to follow the principles of the agile development in order to

gain from the flexibility and the quality of the produced results. Therefore, during the development of

the WELCOME Cloud platform’s components we utilize unit testing, and enforce cycles of delivering

with fully working demos. Such demos have been prepared for practically each meeting of the

consortium for the last year and these meeting have been used as milestones for our development

process.

Regarding testing, in Appendix C, we present a list of test cases already developed or under

development. The tests described in Appendix C, should be considered low level unit tests, even if they

engage more than one components. The integration testing procedure between the various cloud

components and the respective applications is considered out of scope for the current document.


1.2 Review's Comments and Answers

In this section we present the reviewers’ comments, as they were presented in the Consolidated Report

of the second annual review of WELCOME, along with our responses to them. This section covers all

the issues related to the “First prototype of the WELCOME Cloud software” corresponding to

recommendations 4,5,11 and 12 of the Consolidated Report.

Comment

The system design is necessary to be improved especially presenting a more explicit analysis of (a)

clinical workflows, (b) privacy and security context per application and user role, (c) data models and

use of HL7 FHIR, (d) DSS performance in terms of empirical knowledge and stored background

knowledge, (e) The integration and testing of the final prototype.

Response

Regarding the WP5 related aspects of this comment we added a detailed diagram of the complete and

implemented data model classes and relations in the deliverable D5.2. Integration and testing of the

final prototype is part of the Integration WP7, and has been submitted as a separate document.

Regarding the security aspects of integration of the various cloud modules, a new section has been

added (section 0).

Comment

Integration needs thorough planning including test cases which need to include fault injection and

checking the behaviour and recovery of the WELCOME system and its components

Response

The test cases presented in D5.1 have been updated to reflect the current development status,

emphasizing on fault injection scenarios. Specifically, the test cases added or updated are : OA-FEH-1,

OA-FEH-2, OA-FEH-3, OA-FEH-4, OA-SEM-1, OA-SEM-2, OA-SEM-3, OA-WM-6, OA-ECI-1, OA-

ECI-2, OA-ECI-3, OA-LM-1, OA- 1.

Comment

The statistical accuracy of machine learning algorithms is rather high, and this likely will be a critical

advantage of the System; nevertheless, it is expected that this accuracy will be worsened when real-

time bio-signals from heterogeneous patients will be taken into account. This scenario is necessary to

be explored in future work of WP5

Response

While actions have already been taken to check signal quality, and to collect patient data in order to

train algorithms with realistic examples, this effort to improve performance will continue upon data

availability. We will perform data acquisitions with the vest to fine-tune the proposed algorithms in the

real scenario, if it proves necessary. Updating the currently working version of the algorithms will be

minimally invasive to the processing server, since, in practical terms, it will consist of the substitution

of the initial Dynamic Link Library (DLL) by a new one, following the previously specified interface.

Comment


In addition, the deliverable is not exploring how empirical knowledge derived by machine learning

module will comply with already existing knowledge patterns acquired by the physicians; in this sense

the DSS knowledge-model should be more sufficiently presented. Within this context, a clear map of

the real support provided by the DSS to the HCP should be provided presenting sufficiently the HCP

decision making and the using of knowledge derived by machine learning in parallel with validated

knowledge of DSS.

Response

The DSS output is a structure containing a text message and additional annotation information (e.g.

severity, timestamp etc.) directed to the HCP through the HCP applications. The link between machine

learning based features (Feature extraction), personalised thresholds and rules based on domain

knowledge have been better clarified in D3.2, also via figures. In the updated D3.2, rules sources of

evidence have been clarified (e.g. standards).

In D5.1 it is explained a) how feature extraction module outputs (machine learning based) are

validated, b) how these outputs of machine learning are further employed in DSS.

Comment

Overall, this deliverable maintains also the concept of “moving target approach” by referring to

“decisions on designs and expected functionality in the second and next year of the project”. In multiple

instances D5.1 presents revised designs (see page 19), uses future tense ("will") or unclear commitment

(see "possibly also…" page 34) for developments or lacks conclusions on results of testing (see “ST

analysis” page 26)

Response

D5.1 does not reflect a final state of development. However, we revised the document to reflect the

current state of development based on the completed design, also disambiguating any vague

expressions.

TABLE of Changes after 2nd year review

Section Page Section Description

5 Executive Summary

11 Introduction

2.1.2 17 Storage Engine

3 25

Security aspects regarding the integration of various

modules

5.1 43 Decision Support System

Appendix C 121 Test cases


1.3 Interim review’s Comments and Answers

Comment

The revised version has no reference to the adopted care workflows.

Response

The adopted care workflows have been defined in D2.2 (appendix). How these care workflows are

supported in WELCOME system is part of the design and will be extensively reported in D3.3. Based on

the analysis, the cloud system has a generic design in terms of data management, feature extraction,

information flow that supports the different care workflows. The DSS takes into account comorbidity,

and implements rules for the different diseases and treatments and their interaction. The manner this

information becomes available to HCPs is part of the applications logic.

Comment

The issue of the performance of the machine learning algorithms in real-time operation and especially

when real-time bio-signals from heterogeneous patients will be taken into account is not clearly

considered as an evident risk and thus no real convincing action has been foreseen.

Response

We have added the section 4.3 to include details about performance testing that demonstrate the

adequate performance of our Feature Extraction Algorithms.

Comment

The operation of the DSS is following the presentation provided in D3.2; it is understood that the revised

deliverable has no detailed contribution regarding this aspect. … The effective support provided by the

DSS to the HCP has still not been convincingly presented.

Response

This question is addressed in the updated section 5.1. More specifically, we provide a more extended

description of the DSS logic, a new explanatory figure, and two usage scenarios that demonstrate the

value provided for the end users.

TABLE of Changes after interim review

Section Page Section Description

5 Executive Summary

1.3 15 Interim review’s Comments and Answers

4.2.3 41 Performance of the feature extraction algorithms

5.1 44 Decision Support System

6 52 Risk Management


2 Data management on the cloud

WELCOME Cloud platform handles the data collected from the patients’ and the respective devices.

The development progress on the various WELCOME Cloud platform’s software components is

presented in this section.

2.1 WELCOME Cloud Communication API

2.1.1 Overview

The major components of the Communication stack are shown in Figure 2.

Figure 2: WELCOME Communication API components stack

Each of the components is developed as a separate module in the Java programming language,

integrated in a web application that is deployed on an Apache Tomcat web server.

A number of common external software modules are used throughout the modules' implementation.

Effort has been made to use software libraries that are widely adopted and open source if possible.

HTTP/REST functionality is provided by the Jersey framework1, the reference implementation of the

JAX-RS API. SPIN rules support is provided by the open source TopBraid SPIN API2. The Apache Jena

framework3 is used for RDF content manipulation. To manage those external dependencies the Apache

Maven build system4 is used.

Unit tests are written for each module, using the jUnit testing framework5.

Logging is provided by the Apache Log4j framework6.

Standard JavaDoc documentation is provided for each software module. This is used both for proper

interconnecting with other software modules (e.g. for the developers of the Orchestrator Agent) and

1 https://jersey.java.net/ 2 http://topbraid.org/spin/api/ 3 https://jena.apache.org/ 4 https://maven.apache.org/ 5 http://junit.org/ 6 http://topbraid.org/spin/api/


for easy reuse of any of the modules in other parts of the WELCOME system (e.g. for the developers of

the DSS).

A manual for the REST API, along with training videos and example requests, have been developed and

are constantly updated as the system develops and matures.

2.1.2 Storage Engine The RDF Triple Store is a standard deployment of the OpenLink Virtuoso Universal Server 7.x7. Virtuoso

is offered with both commercial and open-source licenses. We have not, to this point, encountered an

issue that required commercial support. Virtuoso can run on Linux, Windows, and Mac OS X based

systems.

Regarding the File Store, currently files are stored on the file system of the application server. This will

change during the process of migration to the cloud infrastructure, in order to take advantage of the

respective storage cloud services and the provided flexibility.

2.1.3 SPARQL Graph Store Protocol Client The SPARQL client, is a module implementing the SPARQL Graph Store Protocol. It also has the ability

to perform preformatted SPARQL queries against SPARQL 1.1 endpoints. The client is tested against the

Virtuoso server, but also against the Apache Jena Fuseki Server8 in order to ensure its independence

from SPARQL server implementation variations. Internally, RDF content is represented as Apache Jena

Models, and HTTP client operations are performed using the Jersey framework9.

2.1.4 Validation The Validation module is a simple library that verifies that RDF resources conform to SPIN constraints.

It is implemented, leveraging the methods provided by the SPIN API. Unit tests, corresponding to the

SPIN rules included in the data model, are provided.

2.1.5 REST API and CRUD Operations The REST API defines the HTTP exposed interface endpoints that is used by the WELCOME system

components in order to store and retrieve data. The detail description of those interface endpoints is

presented in Appendix A.

The CRUD module is the core of the Communication software stack. It receives requests from the REST

API and transforms them to appropriate method calls of the Validation module, the SPARQL client, and

the File Store. The CRUD module's method calls correspond to the REST API endpoints exposed by the

application. This way additional modules, such as the Orchestrator Agent, can be bundled in the build

of the application, and perform the same type of requests without implementing the added complexity

of HTTP communication. Appropriate Java Events are generated for every request that gets processed.

These events can be used by components such as the Orchestrator Agent to be notified of requests of

interest.

2.1.6 Web Application A JAX-RS application integrates all of the components in a single .war Java application that is deployable

on the Apache Tomcat web server. This application exposes the HTTP/REST services defined above.

Appropriate Jersey Filters and Jersey Entity providers are implemented to enable the application to

communicate seamlessly using the Turtle RDF serialization, provide HTTP based authentication, isolate

7 http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/ 8 https://jena.apache.org/documentation/fuseki2/ 9 https://jersey.java.net/


and dynamically transform the resources' and requests' URIs to and from an internal and external

representation (making the application independent of either the application or the Triple Store

servers' names and IP addresses), and enable gzip10 handling for compressed requests and responses.

2.1.7 Implementation Status A first version of the web application has been developed and communicated to partners responsible

for building the client applications. Of the features specified in the design, this version does not include

conditional requests (using Etags) for concurrency control and caching, enhanced search capabilities

and gzip compressed communication.

These features are currently being tested and will be included in the next version. As the conditional

requests feature is expected to lead to incompatibilities, both versions of the application will be

deployed until client developers have successfully migrated their software to the newer version. Code-

wise these features require changes to the REST API and CRUD modules.

After a cloud provider for the File Storage has been selected, an appropriate module will be developed

to handle file CRUD operations. As stated above, this will require minimal effort.

The manual communicated to client developers will be finalized with the deployment of the newer

version to also document the specific HTTP error codes used in case of server errors or mal-formatted

requests. The accompanying material (videos, demo requests) will also be updated to reflect any

changes.

2.2 Orchestrator Agent

The Orchestrator Agent (OA) is the active process which handles the flow of information through the

WELCOME project’s cloud components. As such, the OA triggers the execution of various processes,

communicates with other system’s components to execute specific tasks and handles all the composite

procedures of the WELCOME cloud hub.

The OA is organized in logical submodules. Figure 3 shows these logical submodules implying that the

submodules on the bottom facilitate the development of the modules on their top.

Each of these submodules’ development status is described in the following subsections. It should be

noted that while the existence and the abstract functionality of each of these modules should be

considered stable, their development is part of an ongoing agile process. This agile process of

development involves a partial redesign of each module each time new things are clarified. For

example, adding a new rule in the DSS could lead to a new calculation needed to be implemented in

the Calculations Handler module. Therefore, their implementation could substantially change in the

future as it has already happened in the past.

10 http://www.gzip.org/


Figure 3: Orchestrator agent’s modules’ stack

Regarding our basic development choices for OA, we decided to use java as the programming language

because it is a widely accepted programming language with a vast community and a big variety of open

source libraries to support our programming effort. Furthermore, since the same programming

language is used for the other major WELCOME cloud modules (Communications API, Feature

Extraction Server etc.), using java could facilitate the integration process with them. Moreover, most of

the major commercial cloud service providers (Amazon, Microsoft etc.) already provide out of the box

java EE processes hosting functionality, facilitating the migration of our code to production cloud

services.

In terms of testing, it should be noted that some additional libraries are used. Our unit tests are based

on the jUnit11 library, while our stub and mock objects built in order to isolate the tested objects are

built using the Mockito12 library. It should also be noted that in order to facilitate testing and the overall

project’s maintenance we use the Spring framework13 in order to exploit is dependency injection

capabilities.

2.2.1 External Communication Interface (ECI) This module exposes the core functionality of the OA. The functionality provided through the ECI allows

the main hosting process to start an OA agent instance process and to notify the OA agent instance

process for an incoming observation/measurement.

This functionality would typically be called from the WELCOME Communications API process which

hosts the OA in the same runtime. The Communications API would start the OA as part of its own

startup procedure and notify the OA of an incoming observation/measurement through the respective

java event.

The ECI has been developed and tested as part of an internal demo/prototype process and is not

expected to change heavily. The test has been done through an exploratory procedure before the

technical F2F meeting conducted among the WELCOME technical partners in Thessaloniki.

11 http://junit.org/ 12 http://mockito.org/ 13 https://spring.io/


The ECI is implemented as pure java which is visible to every java process importing the OA’s jar file.

The start functionality is implemented as a java method while the notify functionality is called as a java

event.

Regarding future development, we plan to enhance the ECI with a stop functionality which would

enable the OA to stop gracefully, properly handling special sensitive resources (closing connections to

files, connections to the SE etc.). This functionality would also allow the OA process to be restarted in

case of an unrecoverable error.

Table 1: Development status of ECI functionality

Functionality Implementation type

Status Testing Future development

Start OA Java method Built for demo

Unit test, exploratory test

Perhaps customizations will be needed for production

Notify OA for incoming measurement/observation

Java event Built for demo



Stop OA Java method Not built Not tested

2.2.2 Workflow Manager (WM) The Workflow Manager (WM) module of the OA handles the execution of the workflows. While the

actual execution of the workflow is based upon the jBPM library14, we decided to wrap its functionality

in a distinct module for the following reasons:

Isolating the use of a library and reducing the dependencies between the various modules of

a system is good practice in software engineering leading to more maintainable and less error

prone code.

Wrapping the functionality of jBPM facilitates us in adding more functionality (e.g.

multithreading)

WM uses the underlying jBPM runtime in order to execute BPMN2 workflows. It should be noted that

WM not only wraps the functionality of jBPM for maintenance reasons, but it also enhances the

functionality provided by the jBPM. For example, jBPM provides only “logical multithreading”15 which

is not sufficient for a real production environment. WM provides the real java multi-threading

functionality in order to allow the concurrent execution of multiple processes.

WM also controls the execution of the respective BPMN2 workflows through the respective methods.

Special care has been taken in order to ensure that when stopping/killing a workflow process, all the

resources attached to it (threads etc.) are properly disposed.

Another functionality that is to be included in the WM is the proper exception handling. WM will wrap

the exceptions occurred in the jBPM runtime engine in order to propagate through the Java Runtime

Engine in a more maintainable fashion.

14 http://www.jbpm.org/ 15 https://docs.jboss.org/jbpm/v6.0/userguide/jBPMAsyncExecution.html


As part of the WM, the main information handling BPMN2 workflow has also been built. Figure 4 shows

the BPMN2 workflow as shown in the BPMN2 designer of the eclipse IDE, used to develop the actual

workflow. Each call to an independent java module through a jBPM workflow implies the existence of

a so called “service” class. These service classes have been implemented in order to allow testing of the

workflow itself independently of the respective modules.

Figure 4: The main information handling workflow

Table 2 summarizes the development status of the WM module.

Table 2: Development status of WM functionality



Handling BPMN2 workflows

Using jBPM2 library and custom java code

Built for production


It should be considered stable for production

Exception management

Custom java code Built for demo



Multi-threaded run of workflows

Custom java code Built for production



Main information handling workflow

BPMN2 file Built for production



Services to be called from main information handling workflow

Custom java code Built for production

Exploratory test It could need changes regarding the actual called functionality in the future, when the rest of the OA’s modules are ready

2.2.3 External sources connector (ESC) The External Sources Connector (ESC) is the module retrieving data from external internet sources in

order to integrate them with the rest of the data produced by WELCOME system. We have identified

two candidate sources that could be used during the implementation.

The first candidate source is the Yahoo Weather API16. It is planned to be used in order for the

WELCOME cloud services to get data regarding weather, as well as the short range weather forecast.

16 https://developer.yahoo.com/weather/


The second candidate source is the Global Near Real Time Data Access provided by MACC-III project17.

This data source could be used in order to provide data regarding air pollution.

The information provided by the two above sources could be used in the DSS rules’ evaluation. The

integration of the above sources and therefore the development of the respective clients as part of the

ESC depend on the DSS rules finalization process.

Table 3: Development status of ESC functionality



Weather information retrieval

Web service client Source identified, rather confident that it could be exploited

None Not developed yet at all

Pollution information retrieval

FTP client Possible source identified, we have to confirm access to the data through an API

None Not developed yet at all. It will only be developed in case there are DSS rules based on the respective data.

2.2.4 Calculations handler (CH) Calculations handler is the module of the OA handling simple calculations. These calculations include

for example the calculation of averages, aggregates, means etc. excluding anything which could be

considered part of a signal processing algorithm. The exact algorithms to be implemented as part of the

CH are to be defined after the finalization of the DSS rules.

2.2.5 Periodic behavior handler (PBH) Periodic behavior handler will be used in order to trigger specific functionality, yet to be defined. This

functionality includes calculations executed by the CH (some values will probably need to be calculated

in a periodic basis), the retrieval of external data through ESC etc. The functionality that the PBH will

execute has not yet been defined in detail.

2.2.6 Feature Extractions Handler (FEH) The Feature Extractions Handler (FEH) handles the communication of the OA with the FES. FEH

implements the client part of the communication with the FES. This communication is based on the

web sockets approach providing responses in an asynchronous fashion. Initially, we decided that the

communication between OA and FES would be synchronous. However, this approach has changed in

asynchronous, in order to allow independence of possible delays on the analysis procedures executed

on the FES.

We have already developed a prototype of the FEH following our initial synchronous approach. This

first prototype has been written using the Apache HTTP Components18 library in order to implement

the web service client functionality. As a testing method between the two partners engaged in the

17 https://www.gmes-atmosphere.eu/oper_info/global_nrt_data_access/ 18 https://hc.apache.org/httpcomponents-client-ga/


implementation of the specific communication process (AUTh implementing the OA part and Coimbra

implementing the FES part), we adopted the SOAP UI tool19. However, since the overall communication

between the OA and the FES has been revised, the implementation of the FEH has to be refactored.

Table 4: Development status of FEH functionality



FES client First prototype as web service client

First prototype developed for demo. Approach changed and refactoring needed.

First prototype tested through unit tests and SOAP UI

Major refactoring needed.

2.2.7 Storage Engine Module (SEM) Storage Engine Module (SEM) is the module of the SEM isolating the functionality accessing the Storage

Engine (SE). SEM is heavily depended on the WELCOME CRUD API, used as an independent .jar library.

It has already been implemented for the first version of the CRUD API for a demo. Since the CRUD API

will very soon be upgraded in order to provide new functionality, the SEM will also need to be

refactored in order to exploit the new provided functionality.

Table 5: Development status of SEM functionality



CRUD operations First version implemented as a pure java code, based on WELCOME CRUD API

First version developed for demo. CRUD API will be upgraded in the near future and refactoring will be needed.

Unit tests, exploratory testing.

Refactoring will be needed.

2.2.8 Logging Module (LM) Logging Module (LM) is very important regarding the maintenance and the debugging of the system.

LM facilitates logging of messages in different levels and in different targets (relational database, files

etc.). The LM is implemented using the Apache Log4j2 library20 in order to take advantage of the out of

the box provided functionality with no need to write custom code.

19 http://www.soapui.org/ 20 http://logging.apache.org/log4j/2.x/


Table 6: Development status of LM functionality



Logging information using different levels of logging

External java library Finalized None needed. A unit test is written in order to confirm correct configuration.

Extra configuration will be needed during integration with other modules and publishing on the cloud.


3 Security aspects regarding the integration of various

modules

Special care has been taken through the implementation process regarding the security of the

interconnection points between the various modules of the WELCOME cloud infrastructure. The

interconnection (or integration) points between the various modules of an enterprise system can be

considered “gates” of security weaknesses typically caused by misunderstandings, by network

communication between the various modules, poor definition of communication methods etc21,22.

3.1 Types of integration of WELCOME modules

There are two main ways for various modules to be connected and communicate with each other: Through normal network connection (SOAP web services, REST web services, TCP/IP, RMI, CORBA etc).

Through integration as libraries of “trusted” code (e.g. in java, organize the various modules as use a

jar file that contains the functionality of a specific module)

3.2 Measures taken for each interconnection point

The measures used in each interconnection point depend heavily on the nature of the connection. More

specifically, the provision taken in each interconnection point can be summarized as follows:

WELCOME Communications API – Orchestrator Agent

The OA is integrated in the communications API executable realm as an independent jar file. A

special class has been written in OA to function as the “gate” of communication between the

Communications API and the OA, without exposing functionality of the main OA code. The specific

class is called OrchestratorAgentAdapter as it follows the Adapter design pattern2324 and is part

of the OA’s External Communication Interface (ECI). This approach has been decided in order to

follow the Separation of Concerns and Single Responsibility principles which in general lead to

more secure software development25.

Orchestrator Agent – Feature Extraction Server

The OA and the FES communicate through a well-defined REST API. Since the two modules run on

separate execution realms and communicate through typical REST communication scenario, we

are going to use a public key infrastructure (PKI) security mechanism based on certificates and the

widely accepted HTTPS protocol. This security mechanism is built upon the rest of the security

cloud infrastructure measures.

Orchestrator Agent – Decision Support System

21 http://www.computer.org/cms/CYBSI/docs/Top-10-Flaws.pdf 22 http://www.oracle.com/technetwork/java/seccodeguide-139067.html#4 23 http://www.headfirstlabs.com/books/hfdp/ 24 https://en.wikipedia.org/wiki/Adapter_pattern 25 Martin, Robert C. (2003). Agile Software Development, Principles, Patterns, and Practices. Prentice Hall. pp. 95–98. ISBN 0-13-597444-5.


The DSS is integrated with the OA by as a reference to an independent jar file. The OA has isolated

the access of the DSS functionality in a specific module, called DSS handler adopting the principles

of the Separation of Concerns and Single Responsibility.

Orchestrator Agent – Storage Engine

The OA uses the SE through the so called CRUD API jar file. The OA has isolated the access of the

CRUD API in the so called Storage Engine Module (SEM) in order to follow the principles of the

Separation of Concerns and Single Responsibility, in an approach analogous to the respective

interconnections through jar files.

3.3 Error handling and logging

Another important aspect of the security of an enterprise system is the error handling procedures. We

have explicitly defined error handling procedures using custom programmatic exception classes or

taking advantage of the respective available network communication practices. More specifically:

Orchestrator agent

Raises a custom “ExceptionEvent” each time an error occurs. Through this event mechanism, other

WELCOME cloud modules can get notified that an error occurred during the execution of an OA

specific functionality. This event, notifies for the occurrence of normal java exceptions and special

care has been taken for the case that an uncaught runtime exception is raised through the

mechanisms provided by java26. Moreover, a custom exception type wrapping the errors that may

occur during workflows running inside the jBPM realm has been built. This custom exception class

(WorkflowServicesException) intends to wrap the exceptions thrown from jBPM in a way that they

are more comprehensible in case of runtime errors.

Feature extraction server

FES has three levels of error handling. It uses HTTP codes to propagate errors on the

communication with OA. Furthermore, it uses java exception internally in order to handle errors.

Finally, special care has been taken in order to handle exceptions inside the matlab realm in order

to ensure that errors are handled and propagated correctly on the above layers.

Communications API

The communication API follows the RESTful paradigm and uses the HTTP status codes to indicate

errors to the client applications.

Storage Engine

The access to the SE is done through the library of the CRUD API which is used by the

Communications API and the OA in order to store data. The CRUD API throws a custom exception

type (implemented in class named CRUDException) which wraps all the respective information and

propagates them using the standard development java practices.

Furthermore, we decided to use a logging framework as a security measure in order to facilitate

detection of malicious activity or errors and the respective recovery processes. More specifically, we

26 https://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html#setDefaultUncaughtExceptionHandler%28java.lang.Thread.UncaughtExceptionHandler%29


use the SLF4J27 as a common layer of logging between the various java modules of the WELCOME cloud

platform, having log4j as the logging framework.

27 http://www.slf4j.org/


4 Data analysis

4.1 Feature Extraction Server

The Feature Extraction Server is the subsystem responsible for extracting features from the vital signs

obtained from the sensors within the vest. This server only communicates directly with the

Orchestrator Agent from which it receives the requests and after these are processed, it sends back the

feature extraction results. The Feature Extraction Server is composed of three main modules: a Web

Service, a Task Manager and a Worker.

The architecture of the system can be viewed in Figure 5.

Figure 5: Feature Extraction Server Architecture.

The Web Service serves as a communication interface through which the feature extraction requests

are submitted by the Orchestrator Agent. We can also check on the Status of the Feature Extraction

Server modules through this Web Service.

The Task Manager has the objective to handle the feature extraction requests, putting them on a queue

waiting to be processed. For these requests to be processed the Task Manager sends them to the

WELCOME Workers.

The Worker is a module that its sole purpose is to perform the feature extraction from the vital signs

contained within the requests. Once this extraction finishes, the results are sent back to the

Orchestrator Agent. This specific module is supposed to be installed on multiple machines to be able to

perform the maximum number of simultaneous feature extraction requests.

4.1.1 Server Execution Environment The Web Service has to be installed on a web server capable of running Java code, such as Apache

Tomcat, thus having also the requirement to have Java Runtime Environment installed, preferably

version 8. The Task Manager is installed on the same machine as the Web Service and only needs the


same JRE installed. The Worker module has some more requirements, starting with the operating

system that has to be a Windows environment. This is due to the necessity of running dynamic link

libraries (Dll’s) that were previously cross-compiled in Matlab and contain the functions to call the

feature extraction algorithms. These libraries have some dependencies as well as, they require that

Matlab Runtime 8.5 is installed as well as Microsoft Visual C++ 2010 redistributable libraries.

4.1.2 Server Communication API The communication with the Feature Extraction Server is performed using REST messages. The

Orchestrator Agent can perform feature extraction requests through making proper calls to specific

endpoints of the Web Service and receive the results of these requests asynchronously. These

endpoints are listed and defined on the Deliverable D3.2 as well as more detailed information about

the communication protocol and technologies adopted.

4.2 Implementation of Algorithms

4.2.1 Signal Processing Algorithms In this section, an overview of the signal processing algorithms is provided, which are detailed in

Appendix A. Bibliographical references are also listed in the appendix.

ECG The introduction of noise to the ECG signal affects the accuracy of algorithms designed to detect diverse

cardiac pathologies, namely, arrhythmias. Therefore, we propose a new method to detect noise periods

and evaluate the ECG signals quality.

Following ECG signal quality assessment, the next sections describe the one-lead ECG algorithms that

have been developed and implemented during the first phase of the project. Specific interfaces,

corresponding to each specific algorithm, are provided, namely:

ECG segmentation and Intervals computation: relates with the identification of the main fiducial

points, such as begin and end of P wave, R peaks detection, as well as relevant intervals

computation, such as PR interval duration;

Ventricular arrhythmias episodes detection, including premature ventricular contraction (

identification of normal and abnormal beats), ventricular tachycardia and ventricular fibrillation;

Atrioventricular (AV) block;

ST deviation: estimation of ST segment deviation.

Finally, the multi-lead algorithm for atrial fibrillation (AF) is described. The attained results show that

the multi-lead method outperforms the one-lead approach.

4.2.1.1.1 ECG signal quality assessment

The scope of the presented algorithm is to serve as an entry barrier to the remaining ECG processing

algorithms, by detecting noise periods and evaluating the ECG signals quality.

Methods

The algorithm consists on the following steps:

Pre-processing, where mean removal and normalization are applied;

Detection of R-peaks, using the Pan&Tomkins algorithm;

Root Mean Square of the Approximation by PCA;

High-Pass FIR Filtering;


Noise Assessment in 4s Segments and Thresholding.

Results and Analysis

We used the ECG signals available from Physionet (MIT-BIH Arrhythmia Database28), and noise records

from the MIT-BIH Noise Stress Database29 also from Physionet, all with a sampling frequency of 360Hz.

The noise records were acquired in a way that the subject’s ECG signals were not visible. Three types

of noise were derived from these records: i) the baseline wandering (bw); ii) the EMG artifact (ma); Iii)

and the electrode motion (em). To add the noise to the ECG signals at different SNR’s, we used the nst

function from the WFDB Software Package also provided by Physionet, based on a peak-to-peak

amplitude to calculate the gains to apply to the noise records.

All the records were resampled to 250 Hz sampling frequency (fs), to match the sensors from the

WELCOME vest. In the training stage we used the MLII-lead signals from the records 201, 205, 213, 217,

223 and 231. We have chosen this records due to its high quality signal and the presence of various

types of arrhythmias, to be able to determine the parameters that best discriminate the noise periods,

keeping a low sensitivity to arrhythmia patterns.

The test results for the MLII-lead and V2-lead of the noise detection algorithm are presented in Table 7

and Table 8 respectively, where SE and SP stand for sensitivity and specificity, respectively (percent

values).

Table 7: Test results for the MLII-lead, and the right one to the results of V2 signals.

MLII em ma bw Total

SNR SE SP SE SP SE SP SE SP SNR Interval

-6 97.91 96.05 99.54 94.38 99.8 93.9 99.08 94.78 -6

0 98.62 96.82 99.64 94.57 99.37 94.72 99.15 95.07 [-6. 0]

6 98.81 97.23 99.6 95.41 93.01 95.27 98.48 95.37 [-6. 6]

12 97.97 96.97 98.44 96.95 66.5 96.04 95.77 95.69 [-6. 12]

18 86.84 96.83 95.09 96.79 34.62 97.19 91.05 95.94 [-6. 18]

NoiseAvg 96.03 96.78 98.46 95.62 78.66 95.42

Table 8: Test results for the V2-lead.

V2 em ma bw Total


-6 96.81 96.79 96.39 96.51 96.29 95.96 96.5 96.42 -6

0 97.28 97.49 96.86 96.67 96.92 97.47 96.76 96.81 [-6. 0]

6 96.2 97.97 96.93 97.56 97.2 97.79 96.77 97.13 [-6. 6]

12 94.97 97.85 96.65 98.21 86.38 98.06 95.74 97.36 [-6. 12]

18 85.14 98.42 93.33 98.12 38.2 97.89 91.04 97.52 [-6. 18]

NoiseAvg 94.08 97.7 96.03 97.41 83 97.43

28 http://www.physionet.org/physiobank/database/mitdb/ 29 http://www.physionet.org/physiobank/database/nstdb/


4.2.1.1.2 ECG segmentation and intervals computation

The detection of the ECG characteristic waves, i.e. ECG segmentation, is a fundamental task for the

diagnosis of cardiac disorders and heart-rate variability analysis.

Methods

Our algorithm for ECG segmentation is composed by three main steps:

Elimination of the baseline wandering;

Elimination of noise;

Detection of the ECG characteristic waves.


To validate the ECG segmentation algorithm we adopted the following metrics: sensitivity (SE) and

positive predictive value (PPV). Additionally, the mean error of detection of the onset, peak and offset

of the characteristic waves was also assessed by the following equation:

𝑚 =1

𝑁𝑐𝑦𝑐𝑙𝑒𝑠

× ∑ 𝐼𝑜𝑟


𝑖=1

(𝑖) − 𝐼𝑜𝑒(𝑖) (1)

where 𝐼𝑜𝑒 and 𝐼𝑜𝑟 are the real and detected indexes of the characteristic waves in each heart cycle.

The proposed algorithm was tested in the Physionet QT database, which is composed by 105 records

of 15 minutes extracted from six databases.

The results achieved by the proposed algorithm are presented in Table 9. It is possible to observe that

our algorithm achieved very good results in the detection of both three characteristic waves, being the

best performance achieved in the detection of the QRS complexes (SE: 99.8% and PPV: 99.7%). The

performance of the algorithm in the detection of the P- and T-waves suffered from a minor decrease,

with an SE of 93.2% and PPV of 99.8% for the P-waves and an SE and PPV of 99.1% for the T-waves. The

mean error of the detection of the characteristic waves onset and offset was approximately 23 ms,

which is about 10% of the of the characteristic waves length. Although the algorithm was not able to

detect the exact boundaries of the characteristic waves, these results show a minor error in the

detection of the boundaries.

Table 9: Results achieved by the proposed algorithm for each dataset of the QT database.

Metric Sensitivity [%] PPV [%]

Wave P QRS T P QRS T

Mean 93.17 99.80 97.64 96.82 99.70 98.00

Metric Mean error [ms]

Wave P-waves QRS T-waves

Mean Onset Peak Offset Onset Peak Offset Onset Peak Offset

14.34 12.11 47.54 20.21 6.87 23.35 31.04 21.18 36.20


4.2.1.1.3 Ventricular Arrhythmias

This module presents the approach followed for the assessment of ventricular arrhythmias, with clinical

relevance for COPD. The framework includes algorithms for ventricular arrhythmias detection (PVC-

premature ventricular contractions, VT-ventricular tachycardia and VF-ventricular fibrillation), that are

currently incorporated into the algorithms for ECG analysis.

Methods

The proposed approach assumes that the fundamental differences in the physiologic origins of sinus

rhythm and PVC/VT/VF can be discriminated via time analysis of the ECG’s morphology and spectral

components. The set of applied discriminating features have been determined using a correlation

analysis procedure of the most significant features found in literature as well as new features developed

during this work. These features are provided as inputs to a hierarchical NN module enabling the

discrimination of specific arrhythmias. In this classifier configuration, each module discriminates only

between two classes. As is well known, the achievable accuracy of a given classifier is highly dependent

on the number of classes present in the input data. Clearly, with only two classes each classifier is able

to provide a superior classification result, due to the lower complexity of the mapping function to be

identified. This fact has justified the design of different neural network classifiers with specialized tasks

(PVC, VT, and VF).


PVC detection

The PVC detection algorithm validation has been performed using 46 of 48 MIT-BIH database records.

Non MLII lead configurations records have been removed from the training and testing datasets,

preserving coherence in the morphological characteristics of ECG records. 1965 PVCs and 11250 normal

QRS complexes from the aforementioned dataset, compose the training dataset. Validation was

performed using all 46 dataset records (6595 PVCs and 95893 normal beats).

The achieved results regarding PVC detection performance are presented and compared with state of

the art algorithms. The values shown for the later are those reported by their respective authors. The

sensitivity and specificity achieved by the proposed algorithm are 96.35% and 99.15%, respectively.

Comparing these values with those of the algorithms reported in literature, it is observed that the

proposed algorithm reveals very accurate classification results.

Table 10: Results for PVC detection.

Sensitivity [%] Specificity [%]

Proposed Algorithm 96.35 99.15 Jekova et al. 93.30 97.30 Christov et al. 96.90 96.70 Christov and Bortolan 98.50 99.70

VT and VF

To validate the VT/VF module of the algorithm, the following public databases were employed: MIT-

BIH Arrhythmia Database (MIT), MIT-BIH Malign Arrhythmia Database (MVA) and Creighton University

Ventricular Tachyarrhythmia Database (CVT)30.

30 http://www.physionet.org/physiobank/database/cudb/

http://www.physionet.org/physiobank/database/cudb/


The performance of the algorithm for VT and VF (MIT/MVA/CVT) detection are presented in Table 11.

As can be observed, the detection results are higher when considering independently each database.

Applied to all databases the method has a sensitivity of 89.3% and specificity of 94.1%. This has mainly

to do with dubious annotations in some signals of the publicly available databases.

Table 11: VT/VF Classification performance.

Database MIT MVA CVT All

Sensitivity [%] 99.7 90.7 91.8 89.3 Specificity [%] 98.8 95.0 96.9 94.1

4.2.1.1.4 AV Blocks

An atrioventricular block (AV block) is a type of heart block in which the conduction between the atria

and ventricles of the heart do not follow a correct path, i.e., when the atrial depolarizations fail to reach

the ventricles or when atrial depolarization is conducted with a delay.

Methods

There are three types of AV block: first-degree, second-degree (Mobitz type 1 and Mobitz type 2) and

third-degree atrioventricular block. From the perspective of the patient, the first-degree AV block is,

typically, not associated with any symptoms; the second-degree is usually asymptomatic, but some

irregularities of the heartbeat can be observed by the patient; the third-degree is, generally, associated

with symptoms such as fatigue, dizziness, and light-headedness.

From the clinical point of view, the identification of each one of the AV blocks is typically diagnosed

from the ECG analysis, based on the parameters associated with the conduction between the atria and

ventricles: duration of the PR interval and occurrence/order of P wave and QRS complex. Therefore,

following ECG segmentation, a number of state of the art rules are applied to identify AV blocks.


As described, all the methods for AV block detection are supported by the identification of the main

waves and intervals, namely the PR interval and QRS complex. To this aim, the segmentation module

developed inside this project was used to compute these intervals. As a result, since the methods

implemented for AV blocks are simply the direct implementation of the referred rules (duration of the

PR intervals and occurrence/order of P waves and QRS complexes), the performance is straightly linked

with the performance of the segmentation.

As result, the validation of this algorithm is established directly by the segmentation performance.

4.2.1.1.5 ST deviation

In this section, we describe the approach followed for the estimation of the ST segment deviation.

Methods

The algorithms implemented to evaluate ST segment deviation follow basically two stages:

The ECG signal is first broken into cardiac cycles and a baseline removal process is applied to each individual interval;

The second stage involves several measures of the aimed deviation. Four measurements of ST deviation are available. In this way, the person analyzing the ST segment deviation has several different values to support the decision making. The first three were chosen from


the literature and, in the last one, a new algorithm was developed and implemented based on Wigner-Ville transform.


A true validation process could not be done. In fact, the available databases in this area, namely, the

European ST-T Database and the Long-Term ST Database, do not provide the values of the ST segment

deviation, thus impeding an actual comparison. These datasets were created to be used for evaluation

of algorithms that detect or differentiate between ischemic ST episodes, axis-related non-ischemic ST

episodes, etc. This is not the case of the present algorithm, which only considers discrete values of the

ST segment deviation without further processing. For this reason, a correlation analysis was carried out

between our method and each of the state-of-art’s methods. The average results obtained are

presented in the table below.

Table 12: ST deviation correlation analysis.

Method Correlation coefficient

Records

Taddei’s method 0.512 'e0105','e0213','e0403','e0107','e0305','e0405','e0111', 'e0409','e0113','e0411','e0115','e0119','e0413','e0121', 'e0415','e0127','e0501','e0123','e0129','e0515','e0125', 'e0417','e0139','e0601',’e0147','e0603','e0151','e0607', 'e0605','e0159','e0609',‘e0163','e0161','e0203','e0817', 'e0613','e0205','e0615','e0207','e0801','e0303','e0211', 'e0103','e0305',

Pang’s method 0.575

Akselrod’s method 0.576

4.2.1.1.6 Atrial Fibrillation using 12-lead ECG

Atrial fibrillation (AF) is the most common sustained cardiac arrhythmia and is associated with

significant morbidity, mortality and decreased life quality, specially in elderly. This module presents the

approach followed towards its detection, using a novel 12-lead ECG method.

Methods

The proposed method consists of the following steps:

a noise detection phase, where ECG signals are analyzed in order to detect the segments with noise;

a feature extraction phase, where the ECG signals are processed and analyzed in order to extract relevant features, namely, performing segmentation of the MLII-lead ECG signal;

a classification phase, responsible for the discrimination between AF and non-AF episodes. Results and analysis

In this study, AF and non-AF episodes from 12 patients were considered. From those, 1 episode (2

records of 30 mins.) was selected from the “St.-Petersburg Institute of Cardiological Technics 12-lead

Arrhythmia Database” and 11 episodes (11 records of 60 mins.) were selected from the 12-lead ECG

database collected by our team under the project “Cardiorisk - Personalized Cardiovascular Risk

Assessment through Fusion of Current Risk Tools”.

The selected records were partitioned into records of 5 mins leading to the construction of a dataset

consisting of 144 records of 5 mins length, in which 72 records present AF and 72 records present other

rhythms other than AF.

The selection of the features most suitable for detection of AF episodes was performed based on the

F-score metric. A ROC analysis was performed for each features using a 6-fold cross validation approach,

leading to the selection of eight features. The best features were extracted from the HR analysis (F4 and


F3), followed by three features from the AA analysis (F6, F8 and F10). Three features from both HR and

AA analysis (F2, F9, F11) presented a F-score below the 50% and therefore were not selected.

The validation of our algorithm was performed using a 6-fold cross validation approach, where the

dataset was randomly partitioned into 6 equal size subsets. From the 6 subsets, 5 subsets were used

for training (with episodes from 10 patients) and 1 subset (with episodes from 2 patients) was used for

testing. The cross-validation process was repeated 6 times for each of the 6 subsets. This process was

repeated 20 times and the average and standard deviation (avg ± std) of the Sensitivity (SE), Specificity

(SP) and Positive predictive value (PPV) was evaluated.

In Table 13 we present the results achieved by a single-lead algorithm previously proposed by us and

the novel multi-lead algorithm in the testing subsets. It is possible to observe that the multi-lead

algorithm outperformed the single-lead algorithm. The analysis of AA recovered from 12-lead source

separation provided relevant features that enabled the increase of approximately 9% the algorithm’s

SE, 1% in the algorithm’s SP and 4% in the algorithm’s PPV. These results show that source separation

techniques such as ICA can provide a valuable insight about AA and enable the extraction of reliable

features for AF detection.

Table 13: Results achieved by the proposed multi-lead and single-lead AF detection algorithms.

Algorithm Sensitivity [%]

avg ± std Specificity [%]

avg ± std PPV(%)

avg ± std

Single-lead algorithm 79.0 ± 3.0 91.4 ± 0.5 86.6 ± 2.2 Multi-lead algorithm 88.5 ± 1.4 92.9 ± 0.3 90.6 ± 1.4

Lung Sounds

4.2.1.1.7 Data Collection

An acquisition protocol was implemented with the objective to help the design of the pulmonary sound

processing algorithms, in collaboration with the General Hospital of Thessaloniki ‘G. Papanikolaou’ and

at the General Hospital of Imathia (Health Unit of Naoussa), Greece. The protocol includes the collection

of chest sounds on 30 patients at six recording sites. For each site, lung sounds as well as cough and

speech were acquired. The ethical committee of the General Hospital of Thessaloniki ‘G. Papanikolaou’

authorized the data acquisition.

Auscultation was performed with the participants in a sitting position, using six channels that were set

in different positions: four in the back and two in the front of the chest (Figure 6). For each volunteer

we selected the data acquired from the two positions where the adventitious sounds/normal sounds

were better heard.

Figure 6: Potential positions for the acquisition of sounds (red). For each volunteer we selected the data acquired from the two positions where the adventitious sounds/normal sounds were better heard.


The data were acquired at 4 kHz using a 3M Littman electronic stethoscope (model 3200), which

complies with the EMC requirements of the IEC 60601-1-2. The acquisitions were done with the

volunteers in the sitting position. In order to evaluate the performance of the algorithms with a lower

sampling rate (for compatibility with the vest requirements), data was also downsampled to 2 kHz.

All the record data are being annotated by pulmonologists. So far, data from 9 patients and 3 healthy

volunteers is annotated and was used in the experimental study. The study will be complemented with

the remaining annotations as soon as they are available.

As for cough, during the acquisition, the volunteers were asked to simulate cough and then to count

from one to ten. The physicians who supervised the acquisition annotated the different events in the

timeline and we assigned them to three classes: (1) cough, (2) speech, and (3) other. Cough and speech

periods are the predominant events in the cough sub-dataset. In total, 343 cough events were

annotated.

113 wheezes were annotated in the temporal space. Using this information, the frames on the

spectrogram space were annotated as containing or not containing wheezes. During the selection of

the most suitable features for wheeze detection, data from four patients with episodes of forced cough

and speech were also used, as mentioned above. Although these events are detected in the previous

stages of the data processing workflow, these additional data allow to improve the robustness of the

algorithm.

199 crackle events were also annotated in the temporal space. Since crackles can appear as individual

or in group, the algorithm developed in this study aims to detect crackle events. Frames were annotated

as containing or not containing crackles. Neighborhood frames (with a maximum frame distance of 5

frames (with a duration of 128 ms and a overlap of 75%)) were grouped and considered to belong to

the same crackle event.

4.2.1.1.8 Sound signal quality assessment

The goal of this algorithm is to assess the quality of the audio signals before the rest of the sound

processing algorithms start. In its current version, the algorithm is quite simple: given a number of

thresholds (related to amplitude and length), it finds silent and saturated segments, and it outputs the

useful segments of the signal. The default thresholds are shown in Table 14. A segment is considered

silent or saturated if it reaches both amplitude and length thresholds (e.g. a segment is silent if the

number of consecutive samples with absolute amplitude ≤0.1% has length ≥2ms).

We are presently working on a more general lung sound noise detection approach.

Table 14: Default thresholds.

Silent Saturated

Amplitude (%) 0.1 99.9

Length (ms) 2 1

4.2.1.1.9 Cough Detection

Cough is a respiratory reflex characterized by sudden expulsion of air at a high velocity accompanied by

a transient sound of varying pitch and intensity. Here we briefly describe a method for automatic

recognition and counting of coughs solely from sound recordings, which ideally removes the need for

trained listeners.


Methods

Algorithm overview

Figure 7 outlines the cough detection process.

Figure 7: Outline of the cough detection algorithm.


Due to the small size of the dataset a leave-one-out (volunteer) cross-validation (LOOCV) approach was

used to test the performance of the detectors. It is important to refer that the "one left out" is always

a subject, not a single event.

Table 15 and Table 16 show the mean classification results for LOOCV and the sum of detected coughs.

Table 15: Classification results for cough detection with LOOCV.

(mean ± standard deviation)

Detected Coughs Sensitivity [%] Specificity [%] PPV [%] F-measure [%]

First Step 331/343 96.5±5 80.3±15 88.9±8 92.3±5

Second Step 319/343 93.2±6 87.6±11 92.5±6 92.6±4

Table 16: Classification results for cough detection with LOOCV after downsampling to 2KHz.



First Step 332/343 97.0±2 78.4±16 88.1±8 92.1±5

Second Step 320/343 93.6±7 89.3±13 93.4±7 93.2±4

4.2.1.1.10 Detection of crackles and wheezes

The automatic detection of adventitious sounds (additional respiratory sounds superimposed on breath

sounds) is a valuable non-invasive tool to detect and follow-up respiratory diseases such as chronic

obstructive pulmonary disease (COPD). Adventitious sounds include wheezes (continuous sounds),


stridors, squawks and crackles (discontinuous sounds). Here, we briefly illustrate our methods to detect

crackles and wheezes.

Methods

Two independent classification models, one for wheezes and another for crackles, were developed.

Both follow the workflow presented in Figure 8.

Figure 8: Workflow of the crackles/wheezes events detector algorithm.

We tested the performance of several features to detect wheezes and crackles events. For the

detection of wheezes, we tested the performance of 30 features and for crackles we evaluated the

performance of 33 features.


The sensitivity and positive predictive values (PPV) measured after classifying the data with

adventitious sounds using the crackles/wheezes detector is presented in Table 17. Similarly, Table

18presents the same metrics when the signals were downsampled. There, it can be seen that results

decrease slightly, as a consequence of the fact the some higher frequency content is discarded in some

cases.

Table 17: Sensitivity and PPV (mean and standard deviation (SD) measured after classifying the data with adventitious sounds using the crackles/wheezes detector.

Crackles Wheezes

Sensitivity [%] PPV [%] Sensitivity [%] PPV [%]

Mean ± STD

84 ± 22 78 ± 17 79 ± 28 90 ± 10


Table 18: Downsampled signal. Sensitivity and PPV (mean and standard deviation (SD) measured after classify the data with adventitious sounds using the crackles/wheezes detector.

Crackles Wheezes

Sensitivity [%] PPV [%] Sensitivity PPV [%]

Mean ± STD

80 ± 23 78 ± 20 76 ± 30 90 ± 14

Activity and SpO2 signal The activity signal consists of integer values which imply the type of the activity performed. More

precisely, there exist 5 different values for each activity type (lying, sitting/standing, walking, running

and unknown). Regarding the activity signal, several features are calculated. First of all, the dominant

activity of an activity signal is extracted. Furthermore, the absolute time for each activity type which is

present in each segment of the signal as well as the relative percentage of the total signal duration is

extracted. Finally, both the signal duration and the signal quality are calculated. This is defined as the

percentage of the accepted values that the signal receives.

The SpO2 signal estimates the oxygen saturation level. The SpO2 signal is by nature a signal of low

frequency, so the presence of abrupt changes can be considered as artifacts. In this respect, for the

manipulation of SpO2 signal a lowpass butterworth filter of second order is applied. The preprocessing

procedure of the signal is completed by rounding the signal towards the nearest integer. For the

extraction of the mean value of the SpO2 signal, it is necessary that the quality of the signal to be at

least 80%, so that the results are valuable. Additionally, the mean value of the SpO2 signal is also

computed for each type of activity performed.

For the extraction of the activity and the SpO2 features, several dll files have been created using

MATLAB computing environment. In this respect Microsoft Windows 7 SDK has been installed. This SDK

works with a specific C++ compiler. The aforementioned dll files run in the Feature Extraction Server

(FES). For the appropriate execution of the algorithms, a specific error handling procedure has been

adopted in order to prevent MATLAB runtime engine from crashing. In order to test the effectiveness

and the reliability of the developed algorithms, we have created multiple edf+ files, which include

several activity and SpO2 signals, by using specific MATLAB algorithms. These file included simulated

data by using the same coding as the data which will be extracted by the wearable sensor and they will

be used for WELCOME purposes.

4.2.2 EIT Signal Processing In WELCOME, the monitoring of patient’s regional lung ventilation is accomplished through electrical

impedance tomography (EIT). EIT is a non-invasive, radiation-free medical imaging technique which

will become wireless and wearable through the WELCOME project. In lung EIT, a set of electrodes is

placed around the patient’s thorax and used for injecting electrical currents and measuring the resulting

potentials through well-defined stimulation patterns. These potentials are used for the computation of

images showing the distribution of electrical resistivity changes in the studied chest cross-section. In

WELCOME, in addition to spontaneous tidal breathing monitoring, standard ventilation maneuvers

performed are foreseen. Early results have shown that regional ratios of forced expiratory volume in 1

s (FEV1) and forced vital capacity (FVC) can be computed using the acquired EIT image sequences. EIT

feature extraction software consists of independent modules written in MATLAB. For raw EIT data

reading and reconstruction the software platform of the Electrical Impedance and Diffuse Optical

Reconstruction Software (EIDORS) project is used.


The EIT reconstruction process is sensitive to measurement errors and, consequently, movements,

sweat, electrode drifting or detachment cause artifacts. To overcome these, disturbances detection is

performed with simple statistical rules and thresholding that detect extremely large voltage variations.

Most of the existing EIT reconstruction algorithms use a segment of the raw voltage measurements as

reference. The optimal reference is considered a short tidal breathing period in the raw EIT data. To

detect this reference we compute RGIC which is the average voltage of each time frame and apply

band-pass filtering. The next step of the RGIC processing is the detection of a relatively stable tidal

breathing period, that is, the detection of a small number (4-6) of consecutive breaths of almost

constant volume and duration. This is accomplished by first detecting the end-inspiratory and end-

expiratory values of RGIC through the differentiation of its filtered version. A common problem of the

initial breath segmentation is the detection of smaller breath-like variations which cannot be

considered as normal breaths. Weak breaths appear as outliers due to their small volume and duration.

Their elimination is the operation of merging them with one of their adjacent normal breaths. The

resulting breath sequence is scanned until a predetermined number of consecutive tidal breaths are

found (the most stable tidal breathing). The criterion used is based on the standard deviation of the

volume and duration of the breaths under examination.

The development of the EIT features extraction modules is based on the reconstruction algorithms

provided by the EIDORS open source library. More specifically, the adjacent electrode current drive and

voltage measurement is used. Also, the GREIT reconstruction matrix for adult human thorax geometry

is used for the forward model.

A portion of the EIT features required for the characterization of regional ventilation must be computed

based on the processing and analysis of tidal (or quiet) breathing periods. The Global Impedance Curve

(GIC) of the reconstructed EIT image sequence must be partitioned into tidal breathing periods (TBP).

The GIC is computed, filtered and differentiated for the end-inspiratory and end-expiratory values

identification. The sequence of detected breaths is partitioned into TBPs by a heuristic rule: the

sequence is sequentially scanned by a sliding window of constant width and, at each position, the set

of breaths in the sliding window is classified as tidal or not. Next, with a second scan, overlapping tidal

segments are merged into larger tidal segments. The resulting tidal segments constitute the reported

TBPs.

For each spontaneous tidal breathing period, functional EIT images (fEIT) and various ventilation indices

are computed. fEIT images can potentially identify alterations of local lung dynamics. Currently, each

fEIT image is computed as the difference between the end-inspiratory EIT-image for the previous end-

expiratory one. More advanced methods for fEIT image computation taking into account the phase

shifts of individual pixel-level impedance courses in the EIT sequence are being implemented.

More details on the EIT signal processing can be found in B.3.

4.3 Performance of the feature extraction algorithms

In this subsection, a performance testing of the feature extraction algorithms is presented. First, the

testing environment is described and then the results are presented and discussed.

4.3.1 Testing environment To test the performance we sent a significant number of simultaneous extraction requests for the

Feature Extraction Server (FES) installed on the Amazon cloud to process. The characteristics of the test

machine are as follows:


Table 19: Performance testing environment

CPU RAM # workers

Amazon instance Intel Xeon [email protected] 16Gb 4

We created a Java client that sends 20 feature extraction requests at the same time to the FES. To be

clear, the 20 requests are sent sequentially, one after the other. The time that it takes to send all the

20 requests and their respective data to the FES is on average 12 seconds from start to finish.

Concerning the number of Workers running in the machine we decided to run 4 of them, this means

that it will be able to process 4 requests simultaneously and take advantage of the multi-core processor.

The choice of 20 simultaneous requests is due to the fact that we expect to have 20 vests in the project

and we want to simulate the unlikely scenario of all 20 vests sent their data at the same time.

Each request will have 5 minute worth of vital signs data (1 lead ECG, breathing rate, activity and SpO2)

and we will measure how much time it takes to process all 20 requests, as well as the mean time that

each request will take to be processed by a Worker.

4.3.2 Results and discussion The performance of the machine learning algorithms might be analyzed in two ways: time performance

and classification accuracy.

Regarding time performance, processing of 5-minute chunks of ECG, Activity, SpO2 and Tachypnea

takes, on average, 37 seconds in the cloud infrastructure, which is well below the duration of the

signals. In addition, if we imagine a worst-case scenario in which the signals from all the 20 vests

foreseen in all the WELCOME pilots arrive at the same time, their processing would take, on average,

3m10s, which we believe is acceptable.

We performed 5 runs of the testing scenario. The results are the following:

Table 20: Performance results

# of simultaneous requests Total time spent Time per request (minimum - maximum)

20 183 seconds 34-39 seconds





Average 186,4 seconds 36-38 seconds

Taking into account that each request carries 5 minutes of vital signs data, we find that the processing

time for 20 requests is quite satisfactory since it processes them all in around 3 minutes, which is well

below the 5 minute mark. As for each individual request, these will take around 36-37 seconds to be

processed on average. Additionally, these tests show that the processing time is rather consistent along

multiple executions.

Finally, we remind the fact that we are using 4 Workers on an 8-core processor, this means that in

theory we can increase the number of Workers. For instance even with a minor increase from 4 to 6

Workers we believe we will obtain some performance gain in processing the same 20 requests since 4

mailto:[email protected]


Workers are not taking full advantage of the 8-core processor. This however was not tested yet but we

will do it soon.

As for classification accuracy, our algorithms are, in general, quite robust, particularly regarding ECG

processing (over 95% sensitivity and specificity in most tasks). Regarding the classification of crackles

and wheezes, current evaluation metrics are around 80%, which is on par with the state of the art.

However, standard deviation is high (around 20%) and may be explained by the small size of the dataset,

the variability of the adventitious sounds between patients and the use of the leave-one-out (volunteer)

cross-validation approach to test the performance of the detectors. To tackle this issue, we are

presently extending our dataset so that our algorithms are trained with data with higher heterogeneity,

which we hope will improve their robustness. In addition, prior to classification, the quality of audio

signals is assessed. This step excludes low quality signals from further processing, which also improves

the robustness of our approach.


5 Medical Knowledge and decision support

5.1 Decision Support System

In this section, the functionality and implementation of the WELCOME DSS is described. After an

overview of the DSS’s functionality overview, its main components, the information flow and two usage

scenarios depicting the DSS’s main impact on the end user are presented.

5.1.1 DSS Functionality Overview In this section we briefly present what type of information the WELCOME DSS employs as input and

what type of output is made dynamically available to the HCP users of the WELCOME system.

DSS inputs:

Patient hub data (bio signals, external medical devices data)

Clinical exam data, for current clinical status

Medication list, for known adverse effects

Questionnaire answers and scores, for symptoms, depression, etc

Feature extraction output (Clinical findings device assertables, e.g. coughing, tachycardia etc.)

Rules:

Rules identifying distinct abnormalities according to known and medically accepted

thresholds, not necessarily pointing to a specific ‘diagnosed’ situation if not combined,

o “if coughing detected then abnormal state”

o “if body temperature> 37.9 Celsius then abnormal state”

Complex Comorbidity rules from experts and medical evidence

• Combinations of conditions pointing at specific problems/causalities

o “if activity decreasing and depression scale high, and no other vital sign change, then

potential depression”

Output:

Abnormalities detected based on the provided information

o Aggregation of daily distinct abnormality types

o List of daily abnormalities

Warnings for specific states based on single or combined conditions

o Adverse drug event

o Disease deterioration

Exacerbation

Pulmonary oedema

Depression

Pulmonary embolism

Hypoglycemia

Hyperglycemia

Diabetes deregulation

Pneumonia

Cardiac arrhythmias

Output format:


Text message

o Rule specific content

o Certainty where applicable, as derived from the number of “optional” conditions met,

i.e. related conditions that are true on top of the mandatory ones, adding more

confidence to the detection

o Link to facts triggering rule, that can be accessed for further investigation of the

situation (traceability and justification)

5.1.2 Main components Much of the data processing needed for Decision Support in the WELCOME system takes place in

distinct modules (Feature Extraction Server, Calculations Handler of the OA etc.) as outlined on Figure

9 (reproduced from D3.2). Although the data produced at these modules can be thought as part of the

entire Decision Support process, for the purposes of this section DSS refers to the Execution Engine

node of Figure 9 as the implementation of the other modules is analyzed in other parts of this

document.

WELCOMEKnowledge Base

STORAGE SERVER

DSS EXECUTION ENGINE

FEATURE EXTRACTION

CALCULATIONS HANDLER

Enhanced KBRules

Standard ThresholdsPersonalized ThresholdsMandatory conditions

Optional conditionsIdentified rule outcome

Biosignal data

Group of domain ExpertsPulmonologists, Cardiologists

Figure 9: WELCOME DSS: Knowledge base enhancement and knowledge flow


As outlined in the design, the DSS implementation is based on the SPIN framework, reusing software

libraries developed for other modules of the Communication system. The DSS is packaged as a separate

Java web application in order to assure and facilitate independence, scalability and maintainability. Its

implementation is similar to the component stack outlined for the Communication API in section 2.1

and shown below.

Figure 10: Core DSS software stack

The main components of the DSS module are:

• Communication Handler: This component is responsible for all Input and Output to the DSS. It

handles requests by the Orchestrator Agent (OA) to initiate DSS procedures (via an HTTP REST

endpoint), and communicates with the storage engine (reusing the SPARQL and File Storage Clients

developed for the Cloud API) to retrieve rules from the rule ontology, resources and files relevant to

the DSS procedures, and store back the DSS output.

• Rules Evaluation: This component uses the SPIN API to evaluate the defined rules.

• Storage: A number of different needs can be addressed by adding a storage component to the

DSS. First, it can provide a way to keep temporal information on rules' evaluation and firing. This can

be used to store intermediate rules' results (e.g. rules w33-w34 and w42-w43 in D3.2) needed by other

rules in the ruleset. Furthermore, it enables a mechanism to control the rate of alerts' generation by

defining a strategy of meta-rules that specify how to handle the repeated firing of the same rule.

Second, it enables keeping a detailed provenance log of when, how and why each rule fired. This can

help in performing a retrospective analysis of the initial ruleset, in order to evaluate, validate and fine-

tune individual rules. All of the above needs, whether for persistent storage as is the second case, or

short-term non-persistent storage as is the first case, can be met by using an RDF Triple Store. An

embedded instance of the Apache Jena Fuseki server is launched as needed to serve as a storage engine

for the DSS. Communication with this local RDF Store is achieved reusing the SPARQL module developed

for the Communication web application.

5.1.3 DSS information flow Typically, data arrive every five minutes when vital sign data are transmitted from patient hub to the

cloud or when the vest starts charging and raw data (including multi-lead ECG, EIT signals and lung

sounds) are transmitted. The DSS process is triggered as part of the OA’s Main Information Handling

Workflow, described in detail in the D3.2.


Figure 11: WELCOME Clinical DSS Process

As shown in Figure 11, a typical information flow in the overall DSS process would involve OA’s

Calculations Handler to calculate the baseline values for each patient. Baseline values (like average

number of base line crackles per minute, daily percentage of resting time, heart rate average,

respiratory average and SpO2 average) are calculated based on the average of recorded values on first

2-4 days and they are stored in the Storage Engine (SE). Thus, they constitute personal baseline values,

and comparison to them suggests a deviation from the previously known steady state of the patient.

The algorithms used to calculate the various baselines are part of the Calculations Handler module and

as such they are considered out of the current document’s revision scope as they are going to be

described in D5.2.

As data arrive to the cloud, Communication API saves them to SE and notifies the OA. OA executes the

main handling information workflow and as part of it, calls the CH in order to store the last two hours

average values of the respective measurement.

After that, OA calls FES in order to extract high level features from low level signal data. For example,

the FES calculates the number of event crackles per minute, the percentage of the activity in resting or

non-resting status, the SpO2 average and sends the results to OA.


OA stores the received results to the SE and initiates the DSS process.

for (FeatureExtractionProcess process : processes.values()) {

logger.info("Starting feature extraction process " +

process.getUuid().toString() + " of type "

+ process.getClass().getSimpleName());

try {

WebhooksCallbackServerManager.activeProcesses.put(process.getUuid().toString(),

process);

if (sendHttpRequestToEndpoint(process)) {

process.setStatus(FeatureExtractionProcessStatus.Waiting);

waitForResults(process);

} else {

throw new Exception("The feature extraction process endpoint " +

process.getEndpointURL() + " has not responded correctly for process with ID " +

process.getUuid().toString()+ ". It has not thrown an exception but it has not

accepted the request either.");

}

} catch (Exception ex) {

process.setStatus(FeatureExtractionProcessStatus.Aborted);

throw ex;

}

}

for (DecisionSupportSystemProcess process :

observationAnalysisContext.getDecisionSupportSystemProcesses()) {

process.setStatus(DecisionSupportSystemProcessStatus.InProcess);

processes.put(process.getUuid().toString(), process);

}

try {

logger.info("Starting DSS processes execution (DSS Handler module has been

called)");

logger.trace("Processes number:"

+observationAnalysisContext.getDecisionSupportSystemProcesses().size());

for (DecisionSupportSystemProcess process : processes.values()) {

logger.info("Starting DSS process " + process.getUuid().toString() + " of

type " + process.getClass().getSimpleName());

try {

process.setStatus(DecisionSupportSystemProcessStatus.Waiting);

process.run();

process.setStatus(DecisionSupportSystemProcessStatus.Completed);

this.results.put(process, process.getResult());

} catch (Exception ex) {

process.setStatus(DecisionSupportSystemProcessStatus.Aborted);

throw ex;

}

}

saveResultsToSEM();

return DecisionSupportSystemProcessStatus.Completed;

} catch (Exception e) {

for (DecisionSupportSystemProcess process : processes.values()) {

process.setStatus(DecisionSupportSystemProcessStatus.Aborted);

}

throw e;

}

Figure 12: Code snippet regarding the handling of the feature extraction processes. Web hooks logic is used as described in D3.2

Figure 13: OA calling the various DSS processes and handling possible exceptions


The DSS’s Communication Handler is notified by the OA to initiate the evaluation of SPIN rules’ set,

through the Rules Evaluation component. Typically, each rule defined in D3.2 Appendix A would

correspond to one SPIN rule.

Figure 14: SPIN function used to identify sputum increase based on questionnaire

Figure 15: SPIN function used to identify purulent sputum

Figure 14 and Figure 15 show two SPIN functions that respectively identify increase in sputum

production and purulent sputum, depending on a patient's answers to the respective questionnaires.

These functions are used as SPARQL functions inside SPIN rules.

The Communication Handler component retrieves the data needed as input facts for the rules’

evaluation and forwards them to the Rules Evaluation (RE) component. RE produces a message out of


each fired SPIN rule and forwards it to the Communication Handler to be stored in the SE. It can then

be retrieved by the respective web applications in order to inform the medical personnel.

Meta-rules and Temporally Aggregated Abnormality Handling In order to prevent over-alerting and improve robustness, a meta-rule with temporal thresholds is

applied to any given rule. Thus, although, a specific rule based on the information from the recorded

signals can be fired at any time, a corresponding alert by the DSS is generated only when the rule-

specific temporal threshold is exceeded (e.g. conditions of rule w42 must be met continuously for more

than 30 minutes).

Rule w42 (copied from D3.1) is as follows:

IF

Current Diseases contains E08*-E13* Deviation from baseline HR >15% Temperature >38 C Deviation from baseline Blood Glucose Level > 30% ABS(Deviation from baseline Spo2) < 3 Deviation from baseline respRate resting < 15% ABS ( Deviation from baseline FEV1 (EIT)) < 3% ABS(Deviation from baseline IVC (EIT)) < 5%

THEN

Alert Diabetologist/ Possible diabetes deregulation Alert patient for more frequent temperature measurements / possible infection Alert patient for more frequent Blood Glucose measurements

This rule refers to the case where increase of heart rate, temperature and glucose, without

pneumonology related findings in a patient with COPD and diabetes comorbidity are suggesting a

deregulation of origin different than COPD, i.e. potential diabetes deregulation or infection. This is a

rule that follows the logic of medical (differential) diagnosis.

In this case, resting heart rate, SpO2, respiration rate, FEV1, IVC, have to meet the criteria for more

than a single segment (when available), in order to fire the rule in enough segments to meet the

temporal aggregation threshold. A single alert for “possible diabetes deregulation” will then be

generated. In addition, this will help to avoid cases where erroneous or noisy measurements may be

generated, for example wrong heart rate readings for some segments, thus increasing robustness.

Furthermore, regarding simple abnormality rules (e.g. D3.2 App. A Table 16, and answers to simple

questions) an aggregation strategy is applied, that generates a periodic (daily) report. This report

contains the types of abnormalities observed during the day and links to their respective instances. The

DSS process that generates this report is triggered once per day, by the OA.

For example, High/Low BP, Tachypnoea, Tachy/Bradycardia, High/Low Blood Sugar, High/Low

Temperature, which can be detected following medically acceptable thresholds and algorithms

constitute detected abnormalities. The same stands for questions with known and structured negative

answer (e.g. increase of sputum or dyspnea). This is also applicable to new questions added to the

system by the doctor, provided that they have structured positive/negative answers.

The aggregation of abnormalities, although does not point at a specific ‘diagnosis’, is an attempt to

offer a quantified view of a patient’s status, and a longitudinal view of overall patient’s stability or

improvement/deterioration. In the future, this quantified information can be the basis for discovering


new rules from the patient data (e.g. by finding the points where slopes appear and correlating them

with previous changes in specific biosignals).

5.1.4 Usage scenarios The first scenario is an example of the DSS individual rules firing. The rules evaluated in this context of

execution are the ones described in the Appendix A of the D3.2. The second scenario demonstrates the

value of the abnormalities' aggregation process and its impact to the end user.

Scenario 1 - Individual rules processing Alice, a 67 year old COPD patient who also suffers from heart failure, follows her typical day routine

wearing the WELCOME vest. Vest constantly measures HR and SpO2 and transmits them periodically

to the WELCOME cloud as part of its vital signs transmission process, along with other data including

estimation of activity and single lead ECG. Daily, all the data including EIT signals and lung sounds

recording are also transmitted, typically during the night when Alice is sleeping and the vest is charging.

The WELCOME cloud infrastructure compares the data received to the patient’s baseline and identifies:

Answers to questions asked with the respective patient hub application o Did your sputum increase today == true o What is the color of your sputum == purulent o mMRC score increase over previous assessment (+2)

18% increase of crackles

The last two days the total resting activity is 130% in average compared to baseline measurements

20% increase of resting heart rate

18% increase of resting respiratory rate

3% decrease of Spo2

The above conditions trigger a specific rule (namely rule W38 described in Appendix A of D3.2).

Provided the conditions are met for more than 30' continuously (meta-rule threshold for w38) a

message is formed and communicated to the doctor through the respective web application (developed

in WP6), mainly composed of the basic triggered rule parts:

“There are indications of possible acute exacerbation of COPD. Possible indications coming from

WELCOME system:

Answers to questions:

o Did your sputum increase today == true

o What is the color of your sputum == purulent

The average of baseline crackles has increased 18%

Increase in mMMRC score : +2

For the last 2 days the total resting activity has been 130%

Heart rate average increased 20%

Respiratory rate average increased 18%

SpO2 average decreased 3%

Rule’s optional conditions met: 4 out of 6"

The DSS message highlights the facts that triggered a specific rule, as “indications”. Furthermore, since

the system’s rules contain also “optional” parts, which add to the overall rule’s confidence, only the

optional parts that contribute to the rule firing are contained in the message, in order to facilitate the

evaluation of the message by the doctor. In this example, the rule’s extra confidence is 4/6 as 4 out of


6 optional conditions of the rule are actually valid according to the collected data. At the moment, it is

up to the doctor to make use of this ‘certainty’ information.

Scenario 2 - Aggregated abnormalities For the last couple of days, Bob feels that his health deteriorates but his symptoms are minor (minor

headache, minor increase of coughing, light dyspnea etc.) and therefore he prefers to rest and does not

seek medical help. Bob has followed his typical day routine and before going to bed, without noticing a

minor increase in his body temperature. This combination of conditions did not fire any of the complex

rules that point at specific diagnoses (e.g. COPD exacerbation, according to the known medical logic).

The WELCOME cloud infrastructure formulates the daily report regarding aggregated abnormalities

observed during the day. The WELCOME applications show a graph representation of the total number

of abnormalities during the last few days. In the case of Bob, there is an increase in the aggregated

abnormalities, suggesting a possible start of deviation from steady state. This increased number, easily

identified on the UI, allows the HCP to quickly identify a possible overall health deterioration. The UI

allows navigating from the aggregate view to the source abnormalities that generated the aggregate

DSS report, further facilitating the HCP in investigating the patient's condition. For example the doctor

can see the biodata values that constituted these abnormalities, and decide to initiate further

communication with the patient, or a closer monitoring for the next days, to help avoid a further

deterioration.

6 Risk Management

The following table presents the status of risks that where identified in the system design process (Del.3.1) along with new risks that emerge from the development at cloud

level work package tasks. For the previously identified risks the results and residual risk are documented.

Table 21: Risk management table. Color coding (green : minimal, yellow: partially handled, red: action needed)

System Risk Consequences Severity Mitigation Result Residual risk Color

coding

FES Algorithms Not enough or

adequate data for

training

Low accuracy

of feature

extraction

algorithms

H Data collected at various

locations in Coimbra and

Greece, with

multimorbid patients, to

generate adequate

training data

A dataset collected from 30

volunteers was acquired.

Based on those acquisitions a

dataset with 12 volunteers

was used to test the crackles

and wheeze detection. The

respiratory sounds of six

patients contain wheezes or

wheezes and crackles. The

healthy subjects exhibit

normal respiratory sounds,

while another set of three

patients had only crackle

manifestations.

A dataset with at least 30

patients (with COPD and with

manifestation of wheezes

None


and/or crackles) and 15

healthy subjects should be

acquired using the vest.

FES Algorithms Bad data

acquisition in

training data

Low accuracy M Perform careful data

acquisition and

annotation

All data were annotated

by experts and double-

checked.

None

FES Algorithms Cough data

acquisition not

representative

L Acquire data from COPD

patients

Data were acquired from

COPD patients

Data were

from

controlled

environment

and maybe

results differ

from data

acquired at

home

FES Algorithms Insufficient

algorithm accuracy,

e.g., confusion

between different

lung sounds (and

M Train the classification

algorithm with a

sufficiently broad range

of sound classes Multi-

channel context

For the crackles event

detector we measure a

detectability value and a

positive predictive value

equal to 84 ± 22 % and

78 ± 17%, For the

Results may

differ from

data acquired

at home


background

sounds)

exploit source

separation

False positives, true

negatives

wheezes event detector

we measure a

detectability value and a

positive predictive value

equal to 79 ± 28 % and

90 ± 10%, respectively

FES Algorithms Noisy data in

testing

Low accuracy H Algorithms developed

with noise robustness

Algorithms tested with

realistic data before use

Rules are based not only

on signals but also on

reported data and simple

signs

Under development Partly resolved

by the temporal

aggregation

meta-rule

FES Algorithms High computational

cost

Unfeasible

processing

time

H Evaluate the trade-off between algorithm complexity and accuracy; Eliminate unnecessary extracted audio features.

Performance testing in order to estimate the order of the execution

In the plenary meeting in Coimbra, the FES development team presented a quantified performance analysis of each algorithm. It seems that the performance of every algorithm

The remaining

risk has to do

with feature

extraction

algorithms

that have not

yet been

implemented


time needed for each feature extraction processing algorithm.

Another mitigation strategy would be to buy more cloud resources in order to decrease delay.

implemented until now is acceptable. Testing ongoing

as they have

not been

tested for

performance.

DSS Rules inadequately

expressed

Loss of

alerting and

difficulty in

maintenance

M Testing with simulated

data before actual use.

Both simple and complex

rules present, simple

rules can be more

robust. Initial data also

available, besides rules

Rule set revised to include

simple along with more

complex rules. Rules

reviewed by experts.

None

DSS Frequent updating

of rules

This might

(under

conditions)

lead to some

down time.

L Rule-base external to

DSS rules’ execution

engine

Minimal downtime

The DSS is designed to

accept rules as part of the

DSS ontology and it is

external to the engine.

Downtime to

reload rule

knowledge

base may have

minor impact

in case the OA

requests DSS

at this time.


Data model Requirements for

entities that are in

conflict/overlapping

with existing ones.

A complex

model which

could be very

hard to

maintain.

L These issues are matter

of a balance. We will try

to take advantage of

ontology engineering

best practices in order to

avoid pitfalls. E.g. new

entities will be dropped

and we will proceed with

existing ones and local

entities where possible

Until now, the feedback

from the users of the

model is very positive. It

seems that the model

covers most of their needs.

The data

model has

reached a

steady state

and needs

minimal

update

Cloud Storage Unstable/complex

model, needs for

frequent updates

Inconsistency

/ System

often down

for upgrade

M Early tests for initial end

to end stabilization

Software lifecycle

management

Early test done

None

Communication

API

Frequent Loss of

communication

Reduced

speed of

response

H Data Compression

Prioritization policy

Data compression is

supported by the Cloud

service. Prioritisation has

been implemented as

regards real time and raw

data transmission

(vest/patient hub)

None


External

Sources

Connector

Highly complex API

regarding

environmental data

retrieval

Delays in

development

or not able to

retrieve

environmental

data

M We could always search

for another source of

environmental data

source or retrieve data

without the use of the

formal API (scrapping).

We could also

communicate with the

provider of the data in

order to find a more

viable solution

We confirmed that our

environmental data source

provides data for

operational usage on a

near real time through

FTP.

As our data

provider is the

outcome of a

EC’s project,

we cannot

guarantee the

stability of the

service until it

has been

thoroughly

tested.

However, we

consider that

even in the

worst case the

weather

related

information

data sources

would be

adequate for

our rules set

and provide

the needed

clinical value.


Orchestrator

Agent

Performance issues Delays in the

execution of

the main

workflow of

information of

the OA would

cause delays

in the overall

system.

L We use jBPM as a

workflow engine which is

widely used and

supported by a major

Java EE vendor (Red Hat).

The lack of real

multithreading has been

overcome with the use

of custom java code

which allows the

multiple run of jBPM

workflows.

We tested the multiple

parallel run of jBPM

workflows using unit tests.

Any performance delays

have to do with

dependencies, i.e.

modules that the OA

communicates with

(calculations handler, DSS,

FES)..

None

EIT image

reconstruction

The procedure is

highly risky as it

heavily depends on

the placement of

the electrodes and

their electrical

output. Since we do

not yet have real

data from a real

vest, there is a high

uncertainty

regarding our

ability to

Bad quality of

produced EIT

images which

would

consequently

lead to bad

feature

extraction

results based

on EIT images.

M We already test the most

widely accepted EIT

reconstruction

algorithms against our

selected electrode

topology. We use

simulation models in

order to balance the lack

of real data.

In a few months we will

have an evidence-based

selection of the best EIT

reconstruction algorithm

for our electrodes’

placement.

We have to

verify that our

selected EIT

reconstruction

algorithm is

working well

with the real

vest produced

data.


reconstruct useful

EIT images.

WELCOME

cloud as a

whole

Request overload Reduced

speed of

response

M Early Performance tests

to define upper limits

A Cloud policy that

allows for increasing

resources when

necessary

Performance tests at

storage server and

processing server

Integrated

performance

WELCOME

cloud as a

whole

Unable to acquire

suitable PaaS

infrastructure that

matches our needs.

We will set-up the components in using IaaS cloud resources. This means that we would have to manually maintain required components (application server, triple store etc.)

L We have confirmed that

Java EE PaaS containers

are available through the

major commercial cloud

service providers. We

also confirmed that

there is the capability of

hosting an out of the box

virtuoso (RDF triple

store) on the cloud.

We are currently

investigating the options

provided by the major

cloud services providers.

While we have made

progress, we have not yet

finalized our cloud

infrastructure choices and

therefore have not yet

mitigated this risk.

In the worst

case, we will

have to buy

IaaS services

(virtual

machines) and

setup

/maintain the

rest of the

infrastructure

manually.


WELCOME

cloud as a

whole

Lack of a robust and

automatic

integration and

publishing

mechanism on the

production cloud

environment

The lack of

such an

automated

mechanism

could

significantly

harden the

procedure of

publishing

updates, bug

fixes etc.

Executing the

publishing of

the system on

the

production

environment

manually can

lead to errors

and significant

downtime.

M We have already

discussed such issues

and agreed on using a

common tool which

would allow us to

integrate, test and

publish our code

(GitLab).

We have not yet used the

GitLab tool.

Practically, it is

very difficult

to harmonize

remote

working teams

in following

the same

development

practices,

tools etc. This

risk has not

yet been

mitigated. This

will be further

worked and

mitigated in

WP7.

7 Conclusions

This deliverable describes the progress of the development process so far, regarding the cloud-based

management and processing of the data streams towards detecting changes in patient status and

supporting decisions within an integrated care context for patients with co-morbidities. The work

described in the previous sections on the development of the WELCOME cloud platform are based on

the design rationale and decisions as those are described in D3.1 and D3.2 deliverables of the WP3

which is responsible for the design. As presented, the development of the cloud based modules namely

the Communications API, the Orchestrator Agent, the Feature Extraction Server and the Decision

Support System are mature enough to be able to present their core functionality in a demo.

7.1 Future steps

Cloud system has been developed along its major axes to a satisfactory degree to allow for integrated

testing, however it is not yet finalized. Τhe next steps of development that are to be completed on the

next year are:

Migration to cloud infrastructure

o The developed and tested locally components of the cloud system must migrate to

the Cloud infrastructure

Updates on data model

o Minor additions to the entities are expected. Also addition of questionnaire

translations in Greek, Dutch and German are to be included

Development of unimplemented feature extraction algorithms.

o Some algorithms are not finalized and are about to be in the next period

Integration of algorithms to the FES

o Locally developed and tested algorithms will be integrated to the FES

Development of OA independent blocks and integration of DSS engine in an orchestrator agent

workflow

o The sub-components of the main workflow (e.g. Calculations Handler) are to be

implemented.

Implementation of security requirements

o The integration of the authentication and authorization services offered by the

organization hub will be consumed by the API/Storage server

DSS complete development and testing

o The implementation of DSS component will begin at this period

External sources connector implementation and integration

EIT Reconstruction algorithms assessment

o The evaluation of the current state of the art reconstruction algorithms of EIT based

on the WELCOME vest sensor topology will take place in this period.

EIT Reconstruction

o The algorithm of EIT reconstruction from the EIT signal will be implemented as part

of the FES.

7.2 Open issues

In terms of audio quality assessment, a more thorough approach is in course. Regarding the validation

of the lung sound processing algorithms, in particular wheezes and crackles detection, a more

substantial amount of data is necessary. Hence, the data collection study will continue and the


algorithms will continue to be improved and validated with recourse to the incoming data collection

and annotation.

Additionally, all algorithms must be validated in a real scenario, i.e, in a real situation involving the vest.


8 Bibliography

A.B.M.A. Hossain and M.A. Haque, (2013), “Analysis of Noise Sensitivity of Different ECG Detection

Algorithms,” vol. 3, no. 3, 2013.

I. Jekova, A. Cansell and I. Dotsinsky, (2001), “Noise sensitivity of three surface ECG fibrillation detection

algorithms.,” Physiological measurement, vol. 22, no. 2. pp. 287–297, 2001.

M. Rahman, R. Shaik, and D. Reddy, (2009), “Noise Cancellation in ECG Signals using Computationally

Simplified Adaptive Filtering Techniques: Application to Biotelemetry,” Signal Process. An …, vol. 3,

no. 5, pp. 120–131, 2009.

R. Sivakumar, R. Tamilselvi and S. Abinaya, (2012) “Noise Analysis & QRS Detection in ECG Signals,”

2012 Int. Conf. Comput. Technol. Sci. (ICCTS 2012), vol. 47, no. Iccts, pp. 141–146, 2012.

C. So-In, C. Phaudphut and K. Rujirakul, (2015), “Real-Time ECG Noise Reduction with QRS Complex

Detection for Mobile Health Services,” Arab. J. Sci. Eng., 2015.

H. Yoon, H. Kim, S. Kwon and K. Park, (2013) “An Automated Motion Artifact Removal Algorithm in

Electrocardiogram Based on Independent Component Analysis,” Fifth Int. Conf. eHealth,

Telemedicine, Soc. Med., no. c, pp. 15–20, 2013.

I. Jekova, V. Krasteva, I. Christov and R. Abächerli, (2012) “Threshold-based system for noise detection

in multilead ECG recordings,” Physiol. Meas., vol. 33, no. 9, pp. 1463–1477, 2012.

R. Kher, D. Vala and T. Pawar, (2011), “Detection of Low-pass Noise in ECG Signals,” no. May, pp. 3–6,

2011.

Y. Kishimoto, Y. Kutsuna and K. Oguri, (2007), “Detecting motion artifact ECG noise during sleeping by

means of a tri-axis accelerometer,” Annu. Int. Conf. IEEE Eng. Med. Biol. - Proc., pp. 2669–2672,

2007.

J. Lee, D.D. McManus, S. Merchant and K.H. Chon, (2012), “Automatic motion and noise artifact

detection in holter ECG data using empirical mode decomposition and statistical approaches,” IEEE

Trans. Biomed. Eng., vol. 59, no. 6, pp. 1499–1506, 2012.

A. Mincholé, L. Sörnmo and P. Laguna, (2011). “ECG-based detection of body position changes using a

Laplacian noise model,” Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. EMBS, vol. 14, pp. 6931–

6934, 2011.

P. Raphisak, S.C. Schuckers and A. de Jongh Curry, (2004), “An algorithm for EMG noise detection in

large ECG data,” Comput. Cardiol. 2004, vol. 1, no. 1, pp. 369–372, 2004.

J. Pan and W.J. Tompkins, (1985), “A real-time QRS detection algorithm.,” IEEE Trans. Biomed. Eng., vol.

32, no. 3, pp. 230–236, 1985.

C. Ahlström, (2008), "Nonlinear phonocardiographic Signal Processing," Institutionen för medicinsk

teknik, 2008.

Y. Sun, K. Chan, and S. Krishnan, (2005), "Characteristic wave detection in ECG signal using

morphological transform," BMC Cardiovascular Disorders, vol. 5, pp. 1-7, 2005.


C. Li, C. Zheng, and C. Tai, (1995), "Detection of ECG characteristic points using wavelet transforms,"

Biomedical Engineering, IEEE Transactions on, vol. 42, pp. 21-28, 1995.

J. P. Martinez, S. Olmos, and P. Laguna, (2000), "Evaluation of a wavelet-based ECG waveform detector

on the QT database," in Computers in Cardiology 2000, 2000, pp. 81-84.

Z.-E. Hadj Slimane and A. Naït-Ali, (2010), "QRS complex detection using Empirical Mode

Decomposition," Digital Signal Processing, vol. 20, pp. 1221-1228, 7// 2010.

I. K. Daskalov and I. I. Christov, (1999), "Automatic detection of the electrocardiogram T-wave end,"

Medical & Biological Engineering & Computing, vol. 37, pp. 348-353, 1999/05/01 1999.

R. Couceiro, P. Carvalho, J. Henriques and M. Antunes, (2008), “On the detection of premature

ventricular contractions”, IEEE EMBS, 2008.

M Chikh, N. Gbelgacem and F. Reguig, (2003), “The use of artificial Neural networks to detect PVC

beats”, Lab. de Génie Biomédical. Dép. d’électronique, Univ. Abou Bekr Belkaïd, 2003.

L. Tian and J. Tompkins, (1997), “Time domain based algorithm for detection of ventricular fibrillation“,

Proceedings of the 19 Int. Conference IEEE/EMBS Oct 30-Nov 2, Chicago, USA, 1997.

I. Jekova, and V. Krasteva, (2004), “Real time detection of ventricular fibrillation and tachycardia”,

Physiol. Meas. 25, 1167–1178, 2004.

U. Kunzmann, G. Schochlin and A. Bolz, (2002), “Parameter extraction of ECG signals in real-time”.

Biomed Tech (Berl). 4, 2:875-8, 2002.

I. Jekova, G. Bortolan and I. Christov, (2004), “Pattern Recognition and Optimal Parameter Selection in

Premature Ventricular Contraction Classification”, IEEE Computers in Cardiology 2004; 31: 357-

360.

I. Christov, I. Jekova and G. Bortolan, (2006), “Premature ventricular contraction classification by Kth

nearestneighbours rule”, Physiologic Measurements 2006; 24:123–130.

I. Christov and G. Bortolan, (2004), “Ranking of pattern recognition parameters for premature

ventricular contractions classification by neural networks”, Physiologic Measurements 2004; 25:

1281-1290.

M. Gertsh, (2004), The ECG: A two-step approach to diagnosis, Springer, 2004.

A. Wolf, (2004), Automatic Analysis of Electrocardiogram Signals using Neural networks, (in

Portuguese), PUC-Rio, Ms. Thesis, nº 0210429/CA2004, 2004.

S. Akselrod, M. Norymberg, I. Peled, E. Karabelnik, M.S. Green. (1987), “Computerised Analysis of ST

Segment Changes in Ambulatory Electrocardiograms”, Medical and Biological Engineering and

Computing, v. 25, p. 513-519, 1987.

A. Taddei, Distante G, Emdin M, Pisani P, Moody G B, Zeelenberg C and Marchesi C, (1992), The

European ST Database: standard for evaluating systems for the analysis of ST-T changes in

ambulatory electrocardiography Eur. Heart J. 13 1164–72, 1992.

L. Pang, I. Tchoudovski, A. Bolz, M. Braecklein, K. Egorouchkina and W. Kellermann, (2005), Real time

heart ischemia detection in the smart home care system 27th Annu. Int. Conf. Eng. Med. Biol. Soc.,

2005. IEEE-EMBS 2005.


R.C. Davis, F.D.R. Hobbs, J.E. Kenkre, A.K. Roalfe, R. Iles, G.Y.H. Lip and M.K. Davies, (2012), "Prevalence

of atrial fibrillation in the general population and in high-risk groups: the ECHOES study," EP

Europace, vol. 14, pp. 1553-1559, 2012-11-01 00:00:00, 2012.

G.B. Moody and R.G. Mark, (1983), "A new method for detecting atrial fibrillation using R-R intervals.,"

in Computers in Cardiology, 1983, pp. 227-230.

S. Cerutti, L.T. Mainardi, A. Porta and A.M. Bianchi, (1997), "Analysis of the dynamics of RR interval

series for the detection of atrial fibrillation episodes," in Computers in Cardiology 1997, 1997, pp.

77-80.

K. Tateno and L. Glass, (2000), "A method for detection of atrial fibrillation using RR intervals," in

Computers in Cardiology 2000, 2000, pp. 391-394.

L. Senhadji, F. Wang, A. Hernandez and G. Carrault, (2002), "Wavelets extrema representation for QRS-

T cancellation and P wave detection," in Computers in Cardiology, 2002, 2002, pp. 37-40.

C. Sanchez, J. Millet, J.J. Rieta, F. Castells, J. Rodenas, R. Ruiz-Granell et al., (2002), "Packet wavelet

decomposition: An approach for atrial activity extraction," in Computers in Cardiology, 2002, 2002,

pp. 33-36.

S. Shkurovich, A.V. Sahakian and S. Swiryn, (1998), "Detection of atrial activity from high-voltage leads

of implantable ventricular defibrillators using a cancellation technique," Biomedical Engineering,

IEEE Transactions on, vol. 45, pp. 229-234, 1998.

J.J. Rieta, F.Castells, C. Sanchez, V. Zarzoso and J. Millet, (2004), "Atrial activity extraction for atrial

fibrillation analysis using blind source separation," Biomedical Engineering, IEEE Transactions on,

vol. 51, pp. 1176-1186, 2004.

R. Couceiro, P. Carvalho, J. Henriques, M. Antunes, M. Harris, and J. Habetha, (2008), "Detection of

atrial fibrillation using model-based ECG analysis," in ICPR 2008. 19th International Conference on

Pattern Recognition, 2008., 2008, pp. 1-5.

J. Korpáš, M. Vrabec, J. Sadlonova J, D. Salat, L.A. Debreczeni, (2003) "Analysis of the cough sound

frequency in adults and children with bronchial asthma." Acta Physiologica Hungarica 90.1 (2003):

27-34.

J. Korpáš and T. Zoltán, (1979), "Cough and other respiratory reflexes." S. Karger, 1979.

American College of Chest Physicians. (1998), "Managing cough as a defense mechanism and as a

symptom." Chest 114.2 (1998): 133S-181S.

J. N. Evans and M.J. Jaeger, (1975), "Mechanical aspects of coughing." Lung 152.4 (1975): 253-257.

A.A. Abaza, J.B. Day, J.S. Reynolds, A.M. Mahmoud, W.T. Goldsmith, W.G. McKinney,E.L. Petsonk and

D.G. Fraze, (2009), "Classification of voluntary cough sound and airflow patterns for detecting

abnormal pulmonary function." Cough 5.8 (2009): 9284.

O. Lartillot and P. Toiviainen. (2007), "A Matlab toolbox for musical feature extraction from audio."

International Conference on Digital Audio Effects (2007).

M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I. H. Witten, (2009), "The WEKA data mining

software: an update." ACM SIGKDD explorations newsletter 11.1 (2009): 10-18.


P. Piirilä and A. R. A. Sovijärvi, (1995), “Crackles: recording, analysis and clinical significance,” Eur.

Respir. J., vol. 8, no. 12, Dec. 1995.

A. R. A. Sovijärvi, F. Dalmasso, J. Vanderschoot, L. P. Malmberg, and G. Righini, (2000), “Definition of

terms for applications of respiratory sounds,” Eur. Respir. Rev., vol. 10, no. 77, 2000.

X. Lu and M. Bahoura, (2008), “An integrated automated system for crackles extraction and

classification,” Biomed. Signal Process. Control, vol. 3, Jul. 2008.

M. Bahoura and X. Lu, (2006), “Separation of crackles from vesicular sounds using wavelets packet

transform,” in 2006 Proc. Int. Conf. on Acoustics, Speech, and Signal.

L. J. Hadjileontiadis, (2007), “Empirical Mode Decomposition and Fractal Dimension Filter,” IEEE Eng.

Med. Biol. Mag., Feb. 2007.

P. A. Mastorocostas and J. B. Theocharis, (2007), “A dynamic fuzzy neural filter for separation of

discontinuous adventitious sounds from vesicular sounds,” Comput. Biol. Med., vol. 37, Jan. 2007.

S. Charleston-Villalobos, G. Martinez-Hernandez, R. Gonzalez-Camarena, G. Chi-Lem, J. G. Carrillo, and

T. Aljama-Corrales, (2011), “Assessment of multichannel lung sounds parameterization for two-

class classification in interstitial lung disease patients,” Comput. Biol. Med., vol. 41, Jul. 2011.

S. A. Taplidou and L. J. Hadjileontiadis, (2007), ” Wheeze detection based on time-frequency analysis of

breath sounds,” Comput. Biol. Med., vol. 37, issue 8, pp.1073-1083, 2007.

D. Emmanouilidou, K. Patil, J. West, and M. Elhilali, (2012), “A multiresolution analysis for detection of

abnormal lung sounds”, in Conf Proc Eng Med Biol Soc., pp. 3139-3242, 2012.

Y. Shabtai-Musih, J. B. Grotberg, and N. Gavriely, (1992), “Spectral content of forced expiratory wheezes

during air, He, and SF6 breathing in normal humans,” J Appl. Physiol., vol. 72, pp. 629–635, 1992.

M. Bahoura, (2009), “Pattern recognition methods applied to respiratory sounds classification into

normal and wheeze classes,” Comput Biol Med., vol 39, issue 9, pp. 824-843, 2009.

Y. Qiu, A. Whittaker, M. Lucas and K. Anderson, (2005), “Automatic wheeze detection based on auditory

modeling,” in Proc. Inst. Mech. Eng. H., vol. 219, issue 3, pp. 219-227, 2005.

E. Kvedalen, (2003), “Signal processing using the Teager energy operator and other nonlinear

operators,” Master Thesis, Dep. Informatics, Univ. Oslo, Norway, 2003.

R. C. Gonzalez, R. E. Woods, and S. L. Eddins, (2004), Digital Image Processing Using Matlab. Gatesmark

Publishing, 2004.

L. Mendes, I. Vogiatzis, E. Perantoni, E. Kaimakamis, I. Chouvarda, N. Maglaveras, V. Tsara, C. Teixeira,

P. Carvalho, J. Henriques, R. P. Paiva, (2015), “Detection of wheezes using their signature in the

spectrogram space and musical features”, 37th Annual International Conference of the IEEE

Engineering in Medicine and Biology Society, 2015.

http://www.ncbi.nlm.nih.gov/pubmed?term=Taplidou%20SA%5BAuthor%5D&cauthor=true&cauthor_uid=17113064


I. Chouvarda, et al., (2014), "Combining pervasive technologies and cloud computing for COPD and

comorbidities management." Wireless Mobile Communication and Healthcare (Mobihealth), 2014

EAI 4th International Conference on. IEEE, 2014.

D.C. Barber, B.H. Brown, (1984), “Applied Potential Tomography”, Journal of Physics E: Scientific

Instruments, vol 17(9), pp. 723-733, 1984.

M. Bodenstein, M. David, and K. Markstaller, (2009), “Principles of electrical impedance tomography

and its clinical application“, Critical Care Medicine, vol 37(2), p. 713-24, 2009.

A. Adler et al, (2009), “GREIT: a unified approach to 2D linear EIT reconstruction of lung images”,

Physiol. Meas, vol 30(6),pp. S35-55, 2009.

A. Adler et al, (2012), “Whither lung EIT: Where are we, where do we want to go and what do we need

to get there?”, Physiol. Meas, vol 33(5), pp. 679-94, 2012.

I. Frerichs, (2000), ”Electrical impedance tomography (EIT) in applications related to lung and

ventilation: a review of experimental and clinical activities”, Physiol. Meas. vol 21(2), pp. R1-21,

2000.

I. Frerichs, T. Becher, and N. Weiler, (2014), ”Electrical impedance tomography imaging of the

cardiopulmonary system”, Curr Opin Crit CareVol 20, pp. 323-332, 2014.

C. Gomez-Laberge, J.H. Arnold, and G.K. Wolf, (2012), “A Unified Approach for EIT Imaging of Regional

Overdistension and Atelectasis in Acute Lung Injury”, IEEE Trans. On Medical Imaging, vol 31(3) pp.

834-42, 2012.

B. Vogt, S. Pulletz, G. Elke, Z. Zhao, P. Zabel, N. Weiler, and I. Frerichs, (2012), ”Spatial and temporal

heterogeneity of regional lung ventilation determined by electrical impedance tomography during

pulmonary function testing”, J Appl Physiol, vol 113(7), pp.1154-61, 2012.

P. Gaggero, et al., (2013), "Open EIT: a common and open file format for electrical impedance

tomography.", 2013

Hahn, G., et al., (2008), "Improvements in the image quality of ventilatory tomograms by electrical

impedance tomography." Physiological measurement 29.6 (2008): S51.

MS Holmes et.al, (2014), “An Acoustic-Based Method to Detect and Quantify the Effect of Exhalation

into a Dry Powder Inhaler.”, J Aerosol Med Pulm Drug Deliv. 2015 Aug;28(4):247-53. doi:

10.1089/jamp.2014.1169. Epub 2014 Nov 13


9 Abbreviations

AF Atrial Fibrillation

AFL Atrial Flutter

API Application Programming Interface

AV Atrioventricular

BAN Body Area Network

BLE Bluetooth Low Energy

BP Blood Pressure

BT Bluetooth

CA Cloud Agent

CAD Clinical Administration

CH Calculations Handler

CRUD Create-Retrieve-Update-Delete

DSL Domain Specific Languages

DSS Decision Support System

ECG Electrocardiogram (ECG)

ECI External communication interface

EDF European Data Format

EIT Electrical Impedance Tomography

ESC External Sources Connector

F2F Face to face

FE Feature extraction ()

FEH Feature extractions handler

FES Feature Extraction Server

HCP Healthcare professionals

LM Logging module

NA non applicable

OA Orchestrator Agent

OH Organization hub

P2P Pear to Peer

PBH Periodic behavior handler

RD Raw Data

SA Sensor Agent

SE Storage Engine

SEM Storage Engine Module

SSP Bluetooth Secure Simple Pairing

URI Uniform Resource Identifier

VA Vest Agent

Vital Signs VS

WAHD WELCOME Application Hosting Device

WM Workflow Manager

WoT Web of Things


Appendix A. REST API

The REST API defines the HTTP exposed interface endpoints as described in the following tables

Request parameters (for all requests, unless otherwise specified)

Parameter Value

HTTP authentication type BASIC

Username [not publicly communicated]

Password [not publicly communicated]

Content-Type text/turtle

API base path http://{server-url or IP:port number [not publicly communicated]}/welcome/api

Ontology Schema Service

Service Ontology Schema

Path ./schema

Action Retrieve from server

HTTP Verb GET

Example:

Request

GET /welcome/api/schema HTTP/1.1 Host: ... Authorization: ...

Response

Status → 200 OK Content-Type → text/turtle Date → Fri, 03 Jul 2015 10:39:44 GMT [...] @prefix ns15: <http://lomi.med.auth.gr/ontologies/WELCOME_entities#> . @prefix ns14: <http://lomi.med.auth.gr/ontologies/FHIRResourcesExtensions#> . ns14:question a owl:ObjectProperty ; rdfs:domain [ a owl:Class ; owl:unionOf ( ns14:QuestionsGroup ns14:QuestionAnswer ) ] ; rdfs:label "Ερωτήσεις ομάδας ερωτήσεων"@el-gr , "Link to questions in this group"@en ; rdfs:range ns14:Question . ns15:Pulmonologist a owl:Class ; rdfs:subClassOf ns15:MedicalDoctor . [...]


Path ./schema

Action Create on server

HTTP Verb PUT


Credentials Service requests that change the ontology, are for data model maintenance. A different set of credentials is used

Response Status → 201 Created Location → [Schema URI]


Path ./schema

Action Update on server

HTTP Verb POST

Credentials See PUT description

Response 200 OK


Path ./schema

Action Delete from server

HTTP Verb DELETE

Credentials See PUT description

Response 204 No Content

File Service

Service File Service

Path ./files

Action Store file on server

HTTP Verb POST

Content-Type application/octet-stream

Example:

Request

POST /welcome/api/files HTTP/1.1 Host: ... Authorization: ... Content-Type: application/octet-stream

Response

Status → 201 Created Content-Length → 0 Date → ... Location → [File URI]

Note that filename is auto-assigned from the server


Path ./files/{filename}

Action Retrieve file from server


HTTP Verb GET

Content-Type application/octet-stream

Example:

Request

GET /welcome/api/files/d4421987-7e8a-4054-bf6c-7ca24e780d2d.edf HTTP/1.1 Host: ... Authorization: ... Content-Type: application/octet-stream

Response

Status → 200 OK Content-Type → application/octet-stream Date → ... [...] raw file stream [ ...]


Path ./files/{filename}

Action Delete file from server

HTTP Verb DELETE

Example:

Request

DELETE /welcome/api/files/d4421987-7e8a-4054-bf6c-7ca24e780d2d.edf HTTP/1.1 Host: ... Authorization: ...

Response

Status → 204 No Content Date → ...

A 404 will be returned if the file to be deleted does not exist.

RDF Resources Service

Service RDF Service

Path ./data/{type}

Action Store an instance of type {type} on the server

HTTP Verb POST

Example:

Request

POST /welcome/api/data/Patient HTTP/1.1 Host: ... Authorization: ... Content-Type: text/turtle @prefix FHIRResources: <http://lomi.med.auth.gr/ontologies/FHIRResources#> . @prefix FHIRpt: <http://lomi.med.auth.gr/ontologies/FHIRPrimitiveTypes#> . @prefix FHIRResourcesExtensions: <http://lomi.med.auth.gr/ontologies/FHIRResourcesExtensions#> . @prefix FHIRct: <http://lomi.med.auth.gr/ontologies/FHIRComplexTypes#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .


@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . [ a FHIRResources:Patient ; FHIRResources:Patient.active [ a FHIRpt:boolean ; rdf:value true ] ; FHIRResources:Person.birthDate [ a FHIRpt:dateTime ; rdf:value "1974"^^xsd:gYear ] ; FHIRResources:Person.gender FHIRResources:AdministrativeGender_male ; FHIRResourcesExtensions:Person.language [ a FHIRct:Coding ; FHIRct:Coding.code [ a FHIRpt:code ; rdf:value "en"^^xsd:string ] ; FHIRct:Coding.display [ a FHIRpt:string ; rdf:value "english"^^xsd:string ] ; FHIRct:Coding.system [ a FHIRpt:uri ; rdf:value "https://tools.ietf.org/html/bcp47"^^xsd:anyURI ] ] ; FHIRResourcesExtensions:Person.preferred [ a FHIRct:Coding ; FHIRct:Coding.code [ a FHIRpt:code ; rdf:value "en"^^xsd:string ] ; FHIRct:Coding.display [ a FHIRpt:string ; rdf:value "english"^^xsd:string ] ; FHIRct:Coding.system [ a FHIRpt:uri ; rdf:value "https://tools.ietf.org/html/bcp47"^^xsd:anyURI ] ] ] .

Response

Status → 201 Created Content-Length → 0 Date → ... Location → http://[server name]/welcome/api/data/Patient/dea1d7b1-3a85-489f-8463-db0f6b48b938

Note that resource uuid is auto-assigned from the server

Service RDF Service

Path ./data/{type}/{uuid}

Action Retrieve the instance of type {type} and uuid {uuid} from the server

HTTP Verb GET

Example:

Request

GET /welcome/api/data/Patient/dea1d7b1-3a85-489f-8463-db0f6b48b938 HTTP/1.1 Host: ... Authorization: ... Content-Type: text/turtle

Response


Status → 200 OK Content-Length → 2321 Date → ... @prefix ns5: <http://lomi.med.auth.gr/ontologies/FHIRComplexTypes#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . @prefix ns2: <http://lomi.med.auth.gr/ontologies/FHIRResources#> . @prefix ns1: <welkv2://welcome-project.eu/data/Patient/> . @prefix ns3: <http://lomi.med.auth.gr/ontologies/FHIRPrimitiveTypes#> . <http://aerospace.med.auth.gr:8080/welcome/api/data/Patient/dea1d7b1-3a85-489f-8463-db0f6b48b938> a ns2:Patient ; ns2:Patient.active [ a ns3:boolean ; rdf:value 1 ] ; ns2:Person.birthDate [ a ns3:dateTime ; rdf:value "1974-01-01T00:00:00+03:00"^^xsd:gYear ] ; ns2:Person.gender ns2:AdministrativeGender_male ; <http://lomi.med.auth.gr/ontologies/FHIRResourcesExtensions#Person.language> [ a ns5:Coding ; ns5:Coding.code [ a ns3:code ; rdf:value "en"^^xsd:string ] ; ns5:Coding.display [ a ns3:string ; rdf:value "english"^^xsd:string ] ; ns5:Coding.system [ a ns3:uri ; rdf:value "https://tools.ietf.org/html/bcp47"^^xsd:anyURI ] ] ; <http://lomi.med.auth.gr/ontologies/FHIRResourcesExtensions#Person.preferred> [ a ns5:Coding ; ns5:Coding.code [ a ns3:code ; rdf:value "en"^^xsd:string ] ; ns5:Coding.display [ a ns3:string ; rdf:value "english"^^xsd:string ] ; ns5:Coding.system [ a ns3:uri ; rdf:value "https://tools.ietf.org/html/bcp47"^^xsd:anyURI ] ] .

Service RDF Service


Action Delete the instance of type {type} and uuid {uuid} from the server

HTTP Verb DELETE

Example:

Request

DELETE /welcome/api/data/Patient/dea1d7b1-3a85-489f-8463-db0f6b48b938 HTTP/1.1 Host: ... Authorization: ...

Response


Status → 204 No Content Date → ...

A 404 will be returned if the resource to be deleted does not exist.

Service RDF Service


Action Replace the instance of type {type} and uuid {uuid} on the server

HTTP Verb PUT

Response 200 OK

Service RDF Service

Path ./data/{type}

Action Retrieve an RDF bag containing the URIS for every instance of type {type} from the server

HTTP Verb GET

Example:

Request

GET /welcome/api/data/Patient HTTP/1.1 Host: ... Authorization: ... Content-Type: text/turtle

Response

Status → 200 OK Date → ... Content-Length → 298 <http://[server name]/welcome/api/data/Patient> a <http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag> ; <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> <http://[server name]/welcome/api/data/Patient/e7d994d9-11bd-4b56-8b1c-f6d5d45023d8> .

1. An empty bag will be returned if no instances exist that satisfy the search criteria 2. {type} can be any Class from the ontology that can be instantiated, or any superclass of them.

Service RDF Service

Path ./data/{typeA}/{uuid}/{typeB}

Action Retrieve an RDF bag containing the URIS for every instance of type {typeB}, that "points" directly to the instance of type {typeA} and uuid {uuid}

HTTP Verb GET

Example:

Request

GET /welcome/api/data/Patient/a1083798-3c0f-4fb4-8a85-823cbb16ceba/Device HTTP/1.1 Host: ... Authorization: ... Content-Type: text/turtle


Response

Status → 200 OK Date → ... Content-Length → 559 <http://[server name]/welcome/api/data/Patient/a1083798-3c0f-4fb4-8a85-823cbb16ceba/Device> a <http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag> ; <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> <http://[server name]/welcome/api/data/PortableBiomedicalSensorDevice/5c78d8b8-39d3-4ecb-a99c-b84b519b04ab> ; <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2> <http://[server name]/welcome/api/data/PortableBiomedicalSensorDevice/5c78d8b8-39d3-4ecb-a99c-b84b519b04ab> .

1. An empty bag will be returned if no instances exist that satisfy the search criteria 2. {typeB} can be any Class from the ontology that can be instantiated, or any superclass of them.


Appendix B. Signal Processing Algorithms: Details

B.1 ECG

The introduction of noise to the ECG signal affects the accuracy of algorithms designed to detect diverse

cardiac pathologies, namely, arrhythmias [1, 2]. Therefore, we propose a new method to detect noise

periods and evaluate the ECG signals quality.

Following ECG signal quality assessment, the next sections describe the one-lead ECG algorithms that

have been developed and implemented during the first phase of the project. Specific interfaces,

corresponding to each specific algorithm, are provided, namely:

ECG segmentation and Intervals computation: relates with the identification of the main

fiducial points, such as begin and end of P wave, R peaks detection, as well as relevant intervals

computation, such as PR interval duration;

Ventricular arrhythmias episodes detection, including premature ventricular contraction (

identification of normal and abnormal beats), ventricular tachycardia and ventricular

fibrillation;

Atrioventricular (AV) block

ST deviation: estimation of ST segment deviation

Finally, the multi-lead algorithm for atrial fibrillation (AF) will be described. The attained results show

that the multi-lead method outperforms the one-lead approach.

B.1.1 ECG signal quality assessment The scope of the presented algorithm is to serve as an entry barrier to the remaining ECG processing

algorithms, by detecting noise periods and evaluating the ECG signals quality.

One of the strategies used to control the influence of noise artifacts is the denoising of the ECG signals

without affecting the original signal length [3-6]. In [6], ICA was used to separate the clean ECG signal

from the noise sources. In [4], a denoising was performed to the signal recurring to a notch filter, and

Wavelet and Empirical Mode Decomposition methods. In [3], adaptive filtering was used for noise

cancellation. In [5], a reduction of noise was performed by smoothing the signal with a Savitzky-Golay

filter.

Another strategy is the removal of noise periods when they are detected [6-12]. In [6], before source

separation, a gaussianity measure, the neguentropy, was used to evaluate the presence of noise

segments. In [12], a morphological filtering was performed in order to detect noise segments. The use

of accelerometers were explored to detect movement noise in [9]. In [10], statistical metrics were

investigated across the first Intrinsic Mode Function of the Empirical Mode Decomposition. In [11],

statistical properties were explored on the Laplacian model of the ECG signal. In [8], the RMS error was

computed between the original signal and the approximation resulting from the reconstruction by PCA

using some of the top eigenvectors. In [7], a set of detectors, each one specific to one noise or

interference type, was explored, weighting in the end the effects of each interference to the signal

overall quality.


In the next paragraphs, we describe an algorithm to detect noise periods on multi-lead ECG signals for

quality evaluation. We use two main features to do so, the error of the reconstruction by PCA and a

high frequency feature, the high-pass filtered signal.

Methods Pre-processing

The algorithm was optimized for 5-minutes segments of ECG signal (according to the requirements of

the Welcome project); thus, each signal must be at least 5 minutes length. In each 5 minutes segment,

the signal mean is removed by using a high-pass FIR filtering with a 0.5 Hz cut-off frequency. Then, the

signal is normalized using its standard deviation.

R-Peak Detection

In order to obtain the beat matrix to perform the approximation by PCA, first we must segment each

beat. To do so, we use an R-peak detector based on the Pan&Tompkins algorithm [13]. The modification

performed in this algorithm is in the threshold to assess the peaks locations, which is adaptive. The

threshold is derived by the result of a moving average filter with a span of 2 seconds on the resulted

energy vector. In case of absence of peaks in periods longer than 2 seconds this adaptive threshold will

take the values of the zero energy baseline, and still detect peaks where is not supposed to do. To

prevent this effect, we impose a minimum level threshold set to 0.001. This modification is made to

take into account the different possible beat amplitudes.

If the beats per minute rate is less than 25, the whole 5 min are considered as non-quality signal,

because the physiological impossibility of this heart rate indicates a disconnection of the electrodes

from the skin.

Root Mean Square of the Approximation by PCA

The PCA will be performed on the beat matrix (M), which consist in one beat per line. Each beat Bi in M

is obtained from the adjacent R-peaks locations of the current R-peak location, Ri. The beat Bi is the

segment of the signal corresponding to the period of (Ri – Ri-1)/2 to (Ri+1 - Ri)/2. As the lengths of each

beat are different, we must perform a resampling to equalize all the beat lengths in order to perform

the PCA. The chosen length is half of the sampling frequency. All the beats in M suffer a min-max

normalization. Then we derive the eigenvalues and the eigenvectors of the covariance matrix of M,

and make the reconstruction of the beats matrix based only on the eigenvectors that provide at least

98% of the initial total variance. This value was found as the best to discriminate between noise and

clean periods according to a ROC analysis.

𝐴𝑝𝑝𝑟𝑜𝑥𝑀 = 𝑀𝑉𝑉𝑇 (2)

In (2), the matrix ApproxM is the result of the reconstruction of M based only on the most significant

eigenvectors, V. ApproxM has the same size of M, with N lines and fs/2 columns, corresponding to the

number of beats and the beat lengths, respectively.


𝑅𝑀𝑆𝑒𝑟𝑟(𝑖) = √ ∑ (𝐴𝑝𝑝𝑟𝑜𝑥𝑀(𝑖, 𝑗) − 𝑀(𝑖, 𝑗))2

𝑓𝑠/2

𝑗=1

𝑖 = 1, … , 𝑁; 𝑗 = 1, … , 𝑓𝑠/2 (3)

In (3), the vector RMSerr is the root mean square error per beat between the original beats and the

approximation beats. This vector is one of the features used to assess the presence of noise in ECG

segments. Finally, it is smoothed with a MA filter.

High-Pass FIR Filtering

The second feature to assess the presence of noise artifacts is the result of a high-pass FIR filter with a

cut-off frequency of 90 Hz.

𝐻𝐹𝑋 = 𝑋[𝑛] ∗ 𝐵[𝑚] 𝑛 = 1, … , 𝐿𝑥; 𝑚 = 1, … , 𝐿𝑏 + 1 (4)

In (4), HFX results from the convolution of the ECG signal, X of length Lx with the filter coefficients B

corresponding to a high-pass 90 Hz cut-off frequency (fc), and Lb order. The 90 Hz fc was found as the

best to discriminate between noise and clean periods according to a ROC analysis. Finally, this feature

is smoothed with a MA filter.

Noise Assessment in 4s Segments and Thresholding

The assessment of corrupted periods is done by windowing the whole length of the signal in 4 seconds

chunks with 50% of overlap, and examine the two features in that periods. Before windowing, the

thresholds must be determined to evaluate what is, and what is not noise. These thresholds are not

fixed to specific values, they change in each analyzed signal. To set them, we must first look for noise

free periods. The clean periods correspond to the 3 segments, each with 10 beats, with minimum RMSerr

and no overlap. The average RMSerr of this segments is taken as our reference error for clean periods,

REFerr. The thresholds for the first feature are derived from this value as shown in (5) and (6).

𝑡ℎ1𝑒𝑟𝑟 = 𝑓1𝑒𝑟𝑟 × 𝑅𝐸𝐹𝑒𝑟𝑟 (5)

𝑡ℎ2𝑒𝑟𝑟 = 𝑅𝐸𝐹𝑒𝑟𝑟 + 𝑓2𝑒𝑟𝑟 × (𝑡ℎ1𝑒𝑟𝑟 − 𝑅𝐸𝐹𝑒𝑟𝑟) (6)

In (5) and(6), th1err and th2err correspond to the adaptive thresholds for the first feature, RMSerr. The

f1err and f2err parameters are constant values found in the training stage by ROC analysis and correspond

to 2 and 0.5, respectively. To find the thresholds for the second feature, a similar methodology is taken.

The same periods of time used to assess REFerr are used to calculate REFHF, which is the mean value of

HFX on those periods. The thresholds for the second feature are derived from this value as shown in (7)

and (8).

𝑡ℎ1𝐻𝐹 = 𝑓1𝐻𝐹 × 𝑅𝐸𝐹𝐻𝐹 (7)

𝑡ℎ2𝐻𝐹 = 𝑅𝐸𝐹𝐻𝐹 + 𝑓2𝐻𝐹 × (𝑡ℎ1𝐻𝐹 − 𝑅𝐸𝐹𝐻𝐹) (8)

The th1HF and th2HF parameters correspond to the adaptive thresholds for the second feature, HFX. The

f1HF and f2HF parameter are constant values found in the training stage by ROC analysis and correspond

to 1.115 and 0.6, respectively.


Before the final decision rule described in (9), each 4s chunk is evaluated if there are beats detected by

the R-peak detector, if not, the whole chunk is considered as a non-quality segment.

𝐼𝐹 (max{𝐻𝐹𝑋𝑤} > 𝑡ℎ1𝐻𝐹 & 𝑚𝑎𝑥{𝑅𝑀𝑆𝑒𝑟𝑟

𝑤 } > 𝑡ℎ2𝑒𝑟𝑟) | (max{𝑅𝑀𝑆𝑒𝑟𝑟𝑤 } > 𝑡ℎ1𝑒𝑟𝑟 & 𝑚𝑎𝑥{𝐻𝐹𝑋

𝑤}

> 𝑡ℎ2𝐻𝐹) (9)

In (9), max{𝐻𝐹𝑋𝑤} and max{𝑅𝑀𝑆𝑒𝑟𝑟

𝑤 } represent the maximum values on the 4s window of the first and

second feature, respectively. If the condition is true, then the whole window is classified as noise

corrupted.

Results and Analysis We used the ECG signals available from Physionet (MIT-BIH Arrhythmia Database31), and noise records

from the MIT-BIH Noise Stress Database32 also from Physionet, all with a sampling frequency of 360Hz.

The noise records were acquired in a way that the subject’s ECG signals were not visible. Three types

of noise were derived from these records: i) the baseline wandering (bw); ii) the EMG artifact (ma); Iii)

and the electrode motion (em). To add the noise to the ECG signals at different SNR’s, we used the nst

function from the WFDB Software Package also provided by Physionet, based on a peak-to-peak

amplitude to calculate the gains to apply to the noise records.

All the records were resampled to 250 Hz sampling frequency (fs), to match the sensors from the

WELCOME vest. In the training stage we used the MLII-lead signals from the records 201, 205, 213, 217,

223 and 231. We have chosen this records due to its high quality signal and the presence of various

types of arrhythmias, to be able to determine the parameters that best discriminate the noise periods,

keeping a low sensitivity to arrhythmia patterns.

The algorithm was designed with the aim of high adaptation for different leads. The total noise

detection method encompasses four main stages:

The preprocessing stage, where the baseline shifts are removed and the signal is normalized;

An R-peak detection stage;

The stage where the two features used for classification are derived;

A final stage where the assessment of noise corruption is performed in 4-seconds windows

recurring to the main features.

For the MLII-lead testing set, only the high quality signals were used, i.e., only the signals with few

periods of noise without annotations were considered. The overall MLII testing set comprise a total of

25 subjects. In the testing of V2 signals, all the 4 records were used. In total, approximately 2.5 and 10

hours of training and testing data were employed, respectively.

The test results for the MLII-lead and V2-lead of the noise detection algorithm are presented in Table

20 andTable 21, respectively, where SE and SP stand for sensitivity and specificity, respectively (percent

values).

Table 22: Test results for the MLII-lead, and the right one to the results of V2 signals.

MLII em ma bw Total


-6 97.91 96.05 99.54 94.38 99.8 93.9 99.08 94.78 -6

31 http://www.physionet.org/physiobank/database/mitdb/ 32 http://www.physionet.org/physiobank/database/nstdb/


0 98.62 96.82 99.64 94.57 99.37 94.72 99.15 95.07 [-6. 0]

6 98.81 97.23 99.6 95.41 93.01 95.27 98.48 95.37 [-6. 6]

12 97.97 96.97 98.44 96.95 66.5 96.04 95.77 95.69 [-6. 12]

18 86.84 96.83 95.09 96.79 34.62 97.19 91.05 95.94 [-6. 18]

NoiseAvg 96.03 96.78 98.46 95.62 78.66 95.42

Table 23: Test results for the V2-lead.

V2 em ma bw Total


-6 96.81 96.79 96.39 96.51 96.29 95.96 96.5 96.42 -6

0 97.28 97.49 96.86 96.67 96.92 97.47 96.76 96.81 [-6. 0]

6 96.2 97.97 96.93 97.56 97.2 97.79 96.77 97.13 [-6. 6]

12 94.97 97.85 96.65 98.21 86.38 98.06 95.74 97.36 [-6. 12]

18 85.14 98.42 93.33 98.12 38.2 97.89 91.04 97.52 [-6. 18]

NoiseAvg 94.08 97.7 96.03 97.41 83 97.43

As we can see by the results, the overall sensitivity and specificity for electrode motion and muscle

noise are above the 95%, demonstrating a good performance. The lower sensitivity for the baseline

wondering noise may be due to the detrend step in the preprocess stage that reduces this noise

influence. Also, this lower sensitivity is not concerning, as this noise type is less troublesome for

pathological detection algorithms comparing to the remaining noise types [1].

If we take the results from only the most significant SNR levels, i.e., from -6 to 12 dB, the sensibility and

specificity is 95.77% and 95.69% for the MLII-lead, and 95.74% and 97.36% for the V2-lead, respectively.

To conclude, the high sensibility and specificity achieved by the noise detection algorithm allow us to

infer about the general quality of a signal based on the amounts of noise detected. Depending on the

quality of the signal, the algorithm can discard noise corrupted signals that are not suitable for cardiac

characteristics detection algorithms, or approve others that have the required quality.

B.1.2 ECG segmentation and intervals computation The cardiac activity has an inherent periodicity, which is controlled by an electrical conducting system

composed by specialized fibers that spontaneously generate and rapidly conduct electrical impulses

through the heart. These impulses, or action potentials, coordinate the contraction and relaxation of

the cardiac muscle and allow the filling and emptying of the atria and ventricles - this process is called

cardiac cycle. The electrical activity of the heart can be detected at the skin surface and measured as

an electrical potential using electrocardiography. In the electrocardiogram (ECG) the P-wave and QRS

complex correspond to the depolarization of the atria and ventricles, respectively, while the T-wave is

know as a repolarization wave, corresponding to the recovery of the ventricles from the state of

depolarization [14].

The detection of the ECG characteristic waves, i.e. ECG segmentation, is a fundamental task for the

diagnosis of cardiac disorders and heart-rate variability analysis. In this area, one of the most used and

recognized algorithm for ECG segmentation was proposed by Pan & Tompkins [13] in 1985 and since

then, many algorithms have been proposed based on the Pan & Tompkins foundations. In the last

decades, several other methods have been proposed in literature, focusing on techniques such as


morphological derivatives [15], wavelet transforms [16, 17] and, more recently, empirical mode

decomposition [18] to improve, not only, the detection of the QRS complexes, but also the precision

in the detection of the ECG characteristic waves boundaries.

Methods Our algorithm for ECG segmentation was developed based on the principles proposed by Sun et al. [15].

The algorithm is composed by three main steps: 1) Elimination of the baseline wandering; 2) Elimination

of noise and; 3) Detection of the ECG characteristic waves.

To eliminate the baseline wander and noise, the ECG signal is submitted to opening and closure

operations with different structural elements. First, an approximation of the ECG baseline is calculated

and subtracted to the original ECG signal using (10). Then, the noise is removed by applying two

structural elements to the previously calculated signal using (11).

𝑓𝑏𝑐 = 𝑓𝑜

− (𝑓𝑜

∘ 𝐵𝑜 ∙ 𝐵𝑐) (10)

𝑓 =1

2(𝑓𝑏𝑐 ∙ 𝐵𝑝𝑎𝑟 + 𝑓𝑏𝑐 ∘ 𝐵𝑝𝑎𝑟) (11)

where fbc is the ECG signal without baseline wandering, f is the ECG signal after removing both baseline

wandering and noise and fo is the original ECG signal. The length of the structural elements used to

eliminate the baseline wander and noise (Bo , Bc , B1 and B2) was defined according to the length and

shape of the ECG characteristic waves and to the ECG signal sample frequency (fs).

To detect the ECG characteristic waves, the pre-processed ECG signal is differentiated using a

morphological transformation (see (12)) with different scales (depending on the characteristic wave to

be detected) and the local maxima and minima of the resulting signal 𝑀𝑓𝑠 were detected.

𝑀𝑓𝑠(𝑥) = 𝑀𝑓

+(𝑥) − 𝑀𝑓−(𝑥)

= lim

𝑠→0

(𝑓⊕𝑔𝑠)(𝑥)−𝑓(𝑥)

𝑠− lim

𝑠→0

𝑓(𝑥)−(𝑓⊖𝑔𝑠)(𝑥)

𝑠

= lim𝑠→0

(𝑓⊕𝑔𝑠)(𝑥)+(𝑓⊖𝑔𝑠)(𝑥)−2𝑓(𝑥)

𝑠

(12)

QRS detection

The R-peaks were defined as the local minima with absolute amplitude greater than the ThR adaptive

threshold. ThR was defined as the lowest value above which 90% of the observations are found in the

cumulative histogram of 𝑀𝑓𝑠1 (𝑠1 = 0.035 × 𝑓𝑠). The onset and offset of the R-wave was defined as last

maximum before and first maximum after the R-peak as the onset and offset of the R-wave,

respectively. The onset and offset of the Q- and S-waves were defined as the last minimum (maximum,

for negative R peaks) before and the first minimum (maximum, for negative R peaks) after the

previously defined R-wave. To verify if the detected QRS complexes were correctly identified (i.e., if

they are not T-waves, for example), the amplitude of the 𝑀𝑓𝑠1 around each R peak was investigated and

the R-waves corresponding amplitudes lower that 60% of the mean amplitudes of the analyzed window

were excluded.


P and T-wave detection

In order to define the onset, peak and offset of the P and T-waves, the first step is to calculate the

morphological derivatives (with different scales) and detect the local maxima and minima of the 𝑀𝑓𝑠

and exclude the ones with very low amplitude. To this matter, all the maxima bellow 10% of 𝑀𝑓+ and all

the minima above 10% of 𝑀𝑓− were excluded. The scales used to calculate 𝑀𝑓

𝑠 were defined depending

on the morphological characteristics of the P- and T-waves as:

P-waves: 𝑠1 = 0.055 × 𝑓𝑠 and T-waves: 𝑠2 = 0.09 × 𝑓𝑠 (13)

The second step is two find sets of potential characteristic points, corresponding to the onset, peak and

end of the characteristics waves. These sets were defined as the local extremes and the corresponding

local extremes around them, before and after the QRS complexes, within specific search intervals (eqs.

(14) and (15)). The defined sets that are not compliant with physiological patterns were excluded as

presented below:

- P-waves: Characteristic waves with amplitude/length greater than 0.217mv/0.28s and lower

than 0.0195mv/0.06s were excluded.

- T-waves: Characteristic waves with amplitude/length greater than 0.3mv/0.72s and lower

than 0.0195mv/0.06s were excluded.

P-waves: 𝑃𝑄 = 𝑃𝑄𝑟 + 0.1 × 𝐹𝑠 (14)

T-waves: 𝑄𝑇𝑐 =2

9× 𝑅𝑅 × 0.3 × 𝐹𝑠 + 𝑄𝑇𝑟 (15)

where PQ is the maximum length between the onset of the P-wave and the onset of the Q-wave, 𝑃𝑄𝑟

is the mean of the previously estimated PQ intervals, QTc is the maximum length between the onset of

the Q-wave and the offset of the T-wave and 𝑄𝑇𝑟 is the mean of the previously estimated QT intervals.

The last step is to find the set of characteristic points that better approximates to the characteristic

wave under analysis. This selection is performed using two key aspects:

1. Morphological similarity (analyzed using indexes C and D)

2. Physiology (proximity to the QRS complex)

The index C was defined as the correlation coefficient between the detected characteristic wave and a

model of the characteristic wave (identified using QT database), while the index D, a measure of

dispersion, was defined as:

𝐷 =𝑓(𝑚) − 𝐿(𝑚)

𝑚(𝑀) − 𝑚(1)× max(𝑓(𝑚)) , 𝑚(𝑖) = 𝑂1, … , 𝑂𝑁 , 𝑖 = 1, … , 𝑀 (16)

where f is the pre processed signal, L is a straight line joining the first and last point of f and m is the

sample corresponding to the temporal window of the analyzed characteristic set.

Results and Analysis To validate the ECG segmentation algorithm we adopted the following metrics: sensitivity (SE) and

positive predictive value (PPV). Additionally, the mean error of detection of the onset, peak and offset

of the characteristic waves was also assessed by the following equation:


𝑚 =1


× ∑ 𝐼𝑜𝑟


𝑖=1

(𝑖) − 𝐼𝑜𝑒(𝑖) (17)

where 𝐼𝑜𝑒 and 𝐼𝑜𝑟 are the real and detected indexes of the characteristic waves in each heart cycle.

The proposed algorithm was tested in the Physionet QT datase, which is composed by 105 records of

15 minutes extracted from six databases.

The results achieved by the proposed algorithm are presented in Table 22. It is possible to observe that

our algorithm achieved very good results in the detection of both three characteristic waves, being the

best performance achieved in the detection of the QRS complexes (SE: 99.8% and PPV: 99.7%). The

performance of the algorithm in the detection of the P- and T-waves suffered from a minor decrease,

with an SE of 93.2% and PPV of 99.8% for the P-waves and an SE and PPV of 99.1% for the T-waves. The

mean error of the of detection of the characteristic waves onset and offset was approximately 23 ms,

which is about 10% of the of the characteristic waves length. Although the algorithm was not able to

detect the exact boundaries of the characteristic waves, these results show a minor error in the

detection of the boundaries.

Table 24: Results achieved by the proposed algorithm for each dataset of the QT database.


Wave P QRS T P QRS T

Mean 93.17 99.80 97.64 96.82 99.70 98.00

Metric Mean error [ms]


Mean Onset Peak Offset Onset Peak Offset Onset Peak Offset

14.34 12.11 47.54 20.21 6.87 23.35 31.04 21.18 36.20

In Table 23 we present the results achieved by the algorithm presented by Sun et al. [15] (MMD) an

adaptive thresholding-based detector (TD) [19] and a wavelet transform-based detector (WD) [17]. It is

possible to observe that our algorithm presents similar SE to the state of the art algorithms, in the

detection of the QRS and T-waves. However, in the detection of the P-waves, the proposed algorithm

presented lower SE than the MMD and TD algorithm and higher SE than the WD algorithm. In what

concerns to the PPV, only Sun et al. [15] reported the value for the detection of the QRS complexes

(PPV: 65%), which is much lower than the PPV achieved by our algorithm (PPV: 99.7%). The high PPV

achieved by our algorithm shows the importance of the introduced changes for the improvement of

the detected of the characteristic waves and ultimately for the exclusion of incorrectly detected

characteristic waves, which is of great importance in the analysis of ECG signals.


Table 25: Results achieved by the ECG segmentation algorithms proposed in literature (results extracted from [15] for comparison purposes).



P QRS T Onset Offset Onset Offset Onset Offset

MMD 97.20 94.80 100.00 100.00 99.80 99.60 - 65.00 -

TD 96.20 97.00 99.90 99.90 98.80 98.90 - - -

WD 89.90 89.90 100.00 100.00 99.10 99.10 - - -

Metric Mean error (ms)


Onset Offset Onset Offset Onset Offset

MMD 9.00 12.80 3.50 2.40 7.90 8.30

TD 10.30 -5.70 -7.40 -3.60 23.30 18.70

WD 13.00 5.40 4.50 0.80 -4.80 -8.90

B.1.3 Ventricular Arrhythmias This module presents the approach followed for the assessment of ventricular arrhythmias, with clinical

relevance for COPD. The framework includes algorithms for ventricular arrhythmias detection (PVC-

premature ventricular contractions, VT-ventricular tachycardia and VF-ventricular fibrillation), that are

currently incorporated into the algorithms for ECG analysis.

Methods The proposed approach assumes that the fundamental differences in the physiologic origins of sinus

rhythm and PVC/VT/VF can be discriminated via time analysis of the ECG’s morphology and spectral

components. The set of applied discriminating features have been determined using a correlation

analysis procedure of the most significant features found in literature as well as new features developed

during this work. These features are provided as inputs to a hierarchical NN module enabling the

discrimination of specific arrhythmias. In this classifier configuration, each module discriminates only

between two classes. As is well known, the achievable accuracy of a given classifier is highly dependent

on the number of classes present in the input data. Clearly, with only two classes each classifier is able

to provide a superior classification result, due to the lower complexity of the mapping function to be

identified. This fact has justified the design of different neural network classifiers with specialized tasks

(PVC, VT, and VF).

PVC detection

The proposed PVC detection module considers, for each beat classification, a comparative analysis

using the ECG signal in close proximity to the current beat. It has been established that every analysis

window must contain at least 10 beats. In order to meet this constraint for real-time applications, the

length of the present analysis window is estimated based on the heart rate frequency observed in the

previous window.


For each beat in a given time window a set of 13 features (fi, i=1...13) is extracted (for a complete

review, the reader is referred to [20]). Some of the features are directly related to well defined

characteristics of PVCs: R wave length, area and centre of mass of QRS complex, T wave deflection and

amplitude, P wave absence and RR interval variability [20], as illustrated in Figure 16.

0 0.5 1 1.5 2 2.5-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

t(s)

P

RR R

P

TT

RR-Interval

RR-Interval

Q S

QS length

Center of Mass

No P-wave

PVC

Normal QRS

Normal QRS

QS

QS

Figure 16: Features extracted directly connected to ECG characteristics.

The remaining features have been defined using feature extraction methods based on the

morphological derivative, spectral and information content.

Morphology Information: Two features are based on the ECG signal’s morphological derivative. It is

observed that PVC complexes exhibit lower slope before or/and after each R peak. The slope from the

Q peak to the R peak can be measured by calculating the morphological derivative’s peak amplitudes

in this segment (QRamp, see Figure 17).

0.2 0.4 0.6 0.8 1 1.2

0

0.5

1

1.5

2

t(s)

QS

QS

Morphological

Derivative

ECG

PVCNormal Beat

Figure 17 : Comparison of amplitude differences between normal beats and PVCs morphologic derivatives.

Analogously, the slope after the R peak can be represented by the amplitude of the RS peak segment

(RSamp). An approximation to the normal beat R wave left and right slopes can be estimated by

calculating the averages of QR and RS amplitudes. Let these be ampQR and

ampRS , respectively. The

relations between QRamp and ampQR , and the relation between RSamp and

ampRS , provide two

original features, equations (18) and (19):


1

( )( ) ( ) log , 1,...,

amp

amp

amp

QR if i QR i i nbeats

QR (18)

2

( )( ) ( ) log , 1,...,

amp

amp

amp

RS if i RS i i nbeats

RS (19)

Spectral Information: Chick et al. [21] proposed that the QRS complexes’ morphology differences

between PVCs and normal beats might be evaluated using frequency spectrum signatures. Namely, PVC

spectrums tend to be more concentrated in lower frequencies, while spectrums from normal beats

tend to be more dispersed. The following features are based on this observation. The entropy of each

normalized QRS spectrum assesses the concentration of each spectrum. The logarithmic comparison

between the entropy (H) and the average of all entropies ( H ) leads to the feature presented in (20).

Another feature is calculated using the Kullback–Leibler divergence (Dkl) between every normalized

spectrum (Sp) and the average of all spectrums (pS ). This feature expresses the similarity between

each spectrum and a spectrum that is an approximation of a normal QRS complex spectrum, according

to (21).

3

( )( ) ( ) log , 1,...,

H if i H i i nbeats

H

(20)

4 ( ) ( ), , 1,...,kl p pf i D S i S i nbeats (21)

( )

( ), ( ) ( ) log( )

kl

x X

P xD P X Q X P x

Q x

(22)

VT and VF detection

The selection of the most relevant features for VT and VF discrimination was performed through a

correlation analysis procedure. This approach took into consideration a set of available features found

in literature and developed within this work and their dependency with respect to the desired task.

Concerning temporal domain markers, five morphological features were chosen. These represent

information about the shape of the ECG signal:

a) PTABT (percentage of time above or below thresholds) is defined as the relative amount of time of

beat peaks, which are above a high threshold or below a low threshold [22]. This parameter is a

characteristic of the temporal ECG morphology: a normal ECG presents a very small PTABT and a

ventricular tachycardia/fibrillation exhibits a larger value of PTABT.

b) Another feature was based on an algorithm presented by Jekova and Krasteva [23]. Following this

approach, a particular band pass digital filter is applied to the original signal. Then, from the filtered

signal a set of time domain parameters are extracted, enabling the rhythm classification.

c) A feature comparable to the heart rate was extracted. This feature employs a nonlinear transform,

derived from multiplication of backward differences, providing an estimation of extreme variations in

the ECG [24].

d) Another feature was obtained from a two dimensional phase space reconstruction diagram, a tool

able to identify chaotic behaviour of signals. Fundamentally, if the signal is non-chaotic (normal sinus


rate), the curve in the phase space diagram showing a regular form is concentrated in a restricted region

of the plot. However, a chaotic signal (VT/VF) produces a curve that is uniformly distributed over the

entire diagram.

e) For detection of abnormal signal amplitudes and slopes, appropriate markers were implemented.

These markers were evaluated inside a specific window (10 seconds) by assessing the portions of small

and high derivatives in the ECG signal: i) the number of points close to the baseline where the derivative

is small (signal is almost horizontal) and ii) the number of points where the derivative is high (signal is

almost vertical). The baseline (bLine) as well as the respective derivative (dLine) was found. The number

of points close to the baseline (horizontalP) and the number of points, where the derivative is high

(verticalP) were computed using (23) and (24):

( ) ( ) - ( )

1

If dLine i lowT AND ecg i bLine i baseT

horizontalP horizontalP

(23)

( )

1

If dLine i highT

VerticalP verticalP

(24)

Variables lowT, highT and baseT define three thresholds, which are established based on the amplitude

of the ECG signal. The number of points (horizontalP and verticalP) is evaluated for every window and

allows the estimation of the time interval where the signal is almost horizontal or vertical.

Results and Analysis PVC detection

The PVC detection algorithm validation has been performed using 46 of 48 MIT-BIH database records.

Non MLII lead configurations records have been removed from the training and testing datasets,

preserving coherence in the morphological characteristics of ECG records. 1965 PVCs and 11250 normal

QRS complexes from the aforementioned dataset, compose the training dataset. Validation was

performed using all 46 dataset records (6595 PVCs and 95893 normal beats).

The achieved results regarding PVC detection performance are presented and compared with state of

the art algorithms Table 24. The values shown for the later are those reported by their respective

authors. The sensitivity and specificity achieved by the proposed algorithm are 96.35% and 99.15%,

respectively. Comparing these values with those of the algorithms reported in literature, it is observed

that the proposed algorithm reveals very accurate classification results.

Table 26: Results for PVC detection.

Sensitivity [%] Specificity [%]

Proposed Algorithm 96.35 99.15 Jekova et al. [25] 93.30 97.30 Christov et al. [26] 96.90 96.70 Christov and Bortolan [27] 98.50 99.70

Christov and Bortolan [27] present higher sensitivity (+2.15%) and slightly higher specificity (+0.55%)

than the proposed algorithm. However, it should be noted that the algorithm proposed by these

authors is based on two ECG leads and 26 features, while the proposed algorithm is based on only one

ECG lead and a much lower number of features. Another advantage of the proposed PVC detection


module is that it is more patient-invariant than other state-of-the-art PVC algorithms, since it uses

features that rely on local relative comparisons of ECG properties instead of global absolute values.

VT and VF

To validate the VT/VF module of the algorithm, the following public databases were employed: MIT-

BIH Arrhythmia Database (MIT), MIT-BIH Malign Arrhythmia Database (MVA) and Creighton University

Ventricular Tachyarrhythmia Database (CVT)33.

In a first phase, NN structures were trained and validated independently for each database. In a second

phase, the training was performed taking into account simultaneously all available databases. In both

cases the training data was carefully selected in order to include representative examples of the

arrhythmias under study. The validation was performed using randomly data from these databases.

The NNs were trained using the Levenberg-Marquardt algorithm and the number of hidden neurons

was determined experimentally.

The performance of the algorithm for VT and VF (MIT/MVA/CVT) detection are presented in Table 25.

As can be observed, the detection results are higher when considering independently each database.

Applied to all databases the method has a sensitivity of 89.3% and specificity of 94.1%. This has mainly

to do with dubious annotations in some signals of the publicly available databases. For instance, in

Figure 18, two ECG signals from the MVA (record 421) and CVT databases (record 07) are shown. One

has been annotated as a VT (Figure 18a)), while the other one has been annotated as a sinus normal

rate (Figure 18b). Obviously, recognition of signals of this kind is a challenge to the algorithm and should

be dealt with in further studies.

Table 27: VT/VF Classification performance.

Database MIT MVA CVT All

Sensitivity [%] 99.7 90.7 91.8 89.3 Specificity [%] 98.8 95.0 96.9 94.1

Figure 18 : Examples of incorrectly classified ECG signals.

33 http://www.physionet.org/physiobank/database/cudb/

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

a) b)

http://www.physionet.org/physiobank/database/cudb/


B.1.4 AV Blocks An atrioventricular block (AV block) is a type of heart block in which the conduction between the atria

and ventricles of the heart do not follow a correct path, i.e., when the atrial depolarizations fail to reach

the ventricles or when atrial depolarization is conducted with a delay [28].

Methods There are three types of AV block: first-degree, second-degree (Mobitz type 1 and Mobitz type 2) and

third-degree atrioventricular block. From the perspective of the patient, the first-degree AV block is,

typically, not associated with any symptoms; the second-degree is usually asymptomatic, but some

irregularities of the heartbeat can be observed by the patient; the third-degree is, generally, associated

with symptoms such as fatigue, dizziness, and light-headedness.

From the clinical point of view, the identification of each one of the AV blocks is typically diagnosed

from the ECG analysis [28], based on the parameters associated with the conduction between the atria

and ventricles: duration of the PR interval and occurrence/order of P wave and QRS complex.

First degree (see Figure 19)

The first First-degree AV block is defined by the duration of the PR interval, in particular if the PR interval

is longer than 0.2 seconds. Moreover, since only a delay occurs in the conduction of the electrical

activity between the atria and ventricle, each one of the atrial activation leads to a ventricular

activation, designated as a 1:1 correspondence.

Figure 19: Example of an AV block – first-degree34.

Second degree

The prolongation of the PR interval, associated with intermittent failure in the conduction of the

impuls1es from the atrial to the ventricle, can leads to missing/lost beats. Two main types of second-

degree can be distinguished: Mobitz type I AV block and Mobitz type II AV block.

Mobitz type I AV block (see Figure 20): In this case there is a progressive prolongation of the

PR interval, until an atrial impulse fails to be conducted to the ventricles. As result, a ventricular

impulse is lost. After this condition, the electrical conduction recovers to its baseline, and the

cycle can be repeated.

34 https://www.acadoodle.com/index.php/atrio-ventricular-block

https://www.acadoodle.com/index.php/atrio-ventricular-block


Typical values for characterizing this condition are: PR interval progressively increases, being

always greater than 0.12 second; after 3 PR intervals progressively increasing, a beat is

dropped (identified by the occurrence of a P wave, instead of an expected QRS complex), as it

is depicted in Figure 20.

Additionally, once this type of block occurs in regular cycles, there is a fixed ratio between the

number of P waves and the number of QRS complexes per cycle, which is frequently used to

identify the block. As example, a Mobitz type I block with 4 P waves and 3 QRS complexes per

cycle is designated to as "4:3” block, i.e., the P:QRS ratio.

Figure 20: Example of an 3:2 AV block – second degree type I4.

Second degree Mobitz type II (see Figure 21): This condition is characterized by the fact that

there is an intermittent failure of conduction of atrial impulses to the ventricles without a

progressive increasing of the PR interval, i.e., the PR interval is constant. Moreover, a regular

pattern associated with the number of atrial impulses and a dropped ventricular activation can

be recognized. As example, every second or third atrial impulse may occurs a missing

ventricular beat, designated as 2:1 and 3:1 block, respectively.

As the Mobitz type I, the Mobitz type II block occurs with a P:QRS ratio, such as 3:1, 4:1, 5:1.

It should be noted that the P:QRS ratio is of the form "R:(R-1)" in type I Mobitz block and of the

form "R:1" in type 2 Mobitz block. As result, it is possible, without ambiguity, to designate each

one of the AV blocks without referring the type. As example, a "3:1 Mobitz block" or "4:3

Mobitz block" identifies, respectively a type II and a type I block.

Typical values for characterizing this condition are: PR interval constant and higher than to

0.12 seconds; a ventricular beat is missing after 2, 3, 4, 5 or 6 PR intervals.

Figure 21: Example of an 3:1 AV block – second degree type II4.


Third degree or complete heart block

This condition is characterized by the fact that there is complete failure of transmission of atrial

impulses to the ventricles. Although P waves occur regularly, they are completely unconnected to the

rhythm of the QRS complexes.

Results and Analysis As described, all the methods for AV block detection are supported by the identification of the main

waves and intervals, namely the PR interval and QRS complex. To this aim, the segmentation module

developed inside this project was used to compute these intervals. As a result, since the methods

implemented for AV blocks are simply the direct implementation of the referred rules (duration of the

PR intervals and occurrence/order of P waves and QRS complexes), the performance is straightly linked

with the performance of the segmentation.

As result, the validation of this algorithm is established directly by the segmentation performance.

B.1.5 ST deviation

In this section, we describe the approach followed for the estimation of the ST segment deviation.

Methods The algorithms implemented to evaluate ST segment deviation follow basically two stages. First, the

ECG signal is broken into cardiac cycles and a baseline removal process is applied to each individual

interval. The main goal of this step is to guarantee that the isoelectric line is coincident with zero line,

to facilitate ST segment shift evaluation. The second stage involves several measures of the aimed

deviation. In effect, the literature shows a great variety of approaches to assess this ECG feature. Four

measurements of ST deviation are available. In this way, the person analyzing the ST segment deviation

has several different values to support the decision making. The first three were chosen from the

literature, whose details are presented below, and make use of the ECG segmentation method

presented previously. Moreover, a new algorithm was developed and implemented based on Wigner-

Ville transform.

Baseline removal

Based on R peaks localization, the entire ECG signal is broken into cardiac cycles using the average of

the distances between consecutive R peaks. Each cardiac cycle is then submitted to a process of

baseline removal using Wolf’s [29] method. This method starts by determining the initial and final

heights (H1 and H2) of the interval, using the average of first five samples and the average of last five

samples, respectively. Then, the line segment connecting H1 to H2 is subtracted from the ECG,

originating a corrected signal in terms of baseline.

ST segment deviation measurement


The first algorithm, proposed by Akselrod et al [30], measures ST amplitude in the point localized 104

ms after the R peak. The second algorithm, introduced by Taddei et al. [31], considers ST deviation 80

ms after the J point or, in case of sinus tachycardia (heart rate > 120 bpm), 60 ms after the referred

point. This approach has the disadvantage of depending on J point accurate detection. The third

method, used by Pang et al. [32], measures ST segment deviation in a point that depends on heart rate,

according to the following table.

Table 28: ST segment deviation measurement: Pang et al. [32].

Heart Rate ST Segment Deviation Measuring Point

HR ≤ 100 R + 120 ms 100 < HR ≤ 110 R + 112 ms 110 < HR ≤ 120 R + 104 ms HR > 120 R + 100 ms

It is recognized that time-frequency methods are especially adequate for the detection of small

transient characteristic hidden in the ECG, such as the ST segment. Thus, our approach for the

estimation of ST deviation was based on a time-frequency analysis, in particular using the Wigner-Ville

transform.

The Wigner-Ville distribution is a time-frequency representation that considers a time analytical signal.

Regarding the ECG, the equivalent analytic signal of the initial real signal x(n) was obtained by adding

to the real signal its Hilbert transform H[.] as the imaginary part, (25):

( ) ( ) ( ) y n x n jH x n (25)

The basic idea followed here consists in the division of the time frequency map into characteristic areas

and, within each specific area, to perform the evaluation of particular characteristics. With respect to

ST estimation, two time bands and one frequency band were considered. Regarding time bands, the

areas considered were those on the left (isoelectric line) and on the right (ST segment) of the R peak

(assumed to be previously determined). For each time band, it is expected to determine regions where

there is no signal activity (isoelectric line, interval between the end of P wave and the beginning of the

QRS complex, and ST segment, interval between the end of QRS complex and the begin of T wave).

Thus, for those time bands, a high frequency band were considered and, in particular, the region where

high frequency components present minimum values. Figure 22 depicts this idea, where an

electrocardiogram and its corresponding high time-frequency components are shown (between 0.5 and

1.0, half of the normalized range). By evaluating the minimum of the sum of the high frequency

components in each time band, isoelectric and J points can be obtained. Having determined these

points, ST deviation is straightforwardly estimated, as the difference between J and isoelectric values.


0 20 40 60 80 100 120 140 160-6

-4

-2

0

2

4

6

80 20 40 60 80 100 120 140 time

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

Isoelectric point J point

Figure 22: Example of and electrocardiogram and the corresponding high frequency components (Wigner-Ville transform).

Results and Analysis A true validation process could not be done. In fact, the available databases in this area, namely, the

European ST-T Database and the Long-Term ST Database, do not provide the values of the ST segment

deviation, thus impeding an actual comparison. These datasets were created to be used for evaluation

of algorithms that detect or differentiate between ischemic ST episodes, axis-related non-ischemic ST

episodes, etc. This is not the case of the present algorithm, which only considers discrete values of the

ST segment deviation without further processing. For this reason, a correlation analysis was carried out

between our method and each of the state-of-art’s methods. The average results obtained are

presented in the table below.

Table 29: ST deviation correlation analysis.

Method Correlation coefficient

Records

Taddei’s method 0.512 'e0105','e0213','e0403','e0107','e0305','e0405','e0111', 'e0409','e0113','e0411','e0115','e0119','e0413','e0121', 'e0415','e0127','e0501','e0123','e0129','e0515','e0125', 'e0417','e0139','e0601',’e0147','e0603','e0151','e0607', 'e0605','e0159','e0609',‘e0163','e0161','e0203','e0817', 'e0613','e0205','e0615','e0207','e0801','e0303','e0211', 'e0103','e0305',

Pang’s method 0.575

Akselrod’s method 0.576

B.1.6 Atrial Fibrillation using 12-lead ECG Atrial fibrillation (AF) is the most common sustained cardiac arrhythmia and is associated with

significant morbidity, mortality and decreased life quality, specially in elderly, where its prevalence

increases to 8% [33]. Despite not being lethal, AF is associated with an increased risk of heart failure,

dementia, and stroke. AF results from multiple re-entrant wavelets in the atria, which leads to its partial

disorganization and can be recognized by the absence of the P-wave before the QRS-complex, which is

replaced by a “sawtooth” shaped wave, and by the appearance of an irregular cardiac frequency, or

both. The proposed algorithm is based on the assessment of these using a 12-lead ECG approach. In

this algorithm, the atrial activity is retrieved using ICA, which is used to extracted discriminant features

of AF episodes.

Several methods have been proposed in the literature to detect AF using single-lead or multi-lead ECG

signals. To assess the irregularity of the heart rhythm, methods based on the analysis of a Hidden


Markov Model transition probabilities [34], linear and non-linear analysis of auto-regressive (AR)

models [35] and histogram-based statistical analysis [36] have been proposed. To extract the atrial

activity (AA), two main directions have been followed: i) single-lead ECG analysis and multi-lead ECG

analysis. Techniques such as blind source separation, spatio-temporal cancellation and artificial neural

networks are the most promising in these two research fields. In single-lead ECG analysis, the main

approaches for QRS-T cancelation are based on wavelet transforms [37, 38] and template-based

approaches (e.g. [39]), while in the multi-lead ECG analysis, the main methods are based on the blind

source separation techniques, such as independent component analysis (ICA) [40]. In our previous work

[41] we proposed an algorithm for detection of AF based on the single-lead ECG analysis and combining

features assessed from heart rate (HR) and atrial activity.

Methods

The proposed method consists of a noise detection phase, where ECG signals are analyzed in order to

detect the segments with noise, a feature extraction phase, where the ECG signals are processed and

analyzed in order to extract relevant features, and a classification phase, responsible for the

discrimination between AF and non-AF episodes.

In the first phase the segments in each ECG lead that contain noise are detected using a rule-based

strategy aiming the exclusion of the ECG segments that are not liable for further analysis. The adopted

rules are specified bellow:

1) The MLII ECG lead must not contain noise.

2) The maximum number of leads with noise must not be greater than three.

The rationale behind the abovementioned rules is that one of the main feature extraction steps resorts

on the analysis of the MLII lead to detect the ECG characteristic points. Therefore, the noise in this

particular lead can affect negatively the extraction of features and consequently the discrimination of

AF episodes. Additionally, the maximum number of leads with noise was set to three in order to

guarantee the correct separation of the atrial activity from the ventricular activity and consequent

extraction of features during the atrial activity analysis. After the ECG segments that comply with the

aforementioned rules are discriminated, these are subject to further analysis, while the remaining

segments are excluded.

In the feature extraction phase, the first step of the proposed algorithm is the segmentation of the

MLII-lead ECG signal, i.e. the detection of its characteristics waves (P-wave, QRS-complex and T-wave)

using an algorithm similar to the one proposed in [15].

Heart rate analysis

In heart rate (HR) analysis the main objective is the extraction of features that are able to quantify the

regularity of the RR intervals in the ECG. To this matter, the RR sequence was modelled using a Markov

process (see Figure 1) with three states [34]: small (S1), regular (S2) and long (S3) RR intervals.

From the transition probabilities between each state, one constructed a transition probability (TrP)

matrix, which characterizes the regularity (or irregularity) of the heart cycles. The probability of the

state S2 and the probability of transition from S2 to S2 state quantify the regularity of the heart rate and

were defined as the first two features (F1 and F2).


𝐹1 = 𝑃(𝑆2) (26)

𝐹2 = 𝑃(𝑆𝑖 , 𝑆𝑗) = 𝑃(𝑆𝑖|𝑆𝑗) × 𝑃(𝑆𝑖) (27)

where i=2 and j=2 are the labels corresponding to the second state (regular RR interval).

From the analysis of the TrP matrix we found that AF and non-AF episodes present very characteristic

distributions. Based on this finding we proposed the assessment TrP matrix dispersion by measuring its

entropy (H), as defined in (28).

𝐹3 = ∑ 𝑃(𝑆𝑖) × ∑ 𝑃(𝑆𝑗|𝑆𝑖) × 𝑙𝑜𝑔2𝑃(𝑆𝑗|𝑆𝑖)

3

𝑗=1

3

𝑖=1

(28)

Additionally, the similarity between a probabilistic distribution under analysis and a model

representative of AF episodes (collected from MIT-BIH Atrial Fibrillation database) was also assessed

using the Kullback–Leibler divergence, as defined in (29).

𝐹4 = ∑ ∑ 𝑃(𝑥, 𝑦)𝑙𝑜𝑔 (𝑃(𝑥, 𝑦)

𝑃𝐴𝐹(𝑥, 𝑦) )

3

𝑦=1

3

𝑥=1

(29)

where 𝑃𝐴𝐹(𝑥, 𝑦) is the defined AF model and 𝑃(𝑥, 𝑦) is the distribution under analysis. More about

these features can be found in [41].

P-wave detection

The first step in the analysis of the atrial activity is to search for the presence of P-waves before the

QRS complex. While during non-AF episodes the P-waves are commonly distinguishable, during AF

episodes the P-waves are replaced by a “sawtooth” like waveform resultant from the fibrillatory

process. To correctly evaluate the presence or absence of P-waves, the Pearson correlation (ρ)

coefficient is calculated between the segmented P-waves and a P-wave model and the rate of P-waves

per window (F5) is assessed by:

𝐹5 = 𝑅𝑃𝑤𝑎𝑣𝑒𝑠 =𝑁𝑆𝑃

𝑁𝑅𝑅

(30)

where NSP is the number of selected P waves (with ρ greater than 0.2) and NRR is the number of cardiac

cycles detected in the analysed window.

Atrial activity analysis

The third main characteristic of AF is the uncoordination of the atrial activation, which is a result of the

disorganization in the path of the electrical impulses in the atria. In the ECG, the result is the

replacement of the commonly seen P-waves by fibrillatory waves, with typical frequencies ranging from

5 to 8 Hz (herein defined as AFregion). Moreover, the spectrum of AF episodes presents no harmonics

and the amplitudes above 15 Hz are minimal [40].

In order to analyse this process, it is essential to retrieve the signal components related with the AA,

i.e., to cancel or extract the QRS complex and the T wave (QRST) from the analysed signals. To recover

the atrial components of the ECGs, we applied independent component analysis (ICA) as proposed in


[40]. After the separation process is concluded, the components corresponding to the AA are summed

into a single AA source and the power spectral density (PSD) was estimated using the Welch's (WOSA)

method.

From the analysis of the estimated PSDs, five features were extracted. Although AF episodes are

characterized by a peak in the spectrum within this frequency region, occasionally, due to complications

in the separation process or in the peak detection, no peak is found within this region. Therefore, the

first AA feature (F6), was defined as the distance from the spectrum maximum peak to the frequency

interval characteristic of AF episodes, i.e. 5 to 8 Hz.

In contrast with AF spectrums, which present a very characteristic frequency spectrum with a major

peak in the AFregion, non-AF episodes present a spectrum dispersed along a wider frequency range. This

observation lead to the definition of more two AA features, which are the entropy of the spectrum (F7)

and the Kullback–Leibler divergence between the spectrum and a generalized bell-shaped membership

function (F8):

𝑓(𝑥, 𝑎, 𝑏, 𝑐) =1

1 + |𝑥 − 𝑐

𝑎|

2𝑏 (31)

where the parameters a=2, b=6 and c=6 control the shape and position of the function in the AFregion.

Let P(w) be the spectrum under analysis and Q(w) be the aforementioned bell-shaped function, the

features F7 and F8 are defined as follows:

𝐹7 = − ∑ 𝑃(𝑤) × 𝑙𝑜𝑔2𝑃(𝑤)

𝑤∈𝑊

(32)

𝐹8 = − ∑ 𝑃(𝑤)𝑙𝑜𝑔𝑃(𝑤)

𝑄(𝑤)𝑤∈𝑊

(33)

where w is the frequency bin and W is the range of spectrum frequencies.

Additionally, the dispersion of the spectrum was also assessed by the number of spectrum peaks above

half height the maximum peak (F9) and by the weight of the main peak spectrum frequencies (F10), as

defined in (34) and (35).

𝐹10 =∑ 𝑃(𝑤) × 𝑄(𝑤)𝑤∈𝑊

∑ 𝑃(𝑤)𝑤∈𝑊

(34)

where WP is the range of frequencies corresponding to the main spectrum peak.

To assess the weight of the spectrum frequencies above 15 Hz, the last was defined as:

𝐹11 =∑ 𝑃(𝑤)𝑤>15

∑ 𝑃(𝑤)𝑤∈𝑊

(35)

In Figure 23 we illustrate the main characteristics of the AF and non-AF spectra and the rational behind

the extracted features.


Figure 23: Spectra of AF and non-AF episodes and corresponding extracted features.

The classification between AF and non-AF episodes was performed on a 10 second window basis using

a support vector machine classification model (C-SVC algorithm) with a radial basis function.

Results and analysis In this study, AF and non-AF episodes from 12 patients were considered. From those, 1 episode (2

records of 30 mins.) was selected from the “St.-Petersburg Institute of Cardiological Technics 12-lead

Arrhythmia Database” and 11 episodes (11 records of 60 mins.) were selected from the 12-lead ECG

database collected by our team under the project “Cardiorisk - Personalized Cardiovascular Risk

Assessment through Fusion of Current Risk Tools”.

The selected records were partitioned into records of 5 mins leading to the construction of a dataset

consisting of 144 records of 5 mins length, in which 72 records present AF and 72 records present other

rhythms other than AF.

The selection of the features most suitable for detection of AF episodes was performed based on the

F-score metric. A ROC analysis was performed for each features using a 6-fold cross validation approach,

leading to the selection of eight features. The best features were extracted from the HR analysis (F4 and

F3), followed by three features from the AA analysis (F6, F8 and F10). Three features from both HR and

AA analysis (F2, F9, F11) presented a F-score below the 50% and therefore were not selected.

The validation of our algorithm was performed using a 6-fold cross validation approach, where the

dataset was randomly partitioned into 6 equal size subsets. From the 6 subsets, 5 subsets were used

for training (with episodes from 10 patients) and 1 subset (with episodes from 2 patients) was used for

testing. The cross-validation process was repeated 6 times for each of the 6 subsets. This process was

repeated 20 times and the average and standard deviation (avg ± std) of the Sensitivity (SE), Specificity

(SP) and Positive predictive value (PPV) was evaluated.

In Table 28 we present the results achieved by the single-lead [41] and multi-lead algorithms in the

testing subsets. It is possible to observe that the multi-lead algorithm outperformed the single-lead

algorithm. The analysis of AA recovered from 12-lead source separation provided relevant features that

enabled the increase of approximately 9% the algorithm’s SE, 1% in the algorithm’s SP and 4% in the


algorithm’s PPV. These results show that source separation techniques such as ICA can provide a

valuable insight about AA and enable the extraction of reliable features for AF detection.

Table 30: Results achieved by the proposed multi-lead and single-lead AF detection algorithms.

Algorithm Sensitivity [%]

avg ± std Specificity [%]

avg ± std PPV(%)

avg ± std

Single-lead algorithm [41] 79.0 ± 3.0 91.4 ± 0.5 86.6 ± 2.2 Multi-lead algorithm 88.5 ± 1.4 92.9 ± 0.3 90.6 ± 1.4

B.1.7 References [1] A.B.M.A. Hossain and M.A. Haque, “Analysis of Noise Sensitivity of Different ECG Detection

Algorithms,” vol. 3, no. 3, 2013. [2] I. Jekova, A. Cansell and I. Dotsinsky, “Noise sensitivity of three surface ECG fibrillation detection

algorithms.,” Physiological measurement, vol. 22, no. 2. pp. 287–297, 2001. [3] M. Rahman, R. Shaik, and D. Reddy, “Noise Cancellation in ECG Signals using Computationally

Simplified Adaptive Filtering Techniques: Application to Biotelemetry,” Signal Process. An …, vol. 3, no. 5, pp. 120–131, 2009.

[4] R. Sivakumar, R. Tamilselvi and S. Abinaya, “Noise Analysis & QRS Detection in ECG Signals,” 2012 Int. Conf. Comput. Technol. Sci. (ICCTS 2012), vol. 47, no. Iccts, pp. 141–146, 2012.

[5] C. So-In, C. Phaudphut and K. Rujirakul, “Real-Time ECG Noise Reduction with QRS Complex Detection for Mobile Health Services,” Arab. J. Sci. Eng., 2015.

[6] H. Yoon, H. Kim, S. Kwon and K. Park, “An Automated Motion Artifact Removal Algorithm in Electrocardiogram Based on Independent Component Analysis,” Fifth Int. Conf. eHealth, Telemedicine, Soc. Med., no. c, pp. 15–20, 2013.

[7] I. Jekova, V. Krasteva, I. Christov and R. Abächerli, “Threshold-based system for noise detection in multilead ECG recordings,” Physiol. Meas., vol. 33, no. 9, pp. 1463–1477, 2012.

[8] R. Kher, D. Vala and T. Pawar, “Detection of Low-pass Noise in ECG Signals,” no. May, pp. 3–6, 2011. [9] Y. Kishimoto, Y. Kutsuna and K. Oguri, “Detecting motion artifact ECG noise during sleeping by

means of a tri-axis accelerometer,” Annu. Int. Conf. IEEE Eng. Med. Biol. - Proc., pp. 2669–2672, 2007.

[10] J. Lee, D.D. McManus, S. Merchant and K.H. Chon, “Automatic motion and noise artifact detection in holter ECG data using empirical mode decomposition and statistical approaches,” IEEE Trans. Biomed. Eng., vol. 59, no. 6, pp. 1499–1506, 2012.

[11] A. Mincholé, L. Sörnmo and P. Laguna, “ECG-based detection of body position changes using a Laplacian noise model,” Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. EMBS, vol. 14, pp. 6931–6934, 2011.

[12] P. Raphisak, S.C. Schuckers and A. de Jongh Curry, “An algorithm for EMG noise detection in large ECG data,” Comput. Cardiol. 2004, vol. 1, no. 1, pp. 369–372, 2004.

[13] J. Pan and W.J. Tompkins, “A real-time QRS detection algorithm.,” IEEE Trans. Biomed. Eng., vol. 32, no. 3, pp. 230–236, 1985.

[14] C. Ahlström, "Nonlinear phonocardiographic Signal Processing," Institutionen för medicinsk teknik, 2008.

[15] Y. Sun, K. Chan, and S. Krishnan, "Characteristic wave detection in ECG signal using morphological transform," BMC Cardiovascular Disorders, vol. 5, pp. 1-7, 2005.

[16] C. Li, C. Zheng, and C. Tai, "Detection of ECG characteristic points using wavelet transforms," Biomedical Engineering, IEEE Transactions on, vol. 42, pp. 21-28, 1995.

[17] J. P. Martinez, S. Olmos, and P. Laguna, "Evaluation of a wavelet-based ECG waveform detector on the QT database," in Computers in Cardiology 2000, 2000, pp. 81-84.

[18] Z.-E. Hadj Slimane and A. Naït-Ali, "QRS complex detection using Empirical Mode Decomposition," Digital Signal Processing, vol. 20, pp. 1221-1228, 7// 2010.

[19] I. K. Daskalov and I. I. Christov, "Automatic detection of the electrocardiogram T-wave end," Medical & Biological Engineering & Computing, vol. 37, pp. 348-353, 1999/05/01 1999.

[20] R. Couceiro, P. Carvalho, J. Henriques and M. Antunes, “On the detection of premature ventricular contractions”, IEEE EMBS, 2008.

[21] M Chikh, N. Gbelgacem and F. Reguig, “The use of artificial Neural networks to detect PVC beats”, Lab. de Génie Biomédical. Dép. d’électronique, Univ. Abou Bekr Belkaïd, 2003.

[22] L. Tian and J. Tompkins, “Time domain based algorithm for detection of ventricular fibrillation“, Proceedings of the 19 Int. Conference IEEE/EMBS Oct 30-Nov 2, Chicago, USA, 1997.

[23] I. Jekova, and V. Krasteva, “Real time detection of ventricular fibrillation and tachycardia”, Physiol. Meas. 25, 1167–1178, 2004.


[24] U. Kunzmann, G. Schochlin and A. Bolz, “Parameter extraction of ECG signals in real-time”. Biomed Tech (Berl). 4, 2:875-8, 2002.

[25] I. Jekova, G. Bortolan and I. Christov, “Pattern Recognition and Optimal Parameter Selection in Premature Ventricular Contraction Classification”, IEEE Computers in Cardiology 2004; 31: 357-360.

[26] I. Christov, I. Jekova and G. Bortolan, “Premature ventricular contraction classification by Kth nearestneighbours rule”, Physiologic Measurements 2006; 24:123–130.

[27] I. Christov and G. Bortolan, “Ranking of pattern recognition parameters for premature ventricular contractions classification by neural networks”, Physiologic Measurements 2004; 25: 1281-1290.

[28] M. Gertsh, The ECG: A two-step approach to diagnosis, Springer, 2004. [29] A. Wolf, Automatic Analysis of Electrocardiogram Signals using Neural networks, (in Portuguese),

PUC-Rio, Ms. Thesis, nº 0210429/CA2004, 2004. [30] S. Akselrod, M. Norymberg, I. Peled, E. Karabelnik, M.S. Green. “Computerised Analysis of ST

Segment Changes in Ambulatory Electrocardiograms”, Medical and Biological Engineering and Computing, v. 25, p. 513-519, 1987.

[31] A. Taddei, Distante G, Emdin M, Pisani P, Moody G B, Zeelenberg C and Marchesi C, The European ST Database: standard for evaluating systems for the analysis of ST-T changes in ambulatory electrocardiography Eur. Heart J. 13 1164–72, 1992.

[32] L. Pang, I. Tchoudovski, A. Bolz, M. Braecklein, K. Egorouchkina and W. Kellermann, Real time heart ischemia detection in the smart home care system 27th Annu. Int. Conf. Eng. Med. Biol. Soc., 2005. IEEE-EMBS 2005.

[33] R.C. Davis, F.D.R. Hobbs, J.E. Kenkre, A.K. Roalfe, R. Iles, G.Y.H. Lip and M.K. Davies, "Prevalence of atrial fibrillation in the general population and in high-risk groups: the ECHOES study," EP Europace, vol. 14, pp. 1553-1559, 2012-11-01 00:00:00, 2012.

[34] G.B. Moody and R.G. Mark, "A new method for detecting atrial fibrillation using R-R intervals.," in Computers in Cardiology, 1983, pp. 227-230.

[35] S. Cerutti, L.T. Mainardi, A. Porta and A.M. Bianchi, "Analysis of the dynamics of RR interval series for the detection of atrial fibrillation episodes," in Computers in Cardiology 1997, 1997, pp. 77-80.

[36] K. Tateno and L. Glass, "A method for detection of atrial fibrillation using RR intervals," in Computers in Cardiology 2000, 2000, pp. 391-394.

[37] L. Senhadji, F. Wang, A. Hernandez and G. Carrault, "Wavelets extrema representation for QRS-T cancellation and P wave detection," in Computers in Cardiology, 2002, 2002, pp. 37-40.

[38] C. Sanchez, J. Millet, J.J. Rieta, F. Castells, J. Rodenas, R. Ruiz-Granell et al., "Packet wavelet decomposition: An approach for atrial activity extraction," in Computers in Cardiology, 2002, 2002, pp. 33-36.

[39] S. Shkurovich, A.V. Sahakian and S. Swiryn, "Detection of atrial activity from high-voltage leads of implantable ventricular defibrillators using a cancellation technique," Biomedical Engineering, IEEE Transactions on, vol. 45, pp. 229-234, 1998.

[40] J.J. Rieta, F.Castells, C. Sanchez, V. Zarzoso and J. Millet, "Atrial activity extraction for atrial fibrillation analysis using blind source separation," Biomedical Engineering, IEEE Transactions on, vol. 51, pp. 1176-1186, 2004.

[41] R. Couceiro, P. Carvalho, J. Henriques, M. Antunes, M. Harris, and J. Habetha, "Detection of atrial fibrillation using model-based ECG analysis," in ICPR 2008. 19th International Conference on Pattern Recognition, 2008., 2008, pp. 1-5.

B.2 Lung Sounds

B.2.1 Data Collection An acquisition protocol was implemented with the objective to help the design of the pulmonary sound

processing algorithms, in collaboration with the General Hospital of Thessaloniki ‘G. Papanikolaou’ and

at the General Hospital of Imathia (Health Unit of Naoussa), Greece. The protocol includes the collection

of chest sounds on 30 patients at six recording sites. For each site, lung sounds as well as cough and

speech were acquired. The ethical committee of the General Hospital of Thessaloniki ‘G. Papanikolaou’

authorized the data acquisition.

Auscultation was performed with the participants in a sitting position, using six channels that were set

in different positions: four in the back and two in the front of the chest (Figure 24). For each volunteer

we selected the data acquired from the two positions where the adventitious sounds/normal sounds

were better heard.


Figure 24: Potential positions for the acquisition of sounds (red). For each volunteer we selected the data acquired from the two positions where the adventitious sounds/normal sounds were better heard.

The data were acquired at 4 kHz using a 3M Littman electronic stethoscope (model 3200), which

complies with the EMC requirements of the IEC 60601-1-2. The acquisitions were done with the

volunteers in the sitting position. In order to evaluate the performance of the algorithms with a lower

sampling rate (for compatibility with the vest requirements), data was also downsampled to 2 kHz.

All the record data are being annotated by pulmonologists. So far, data from 9 patients and 3 healthy

volunteers is annotated and was used in the experimental study. The study will be complemented with

the remaining annotations as soon as they are available.

As for cough, during the acquisition, the volunteers were asked to simulate cough and then to count

from one to ten. The physicians who supervised the acquisition annotated the different events in the

timeline and we assigned them to three classes: (1) cough, (2) speech, and (3) other. Cough and speech

periods are the predominant events in the cough sub-dataset. In total, 343 cough events were

annotated.

113 wheezes were annotated in the temporal space. Using this information, the frames on the

spectrogram space were annotated as containing or not containing wheezes. During the selection of

the most suitable features for wheeze detection, data from four patients with episodes of forced cough

and speech were also used, as mentioned above. Although these events will be detected in the previous

stages of the data processing workflow, these additional data will allow to improve the robustness of

the algorithm.

199 crackle events were also annotated in the temporal space. Since crackles can appear as individual

or in group, the algorithm developed in this study aims to detect crackle events. Frames were annotated

as containing or not containing crackles. Neighborhood frames (with a maximum frame distance of 5

frames (with a duration of 128 ms and a overlap of 75%)) were grouped and considered to belong to

the same crackle event.

B.2.2 Sound signal quality assessment The goal of this algorithm is to assess the quality of the audio signals before the rest of the sound

processing algorithms start. In its current version, the algorithm is quite simple: given a number of

thresholds (related to amplitude and length), it finds silent and saturated segments, and it outputs the

useful segments of the signal. The default thresholds are shown in Table 29. A segment is considered

silent or saturated if it reaches both amplitude and length thresholds (e.g. a segment is silent if the

number of consecutive samples with absolute amplitude ≤0.1% has length ≥2ms).

We are presently working on a more general lung sound noise detection approach.


Table 31: Default thresholds.

Silent Saturated

Amplitude (%) 0.1 99.9

Length (ms) 2 1

B.2.3 Cough Detection Cough is a respiratory reflex characterized by sudden expulsion of air at a high velocity accompanied by

a transient sound of varying pitch and intensity [1]. It is a natural respiratory defense mechanism to

protect the respiratory tract and one of the most common symptoms of pulmonary disease [2]. Cough

is the most common symptom for which patients seek medical advice [3] and it is a common and

important symptom in many respiratory diseases. It can be characterized by an initial contraction of

the expiratory muscles against a closed glottis, followed by a violent expiration as the glottis opens

suddenly [4]. Currently, no standard method for automatically evaluating coughs has been established,

even though a variety of approaches have been reported in the literature [1, 5]. Here we describe a

method for automatic recognition and counting of coughs solely from sound recordings, which ideally

removes the need for trained listeners.

Methods Algorithm overview

Figure 25 outlines the classification process. First, we perform a pre-processing step, were we apply an

8th-order high-pass filter at 80Hz, followed by normalization, after which near-silent segments are

discarded (through a process similar to the sound signal quality assessment). Figure 26 shows the audio

signal before and after pre-processing.

Then, we compute the magnitude spectrum in frames of 46ms with an overlap of 80%. Next, a set of

different features (described in the next paragraphs) are extracted as explained in [6] from each peak.

Finally, each is classified as either cough, speech or other.

So far, as the dataset's recordings are mono, there was no test with multi-channel synchronous

recordings. Nevertheless, the algorithm takes into account this possibility and only processes the

channels that are not correlated with others, i.e., if the absolute correlation coefficient of C1 and C2 is

above a threshold (default=0.75), C2 is not processed and the results of C1 are copied to C2.


Figure 25: Outline of the cough detection algorithm.

Figure 26: Audio signal (a) before and (b) after pre-processing.

Feature Extraction

For each non-silent segment, seven features are extracted: (1) Mean Inharmonicity, the mean of the

pitch inharmonicity, (2,3) Mean and Max Flux, the mean and the maximum of the spectral flux, (4) Max

RMS, the maximum of the RMS values, and (5,6,7) Pitch Features.

The computation of the pitch features involves some additional steps. First we apply an 8th-order low-

pass filter at 300Hz (the typical adult human voice frequency range, considering both men and women,

is between 80 and 300Hz). Then we compute the spectrum of the low-passed signal and find the peaks

corresponding to the fundamental frequency at each frame. Next, we estimate the waveform envelope

for each non-silent segment and find its peaks. Subsequently, for each peak, we extract (5) Pitch

Duration, which measures the number of frames for which a fundamental frequency is detected, (6)

Percentage Ratio, which is the ratio of Pitch Duration to the total number of frames of each peak, and


(7) Standard Deviation of the detected fundamental frequencies for each peak. Finally, we apply some

heuristic rules whenever there is more than one peak per segment, in order to obtain a single value per

feature for each segment.

Classification

We implemented a two-step classification for cough detection. Figure 27 shows the magnitude

spectrum of 2 cough segments followed by 1 artifact and 2 speech segments. It is possible to see how

the spectral characteristics of speech are distinct from those of the other types. They have clear

fundamental frequencies and most of their energy is contained in a few frequency regions. By

comparison, the energy of the other segments is more spread out along the spectrum and is limited to

small bursts of activity. This behavior was consistently observed for most of the cough and speech

segments, but not for the artifacts, which have more ambiguous spectral characteristics.

The first step is dedicated to speech discrimination. For this reason, we first train a model with the

ground-truth annotations for speech and cough, leaving the artifacts out, and we do a binary

classification through multinomial logistic regression. Tests using the WEKA software package [7]

showed that we should get rid of features (2) and (4) to perform this classification.

The goal of the second step is to discard other artifacts and only keep the cough segments. In this step

we work with all the ground-truth annotations and we don't use features (5) and (6), as they don't add

any significant predictive value.

Figure 27: Magnitude spectrum of 2 cough segments followed by 1 artifact and 2 speech segments.

Results and Analysis For purposes of evaluation, the dataset was divided in two parts: train (TR) and test (T). TR consists of

36 files from 6 Subjects, while T has 18 files from 3 Subjects. The outline of the dataset can be seen in

Table 30(the values between parentheses correspond to the number of events detected after

downsampling to 2KHz).


Table 32: Dataset description.

# Subjects #Recordings # Cough # Speech # Other

Train (TR) 6 36 227 97 52 (55)

Test (T) 3 18 116 72 (73) 6

Table 31 and Table 32 show the results of both classification steps on train/test evaluation (sensitivity,

specificity, positive predictive value (PPV) and F-measure). It can be seen that the specificity improves

in the second step at the cost of sensitivity. After the second step, the number of artifacts other than

speech detected as coughs is 3, half of the total. It can mean that the feature set is not yet optimized

for the detection of artifacts other than speech, but we speculate that the small number of artifacts

present in the training set and their ambiguous spectral characteristics might be responsible for this

result. A possible solution for this problem would be to add more kinds of artifacts to the dataset and

to divide them into several classes.

Table 33: Classification results for cough detection.


First Step 111/116 957 88.5 92.5 94.1

Second Step 107/116 92.2 91.0 93.9 93.0

Table 34: Classification results for cough detection after downsampling to 2KHz.


First Step 114/116 98.3 86.1 91.2 94.6

Second Step 107/116 92.2 94.9 96.4 94.3

Due to the small size of the dataset a leave-one-out (volunteer) cross-validation (LOOCV) approach was

also used to test the performance of the detectors. It is important to refer that the "one left out" is

always a subject, not a single event.

Table 27 and Table 28 show the mean classification results for LOOCV and the sum of detected coughs.

This evaluation setup shows more clearly the usefulness of the second step in weeding out the artifacts.

It is also worth noting that the metrics are not uniform across subjects, with specificity ranging from

≈58 to 100%, and sensitivity and PPV ranging from ≈76 to 100%. Table 29 presents the detailed results

for each subject.

Table 35: Classification results for cough detection with LOOCV.



First Step 331/343 96.5±5 80.3±15 88.9±8 92.3±5

Second Step 319/343 93.2±6 87.6±11 92.5±6 92.6±4

Table 36: Classification results for cough detection with LOOCV after downsampling to 2KHz.




First Step 332/343 97.0±2 78.4±16 88.1±8 92.1±5

Second Step 320/343 93.6±7 89.3±13 93.4±7 93.2±4

Table 37: Detailed results for cough detection (after second step) with LOOCV after downsampling to 2KHz.

# Cough # Speech # Other Sensitivity [%] Specificity [%]

Subject A 32 43 7 96.9 94.0

Subject B 29 9 7 96.6 87.5

Subject C 43 12 8 88.4 100

Subject D 25 6 13 100 57.9

Subject E 37 8 9 94.6 82.4

Subject F 61 19 11 91.8 93.3

Subject G 38 24 2 100 100

Subject H 45 32 2 95.6 94.1

Subject I 33 17 2 78.8 94.7

B.2.4 Detection of crackles and wheezes The automatic detection of adventitious sounds (additional respiratory sounds superimposed on breath

sounds) is a valuable non-invasive tool to detect and follow-up respiratory diseases such as chronic

obstructive pulmonary disease (COPD). Adventitious sounds include wheezes (continuous sounds),

stridors, squawks and crackles (discontinuous sounds).

Crackles are short explosive sounds that seem to result from an abrupt opening or closing of the airways

[8]. Usually crackles can be classified based on its total duration as fine (<10 ms) or coarse (>10 ms) [9].

These sounds are associated with cardiopulmonary diseases and typically present a very characteristic

waveform. The waveform of the crackle generally begins with a width deflection, followed by

deflections with greater amplitude. Several methods have been proposed for automatic detection of

crackles: based on wavelets [10, 11], on empirical mode decomposition method with Katz fractal

dimension filter [12], on adaptive computing methods [13] and on autoregressive models [14].

Wheezes are continuous sounds that are usually associated with obstructions in the air passages. These

whistling sounds are characterized by periodic waveforms with duration equal or longer than 100 ms

[9]. Due to their musical nature, these sounds have a distinct signature in the spectrogram space.

Different methods were proposed to automatically detect wheezes, such as: 1) based on the detection

of the wheezes signature in the spectrogram space [15, 16, 17]; 2) using the Mel-frequency cepstral

coefficients combined with Gaussian mixture model [18]; 3) and based on auditory modelling [19].

Methods

Algorithm overview

Two independent classification models, one for wheezes and another for crackles, were developed.

Both follow the workflow presented in Figure 28.For each sound channel, we begin by doing the feature

extraction and perform a binary classification frame by frame. After, we try to improve the classification

results. To this end, we begin by doing a reduction of the false positives. This reduction is obtained by

ensuring that the event is detected at least in two channels. Frames that were classified as noise or

cough in the previous processing stages are also discarded. After that we do the concatenation of the

neighborhoods. Finally we count the number of events.


Figure 28: Workflow of the crackles/wheezes events detector algorithm.

Feature Extraction

We tested the performance of several features to detect wheezes and crackles events. For the

detection of wheezes, we tested the performance of 30 features and for crackles we evaluated the

performance of 33 features, described in Table 36.

Table 38: Features tested to detect crackles and wheezes events.

Features Crackles Wheezes

Teager energy x

Fractal dimension of the WPST–NST x

Entropy x

WS-SS x x

29 Musical features x x

Teager energy operator

For each frame the maximum of the Teager energy was computed. In the continuous case the Teager

Energy Operator, 𝜓(.) [20] for real signals x(t) is defined as:

x(t) = (dx

dt)

2

− xd2x

dt2

(36)

The discrete version is given by:


x[n] = x2[n] − x[n − 1]x[n + 1] (37)

with 𝑛 ∈ ℤ and 𝑡 ∈ ℝ.

Fractal dimension of the WPST–NST

Bahoura and Lu [2] proposed a filter scheme based on the wavelet packed transform to separate the

crackles from vesicular sounds. The proposed filter, the wavelet packed stationary transform – no

stationary transform (WPST-NST), is a double thresholding non iterative method. In the formulation of

the WPST–NST filter two assumptions were done: 1) the wavelet coefficients related with the crackles

have larger coefficients when compared with the coefficients related with the vesicular sound, 2) the

background related to the coefficients decrease to zero with the increase of scale. For each frame the

maximum of the Katz fractal dimension of the no stationary part of WPST–NST transform was

computed.

Entropy

For each frame the maximum of the entropy was computed. The information entropy, 𝐻, is a

measurement of the disorder of a system. A discrete random variable X with V possibles outcomes

{a1, a2, … , aV} and associated probabilities {P(a1), P(a2), … , P(aV)} has an entropy equal to [22]:

H = − ∑ P(a𝑣)logP(a𝑣) V

𝑣=1 (38)

The entropy of a signal quantized into 𝑉 levels is given by [22]:

H = − ∑g𝑣

Nlog

g𝑣

N

V

𝑣=1 (39)

where g𝑣 is the number of times that the 𝑣th level appears in the signal and N is the size of the signal.

Detection of the wheezes signature in the spectrogram space (WS-SS)

One of the features evaluated in this study aims to detect the signature of the wheezes in the

spectrogram space. This feature is computed following the next steps:

1 – Filter the signal

The first derivative of the discrete Gaussian kernel was used to filter the signal. The kernel size was

equal to 5 bins (on the time domain).

2 - Compute the spectrogram

The spectrogram of the filtered signal, S[t, f] (with f the frequency and t the time), is computed using a

flat top window, partitions with the length equal to 512 bins and an offset equal to 128 bins (on the

time domain).


3 - Subtraction of the background

The subtraction of the background (tread) was done using the method proposed in [23], i.e.,:

SB[t] = S[t] − B[t] (40)

where S[t] is an array with the frequencies values computed at t and B[t] is the background estimated

using a moving average filter applied to S[t] and SB the spectrogram without the background.

4 - Peak detection

In this step we identify the elements of SB[t, f] that should be part of a wheeze following the method

used in [23]. As in [23], we restrict our search to the interval of frequencies between 100 and 1000 Hz.

All the wheezes in our dataset are within this interval. We use two frequency bands, B1 = [100, 600[

Hz and B2 = [600, 1000] Hz. For each frequency band Bk, with k = {1,2}, we calculate a binary matrix,

P, with the same size as the spectrogram S using:

Pk [t, f] = { 1 if Pk [t, f] ≥ SkB[t] + Ck σ(Sk

B[t])

0 else (41)

where the value of 1 in the matrix P correspond to a possible element of a wheeze, the SkB[t] and

σ(SkB[t]) correspond to the mean value and the standard deviation, respectively, of Sk

B[t]. The

parameters Ck = {1.5, 2.5} are the thresholds used.

5 - Reduction of false positives

For the reduction of the false positives we propose to use the (geodesic) morphological opening by

reconstruction operator.

6 - Computation of the array of weights (w)

After the reduction of the false positives we compute a binary array of weights, wb, using:

wb[t] = { 1 if ∑ P[t, f]

f

≥ 1

0 else

(42)

In order to improve the accuracy of the classification we propose to add a temporal Gaussian

regularization to the binary weights. For more details the additional Gaussian weights, wg, see [24].

The final array of weights, w, is the sum of wb with wg.

Musical features

We tested twenty-nine musical features computed using the MIRtoolbox [6]. For the wheeze detection

we considered only the frequencies between 100 Hz to 1000 Hz. The frame duration used was 128 ms

and the hop factor (ratio) was 0.25. Table 38 presents the musical features used in this study.


Feature Selection

The performance of the features to discriminate respiratory sounds with wheezes/crackles events were

studied taking into account the Matthews correlation coefficient (MCC) measured after classifying the

data using the logistic regression classifier. The MCC is a balanced performance measure, especially

suitable when the dataset are unbalanced.

To rank the importance of the features we use the sequential feature selection in the forward direction.

Each frame was classified as containing or not containing wheezes/crackles events. For the wheeze

features selection the additional data with voice and cough was used. For each classification a stratified

10-fold cross-validation approach with ten Monte Carlo repetitions was used.

Classification

The twenty most relevant features for the detection of wheezes and crackles events were used as

inputs of the classification models. As with cough, due to the small size of the dataset (that exhibits a

great variability of the adventitious sounds between patients) a leave-one-out (volunteer) cross-

validation approach was used to test the performance of the detectors. The pre-detection and

elimination of artifacts was not done. Since the acquisitions were unsynchronized it was not possible

to reduce the false positives based on the simultaneity of the detected events. For one patient, P09,

(see Table 39) the positions of the acquired data used for wheeze and crackle detection were different.

Results and Analysis In Table 37 and Table 38, we present the first thirty features selected by the sequential feature selection

when the objective of the classification was the detection of crackles events and when the objective

was the detection of wheezes events. The label for the features WS-SS, fractal dimension of the WPST-

NST, Teager energy and entropy are, respectively, 1,31,32,33.

Table 39: Rank of the first thirty features selected by the sequential feature selection in the forward direction when the objective of the classification was the detection of crackles (C) and when the objective was the detection of wheezes (W).

# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

C 13 8 7 23 11 27 10 33 17 12 28 2 3 14 30

W 6 27 1 20 11 3 4 23 28 13 2 5 22 30 19

# 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

C 6 25 21 31 18 20 24 1 32 29 26 5 4 15 12

W 18 24 7 29 12 21 17 16 15 25 8 26 10 14 9

Table 40: Musical features and the correspondent labels used in this study.

Feature Description Label

RMS Root-mean square energy of the frame 2

Spec. Brightness Amount of energy of the frame above 500 Hz 3

Spec. Centroid Geometric center (centroid) of the spectral distribution 4

Spec. Flatness Ratio between the geometric mean (of the spectral distribution) and the arithmetic mean

5


Spec. Irregularity Degree of variation of the successive peaks of the spectrum 6

Keyclarity Key strength associated to the best key 7

Spec. Kurtosis Excess kurtosis of the spectral distribution 8

13 Mel-frequency cepstral coefficients

Compact description of the shape of the spectral envelope of an audio signal

[9,21]

Spec. Rolloff 85 Frequency such that a 85% of the total energy is contained below that frequency

22

Spec. Rolloff 95 Frequency such that a 95% of the total energy is contained below that frequency

23

Roughness Average of all the dissonance between all possible pairs of spectrogram frame peaks

24

Spread Standard deviation of the spectral distribution 25

Skewness Coefficient of skewness of the spectral distribution 26

Zerocross Number of times the signal change the sign 27

Chromagram centroid Centroid of the redistribution of the spectrum energy along the different pitches

28

Chromagram peak Peak of the redistribution of the spectrum energy along the different pitches

29

Mode Estimation of the modality (major mode vs minor mode) 30

Table 39 and Table 40 show the results of the detection of crackles and wheezes events when the

original signals were used and when the downsampled sounds were used, respectively. Currently the

implementation of the function to compute the fractal dimension is to slow for the requirements of the

WELCOME project. For this reason the feature fractal dimension of the WPST-NST (19th in the rank)

was not used so far for the detection of the crackles events.

Table 41: Results of the detection of crackles and wheezes events for the different volunteers. E corresponds to the number of events, TP to true positives and FP to false positives.

Crackles Wheezes

ID E TP FP E TP FP

P1a 14 13 1 14 11 0

P1b 19 18 1 9 4 0

P2a 0 0 5 4 1 0

P2b 0 0 8 6 6 2

P3a 0 0 0 7 7 1

P3b 0 0 0 7 7 2

P4a 8 5 4 11 11 1

P4b 10 1 0 11 11 0

P5a 14 14 1 - - -

P5b 18 17 0 - - -

P5c - - - 10 10 3

P5d - - - 14 6 0

P6a 23 23 2 0 0 1

P6b 17 15 7 0 0 0

P7a 12 11 13 0 0 0

P7b 26 23 6 0 0 2

P8a 12 10 4 5 5 1

P8b 11 9 9 15 9 0

P9a 7 7 3 0 0 4

P9b 8 8 2 0 0 0

H01a 0 0 1 0 0 0

H01b 0 0 10 0 0 0

H02a 0 0 1 0 0 1

H02b 0 0 7 0 0 0

H03a 0 0 0 0 0 14

H03b 0 0 0 0 0 2


Total 199 174 85 113 88 34

Table 42: Downsampled signal. Results of the detection of crackle and wheeze events for the different volunteers. The E corresponds to the number of events, TP to the true positives and FP to the false positives.

Crackles Wheezes

ID E TP FP E TP FP

P1a 14 13 1 14 11 0

P1b 19 18 2 9 3 0

P2a 0 0 2 4 1 0

P2b 0 0 7 6 6 1

P3a 0 0 0 7 7 0

P3b 0 0 0 7 7 6

P4a 8 5 3 11 11 1

P4b 10 1 0 11 11 0

P5a 14 14 0 - - -

P5b 18 16 0 - - -

P5c - - - 10 9 3

P5d - - - 14 5 0

P6a 23 22 2 0 0 1

P6b 17 15 1 0 0 1

P7a 12 10 11 0 0 0

P7b 26 20 6 0 0 1

P8a 12 10 5 5 5 1

P8b 11 8 11 15 7 1

P9a 7 5 4 0 0 4

P9b 8 8 3 0 0 1

H01a 0 0 1 0 0 0

H01b 0 0 12 0 0 0

H02a 0 0 1 0 0 2

H02b 0 0 4 0 0 0

H03a 0 0 0 0 0 14

H03b 0 0 1 0 0 1 Total 199 165 77 113 83 39

The sensitivity and positive predictive values (PPV) measured after classifying the data with

adventitious sounds using the crackles/wheezes detector is presented in Table 41. Similarly, Table 42

presents the same metrics when the signals were downsampled. There, it can be seen that results

decrease slightly, as a consequence of the fact the some higher frequency content is discarded in some

cases.

Table 43: Sensitivity and PPV (mean and standard deviation (SD) ) measured after classifying the data with adventitious sounds using the crackles/wheezes detector. The symbol *** means that crackles/wheezes were not present in that particular acquisition.

Crackles Wheezes

ID Sensitivity [%] PPV [%] Sensitivity [%] PPV [%]

P1a 93 93 79 100

P1b 95 95 44 100

P2a *** *** 25 100

P2b *** *** 100 75

P3a *** *** 100 88

P3b *** *** 100 78

P4a 63 56 100 92

P4b 10 100 100 100

P5a 100 93 - -

P5b 90 100 - -


P5c - - 100 77

P5d - - 43 100

P6a 100 92 *** ***

P6b 89 68 *** ***

P7a 92 46 *** ***

P7b 88 80 *** ***

P8a 83 71 1 83

P8b 82 50 60 100

P9a 100 70 *** ***

P9b 100 80 *** ***

Mean ± STD

84 ± 22 78 ± 17 79 ± 28 90 ± 10

Table 44: Downsampled signal. Sensitivity and PPV (mean and standard deviation (SD) ) measured after classify the data with adventitious sounds using the crackles/wheezes detector. The symbol *** means that crackles/wheezes were not present in that particular acquisition.

Crackles Wheezes

ID Sensitivity [%] PPV [%] Sensitivity PPV [%]

P1a 93 93 79 100

P1b 95 90 33 47

P2a *** *** 25 100

P2b *** *** 100 47

P3a *** *** 100 100

P3b *** *** 100 47

P4a 63 63 100 100

P4b 10 100 100 47

P5a 100 100 - -

P5b 89 100 - -

P5c - - 90 75

P5d - - 36 100

P6a 96 92 *** ***

P6b 88 94 *** ***

P7a 83 48 *** ***

P7b 77 77 *** ***

P8a 83 67 100 83

P8b 73 42 47 88

P9a 71 56 *** ***

P9b 100 73 *** ***

Mean ± STD

80 ± 23 78 ± 20 76 ± 30 90 ± 14

To sum up, for the crackles event detector we measure a sensitivity value and a positive predictive

value equal to 84 ± 22 % and 78 ± 17% (mean ± standard deviation), respectively (see Table 41).

Thirty-two false positives were detected in the ten acquisitions were no crackles were present (see

Table 39).

For the wheezes event detector we measure a sensitivity value and a positive predictive value equal to

79 ± 28 % and 90 ± 10%, respectively (see Table 41). Twenty-four false positives were detected in

the twelve acquisitions were no wheezes were present (see Table 39).

The great values measured for the standard deviation (STD) may be explained by the small size of the

dataset, the variability of the adventitious sounds between patients and the use of the leave-one-out

(volunteer) cross-validation approach to test the performance of the detectors. The mean values are

also affected by these factors. A slightly decrease of sensitivity and of the PPV was observed when the

signal was downsampled (see 42).


Due to the small size of the dataset the results present in this study must be validated with more data.

The use of more data in the training dataset (and in testing dataset) will allow using nonlinear classifiers

without the fear of generalization problems. The proposed approach should also be validated using

multichannel data.

B.2.5 References [1] J. Korpáš, M. Vrabec, J. Sadlonova J, D. Salat, L.A. Debreczeni, "Analysis of the cough sound

frequency in adults and children with bronchial asthma." Acta Physiologica Hungarica 90.1 (2003): 27-34.

[2] J. Korpáš and T. Zoltán, "Cough and other respiratory reflexes." S. Karger, 1979. [3] American College of Chest Physicians. "Managing cough as a defense mechanism and as a

symptom." Chest 114.2 (1998): 133S-181S. [4] J. N. Evans and M.J. Jaeger, "Mechanical aspects of coughing." Lung 152.4 (1975): 253-257. [5] A.A. Abaza, J.B. Day, J.S. Reynolds, A.M. Mahmoud, W.T. Goldsmith, W.G. McKinney,E.L. Petsonk

and D.G. Fraze, "Classification of voluntary cough sound and airflow patterns for detecting abnormal pulmonary function." Cough 5.8 (2009): 9284.

[6] O. Lartillot and P. Toiviainen. "A Matlab toolbox for musical feature extraction from audio." International Conference on Digital Audio Effects (2007).

[7] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I. H. Witten, "The WEKA data mining software: an update." ACM SIGKDD explorations newsletter 11.1 (2009): 10-18.

[8] P. Piirilä and A. R. A. Sovijärvi, “Crackles: recording, analysis and clinical significance,” Eur. Respir. J., vol. 8, no. 12, Dec. 1995.

[9] A. R. A. Sovijärvi, F. Dalmasso, J. Vanderschoot, L. P. Malmberg, and G. Righini, “Definition of terms for applications of respiratory sounds,” Eur. Respir. Rev., vol. 10, no. 77, 2000.

[10] X. Lu and M. Bahoura, “An integrated automated system for crackles extraction and classification,” Biomed. Signal Process. Control, vol. 3, Jul. 2008.

[11] M. Bahoura and X. Lu, “Separation of crackles from vesicular sounds using wavelets packet transform,” in 2006 Proc. Int. Conf. on Acoustics, Speech, and Signal.

[12] L. J. Hadjileontiadis, “Empirical Mode Decomposition and Fractal Dimension Filter,” IEEE Eng. Med. Biol. Mag., Feb. 2007.

[13] P. A. Mastorocostas and J. B. Theocharis, “A dynamic fuzzy neural filter for separation of discontinuous adventitious sounds from vesicular sounds,” Comput. Biol. Med., vol. 37, Jan. 2007.

[14] S. Charleston-Villalobos, G. Martinez-Hernandez, R. Gonzalez-Camarena, G. Chi-Lem, J. G. Carrillo, and T. Aljama-Corrales, “Assessment of multichannel lung sounds parameterization for two-class classification in interstitial lung disease patients,” Comput. Biol. Med., vol. 41, Jul. 2011.

[15] S. A. Taplidou and L. J. Hadjileontiadis, ” Wheeze detection based on time-frequency analysis of breath sounds,” Comput. Biol. Med., vol. 37, issue 8, pp.1073-1083, 2007.

[16] D. Emmanouilidou, K. Patil, J. West, and M. Elhilali, “A multiresolution analysis for detection of abnormal lung sounds”, in Conf Proc Eng Med Biol Soc., pp. 3139-3242, 2012.

[17] Y. Shabtai-Musih, J. B. Grotberg, and N. Gavriely, “Spectral content of forced expiratory wheezes during air, He, and SF6 breathing in normal humans,” J Appl. Physiol., vol. 72, pp. 629–635, 1992.

[18] M. Bahoura, “Pattern recognition methods applied to respiratory sounds classification into normal and wheeze classes,” Comput Biol Med., vol 39, issue 9, pp. 824-843, 2009.

[19] Y. Qiu, A. Whittaker, M. Lucas and K. Anderson, “Automatic wheeze detection based on auditory modeling,” in Proc. Inst. Mech. Eng. H., vol. 219, issue 3, pp. 219-227, 2005.

[20] E. Kvedalen, “Signal processing using the Teager energy operator and other nonlinear operators,” Master Thesis, Dep. Informatics, Univ. Oslo, Norway, 2003.

[2] M. Bahoura and X. Lu, “Separation of crackles from vesicular sounds using wavelets packet transform,” in 2006 Proc. Int. Conf. on Acoustics, Speech, and Signal.

[22] R. C. Gonzalez, R. E. Woods, and S. L. Eddins, Digital Image Processing Using Matlab. Gatesmark Publishing, 2004.

[23] S. A. Taplidou and L. J. Hadjileontiadis,” Wheeze detection based on time-frequency analysis of breath sounds,” Comput. Biol. Med., vol. 37, issue 8, pp.1073-1083, 2007.

[24] L. Mendes, I. Vogiatzis, E. Perantoni, E. Kaimakamis, I. Chouvarda, N. Maglaveras, V. Tsara, C. Teixeira, P. Carvalho, J. Henriques, R. P. Paiva, “Detection of wheezes using their signature in the spectrogram space and musical features”, 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2015.

http://www.ncbi.nlm.nih.gov/pubmed?term=Taplidou%20SA%5BAuthor%5D&cauthor=true&cauthor_uid=17113064


B.3 EIT Signal Processing

Introduction

In WELCOME, the monitoring of patient’s regional lung ventilation is accomplished through a medical imaging modality, known as electrical impedance tomography (EIT). It is reminded that, EIT is a non-invasive, radiation-free medical imaging technique which will become wireless and wearable through the WELCOME project [Chouvarda, et al, 2014].

In lung EIT, a set of electrodes is placed around the patient’s thorax and used for injecting electrical currents and measuring the resulting potentials through well-defined stimulation patterns (Figure 29, Left) [Barber and Brown, 1984][Bodenstein et al., 2009]. These potentials are used for the computation of images showing the distribution of electrical resistivity changes in the studied chest cross-section. These images constitute a regularized inverse solution of the generalized Laplace equation [Adler et al., 2009], a highly nonlinear ill-posed problem [Adler et al., 2012].

Figure 29: (Left) Injection of electrical currents and surface voltage measurement. (Right) Functional EIT image showing the distribution of tidal ventilation in a seated, spontaneously breathing man

Assessment of regional lung ventilation is one of the most promising applications of EIT because large

volumes of air are moved in and out of the lungs during breathing, resulting in measurable changes in

lung tissue resistivity. An example of a functional EIT image showing the distribution of ventilation in a

healthy subject is depicted in Figure 29(Right). Well ventilated lung regions exhibit high resistivity

variation (bright) and, inversely, regions with absent ventilation are depicted as areas of low variation

(dark). In spite of the low spatial resolution of EIT images, several clinical studies have shown that useful

quantitative information about, e.g., lung ventilation or respiratory system mechanics, can be extracted

[Frerichs, 2000][Frerichs et al., 2014]. This is justified, in part, by the high temporal resolution of the

current EIT systems (ranging from 13 to 80 frames per second).

So far, lung EIT has mainly been used as a tool for the determination of the least injurious mechanical

ventilator settings in intensive care units [Gomez-Laberge et al., 2012]. A recent clinical study on COPD

patients has shown that EIT can detect the effects of regional airway obstruction during pulmonary

function testing [Gomez-Laberge et al., 2012]. In WELCOME, in addition to spontaneous tidal breathing

monitoring, standard ventilation manoeuvres performed are foreseen. Early results have shown that

regional ratios of forced expiratory volume in 1 s (FEV1) and forced vital capacity (FVC) can be computed

using the acquired EIT image sequences, while classical spirometry gives only one global FEV1/FVC

value. New indices characterizing the spatial lung function heterogeneity in COPD may be further

developed. As an example, we may consider the frequency distributions of pixel FEV1/FVC ratios which

successfully depict the heterogeneity of lung disease in COPD compared with healthy subjects [Frerichs

et al., 2014]. Active research aims to establish and standardize these new indices in order to be used as


indicators of COPD progression and, also, as detectors of early stages of exacerbations. To this direction,

EIT findings can be combined with other features (e.g. lung sounds and heart rate) for the design of

more robust exacerbation detection rules.

B.3.1 EIT Processing Modules and EIDORS

The processing and analysis of the EIT signals, from the raw data acquired via the vest sensors up to the final numerical features that serve as input to the decision support modules of WELCOME, involves the following phases/stages:

Reading of the raw voltage measurements,

Preprocessing of the raw voltage measurements,

EIT signal disturbance detection,

EIT image reconstruction,

Breath Detection,

Tidal Breathing Periods Detection,

fEIT image computation,

Forced Manoeuvre Detection,

Pulmonary Function estimation,

Expiration Time computation,

Ventilation Heterogeneity

The flowchart in Figure 30:EIT Feature Extraction flowchart illustrates the flow of EIT raw data in the processing pipeline

Figure 30:EIT Feature Extraction flowchart

Read Raw EIT data

Preprocess

Data

RECONSTRUCTION

Breath Detection

Disturbance detection

Accept? Tidal Breathing

Period Detection

Forced

Manoeuvre

Detection

Functional EIT

image and

Ventilation Indices

Estimation

Expiration Times

& Ventilation

Indices Estimation

Ventilation

Heterogeneity

Estimation

Detection

END

START

NO

YES


It is noted that Filtering, Breath Segmentation and Tidal Breathing Period detection are processing

modules that are used twice. Initially, they are applied on the raw global impedance curve during the

raw data preprocessing stage and then, they are applied on the global impedance curve of the

reconstructed image sequence. In this report, they are described only once at the point where they are

used for the first time.

EIT feature extraction software consists of independent modules written in MATLAB. For raw EIT data reading and reconstruction the software platform of the Electrical Impedance and Diffuse Optical Reconstruction Software (EIDORS) project is used. EIDORS is maintained on the Internet by EIT researchers worldwide; it is platform-independent and compatible with most EIT systems. The data acquisition and image reconstruction algorithms used in WELCOME conform to the Graz consensus on

EIT lung imaging (GREIT), as described in [Adler et al, 2009]. GREIT provides the broadest consensus on EIT lung imaging techniques for electrode placement, stimulation and measurement patterns, as well as image reconstruction algorithms.

B.3.1.1.1 Read the raw voltage measurements

Currently, it is assumed that the raw voltage measurements are stored in the widely used file format, name the .eit format. However, it has been decided that the recently proposed OpenEIT format is more generic and therefore, more adequate for WELCOME. OEIT proposed as standardized data storage and exchange format, to help bridge the gap between EIT hardware manufacturers, algorithm developers and other EIT data users. EIT data in the OEIT format can be read by anyone with access to an OEIT reader (for which a reference open-source implementation will be provided), and will document the gain, signal synchronization and the stimulation and measurement strategy This allows an automatic selection of image reconstruction parameters. OEIT can also serve as an exchange format into which legacy EIT data can be converted [Gaggero, et al., 2013].

B.3.1.1.2 EIT signal disturbance detection

It is well known that the EIT reconstruction process is sensitive to measurement errors and, consequently, movements, sweat, electrode drifting or detachment cause artifacts that severely limit the diagnostic value of the reconstructed images. Therefore, the existence of a mechanism for the EIT signal disturbance detection is absolutely necessary. This mechanism is based on the fact that the

various types of system instabilities cause extremely large voltage variations (see Figure 31) which can be easily detected by simple statistical rules and thresholding. Currently, EIT raw signals containing measurement errors are rejected.

Figure 31: EIT raw voltage measurements with electrode detachment problem


B.3.1.1.3 Raw Data Preprocessing, Breath Segmentation & Tidal Breathing Period Detection

Most of the existing EIT reconstruction algorithms use a segment of the raw voltage measurements as reference. This is because the final output of the reconstruction process is the relative impedance change. There is experimental evidence that if the reference period is a period of tidal breathing, the quality of the reconstructed EIT image sequence is better. For this reason, the detection of a short tidal breathing period in the raw EIT data (i.e. voltage measurements) is considered as a necessary first preprocessing step: Initially, the Raw Global Impedance Curve (RGIC) is computed based on the raw EIT data. RGIC is simply the average voltage of each time frame. Figure 32(Left) shows a typical RGIC signal. It is evident that this signal, in addition to a dominant respiratory component, contains artifacts and noise (Figure 32, Right).

Figure 32: Raw Global Impedance Curve (Left) and a zoomed part of it (Right)

In order to isolate the respiratory component, band-pass filtering is used. An appropriate range of frequencies is [0.15, 0.5] Hz. Figure 33 shows the result of the band-pass filtering operation on the signal of Figure 34.

Figure 33: RGIC after band-pass filtering

The next step of the RGIC processing is the detection of a relatively stable tidal breathing period, that is, the detection of a small number (4-6) of consecutive breaths of almost constant volume and duration. This is accomplished by first detecting the end-inspiratory and end-expiratory values of RGIC

through the differentiation of its filtered version. Figure 34 (Top) shows an indicative result where these characteristic extremal points have been detected.

A common problem of the initial breath segmentation is the detection of smaller breath-like variations which cannot be considered as normal breaths. We call such patterns weak breaths and their detection and elimination is the next step of the processing pipeline. Specifically, weak breath detection is based on the comparison of the volume and duration of each breath with the corresponding average values of the whole set of detected breaths. Weak breaths appear as outliers due to their small volume and duration and, therefore, their detection can be done by a thresholding operation. Next, their

elimination is the operation of merging them with one of their adjacent normal breaths. Figure 34 (Bottom) shows the result of the weak breath detection and elimination procedure.


Figure 34: Breath Segmentation of the filtered Raw Global Impedance Curve before (Top) and after (Bottom) weak breath detection and elimination.

After the elimination of the weak breaths, the resulting breath sequence is sequentially scanned from left to right until a predetermined number of consecutive tidal breaths are found. The criterion used is based on the standard deviation of the volume and duration of the breaths under examination and expresses mathematically the expectation that a tidal breathing period contains breaths of almost constant volume and duration. In the current implementation, instead of stopping at the first detected tidal breathing period, the search continues so that the most stable tidal breathing is detected.

Figure 35: An example of detected tidal breathing period.

B.3.1.1.4 EIT image reconstruction

After the computation of the reference impedance frame in the data preprocessing stage, the raw EIT data is input to the EIT reconstruction module. Currently, the reconstruction module the WELCOME vest is not yet ready. This is because the electrode topology and stimulation patterns of the WELCOME vest are still under adjustment and final decisions are still pending. For these reasons, the development of the EIT features extraction modules is based on the reconstruction algorithms provided by the EIDORS open source library. More specifically, the adjacent electrode current drive and voltage measurement is used. Also, the GREIT reconstruction matrix for adult human thorax geometry is used for the forward model.


Figure 36: Forward model used for training GREIT using contrasting lung regions (source: EIDORS web site)

B.3.1.1.5 Tidal Breathing Periods Detection

A portion of the EIT features required for the characterization of regional ventilation must be computed based on the processing and analysis of tidal (or quiet) breathing periods. The same problem was encountered in the EIT data preprocessing where the most stable tidal breathing period in the Raw Global Impedance Curve was sought. This time, the Global Impedance Curve (GIC) of the reconstructed EIT image sequence must be partitioned into tidal breathing periods (TBP). Again, the GIC is computed, filtered and differentiated for the end-inspiratory and end-expiratory values identification. The sequence of detected breaths is partitioned into TBPs by a simple heuristic rule: the sequence is sequentially scanned by a sliding window of constant width and, at each position, the set of breaths in the sliding window is classified as tidal or not. Next, with a second scan, overlapping tidal segments are

merged into larger tidal segments. The resulting tidal segments constitute the reported TBPs. Figure

37 shows an indicative result of partitioning into TBPs.

Figure 37: Detection of three tidal breathing periods (red lines) in an EIT sequence of 20000 frames (10 minutes)

B.3.1.1.6 fEIT image computation

For each spontaneous tidal breathing period, this module computes functional EIT images (fEIT) and, also, various ventilation indices. fEIT images constitute an important feature of EIT since they can potentially identify alterations of local lung dynamics. Currently, each fEIT image is computed as the

difference between the end-inspiratory EIT-image for the previous end-expiratory one (Figure 38).


More advanced methods for fEIT image computation taking into account the phase shifts of individual pixel-level impedance courses in the EIT sequence are being implemented [Hahn et al., 2008].

Figure 38: A functional EIT image (32 x 32)

B.4 Medication adherence algorithms

The medication adherence algorithm is to be included within the WELCOME cloud. As per the work package description the algorithm is being employed in the WELCOME project to generate appropriate adherence data for COPD patients. The adherence device provides audio recordings of all inhaler events. The algorithm identifies inhaler events that are simply noises as well as inhaler events that indicate a less than successful attempt to take ones medication. The signal processing system is geared towards outpatients using an inhaler. The algorithm specifically used for the system has been tailored to meet the typical behavior of COPD patients and of the WELCOME project requirements.

The patient compliance data will be uploaded through a secure data channel, the HCP will examine the data collected from the INCA device and compare it to the treatment schedule in order to evaluate the patients’ compliance. As per the Initial and Complete System Design the algorithm development completes the following tasks:

Assesses inhaler technique performance from the inhaler audio signals

Provides feedback on what type of technique error occurred

Provides feedback on the date and time the inhaler was used

Additionally, it provides information on the Patient Identification number and the duration of the inhaler use (start date and end date).

This algorithm has been developed into a DLL for employment in the WELCOME cloud which loads in the inhaler compliance files from the patient hub. The INCA device, upload system, medication adherence algorithm and features extracted are being validated in ongoing clinical trials with Royal College of Surgeons in Ireland prior to the employment in WELCOME. This medication adherence algorithm has been found to have an accuracy of 83% in determining the correct inhaler technique compared to the clinical raters in an community dwelling population with asthma [MS Holmes et.al, 2014].

Appendix C. Test cases

The test case template is first and then the actual test case details are presented.


Table 45: Test case template

Field Description

ID Unique identifier of the test case Summary A brief description of the test case Test Area The system component(s) under testing, i.e. the Clinical

Administration Web Application Related Requirement The system requirement(s) validated by the specific test case Objective The test case goal Assignee The partner(s) to perform the test case Preconditions Potential prerequisites for the test case to be available for

execution, i.e. test data requirements or previous test cases that

should be executed first Target Platform The platform that is targeted by this test case, i.e. vest, server,

browser, mobile device Test Data The actual test data that should be used for this test case during

execution Actions This is the test procedure, the actions taken to execute the test

case Expected Results The expected outcome of the test case execution Effects on Data The potential effects that this test case would have on the

already stored data Status e.g. under development, stable, unstable, deprecated

The rest of the tables contain the details of the test cases.

Field Description

ID OA-FEH-1

Summary Testing specific feature extraction

Test Area OA’s FEH – Integration between OA and FES

Related Requirement OA-FR-1, OA-FR-2

Objective Testing that the OA’s FEH can communicate with the FES and get

the respective feature extraction result in the expected format.

Assignee AUTH

Preconditions Some test data (e.g. EDF files, signals) already exist and contain

valid data.


Target Platform Tests the communication between FES and the OA.

Test Data Sample EDF files created from AUTh, sample values for each

test. The test data are created for each specific endpoint to be

tested.

Actions FEH is instantiated. The respective method is called having as

arguments the respective data. The FES responds synchronously

with an ACK confirming that it received the analysis request.

After a period of time, FES asynchronously notifies the OA that

the analysis is over and returns the actual analysis results.

Expected Results The FES returns the expected analysis response.

Effects on Data The analysis data are stored in the SE while testing and removed

after the test is complete in order to keep our testing

environment clean.

Status This is a generalized unit test implemented in such a way that

only the endpoint specific parts would have to be rewritten. It

has been implemented for tachycardia/bradycardia and the

SpO2 feature extraction endpoints. It will be implemented for

the rest of the FES endpoints as they get integrated in the

system.

Field Description

ID OA-FEH-2

Summary Testing error handling during feature extraction



Objective Testing that the OA’s FEH will terminate the respective feature

extraction process and propagate an exception in case of an

error.

Assignee AUTH


valid data.




tested.


arguments the respective data. An exception is mocked. We


check that the exception is thrown and contains the right

message. We also check that the respective OA’s feature

extraction process has been terminated.

Expected Results The exception is thrown and contains the right message. The

feature extraction process is terminated.

Effects on Data None. The test case is totally independent of stored data.

Status This is a generalized unit test. It has been implemented for

tachycardia/bradycardia and the SpO2 feature extraction

endpoints. It will be implemented for the rest of the FES

endpoints as they get integrated in the system.

Field Description

ID OA-FEH-3

Summary Testing timeout handling during feature extraction




extraction process and propagate an exception in case of a

communication timeout.

Assignee AUTH


valid data.




tested.


arguments the respective data. A timeout behaviour is mocked.

We check that a timeout exception is thrown and contains the

right message. We also check that the respective OA’s feature


Expected Results The timeout exception is thrown and contains the right

message. The feature extraction process is terminated.


Status This is a generalized unit test. It has been implemented for

tachycardia/bradycardia and the SpO2 feature extraction


endpoints. It will be implemented for the rest of the FES

endpoints as they get integrated in the system.

Field Description

ID OA-FEH-4

Summary Testing multiple feature extraction processes as part of the

same analysis plan




extraction process and propagate an exception in case of a

communication timeout.

Assignee AUTH


valid data.




tested.


arguments the respective data. A timeout behaviour is mocked.

We check that a timeout exception is thrown and contains the

right message. We also check that the respective OA’s feature


Expected Results The timeout exception is thrown and contains the right

message. The feature extraction process is terminated.


Status This test uses the currently implemented FES process

(tachycardia/bradycardia and SpO2). Ways to generalize in

order to check all the available FES processes in the future

without manually changing the test is to be investigated.

Field Description

ID OA-SEM-1

Summary Testing retrieval of data as an RDF resource from the SE


Test Area OA’s SEM – Integration between OA and SE

Related Requirement Implied by all requirements of OA

Objective Testing that the OA’s SEM can retrieve data from the SE

Assignee AUTH

Preconditions The SE contains data in RDF form in a graph used with a test URI

Target Platform Tests the communication between OA and SE through the SE API

Test Data The data of the graph contained in the SE

Actions SEM is instantiated. The respective method is called trying to

retrieve data from a non-existing graph. The test checks that the

SE responds with an HTTP 404 error. Then the SEM requests

data from an existing graph and tests that the expected resource

is returned.

Expected Results The SE returns the requested data for the happy path of the test

and an HTTP 404 for the request of non-existing data.

Effects on Data The test case has a precondition that a specific test data graph

exists on the SE.

Status Unstable as the SE API has changed. The test case has to be

revisited.

Field Description

ID OA-SEM-2

Summary Testing retrieval of data as an RDF resource, including a signal

file from the SE



Objective Testing that the OA’s SEM can retrieve data from the SE and

especially the respective EDF file

Assignee AUTH


and the respective EDF file



Test Data The data of the graph contained in the SE and the respective EDF

file

Actions SEM is instantiated. The respective method is called trying to

retrieve data from a non-existing graph. The test checks that the

SE responds with an HTTP 404 error. Then the SEM requests

data from an existing graph and tests that the expected resource

and the expected EDF file is returned.

Expected Results The SE returns the requested data for the happy path of the test

and an HTTP 404 for the request of non-existing data.


exists on the SE and that the specific EDF file exists on the SE.

Status Unstable as the SE API has changed. The test case has to be

revisited.

Field Description

ID OA-SEM-3

Summary Testing the ability to update already retrieved data as an RDF

resource, including a signal file from the SE



Objective Testing that the OA’s SEM can retrieve and update data from the

SE and especially the respective EDF file

Assignee AUTH


and the respective EDF file


Test Data The data of the graph contained in the SE and the respective EDF

file

Actions SEM is instantiated and requests data from an existing graph

and tests that the expected resource and the expected EDF file

is returned. Then the test changes the data in the respective RDF

resource and executes the update operation. Finally, the test

retrieves the specific graph again in order to verify that the

update is successful.

Expected Results The SE returns the requested data and when the update is

executed no exception is raised. When the data are checked

after the update, the update is verified.



exists on the SE and that the specific EDF file exists on the SE.

Status Deprecated as the SE API has changed.

Field Description

ID OA-WM-1 Summary Testing workflow which leads to tachypnea feature extraction Test Area OA’s main information handling workflow

Related Requirement OA-FR1, OA-FR2 Objective Testing that the OA’s WM can execute a workflow that gets data

from SEM, decide which FE process to follow and use FEH to

communicate with the FES and get the respective feature

extraction result in the expected format. Stub are used to

imitate the functionality of the SEM. Assignee AUTH Preconditions The SEM and the SE behave as expected, as their behavior is not

tested here (a stub is used).

Target Platform Tests the overall main workflow’s execution

Test Data Sample EDF files created from AUTh

Actions WM and the rest of the OA’s modules are instantiated. A new

measurement arrives from the client through the WELCOME

communication API and the OA gets notified through a raised

java event. The workflow is executed and the data are analyzed

from the FES. Expected Results The FES returns the analysis result recognizing tachypnea. Effects on Data None. The test case is totally independent of stored data.

Status Deprecated as the main workflow of information handling has

been revised.

Field Description

ID OA-WM-2 Summary Testing exception handling inside the main information

workflow


Test Area OA’s main information handling workflow

Related Requirement OA-FR1, OA-FR2 Objective Testing that the OA’s WM can handle exceptions inside the main

workflow of information handling Assignee AUTH Preconditions The test supposes that something goes wrong and an exception

is raised from the SEM.

Target Platform Tests the WELCOME Cloud OA’s WM module

Test Data None

Actions WM and the rest of the OA’s modules are instantiated. A new

measurement arrives from the client through the WELCOME

communication API and the OA gets notified through a raised

java event. The workflow is executed and an exception is raised

by the SEM. The test checks that the exception is handled inside

the workflow according to the exception handling strategy (log

the error). Expected Results The exception is raised and handled through the main workflow

getting logged. Effects on Data None. The test case is totally independent of stored data.

Status Stable. This test has been revised as the main information

handling workflow changed and is now considered stable.

Field Description

ID OA-WM-3 Summary Testing multiple workflows execution in parallel. Test Area OA’s main information handling workflow

Related Requirement OA-FR1, OA-FR2, OA-NFR3 Objective Testing that the OA’s WM can execute many workflows in

parallel Assignee AUTH Preconditions The test supposes that multiple processes should be raised

before



Test Data None

Actions WM and the rest of the OA’s modules are instantiated. Two

measurements arrive from the client through the WELCOME

communication API and the OA gets notified through raised java

events. The workflow is executed for each incoming

measurement independently. The test checks that the two

workflows are executed in parallel. Expected Results The raised workflows are executed in a truly parallel way, on

separate java threads. Effects on Data None. The test case is totally independent of stored data.



Field Description

ID OA-WM-4 Summary Testing starting of workflow Test Area OA’s WM

Related Requirement OA-FR1, OA-FR2, OA-NFR3 Objective Testing that the OA’s WM can start a workflow on OA’s demand Assignee AUTH Preconditions None


Test Data None

Actions WM is instantiated and a specific stub workflow is started. The

status of the workflow instance is checked and verified that the

execution has started. Expected Results No exception raises and the workflow is normally started Effects on Data None. The test case is totally independent of stored data.

Status Stable.


Field Description

ID OA-WM-5 Summary Testing stopping of workflow Test Area OA’s WM

Related Requirement OA-FR1, OA-FR2, OA-NFR3 Objective Testing that the OA’s WM can stop a workflow on OA’s demand Assignee AUTH Preconditions None

Target Platform Tests the OA’s WM module

Test Data None

Actions WM is instantiated and a specific stub workflow is started. The

status of the workflow instance is checked and verified that the

execution has started. Then the workflow is stopped and its

status is checked. Expected Results No exception raises and the workflow is normally stopped. Effects on Data None. The test case is totally independent of stored data.



Field Description

ID OA-WM-6

Summary Testing the release of resources in order to prevent memory

leaks

Test Area OA’s WM

Related Requirement OA-NFR3

Objective Testing that the OA’s WM releases the occupied resources (java

threads) when it should

Assignee AUTH

Preconditions None



Test Data None

Actions WM is instantiated and ten workflows are executed in parallel.

When the execution of the workflows ends, the cleanup

mechanism is called and the number of the live java threads is

checked.

Expected Results No exception raises and the java resources are freed normally in

order to confirm that no memory leaks or other errors occur


Status Stable.

Field Description

ID OA-WM-7 Summary Testing the passing of arguments in workflows Test Area OA’s WM

Related Requirement No direct requirement Objective Testing that the OA’s WM can start a workflow and pass

arguments to it Assignee AUTH Preconditions None


Test Data None

Actions WM is instantiated and one workflow starts with a specific

argument given. The test sniffs that the workflow process

started and checks that the specific argument has been used to

set value to a specific variable. Expected Results No exception raises and the tested variable contains the value

of the given argument. Effects on Data None. The test case is totally independent of stored data.

Status Stable.


Field Description

ID OA-WM-8 Summary Testing that an exception occurred in a workflow is thrown in

the WM Test Area OA’s WM

Related Requirement No direct requirement Objective Testing that when an exception happens during normal

workflow execution, the WM sees the exception and handles it

accordingly. Assignee AUTH Preconditions None


Test Data None

Actions WM is instantiated and one workflow starts. This workflow

raises an exception and the WM catches it. Expected Results The exception is raised as expected and caught by the WM

module. Effects on Data None. The test case is totally independent of stored data.

Status Under development.

Field Description

ID OA-WM-9 Summary Testing starting of the main information handling workflow Test Area OA’s WM

Related Requirement OA-FR1, OA-FR2, OA-NFR3 Objective Testing that the OA’s WM can start the main information

handling workflow on OA’s demand Assignee AUTH


Preconditions None


Test Data None

Actions WM is instantiated and the real main information handling

workflow is started. The status of the workflow instance is

checked and verified that the execution has started. Expected Results No exception raises and the main information handling

workflow is normally started Effects on Data None. The test case is totally independent of stored data.

Status Stable.

Field Description

ID OA-WM-10 Summary Testing stopping of the main information handling workflow Test Area OA’s WM

Related Requirement OA-FR1, OA-FR2, OA-NFR3 Objective Testing that the OA’s WM can stop the main information

handling workflow on OA’s demand Assignee AUTH Preconditions None


Test Data None

Actions WM is instantiated and the main information handling workflow

is started. The status of the workflow instance is checked and

verified that the execution has started. Then the workflow is

stopped and its status is checked. Expected Results No exception raises and the workflow is normally stopped. Effects on Data None. The test case is totally independent of stored data.


Status Stable.

Field Description

ID OA-WM-11 Summary Testing the passing of arguments in the main information

handling workflow Test Area OA’s WM

Related Requirement No direct requirement Objective Testing that the OA’s WM can start the main information

handling workflow and pass a URI regarding the specific

measurement to be served Assignee AUTH Preconditions None


Test Data None


starts with the respective measurement URI given as argument.

The test sniffs that the workflow process started and checks that

the specific argument has been used to set value to a specific

inner workflow variable. Expected Results No exception raises and the tested variable contains the value

of the given argument. Effects on Data None. The test case is totally independent of stored data.

Status Stable.

Field Description

ID OA-WM-12 Summary Testing that the main information handling workflow saves the

state of the analysis process Test Area OA’s WM


Related Requirement No direct requirement Objective The objective is to check that when the workflow finishes

execution, it saves its own state. Assignee AUTH Preconditions None


Test Data None



The various services called by the main workflow are modelled

as stubs. The execution of the workflow according to the stubs

behavior is checked. Expected Results No exception raises and the “save analysis status” functionality

is called from the main information handling workflow. Effects on Data None. The test case is totally independent of stored data.

Status Stable.

Field Description

ID OA-WM-13 Summary Testing that the main information handling workflow does not

raise the DSS rules validation procedure when it should not Test Area OA’s WM

Related Requirement No direct requirement Objective The objective is to check that the main information handling

workflow correctly avoids calling the DSS validation procedure

when it should not. Assignee AUTH Preconditions The test supposes that the analysis context information

retrieved (using the respective stub) prescribes that DSS rules

should not be validated. Target Platform Tests the overall main workflow’s execution

Test Data None






behavior is checked. Expected Results No exception raises and the “Raise the respective DSS

processes” functionality is not called from the main information

handling workflow. Effects on Data None. The test case is totally independent of stored data.

Status Stable.

Field Description

ID OA-WM-14 Summary Testing that the main information handling workflow raises the

DSS rules validation procedure when it should Test Area OA’s WM


workflow correctly raises the DSS validation procedure when it

should. Assignee AUTH Preconditions The test supposes that the analysis context information

retrieved (using the respective stub) prescribes that DSS rules

should be validated. Target Platform Tests the overall main workflow’s execution

Test Data None





behavior is checked. Expected Results No exception raises and the “Raise the respective DSS

processes” functionality is called from the main information



Status Stable.

Field Description


raise the FEH procedure when it should not Test Area OA’s WM


workflow correctly avoids calling the FEH procedure when it

should not. Assignee AUTH Preconditions The test supposes that the analysis context information

retrieved (using the respective stub) prescribes that Feature

Extraction Server (FES) should not be called. Target Platform Tests the overall main workflow’s execution

Test Data None





behavior is checked. Expected Results No exception raises and the “Raise the respective FEH



Status Stable.

Field Description


FEH procedure when it should


Test Area OA’s WM


workflow correctly raises the FEH procedure when it should. Assignee AUTH Preconditions The test supposes that the analysis context information

retrieved (using the respective stub) prescribes that Feature

Extraction Server (FES) should be called. Target Platform Tests the overall main workflow’s execution

Test Data None





behavior is checked. Expected Results No exception raises and the “Raise the respective FEH



Status Stable.

Field Description


raise the CH procedure when it should not Test Area OA’s WM


workflow correctly avoids calling the CH procedure when it


retrieved (using the respective stub) prescribes that Calculation

Handler (CH) should not be called.



Test Data None





behavior is checked. Expected Results No exception raises and the “Raise the respective CH processes”

functionality is not called from the main information handling

workflow. Effects on Data None. The test case is totally independent of stored data.

Status Stable.

Field Description


CH procedure when it should Test Area OA’s WM


workflow correctly raises the CH procedure when it should. Assignee AUTH Preconditions The test supposes that the analysis context information

retrieved (using the respective stub) prescribes that Calculation

Handler (CH) should be called. Target Platform Tests the overall main workflow’s execution

Test Data None





behavior is checked.


Expected Results No exception raises and the “Raise the respective CH processes”

functionality is called from the main information handling

workflow. Effects on Data None. The test case is totally independent of stored data.

Status Stable.

Field Description


raise the ESC procedure when it should not Test Area OA’s WM


workflow correctly avoids calling the ESC procedure when it


retrieved (using the respective stub) prescribes that External

Sources Connector (ESC) should not be called. Target Platform Tests the overall main workflow’s execution

Test Data None





behavior is checked. Expected Results No exception raises and the “Raise the respective ESC



Status Stable.


Field Description


ESC procedure when it should Test Area OA’s WM


workflow correctly raises the ESC procedure when it should. Assignee AUTH Preconditions The test supposes that the analysis context information

retrieved (using the respective stub) prescribes that External

Sources Connector (ESC) should be called. Target Platform Tests the overall main workflow’s execution

Test Data None





behavior is checked. Expected Results No exception raises and the “Raise the respective ESC



Status Stable.

Field Description

ID OA-WM-21 Summary Testing that the main information handling workflow retrieves

the analysis context information needed to proceed Test Area OA’s WM

Related Requirement No direct requirement


Objective The objective is to check that the main information handling

workflow correctly retrieves the analysis context information

which will guide the execution of the rest of the workflow Assignee AUTH Preconditions None


Test Data None





behavior is checked. Expected Results No exception raises and the “Retrieve arrived observation's

analysis context” functionality is called from the main

information handling workflow. Effects on Data None. The test case is totally independent of stored data.

Status Stable.

Field Description

ID OA-ECI-1

Summary Testing the starting of the OA

Test Area OA’s ECI – Integration between OA and WELCOME

communications API

Related Requirement Needed according the design decisions.

Objective Testing that the OA can start through a call to its ECI.

Assignee AUTH

Preconditions None

Target Platform Tests that the OA can start through an ECI call

Test Data None


Actions The ECI’s Starter class is instantiated and workflow session is

prepared. The status of the OA is checked to verify that it has

actually been started.

Expected Results No exception is raised and the OA has started normally.


Status Developed. However this test has to be revisited as the WM has

been revised and the respective classes have been majorly

refactored.

Field Description

ID OA-ECI-2

Summary Testing the starting of the OA with multiple workflows on the

memory


communications API


Objective Testing that the OA can start through a call to its ECI, loading two

workflows in its memory.

Assignee AUTH

Preconditions None

Target Platform Tests that the OA can start through an ECI call

Test Data None


prepared. The status of the OA is checked to verify that it has

actually been started and that two BPMN2 workflows are

actually loaded on memory.

Expected Results No exception is raised and the OA has started normally.


Status Developed. However this test has to be revisited as the WM has

been revised and the respective classes have been majorly

refactored.

Field Description


ID OA-ECI-3

Summary Testing that the errors occurred in the OA are propagated

correctly


communications API


Objective Testing that the OA notifies listeners of an error event. Typically,

the Communication API would be such a listener.

Assignee AUTH

Preconditions None

Target Platform OA’s ECI

Test Data None


prepared. An exception is mocked during workflow execution.

Expected Results An exception is raised and is propagated through our custom

event mechanism.


Status Stable.

Fi

Field Description

ID OA-LM-1

Summary Testing the OA’s LM capability to log

Test Area OA’s LM

Related Requirement OA-NFR-1, Needed according the design decisions.

Objective Testing that the OA’s log can log in multiple levels

Assignee AUTH


Preconditions None

Target Platform OA’s LM

Test Data None

Actions The OA’s LM is instantiated and four messages in various log

levels are logged

Expected Results No exception is raised and the messages have been logged

successfully according to the LM configuration.


Status Developed. However this test has to be revisited as the LM

needs to be checked as a database log.

Field Description

ID SGC-1 Summary Create RDF Graph using SPARQL Graph Store HTTP Protocol

communication to the RDF Triple Store Test Area SPARQL Graph Store Protocol Client module

Related Requirement Overall Cloud design Objective Testing that the SPARQL Graph Store Protocol Client module can

communicate with the RDF Triple Store using HTTP requests as

defined in the SPARQL 1.1 Graph Store Protocol Specification Assignee AUTH Preconditions RDF Triple Store supporting SPARQL 1.1 Graph Store Protocol

(Virtuoso, Fuseki)

Target Platform Communication API application, Storage Engine

Test Data Sample RDF graph, Sample URI

Actions 1. Client is instantiated.

2. The respective create() method is called having as arguments

the sample RDF Graph and URI.

3. The respective create() method is called a second time with

the same arguments. Expected Results Method returns successfully in step 2 and throws exception on

step 3.


Effects on Data None (Sample RDF data have distinct namespace and tests are

not run on production infrastructure) Status stable

Field Description

ID SGC-2 Summary Delete RDF Graph using SPARQL Graph Store HTTP Protocol





(Virtuoso, Fuseki)




2. Sample RDF graph with sample URI is created

3. The respective delete() method is called having as arguments

the sample URI.

4. The respective method is called a second time with the same

arguments. Expected Results Method returns successfully in step 3 and throws exception on

step 4. Effects on Data None (Sample data are not on the WELCOME namespace and

tests are not run on production infrastructure) Status stable

Field Description

ID SGC-3


Summary Read RDF Graph using SPARQL Graph Store HTTP Protocol





(Virtuoso, Fuseki)




2. Sample RDF graph with sample URI is created

3. The respective read() method is called having as arguments

the sample URI.

4. The delete() method is called a for the sample URI.

5. The respective read() method is called a second time with the

same arguments.

Expected Results RDF graph returned on step 3 is isomorphic with sample graph.

Exception thrown on step 5.

Effects on Data None (Sample data are not on the WELCOME namespace and


Field Description

ID SGC-4 Summary Replace RDF Graph using SPARQL Graph Store HTTP Protocol




defined in the SPARQL 1.1 Graph Store Protocol Specification Assignee AUTH


Preconditions RDF Triple Store supporting SPARQL 1.1 Graph Store Protocol

(Virtuoso, Fuseki)


Test Data Sample RDF graph A, Sample RDF graph B, Sample URI


2. The respective replace() method is called having as arguments

the sample RDF graph A and the sample URI.

3. Sample RDF graph A with sample URI is created

4. The respective replace() method is called having as arguments

the sample RDF graph B and the sample URI.

5. The read() method is called a for the sample URI.

Expected Results Exception thrown on step 2.

The RDF graph returned on step 5 is isomorphic with sample

graph B.



Field Description

ID SGC-5 Summary Update RDF Graph using SPARQL Graph Store HTTP Protocol





(Virtuoso, Fuseki)


Test Data Sample RDF graph A, Sample RDF graph B, Sample URI



2. Sample RDF graph A is created with sample URI.

3. The respective update() method is called having as arguments

sample RDF graph B and sample URI.

4. The read() method is called for the sample URI.

Expected Results The RDF graph returned on step 4 is isomorphic with the sum of

sample graphs A and B.



Field Description

ID SGC-6 Summary Execute ASK SPARQL query on RDF Triple Store Test Area SPARQL Graph Store Protocol Client module


make ASK queries on the SPARQL endpoint of the RDF Triple

Store Assignee AUTH Preconditions RDF Triple Store supporting SPARQL 1.1 Graph Store Protocol

(Virtuoso, Fuseki)


Test Data Sample RDF graph A, Sample URI, Sample ASK query A, Sample

ASK query B Actions 1. Client is instantiated.


3. The respective ASK() method is called having as arguments

sample query A.

4. The respective ASK() method is called having as arguments

sample query B.

Expected Results Method returns true on step 3, false on step 4.




Field Description

ID SGC-7 Summary Determine if a specified graph exists on the RDF Triple Store Test Area SPARQL Graph Store Protocol Client module

Related Requirement Overall Cloud design Objective The HEAD method as defined in the SPARQL Graph Store

Protocol spec. is not implemented on the Virtuoso server. A

workaround method has been implemented on the SPARQL

Graph Store Client module, to check for the existence of a

specified graph, using either a full read request (GET) or an

appropriate ASK query

Assignee AUTH Preconditions RDF Triple Store supporting SPARQL 1.1 Graph Store Protocol

(Virtuoso, Fuseki)


Test Data Sample RDF graph A, Sample URI


2. Exists() method is called having as arguments the sample URI.


3. Exists() is called again with same arguments.

Expected Results Method returns false on step 2, true on step 3.



Field Description

ID SGC-8 Summary Search for Graphs on RDF Triple Store based on URI and rdf:type

property Test Area SPARQL Graph Store Protocol Client module


perform basic search for graphs, using prepared SPARQL queries


Assignee AUTH Preconditions RDF Triple Store supporting SPARQL 1.1 Graph Store Protocol

(Virtuoso, Fuseki)


Test Data Sample RDF graph A, B and C, Sample URIs A, B and C


2. Sample RDF graph A is created with sample URI A.

3. Sample RDF graph B is created with sample URI B.

4. Sample RDF graph C is created with sample URI C.

5. The respective search() method is called having as arguments

URI A, rdf:type null.

6. The respective search() method is called having as arguments:

a string that is a substring of all sample URIs and rdf:type null.


a string that is not a substring of any of the sample URIs and

rdf:type null.


a null string, and rdf:type same as graph C.


a string that is a substring of all sample URIs, and an rdf:type

same as graph C.

Expected Results Method returns

URI A on step 5

URIs A, B and C on step 6

Empty set on step 7

URI C on step 8

URI C on step 9



Field Description

ID SGC-9 Summary Search for references to specific graph on RDF Triple Store based

on rdf:type property Test Area SPARQL Graph Store Protocol Client module

Related Requirement Overall Cloud design


Objective Testing that the SPARQL Graph Store Protocol Client module can

search for relationships between graphs, using prepared

SPARQL queries Assignee AUTH Preconditions RDF Triple Store supporting SPARQL 1.1 Graph Store Protocol

(Virtuoso, Fuseki)


Test Data Sample RDF graph A, B, Sample URIs A, B


2. Sample RDF graph A is created with sample URI A.

3. Sample RDF graph B is created with sample URI B (graph B

having a triple pointing to URI A).

4. The respective method is called having as arguments URI A,

rdf:type same as graph B.

5. The respective method is called having as arguments URI A

and rdf:type different than graph B.

Expected Results Method returns

URI B on step 5

Empty set on step 5



Field Description

ID CRUD-FH-1 Summary CRUD Layer - File Operations Test Area CRUD Layer - File Handler

Related Requirement SE-FR-3 and Overall Cloud design Objective Testing that the CRUD Layer can perform all required file

operations (create, read, delete) Assignee AUTH Preconditions An existing path with write permissions is properly defined in

API configuration



Test Data Sample data to be handled as a file

Actions 1. File Handler is instantiated.

2. File is written.

3. File is read back.

4. Data read back are compared to original data

5. File is deleted.

Expected Results All methods return without throwing any exceptions

Effects on Data None (tests are not run on production infrastructure)

Status Deprecated (current test is rudimentary and a full test suite for

the File Handler will be developed against the cloud file storage

solution that will be selected)

Field Description

ID SR-1 Summary SPIN rules basic tests Test Area Validation module (SPIN rules)

Related Requirement SE-FR-2, SE-FR-4 and overall Cloud Design Objective Testing that SPIN rules are firing as expected Assignee AUTH Preconditions SPIN API available

Target Platform Communication API application

Test Data Sample ontologies with corresponding instance data developed

using TopBraid Composer Actions 1. RDF models are created consisting of respective ontology and

instances

2. SPIN API method getSpinConstraintViolations() is called with

arguments the models of step 1.

3. Constraint Violations returned are compared against those

produced by TopBraid Composer.

Expected Results Tests should produce the same constraint violations as TopBraid

Composer.

Effects on Data None (tests are independent)


Status Under development (test suite is evolving alongside the

WELCOME data model)

Field Description

ID CRUD-SH-1 Summary CRUD Layer - Schema Handler Test Test Area CRUD Layer - Schema Handler

Related Requirement SE-FR-2 and Overall Cloud design Objective Testing that the CRUD Layer can perform all required WELCOME

schema operations (create, read, update, delete) Assignee AUTH Preconditions SGC-* tests passed successfully

SR-* tests passed successfully


Test Data Sample Ontologies A, B (serialized in Turtle)

Actions 1. Schema Handler is instantiated.

2. Create() method is called with argument ontology A

3. Update() method is called with argument ontology B.

4. Read() method is called.

5. Delete() method is called.

6. Delete() method is called.

Expected Results Result of step 4 is isomorphic with the sum of A and B.


Effects on Data Test MUST NOT be run against production infrastructure as it

alters the cloud data model. Status stable

Field Description

ID CRUD-MH-1 Summary CRUD Layer - Model Handler Test Test Area CRUD Layer - Model Handler

Related Requirement SE-FR-2, SE-FR-4, and Overall Cloud design


Objective Testing that the CRUD Layer can perform all required graph

operations (create, read, replace, delete) Assignee AUTH Preconditions SGC-* tests passed successfully

SR-* tests passed successfully

CRUD-SH-* tests passed successfully


Test Data Sample Ontology, Sample RDF Models A and B, Sample URI

Actions 1. Schema Handler is instantiated with sample ontology.

2. Create() method is called with arguments RDF Model A,

sample URI

3. Model is read back from sample URI.

4. Replace() method is called with arguments RDF Model B,

sample URI

5. Model is read back from sample URI.

6. Delete() method is called with argument sample URI

7. Delete() method is called with argument sample URI

Expected Results Result of step 3 is isomorphic with RDF Model A

Result of step 5 is isomorphic with RDF Model B


Effects on Data Test MUST NOT be run against production infrastructure as it

alters the cloud data model. Status stable

Field Description

ID CRUD-RDF-1 Summary CRUD Layer - RDF Operations - Verify Single Resource Test Area CRUD Layer

Related Requirement SE-FR-2, SE-FR-4, and Overall Cloud design Objective Testing the method that checks that an input RDF model

contains one and only one instance of the classes allowed to be

instantiated by the WELCOME data model. Assignee AUTH Preconditions none



Test Data Sample Ontology, Sample RDF Models

Actions Respective method is called with sample arguments

Expected Results Method returns true if and only if there is one and only one

instance of a class that is inside the hierarchy of allowed classes,

and that class is a terminal node in the hierarchy class tree.

Otherwise method returns false.

Effects on Data None (Tests are independent)

Status stable

Field Description

ID CRUD-RDF-2 Summary CRUD Layer - RDF Operations - Change Resource URI Test Area CRUD Layer

Related Requirement SE-FR-2, SE-FR-4, and Overall Cloud design Objective Testing the method that assigns URIs to stored resources. Assignee AUTH Preconditions none


Test Data Sample RDF model A with root resource having URI X

Sample RDF model B with root resource as a blank node

Sample RDF model C with root resource having URI Y

Models A, B and C are otherwise identical

Actions 1. changeResourceURI() method is called for model A with a

target URI of Y

2. changeResourceURI() method is called for model B with a

target URI of Y

Expected Results After method call all Models A, B and C are isomorphic.


Status stable


Field Description

ID CRUD-RDF-3 Summary CRUD Layer - RDF Operations - Get the local class name for an

RDF resource Test Area CRUD Layer

Related Requirement SE-FR-2, SE-FR-4, and Overall Cloud design Objective Testing the method that retrieves the local class name from an

RDF resource. Assignee AUTH Preconditions none

Target Platform CRUD Layer - Communication API application

Test Data Sample Ontology, Sample RDF Models


Expected Results 1. Method returns a String representing the local name of the

class of the root resource of the model, if and only if there exists

one and only one rdf:type triple, and the respective class in a

terminal node in the allowed classes hierarchy.

2. Otherwise method returns null.


Status stable

Field Description

ID CRUD-RDF-4 Summary CRUD Layer - RDF Operations - Internalize/Externalize RDF

model namespaces and request URIs Test Area CRUD Layer

Related Requirement Overall Cloud design


Objective Test the methods that decouple the namespaces of the actual

API deployment URL from the internal URIs stored in the RDF

Triple Store. Assignee AUTH Preconditions none

Target Platform CRUD Layer, Communication API application

Test Data Sample RDF Models


Expected Results Methods successfully transform between external and internal

URI namespaces.


Status Deprecated (this functionality is now implemented as a filter on

the Cloud API's top layer)

Field Description

ID OA- 1

Summary Testing full scenario of vital sign analysis

Test Area Overall OA analysis – Full integration test


Objective Testing that OA is working in the overall happy case scenario

Assignee AUTH


valid data.

Target Platform Tests the communication between FES and the OA, the

communication between the DSS and OA, the communication

with External Sources and the functionality of the CH. Testing

that all modules are correctly integrated and work as expected.



tested.

Actions OA is instantiated and notified that a vital sign session has

ended. The OA retrieves the related vital signs and build the the


so called ObservationAnalysisContext entity which contains the

plan of the analysis. The ObservationAnalysisContext is used

from the WM through the overall workflow path. The workflow

is executed and the overall analysis results are stored in the

SEM.

Expected Results The ObservationAnalysisContext and the results retuned are

checked. The fact that the results are actually stored in the SEM

is checked.

Effects on Data The analysis data are stored in the SE while testing and removed

after the test is complete in order to keep our testing

environment clean.

Status This test currently is executed for the implemented

functionality. It does not include CH and ESC which should be

included in the future.

first prototype of welcome cloud softwared5.1 – first prototype of welcome cloud software page 5...

Documents