correlating carotid imaging and phylogenetic trees...

174
CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES FOR THE PRE AND POST ANALYSIS OF GENETIC ISCHEMIC STROKES _____________________________________________________________________ A THESIS SUBMITTED TO LAHORE COLLEGE FOR WOMEN UNIVERSITY IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN COMPUTER SCIENCE By HUMA IFTIKHAR Registration No.: 07-M/LCWU-12446 ________________________________________________________________________ DEPARTMENT OF COMPUTER SCIENCE LAHORE COLLEGE FOR WOMEN UNIVERSITY, LAHORE PAKISTAN 2015

Upload: others

Post on 30-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

CORRELATING CAROTID IMAGING AND

PHYLOGENETIC TREES FOR THE PRE AND POST

ANALYSIS OF GENETIC ISCHEMIC STROKES _____________________________________________________________________

A THESIS SUBMITTED TO LAHORE COLLEGE FOR WOMEN UNIVERSITY IN

PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY IN COMPUTER SCIENCE

By

HUMA IFTIKHAR

Registration No.: 07-M/LCWU-12446

________________________________________________________________________

DEPARTMENT OF COMPUTER SCIENCE

LAHORE COLLEGE FOR WOMEN UNIVERSITY, LAHORE

PAKISTAN

2015

Page 2: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

CERTIFICATE

This is to certify that the research work described in this thesis submitted by Ms. Huma

Iftikhar to Department of Computer Science, Lahore College for Women University has

been carried out under my direct supervision. I have personally gone through the raw data

and certify the correctness and authenticity of all results reported herein. I further certify that

thesis data have not been used in part or full, in a manuscript already submitted or in the

process of submission in Partial/complete fulfillment of the award of any other degree from

any other institution or home or abroad. We also certified that the enclosed manuscript, has

been to paid under my supervision and I endorse its evaluation for the award of PhD degree

through the official procedure of University.

____________________________

Dr. Muhammad Abuzar Fahiem

Supervisor

Date:

Verified By

________________

Dr. Muhammad Abuzar Fahiem

Chairperson

Department of Computer Science

Stamp

_________________

Controller of Examination

Stamp

Date: ___________

Page 3: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

Dedicated to

My Parents

Muhammad Iftikhar Ali & Farhana Iftikhar For their unconditional love and support

My loving Husband Tauseef Ali Sulehri

For giving me my identity. Who wipes out the sense of time, memory of a difficult beginning and fear of an end in me.

My kids

Areesha, Marva, Shafay, Saad and Eshaal Who are indeed treasures from Allah. They have made me

complete.

Page 4: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

ACKNOWLEDGMENTS

I’m grateful to Allah for giving me courage and to enlighten my path. I am being blessed by

Him to have caring and loving people around me.

My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance. He has

been providing constant support and guidance through each step of this research work. This

work could never have been completed without his earnest supervision.

I would like to pay thanks to my parent university, Lahore College for Women University,

Lahore, Pakistan, that played a very important role throughout my studies as well as in the

completion of my PhD.

I would like to thank HapMap consortium (www.hapmap.org/) for providing data for this

research.

I would like to thank my friends for bearing with me in the hard times especially Saima for

spending her precious time to help me and support me.

Most of all, I owe my degree to my husband, kids, family and in-laws, without their support

and patience, achieving my goal was impossible. I want to express a special gratitude to my

mother-in-law and my parents for their moral support and encouragement.

Huma Iftikhar.

Page 5: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

CONTENTS

Title Page No.

List of Tables i

List of Figures ii

List of Abbreviations iii

Abstract vi

Chapter 1: Introduction 1

1.1 Types of Strokes 1

1.2 Ischemic Strokes 2

1.2.1 Subtypes of Ischemic Strokes 2

1.2.2 Causes of Ischemic Strokes 3

1.2.3 Prevention and Diagnostics 3

Chapter 2: Review of Literature 6

2.1 Ischemic Stroke Risk Estimation Approaches 6

2.1.1 Medical Image Analysis based Ischemic Stroke

Risk Estimation

7

2.1.1.1 Image Acquisition 7

2.1.1.1.1 Carotid Duplex Ultrasound

(CDU)

7

2.1.1.1.2 Computed Tomography

Angiography (CTA)

7

2.1.1.1.3 Magnetic Resonance

Angiography (MRA)

8

2.1.1.1.4 Cerebral Angiography (CAG) 8

2.1.1.1.5 Digital Subtraction

Angiography (DSA)

8

2.1.1.2 CA and IMT Segmentation Techniques 9

2.1.1.2.1 Edge Tracking and Gradient 9

Page 6: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

2.1.1.2.2 Dynamic Programming 10

2.1.1.2.3 Active Contours 10

2.1.1.2.4 Nakagami Mixture Modeling 11

2.1.1.2.5 Hough Transform 11

2.1.1.2.6 Integrated Approaches 11

2.1.1.3 Plaque Segmentation Techniques 12

2.1.1.3.1 Discrete Dynamic Contour 12

2.1.1.3.2 Kalman Filters 12

2.1.1.3.3 Balloon 12

2.1.1.3.4 Canny Edge Detection 13

2.1.1.3.5 Morphological Operations 13

2.1.1.4 Features 13

2.1.2 Genetic Data Analysis based Ischemic Stroke

Risk Estimation

15

2.1.2.1 Phylogenetic Trees 16

2.1.2.1.1 Phylogenetic Data 16

2.1.2.1.2 Phylogenetic Tree

Construction Methods

17

2.1.3 Classifiers 18

2.2 Comparison 20

2.2.1 Comparison of Different Imaging Techniques 20

2.2.2 Comparison of Different Approaches for Ischemic

Stroke Risk Estimation

27

Chapter 3: Proposed Approach- Ischemic Stroke Risk Estimation

Using Carotid Imaging

38

3.1 Materials and Methods 39

3.1.1 Phase-I: Preprocessing 44

3.1.2 Phase-II: Intima-Media Segmentation 47

3.1.3 Phase -III: Estimation 53

Page 7: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

3.2 Results 58

3.3 Discussion 63

3.4 Conclusion 65

Chapter 4: Proposed Approach- Genetic Data Based Ischemic Stroke

Classification

71

4.1 Materials and Methods 84

4.1.1 Data 84

4.1.2 Method 84

4.1.3 Classification 88

4.1.3.1 Bayes Net 88

4.1.3.2 Naïve Bayes 88

4.1.3.3 IBk 88

4.1.3.4 AdaBoostM1 89

4.1.3.5 Classification via Regression 89

4.1.3.6 J48 90

4.1.3.7 Random Forest 90

4.1.3.8 Bagging 90

4.1.3.9 Multilayer Perceptron 91

4.2 Results and Discussion 91

4.3 Conclusion 99

Chapter 5: Analysis & Discussion- Correlating Phylogenetic Trees 100

5.1 Phylogenetic Tree Construction Methods 101

5.1.1 Unweighted Pair Group Method using Arithmetic

Averages (UPGMA)

101

5.1.2 Neighbor Joining (NJ) 102

5.1.3 Maximum Parsimony (MP) 102

5.1.4 Maximum Likelihood (ML) 102

5.2 Softwares and Tools Available for Phylogenetic Tree 103

Page 8: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

Construction

5.3 Bioinformatics Databanks 104

5.4 Materials and Methods 106

5.4.1 Data 106

5.4.2 Method 107

5.5 Results and Discussion 121

5.6 Conclusion 125

Chapter 6: Conclusion & Future Recommendations 126

6.1 Conclusion 126

6.2 Future Recommendations 126

References 127

Plagiarism Report viii

List of Publications and Reprints ix

Page 9: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

i

List of Tables

Table No. Title Page No.

2.1 Comparison of different medical imaging techniques for the

carotid artery

21

2.2 Review of different stenosis and IMT estimation approaches 28

3.1 Dataset details 40

3.2 Comparison of manual and proposed approach measurements

for 100 carotid artery ultrasound images

59

3.3 Wilcoxon ranksum test computed for Expert 1, Expert 2 and

Automatic measurements for 100 carotid artery ultrasound

images

60

3.4 Difference computed for classification results of Expert 1,

Expert 2 and Automatic measurements for 100 carotid artery

ultrasound images

63

3.5 Comparison of proposed approach with existing approaches 67

4.1 Stroke associated genes/ locus, SNP id/ haplotype and

corresponding SNPs

77

4.2 Details of dataset 84

4.3 Sample genetic data for randomly chosen subjects 85

4.4 Classification results for genetic data using different classifiers 93

5.1 Phylogenetic tools and their implementation methods 103

5.2 Details of selected HapMap population data 106

5.3 Stroke associated genes/ locus, SNP id/ haplotype and

corresponding allele risk

108

5.4 Allele frequencies for the SNPs from sample population data 110

5.5 Percent allele frequencies for the SNPs from sample population

data

114

5.6 Distance matrix for all populations 119

Page 10: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

ii

List of Figures

Figure No. Title Page No.

3.1 Layers of carotid arterial wall 38

3.2 Block diagram of proposed approach 42

3.3 Detailed working of proposed approach 43

3.4 Image processing steps on carotid ultrasound images 52

3.5 Decision tree for classification of stenosis and ischemic stroke

risk

57

3.6 Bland-Altman plots of (a) Expert1 NM1 versus Automatic

measurements by proposed approach NA (b) Expert2 NM2

versus Automatic measurements by proposed approach NA

62

4.1 Identified SNPs causing stroke risk 76

4.2 Comparison of % accuracy of classifiers using genetic data 96

4.3 Comparison of % specificity of classifiers using genetic data 97

4.4 Comparison of % sensitivity of classifiers using genetic data 98

5.1 Phylogenetic tree using our distance matrix 120

5.2 Comparison of % allele frequency of all sample populations 122

5.3 Combined allele frequencies of all sample populations 123

5.4 Constructed phylogenetic tree using FST matrix as calculated

by Altshuler et. al.[201]

124

Page 11: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

iii

List of Abbreviations

ANN Artificial Neural Networks

ASM Angular Second Moment

BMI Body Mass Index

CA Carotid Artery

CAG Cerebral Angiography

CDU Carotid Duplex Ultrasound

CIMT Carotid Intima Media Thickness

CT Computed Tomography

CTA Computed Tomography Angiography

CV Coefficient of Variation

DDP Dual Dynamic Programming

DGV Database of Genomic Variants

DNA DeoxyriboNucleic Acid

DP Dynamic Programming

DPAD Detail Preserving Anisotropic Diffusion

DSA Digital Subtraction Angiography

EGA European Genome Phenome Archive

EMBL European Molecular Biology Laboratories

FDTA Fractal Dimension Texture Analysis

FOAM First Order Absolute Moment

FOM Figure of Merit

FPS Fourier Power Spectrum

Page 12: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

iv

GLDS Gray Level Difference Statistics

GVF Gradient Vector Flow

GWAS Genome-Wide Association Studies

HGMD Human Gene Mutation Database

HMM Hidden Markov Model

HT Hough Transform

IDM Inverse Difference Moment

IM Intima Media

IMT Intima Media Thickness

IOE Intra-Observer Error

ISGS Ischaemic Stroke Genetics Study

LI Lumen-Intima

MA Media Adventitia

MBPN Multilayer Back Propagation Network

ML Maximum Likelihood

MLE Maximum Likelihood Estimation

MLP Multilayer Perceptron

MP Maximum Parsimony

MRA Magnetic Resonance Angiography

MRI Magnetic Resonance Imaging

mRNA Messenger RNA

NGTDM Neighborhood Gray Tone Difference Matrix

NJ Neighbor Joining

Page 13: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

v

OMIM Online Mendelian Inheritance in Man

RF Radio Frequency

RNA Ribonucleic Acid

ROI Region of Interest

rRNA Ribosomal RNA

SF Statistical Features

SGLDM Spatial Gray Level Dependence Matrices

SNP Single Nucleotide Polymorphism

SVM Support Vector Machine

TEM Texture Energy Measures

TIA Transient Ischemic Attack

UPGMA Unweighted Pair Group Method using arithmetic Averages

WHO World Health Organization

Page 14: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

vi

Abstract

Ischemic stroke is the most commonly occurring type of stroke and one of the most communal

causes for disability and death in the world as per World Health Organization. Multiple factors

such as hypertension, diabetes, arterial fibrillation, heart diseases, transient ischemic strokes, etc.

contribute to ischemic stroke susceptibility. There is a compelling need for follow up checkups

and post analysis to prevent further strokes. Apart from clinical tests, a lot of research is being

carried out on computer based automated techniques and mechanisms for estimation of ischemic

stroke risk. Ultrasound images of the carotid artery are used for development of noninvasive

image based methods for stroke risk estimation however; carotid artery morphology, noise and

artifacts in the ultrasound images can lead to false classification.

Carotid intima media thickness is an indicator of future ischemic stroke. In this research, we

have proposed an automatic ischemic stroke risk estimation approach using carotid intima media

thickness from longitudinal carotid B-mode ultrasound images. Based on carotid intima media

thickness, a classification scheme is proposed to associate the carotid artery stenosis with

ischemic stroke risk. The proposed approach is tested and clinically validated on a data set of

100 longitudinal ultrasound images of the carotid artery. There is no significant difference

between intima media thickness measurements obtained using our approach and the manual

measurements by experts. The intra-observer error of 0.088, a Coefficient of Variation of

12.99%, Bland-Altman plots with small differences between experts (0.01 and 0.03 for Expert 1

and Expert 2, respectively) and Figure of Merit of 98.5% are obtained. The proposed approach

makes the risk estimation process automatic and yet reduces the risk of subjectivity and operator

variability for intima media thickness measurement.

Additionally, some of stroke cases are suspected to be genetic as the patients do not suffer from

Page 15: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

vii

the conventional risk factors. Extensive research has been conducted to investigate the unknown

factors other than the conventional ones and their relationship with genetics. We have analyzed

genotype data for stroke risk estimation. Nine classification models are used on the SNPs data to

analyze and classify individuals. An accuracy of 88.16% is achieved by the proposed approach.

Ischemic stroke risk has been correlated with genetic distances. For this purpose phylogenetic

trees have been used. Analysis suggests that given two populations might be genetically close

but they might be far with respect to ischemic stroke risk.

Proposed research has addressed both the medical image analysis and genetic data analysis for

stroke risk estimation. The proposed approach has achieved higher accuracy, specificity and

sensitivity values when compared to existing approaches.

Page 16: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

CHAPTER NO. 1

INTRODUCTION

Page 17: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

1

Introduction

Stroke is caused due to deprivation of oxygen to the brain that causes brain cells death. Poor

blood flow causes oxygen deprivation in the brain. According to the World Health

Organization (WHO) [1], 15 million people suffer from stroke worldwide each year. Of these, 5

million die and another 5 million are permanently disabled. In 2013 stroke was the second major

cause of death [2]. Stroke is responsible for 6.4 million of the total deaths in year 2013 i.e. 12%

of the total deaths. Almost 67% of the strokes patients are 65 years or above. 50% patients who

suffer from stroke live for a maximum period of one year after the stroke occurrence.

1.1 Types of Strokes

There are three major types of strokes:

1. Ischemic Stroke

Blood clot causes ischemic stroke. This type of stroke accounts for 87% of all of the

stroke cases[3]. Fatty deposits lining the blood vessels are the major source of

obstruction causing ischemic strokes.

2. Hemorrhagic Stroke

The second type of strokes i.e. hemorrhagic strokes are caused by bleeding in brain due

to rupture of blood vessels. It accounts for 13% of the stroke cases. Weakened blood

vessels rupture and bleed in the brain and cause compression in the brain tissues.

Uncontrolled hypertension is the major cause of hemorrhagic strokes.

3. Transient Ischemic Attack (TIA)

TIAs are also known as mini strokes as they are mini episodes of stroke. These are

Page 18: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

2

caused by temporary blockage of a vessel. They are taken as warning of an upcoming

stroke. TIA occurrence is rapid and the effects last for a short period of time. They

usually do not cause permanent injury or loss.

Our research work is focused on ischemic strokes and its risk estimation. Ischemic strokes are

discussed in detail in the next section.

1.2 Ischemic Strokes

Ischemic strokes [4] occur due to thrombosis[5], embolism [6] or atherosclerosis [7, 8]. An

individual affected with ischemic stroke may lose the ability to move one side of the body,

speak, see, eat and drink. It also increases the individual’s chances for heart attack and heart

failure [6]. The damage caused by ischemic stroke may be temporary or permanent. It is the

leading cause of long-lasting disability, long lasting injury and death. Studies confirm that an

individual who has suffered from ischemic stroke is at high risk of having more strokes [9-11].

It is a common perception that stroke usually occurs in old people but statistics show that 28%

of the strokes are in the individuals younger than 65 years of age. Studies have proven that

stroke is preventable and treatable provided needful is done in time [12]. Preventive measures

can preclude 80% of the strokes.

1.2.1 Subtypes of Ischemic Strokes

There are two major subtypes of ischemic strokes, namely:

1. Embolic Strokes

2. Thrombotic Strokes

In embolic stroke an emboli or a clot is formed in some part of body that eventually travels to

the brain and forms blockage. This blockage is then responsible for stroke. However, thrombotic

Page 19: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

3

strokes are caused by thrombus or ruptured plaque that causes a blockage in the artery that

supplies blood to the brain.

1.2.2 Causes of Ischemic Strokes

Stroke is a complex disease as there are many factors that play role in the stroke susceptibility.

The risk factors are either modifiable or unmodifiable. Some of the major modifiable risk factors

for stroke are:

1. Hypertension

2. Diabetes

3. Arterial fibrillation

4. Smoking

5. Heart diseases like coronary artery disease, valve defects, enlargement of one of the

heart’s chambers.

6. TIAs

7. Cholesterol imbalance

8. Physical inactivity

9. Obesity

Above listed factors can be modified with medical treatment or lifestyle changes. Some of the

unmodifiable or non-amendable risk factors [7] are:

1. Age

2. Gender

3. Race

1.2.3 Prevention and Diagnostics

Two things that can help prevent stroke and the risk of death or disability from it are to control

Page 20: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

4

the factors that cause stroke and to lookout for the stroke warning signs. Medical imaging

techniques are useful in detection of any changes in the blood flow to the brain. The invisible or

inheritable factors can be diagnosed using genetic data. The methods used for diagnostics and

risk estimation include:

1. Medical Image Analysis

2. Gene Data Analysis

Once stroke occurs less can be done. After the occurrence of stroke the nature and severity of

damage to the brain and its functions is accessed. Stroke can result in minor impairments, severe

impairments or in severe cases death. Rehabilitation assistance starts shortly once the patient is

stable (usually in 24 to 48 hours). In last decade intense research has been done to identify the

apparent causes of strokes and their prevention.

The need is felt to devise automated algorithms that are helpful in stroke risk assessment.

Medical images are used to assess the stroke risk as visible changes in the arteries can be

analyzed. Genetic data can be used for risk identification of an individual by analyzing the risk

alleles. Genetic data may aid to pre-diagnose individuals at high risk of ischemic stroke.

Genotypes as well as phenotypes contribute highly to the risk of stroke. Further, the individuals

who already have suffered from ischemic stroke are at very high risk of having stroke again. So

there is a compelling need of post analysis and follow up checkups to prevent further strokes.

Intima media thickness (IMT), the atherosclerotic carotid plaque, severity of stenosis due to

atherosclerotic plaques and plaque characterization can be taken under consideration using

carotid imaging. Texture patterns extracted from the carotid ultrasound images can help in

analysis of the plaques and their characterizations for pre diagnosis of the ischemic strokes.

Plaques have different compositions and appearances, so texture patterns are helpful in

Page 21: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

5

characterizing different types of plaques. Similarly carotid artery (CA) morphology, noise and

artifacts in the carotid images can lead to false classification. Phylogenetic trees can help to take

into account the genetic risk factors and gene mutations. The risk patterns can be extracted using

these trees from the sampled group of data. These patterns can facilitate improvement and

accuracy of risk estimation.

Our research mainly takes into account carotid images and genetic data for the risk estimation of

ischemic stroke for an individual. The primary focus is on the prediction accuracy. The objective

of our research work is to carry out a comprehensive analysis of ultrasound images of the CA

and genes data to pre diagnose the individuals at high risk of ischemic strokes. For effective

prevention and prognostic implications of ischemic strokes, real-time analysis of genetic data

and ultrasound imaging is used. The research work improves the process of correct identification

of individuals at high risk of ischemic stroke by contributing in the visual assessment procedure

conducted by the medical personals. It also facilitates early diagnosis and the assessment of the

stroke risk.

The thesis is organized as follows: chapter 2 comprises of comprehensive literature review,

comparative analysis of different carotid imaging techniques, various approaches in the field of

carotid image analysis for stroke risk identification and assessment and advancements done in

the risk assessment of an individual for genetic ischemic stroke. Proposed approach for ischemic

stroke risk identification using carotid imaging, classification and results are presented in

chapter 3. The proposed risk assessment of ischemic stroke of an individual using genetic data,

classification and results comprises chapter 4. Phylogenetic tree analysis for the risk estimation

of different populations and their comparison with traditional phylogenetic tree are given in

chapter 5. Chapter 6 concludes the thesis and provides recommendations for future work.

Page 22: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

CHAPTER NO. 2

REVIEW OF LITERATURE

Page 23: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

6

Review of Literature

Stroke and the risk of death or disability from it can be prevented by taking some precautionary

measures. The measures that can prevent stroke are to control the factors that are its root cause

and to lookout for the warning signs. However in many stroke cases the phenotypes are not

present. Researchers are of the opinion that strokes are strongly related to genes and are passed

on in generations. Even the risk factors that contribute to strokes are passed on genetically [7].

Genetics influence the inherited predisposition for certain diseases.

The real-time analysis of the genetic data and ultrasound imaging provide quick means to

qualitatively analyze the input data and draw meaningful interpretation. It subsequently helps in

analyzing the risk factors and takes into account the preventive measures for stroke.

Phylogenetic trees can be used for the representation of highly diverse, multidimensional data

sets. They are used to evaluate the findings of the genetic data analysis among various

populations.

Analysis of the images of the CA and of genetic data plays a key role when assessing the

ischemic stroke risk. This chapter gives a comprehensive survey on medical imaging techniques,

various medical image analysis based ischemic stroke risk estimation techniques, phylogenetic

trees and classifiers generally used for classification purposes. Comparisons of different CA

imaging techniques and ischemic stroke risk estimation approaches are also given in this

chapter. Our improved approach for ischemic stroke risk estimation is proposed keeping in view

the shortcomings and limitations of the existing techniques.

2.1 Ischemic Stroke Risk Estimation Approaches

There are two types of risk estimation approaches for ischemic strokes; medical image analysis

Page 24: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

7

based approaches and genetic data analysis based approaches. Both of these approaches and the

steps involved are discussed in the following section.

2.1.1 Medical Image Analysis based Ischemic Stroke Risk

Estimation

Medical images are used in image based techniques for analysis. Analysis includes image

segmentation for CA and IMT followed by segmentation for plaque inside the CA. A feature set

is then extracted which is used for classification purpose. Tests that provide information about

the CA structure and the blood flow information are useful in estimating the stroke risk.

2.1.1.1 Image Acquisition

There are different medical imaging techniques which are being practiced. The most common

ones are Carotid Duplex Ultrasound (CDU), Computed Tomography Angiography (CTA),

Magnetic Resonance Angiography (MRA), Cerebral Angiography (CAG) and Digital

Subtraction Angiography (DSA).

2.1.1.1.1 Carotid Duplex Ultrasound (CDU)

CDU is used to observe the blood flow in the CA. It uses sound waves to produce images of the

CA. It combines the blood flow information with the traditional imaging of the carotid vessels.

The term duplex means that two modes of ultrasound are used. One is doppler and the other is

B-mode. Doppler evaluates the velocity and direction of the blood flowing inside the artery. The

B-mode obtains the image of the artery. This technique is the most frequently used technique for

the estimation of stenosis.

2.1.1.1.2 Computed Tomography Angiography (CTA)

CTA is used to see the blood flow in the blood vessels throughout the body. It makes use of the

Page 25: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

8

Computed Tomography (CT) that uses x-rays and a computer system to generate the images of

the blood vessels. CT produces detailed images of blood vessels and soft tissues. Sometimes a

dye is injected using a catheter. As the images are captured by a rotating device to capture

images at different angles, hence projection images are obtained. These images can be seen in

different planes and projections.

2.1.1.1.3 Magnetic Resonance Angiography (MRA)

MRA is based on Magnetic Resonance Imaging (MRI). It is the MRI of the blood vessels. The

images are formed by a scan that uses magnetic field and pulses of radio wave energy. It makes

use of magnetic field to capture images of the blood vessels generally in the head and neck

region. Contrast material / dye may be used for clarity of blood vessels.

2.1.1.1.4 Cerebral Angiography (CAG)

CAG is also known as intra-arterial digital subtraction angiography. In CAG a contrast based

dye is injected through a catheter in the blood vessels. The catheter is moved all the way up to

the heart. X rays are used to get images of the blood vessels. This technique is invasive and

usually done after some other noninvasive method confirms the stenosis. People having diabetes

or kidney disease are at risk of having complications during the test. The dye injected can cause

a temporary damage to the kidneys.

2.1.1.1.5 Digital Subtraction Angiography (DSA)

DSA is the process in which an image is acquired before injecting the contrast dye in the blood

vessels and an image is taken after injecting the dye. The pre contrast image is subtracted from

the contrast image to remove the overlying structures other than the blood vessels. This method

uses an image intensifier. It is a type of fluoroscopy testing technique. Area of interest is exposed

to time-controlled x-rays to capture the images. DSA is replaced by CTA and rarely used in the

Page 26: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

9

hospitals and imaging departments.

2.1.1.2 CA and IMT Segmentation Techniques

In this section some of the well-known and commonly practiced segmentation techniques for

CA and IMT are discussed in detail. IMT serves as an indicator of stroke risk and cardiovascular

diseases. IMT is directly associated with increased risk of stroke especially in elderly population

without any history of cardiovascular diseases. IMT is measured as the distance between the

Lumen-Intima (LI) and Media-Adventitia (MA) interfaces of the CA. IMT measurement

methods can be manual or computer assisted. Usually IMT is calculated manually by an

operator that introduces operator variability and requires clinical experience. Manual method is

time consuming and varies according to training and subjective judgment of the operator.

Computer aided methods are used to calculate IMT thickness to overcome these problems.

These methods mainly rely on the computerized segmentation of the LI and MA interfaces in

CA images. Different image segmentation algorithms have been used to segment CA, leading to

more accurate results. Computer aided methods are either semi-automatic or completely

automatic. Most of the computer aided methods proposed are semi-automatic [13-20] and may

require operator or user to manually provide the Region of Interest (ROI) or to manually

perform the initial segmentation or to provide the segmentation seed points or do manual

corrections of the system segmented images. On the other hand the fully automatic techniques

do not require user interaction [21-25].

Computer aided methods can be categorized according to the type of technique used for

segmentation:

2.1.1.2.1 Edge Tracking and Gradient

Edge tracking and gradient based methods generally make use of the intensity profile graphs

Page 27: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

10

[16, 20] or image gradients [15, 19] for segmentation of LI and MA interfaces. Faita et al. [13]

proposed an improved gradient based approach using First Order Absolute Moment (FOAM)

edge operator. This approach was robust to noise but could not process curved and non-

horizontal CA. Another variation is multistep gradient based algorithm [26] that makes use of

intensity, intensity gradient and interface continuity of pixels to segment CA.

2.1.1.2.2 Dynamic Programming (DP)

These algorithms segment IMT from the image of CA and outperformed maximum gradient,

mathematical models and matched filter approaches in speed and continuity of boundary [27].

DP based techniques [28] generally make use of echo intensity, intensity gradient and boundary

continuity as weighted terms of a cost function that is to be minimized. These techniques

generally require training and in case of change of scanner, retraining is required for the system

to work properly. A variation is multiscale based dynamic programming technique [29] for CA

analysis that iteratively calculates exact location of the CA wall by coarse to fine location. Dual

Dynamic Programming (DDP) [30] is also used for CA segmentation.

2.1.1.2.3 Active Contours

Active contours also known as snakes are deformable models that adapt themselves using a

dynamic process that minimizes a global energy function. Snake is a deformable spline.

Constraint image forces attract the snake or the spline towards the object and the internal forces

resist deformation. Conventional snakes need some seed points or initial contour points from the

user. They are used for object tracking, shape recognition, segmentation and edge detection.

Snake model is also being used to detect the CA from B-Mode and sonography images [31-35].

Local image gradients are mostly used to model snake based algorithms [25, 36, 37]. One of the

approaches is to combine local statistics with snakes to segment CA [14, 38, 39]. Some

Page 28: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

11

approaches use local statistics combined with snakes and fuzzy k-means classifier to segment

CA [40, 41].

2.1.1.2.4 Nakagami Mixture Modeling

A Mixture of Nakagami distributions can be used to model the brightness of Radio Frequency

(RF) envelop. Nakagami distribution is effective for modeling the Radio Frequency (RF)

ultrasound signal scattered by the artery wall layers. The parameters of the model are estimated

using EM algorithm. Nakagami modeling and stochastic optimization is used for CA

segmentation [42] and plaque segmentation from B-mode ultrasound images [43].

2.1.1.2.5 Hough Transform (HT)

HT is a technique used for feature extraction with its applications in image analysis, computer

vision and image processing. It works by finding imperfect instances of objects within a certain

class of shapes by a voting procedure. Earlier application of the HT included identification of

lines in an image. But it has been extended to other shapes including circles and ellipses.

Segmentation algorithm based on HT [22, 44-46] is also used to segment CA boundaries. Xu et

al. [35] proposed HT and dual snake model for CA segmentation.

2.1.1.2.6 Integrated Approaches

Integrated approaches proposed for CA segmentation use a combination of more than one

technique for CA segmentation. Molinari et al. [47] developed an approach using local intensity

maxima and fuzzy k-means classifier for CA segmentation. DelSanto et al. [48] proposed an

approach based on signal analysis, image gradients and active contours for CA segmentation

(CULEXsa). Molinari et al. [24] combined signal processing, snakes, fuzzy clustering,

probability based connectivity and morphological approaches to segment CA. Molinari et. al.

proposed an automated multiresolution edge snapper based segmentation technique using scale

Page 29: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

12

space and statistical classification in multiresolution framework (CAMES) [49]. CALEXia [50],

CARES 3.0 [51] and CAMES [49] are all integrated technique based fully automated

approaches for CA segmentation.

2.1.1.3 Plaque Segmentation Techniques

The segmented CA and IMT images are further segmented to find the plaque inside the arteries.

Several different algorithms are generally used for this purpose.

2.1.1.3.1 Discrete Dynamic Contour

Contours in 2D images can be defined using discrete dynamic model. This model is a set of

connected vertices which is automatically modified by an energy minimizing process. Local

contour curvature determines the internal energy and image features determine the external

energy of the contour. Energy Entropy map generated from a large database of artery images

and initial seed point provided by an expert is used to generate the contour of inner CA [52].

2.1.1.3.2 Kalman Filters

Kalman filters are also known as linear quadratic estimators. It observes estimates for data

variables over time or space to estimate unknown variables. The data might contain noise and

other inaccuracies. It recursively analyzes real-time noisy data to produce optimal values for the

unknown variables. They are used in navigation and control of vehicles, space crafts and

aircrafts, robotic motion planning and control. Both temporal and spatial Kalman filters are used

in this technique to extract the boundaries of the CA and center of its walls to measure the

diameter of the artery [53].

2.1.1.3.3 Balloon

Balloon model has an extra force as of working for snakes model to extract the contour of an

object. This additional force is inflation that expands the snakes contour into the minima instead

Page 30: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

13

of shrinking into it. This eliminates the problem of a snake shrinking inwards. Gill et al. [54]

proposed an algorithm to detect atherosclerotic plaque in CA using the triangular mesh based

balloon model proposed by Cohen [55].

2.1.1.3.4 Canny Edge Detection

Canny is a famous edge detector with applications in various fields. Canny edge detection

works by first smoothing the image with Gaussian and then computing the gradient magnitudes.

Finally edges are detected by double thresholding. Hamou & El-Sakka [56] proposed an

algorithm that used Canny edge detector with threshold parameters to segment CA plaque.

2.1.1.3.5 Morphological Operations

Morphological operations can be used to analyze and process shape based data. Edges of an

object in an image are produced using morphological gradient operation. An appropriate

structuring element is selected by the morphological edge detection algorithm for the processed

image. The structuring element makes use of the basic morphological theory including erosion,

dilation, opening and closing and their combined operations to extract edges from the image. A

multistage method was proposed by Abdel-Dayen & Sakka [57]. The proposed work generates

carotid boundaries in the form of small contours from ultrasound images. Different stages

include filtering, quantization, edge detection and edge enhancement.

2.1.1.4 Features

Segmented images are used to extract feature sets. The features commonly used for texture

analysis in medical images are statistical features (SF), spatial gray level dependence matrices

(SGLDM), gray level difference statistics (GLDS), neighborhood gray tone difference matrix

(NGTDM), texture energy measures (TEM), fractal dimension texture analysis (FDTA) and

fourier power spectrum (FPS).

Page 31: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

14

1. Statistical Features (SF)

SF have many applications in image processing. These can be used for probabilistic

description and classification of images as well as for the quality estimation of the

images. These features describe the gray level histogram distribution without

considering the spatial dependence of the pixels. Different SF include mean, median,

variance etc.

2. Spatial Gray Level Dependence Matrices (SGLDM)

SGLDM are the most commonly used features [57] that are computed on the basis of

probability density functions. These density functions are second-order joint conditional

probability density functions. An intermediate matrix of measures is computed from

image. Features are defined as functions of the calculated intermediate matrix. The

commonly used features include angular second moment (ASM), contrast, correlation,

inverse difference moment (IDM), sum average, variance and entropy of the pixel values.

3. Gray Level Difference Statistics (GLDS)

GLDS are calculated using first order statistics of an image. Probability density of image

pixel pairs is estimated at a given distance having a certain absolute gray level difference

value. These features are calculated from difference between pairs of gray levels [58].

Commonly used features are contrast, ASM, entropy and mean.

4. Neighborhood Gray Tone Difference Matrix (NGTDM)

NGTDM is calculated using pixels in the neighborhood of the pixel under consideration

but excluding the pixel under consideration. NGTDM features comprise of coarseness,

contrast, busyness, complexity and strength as texture features[59].

Page 32: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

15

5. Texture Energy Measures (TEM)

TEM are also known as Laws TEM. These features are calculated using local masks to

detect various types of textures in images. Energy of texture is computed using

convolution masks. The energy of texture for each pixel is represented by vectors. The

statistical features mean, standard deviation and entropy of these vectors are used in

place of images [60].

6. Fractal Dimension Texture Analysis (FDTA)

Fractal values at different scales are used in FDTA approaches to get image features.

Features are computed from the parameters of affine relations among different regions of

an image. One of the methods used to calculate FDTA features that are used to measure

the roughness of a surface by fractional Brownian motion model of an image. The

extracted features include roughness or smoothness of a surface calculated using Hurst

coefficients [61, 62].

7. Fourier Power Spectrum (FPS)

Fourier analysis can be used to study texture properties of images. The FPS reveals

coarseness, fineness and directionality of a texture. FPS calculates the radial and angular

sum to find out the nature of the surface i.e. whether the surface is coarse or fine [58].

2.1.2 Genetic Data Analysis based Ischemic Stroke Risk Estimation

Genetics play an important role in the predisposition of many diseases. Genes data analysis is

conducted using different techniques. One method is to use Phylogenetic data and Phylogenetic

trees.

Page 33: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

16

2.1.2.1 Phylogenetic Trees

Phylogenetic trees [63] represent the evolutionary descent of different species or genes from a

common ancestor. They represent diversity among same species or genes of a common ancestor.

They are helpful for structuring classification and identifying the changes that took place in

genes over the course of time. Recently genetic [64, 65] and genomic [66] studies are greatly

contributing to the study of brain and genetically transmitted diseases. Advanced studies have

shown that certain genes can be mutated or deactivated [67] to reduce the risks of these diseases.

2.1.2.1.1 Phylogenetic Data

Different types of data are used to construct phylogenetic trees. The data selection depends on

the purpose for which the tree is to be constructed [68]. Data can be phenotypes, genome gene

ordered data, and nucleotide or protein sequenced data.

1. Phenotype Data

Phenotypes are the data which can be easily obtained by appearance. There are certain

heritable factors that clearly indicate the risk for ischemic stroke [69].

2. Genome Gene Ordered Data

Genome rearrangements in gene order data are used to construct phylogenetic trees

[70].

3. Nucleotide Sequenced Data

Deoxyribonucleic Acid (DNA) or Ribonucleic Acid (RNA) data are coded using an

alphabet for the four nucleotides. This data represents the genetic characters. The

phylogenetic trees are constructed using these DNA or RNA sequences as these provide

immense phylogenetic information[71, 72].

Page 34: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

17

4. Protein Sequenced Data

Proteins are encoded into sequences based on amino acids. This data is also rich in

phylogenetic information and thus is very commonly used nowadays for phylogenetic

tree construction [73].

2.1.2.1.2 Phylogenetic Tree Construction Methods

Phylogenetic data which is in any of the above mentioned forms is used to construct

phylogenetic tree. Trees are most logical way to represent data for evolution representation.

There are many algorithms that are used for phylogenetic tree construction [74, 75]. All these

fall into three major classes:

1. Maximum Parsimony Methods (MP)

MP methods are also known as minimum evolution methods. MP method tries to

estimate a tree with most mutations during the evolution period using minimum

numbers of evolutionary steps. The trees predicted are the one that requires minimum

number of steps to generate the observed variation in sequences from the sequences of

the common ancestors. It uses the simplest and the most parsimonious explanation of an

observed variation.

2. Evolutionary Distances based Methods

These methods infer evolutionary relationships from similarity among organisms.

Organisms sharing an ancestor in recent past are believed to be more similar than the

organisms that have a common ancestor that is more ancient. These methods calculate

genetic distances between sequence alignments in the form of distance matrix. A

distance matrix is also known as a table of evolutionary distances. This matrix is used

Page 35: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

18

to construct phylogenetic trees.

3. Maximum Likelihood Estimation (MLE) Principle

MLE is a method for estimation of the parameters of a statistical model. MLE principle

tries to estimate the tree that is optimal on some function. It requires sample data on

which the function is constructed. For a set of data and statistical model, MLE selects

the values of the model’s parameters such that when used the maximum likelihood

function is maximized. Probabilities are assigned to possible phylogenetic trees. A

substitution model is used to calculate the probability of a mutation.

2.1.3 Classifiers

Assigning a well-known and defined class or category to a new observation is known as

classification. Various classification algorithms have been implemented. Different types of

classifiers are used to classify the data based upon the application. In this section some of the

well-known classifiers are discussed in detail.

Linear Classifiers

Linear classifiers [76] make a classification decision based on the characteristics of the

objects. These characteristics are known as feature values and are fed to the classifier. A

linear predictor score function is used to calculate score for each possible category based

on feature vector and weights vector. The assigned class is the one with the highest score.

Support Vector Machines (SVM)

SVM [77] are supervised algorithms which are used for classification purpose and

regression analysis of data. SVM is a non-probabilistic binary linear classifier which tries

to predict the input data to be one of the two classes. SVM model is mapped so that

Page 36: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

19

separate classes are divided by a clear gap when plotted i.e. they are easily separable.

Quadratic Classifiers

Quadratic classifiers [76] use quadratic discriminant analysis for classification. It

assumes measurements from each class to be normally distributed. The classification

result is assumed to be quadratic in nature. It is assumed that two classes are separable by

a quadric surface.

Kernel Estimation

Kernel estimation also known as k-nearest neighbors [78], is the most simple

classification algorithm. The classification decision is based on majority voting of

neighbors. The classification function is approximated locally resulting in sensitivity to

the local structure of the data.

Decision Trees Learning

Decision trees learning [77] is a vastly used machine learning and classification

algorithm. It uses a decision tree in which the classes are represented as leaves and the

feature conjunctions are labeled on the branches. A model is created that is used to

predict the class of an observation based on several input variables. The predicted

outcome is the class of the new observation.

Artificial Neural Networks (ANN)

ANN [77] also known as neural networks, works on the principle of human brain. It is

continuously changing its design as it adapts to variations during the learning process.

These are well suited for the complex datasets and for the situations where complex

relationships exist between inputs and outputs, or where one has to find patterns and

Page 37: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

20

explore statistical structures with unknown joint probability distribution between

observed variables.

Bayesian Networks

Bayesian network [77] also known as belief network or directed acyclic graphical model.

It is a probabilistic graphical model. It uses directed acyclic graph to show the

conditional dependencies of the variables based on probabilities. The classification

decision is made on the basis of probabilities.

Hidden Markov Models (HMM)

HMM [76] assumes the problem to be classified to have unobserved or hidden states and

is closely related to optimal nonlinear filtering problem. In these models the states are

hidden by the output which is dependent on these states is visible. The sequence of

output tokens produced by the HMM can be used to get knowledge about the sequence

of states.

2.2 Comparison

A detailed comparison of existing medical imaging techniques for CA conducted is presented in

this section. Moreover a comprehensive review of research work conducted for ischemic stroke

risk estimation is given. Summarized tables of both comparisons are given as Table 2.1 and 2.2.

2.2.1 Comparison of Different Imaging Techniques

A comparison of commonly practiced medical imaging techniques for the CA analysis, their

advantages and limitations are given in Table 2.1

Page 38: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

21

Table 2.1: Comparison of different medical imaging techniques for the carotid artery

Imaging CDU CTA MRA CAG DSA

Invasive × × × √ √

Radiation Exposure × √ × √ √

Expensive × × √ √ ×

Safety Risk None Minimal Minimal Significant Significant

Repetition Frequency

over Short Duration of

Time

Often Often Rare Rare Rare

Motion Sensitive × √ √ √ √

Sedative × × √ Sometime Sometime

Page 39: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

22

Imaging CDU CTA MRA CAG DSA

Stenosis

50-

69%

≥ 70% ≥ 70% ≥ 70% - ≥ 70%

Sensitivity 93% 99% 95% 93% - 92.9%

Specificity 68% 86% 98% 97% - 81.9%

Accuracy 85% 95% 97% 95% - -

Overestimation of

Degree of Stenosis

√ √ √ × ×

Hair line Lumen

Detection

× √ × √ √

Risk of Allergic

Reaction

× √ √ √ √

Page 40: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

23

Imaging CDU CTA MRA CAG DSA

Contrast Agents × √ × √ √

Contrast Dosage NA High NA High Low

Image Quality High High High

Artifacts

Image Speckle

Shadowing

Poor definition

of artery

boundaries

Motion artifacts

Turbulent flow

Phase wrapping

Maxwell terms

Laminar flow

Venetian blinds

Motion artifacts

Motion artifacts

Larynx artifact

Technology Sound Waves X-Rays

Magnetic Field and

Radio Frequency

Waves

X-Rays X-Rays

Page 41: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

24

Imaging CDU CTA MRA CAG DSA

Usage

Blood flow

velocity

Severity of

Stenosis

Carotid index

Arterial &

Venous blood

flow

Anatomic image

of CA Lumen

Image of soft

tissue & bony

structure

Evaluation of

Extra cranial CA

Image carotid

arteries

Information about

disease process

Assess collaterals

Arterial and

Venous

occlusions

Arterial Stenosis

Cerebral

aneurysms

Effected by Metallic

Implants

× √ √ √ √

Performed in

Individuals with Renal

Insufficiency

√ × × × ×

Page 42: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

25

Imaging CDU CTA MRA CAG DSA

Procedure Duration Short Short Long Long Short

Views of Carotid

Bifurcation

Limited Seen Seen Limited Limited

Operator Dependent √ × × √ √

Image Usually 2D 3D/4D 3D/4D 3D 3D

Blood Flow Information √ × √ √ √

SNR Low High High High High

Accuracy effected by

Carotid Calcification

√ √ × ×

Page 43: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

26

Imaging CDU CTA MRA CAG DSA

Limitations Cannot assess

intracranial CA.

Superimposed

jugular veins and

arteries may hide

stenosis.

Evaluation of

small vessels is

difficult.

Risk of stroke.

Bones and muscle

tissue are present

in images.

Larynx artifact.

External carotid

or vertebral artery

overlying the

internal CA.

Poor arterial

contrast density.

Page 44: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

27

. It is evident from the comparison that Ultrasound has many advantages; a noninvasive

technique has low cost and includes no radiation exposure. Moreover it can be performed in

patients with renal insufficiency and individuals with metallic implants. It does not need any

sedative to be given to the individual undergoing the test and has no safety risk. So ultrasound

proves to be a safe method for analysis of stenosis.

2.2.2 Comparison of Different Approaches for Ischemic Stroke Risk

Estimation

Intensive research has been done for the detection of stenosis and plaque in the CA in past

decade. A comprehensive comparison is given in Table 2.2. Different factors which are being

considered include input, genetics, segmentation technique, features, classifier and results. Input

type, it’s source, modality and sample size are being considered while reviewing the literature.

Input can be in the form of images or signals. Image can be whole or they can be of plaque only.

Input source can be from some database or from a laboratory or collected by the researcher for

experimentation. Modality may include ultrasound, MRI, CT etc. Features which are being

extracted for classification, their extraction techniques and methods used to reduce features are

also being reviewed. Classifier and its type are also being considered. Different results achieved

by different researchers are also mentioned.

Page 45: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

28

Table 2.2: Review of Different Stenosis and IMT Estimation Approaches

Ref

eren

ce

Inp

ut

Typ

e

Inp

ut

Sou

rce

Mod

ali

ty

Sam

ple

Siz

e

Gen

etic

s

Au

tom

ate

d/

Sem

i

Au

tom

ate

d

Syst

em

Seg

men

tati

on

Tec

hn

iqu

e

Fea

ture

s

Extr

act

ed

Features

Mach

ine

Lea

rnin

g

Alg

ori

thm

/

Cla

ssif

ier

Cla

ssif

ier

Typ

e

Acc

ura

cy

Posi

tive

Pre

dic

t

Valu

e

Sen

siti

vit

y

Fals

e P

osi

tive

Fals

e N

egati

ve

Sp

ecif

icit

y

Oth

er

Fea

ture

Sel

ecti

on

Tec

hn

iqu

e

Fea

ture

Red

uct

ion

Tec

hn

iqu

e

Tsi

apar

as e

t al.

(20

12

) [7

9]

Imag

e

-

B m

ode

ult

raso

und

of

pla

qu

e

20

× - - Tex

ture

Fea

ture

s

DT

CW

T:

36

Z s

core

SVM Supervised 67.6 - 68.9 - - 72.1 -

FR

IT:

6

Z s

core

SVM Supervised 71.5 - 75.2 - - 72.6 -

FD

CT

: 44

Z s

core

SVM Supervised 79.3 - 78.2 - - 84.3 - F

req

uen

cy

Fea

ture

s

WP

-

SVM - 70.9 - 71.9 - - 75.2 -

Kem

ény

et

al.

(199

9)

[82

]

Colo

r-co

ded

po

wer

spec

tra

of

the

audib

le

Do

pp

ler

shif

t

Sel

f

Tra

nsc

ran

ial

Do

pp

ler

Ult

raso

und

282

Au

tom

atic

-

FF

T

- -

ANN Supervised - 56.7 73.4 - - - -

Page 46: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

29

Ref

eren

ce

Inp

ut

Typ

e

Inp

ut

Sou

rce

Mod

ali

ty

Sam

ple

Siz

e

Gen

etic

s

Au

tom

ate

d/

Sem

i

Au

tom

ate

d

Syst

em

Seg

men

tati

on

Tec

hn

iqu

e

Fea

ture

s

Extr

act

ed

Features

Mach

ine

Lea

rnin

g

Alg

ori

thm

/

Cla

ssif

ier

Cla

ssif

ier

Typ

e

Acc

ura

cy

Posi

tive

Pre

dic

t

Valu

e

Sen

siti

vit

y

Fals

e P

osi

tive

Fals

e N

egati

ve

Sp

ecif

icit

y

Oth

er

Fea

ture

Sel

ecti

on

Tec

hn

iqu

e

Fea

ture

Red

uct

ion

Tec

hn

iqu

e

Ky

riac

ou

et

al.

(2

009

) [8

3]

Car

oti

d p

laq

ue

imag

es

Irv

ine

Lab

Ult

raso

und

im

ages

274

im

ages

×

Sem

i au

tom

atic

Man

ual

Mo

rph

olo

gic

al F

eatu

res

Mult

ilev

el b

inar

y

mo

rpho

logic

al

mo

del

PC

A

SVM Supervised 73.72

- 83.94

36.5 16.06

63.5 Acoustic shadows

are

excluded

Dir

ect

Gra

y s

cale

mo

rpho

logy

mo

del

-

SVM Supervised 66.79

- 54.01

20.44

45.99

79.56

Sto

itsi

s et

al.

(200

4)

[80

]

Sta

tic

imag

es, im

age

sequ

ence

s

Irv

ine

Lab

B-

mode

ult

raso

un

d

19 p

atie

nts

×

Au

tom

atic

AN

AL

YS

IS

(s/w

fo

r in

terp

reta

tion

of

med

ical

imag

es)

Tex

ture

fea

ture

s: 9

9

FO

S,

SO

S,

law

’s

TE

M,

FD

TA

AN

OV

A

Fuzzy c-

means

Unsupervis

ed

74 - - - - - Atherom

atous

plaques were

predicte

d.

Moti

on

feat

ure

: 2

MS

V,

MR

SV

AN

OV

A

Fuzzy c-

means

Unsupervis

ed

79 - - - - -

Page 47: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

30

Ref

eren

ce

Inp

ut

Typ

e

Inp

ut

Sou

rce

Mod

ali

ty

Sam

ple

Siz

e

Gen

etic

s

Au

tom

ate

d/

Sem

i

Au

tom

ate

d

Syst

em

Seg

men

tati

on

Tec

hn

iqu

e

Fea

ture

s

Extr

act

ed

Features

Mach

ine

Lea

rnin

g

Alg

ori

thm

/

Cla

ssif

ier

Cla

ssif

ier

Typ

e

Acc

ura

cy

Posi

tive

Pre

dic

t

Valu

e

Sen

siti

vit

y

Fals

e P

osi

tive

Fals

e N

egati

ve

Sp

ecif

icit

y

Oth

er

Fea

ture

Sel

ecti

on

Tec

hn

iqu

e

Fea

ture

Red

uct

ion

Tec

hn

iqu

e

Tex

ture

+ M

oti

on

feat

ure

FO

S,

SO

S,

Law

’s

TE

M,

FD

TA

, M

SV

,

MR

SV

AN

OV

A

Fuzzy c-means

Unsupervised

84 - - - - -

Ch

rist

odou

lou e

t a

l. (

200

3)

[84

]

caro

tid

pla

que

u/s

im

age

Irv

ine

Lab

B m

ode

Ult

raso

und

230

im

ages

Sem

i au

tom

ated

Man

ual

Tex

ture

+ S

hap

e :

61

FO

S,

SG

LD

M (

mea

n),

SG

LD

M (

ran

ge)

, G

LD

S,

NG

TD

M, S

FM

, T

EM

,

FD

TA

, F

PS

, S

hap

e P

aram

eter

s

Maj

ori

ty V

ote

SOM Un supervised

66 - - - - - -

KNN Supervised 65.1 - - - - - -

Av

erag

ing

con

fid

ence

Mea

sure

s

SOM Un

supervised

73.1 - - - - - -

KNN Supervised 68.8 - - - - - -

All

SOM Un

supervised

68.8 - - - - - -

KNN Supervised 69.7 - - - - - -

Page 48: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

31

Ref

eren

ce

Inp

ut

Typ

e

Inp

ut

Sou

rce

Mod

ali

ty

Sam

ple

Siz

e

Gen

etic

s

Au

tom

ate

d/

Sem

i

Au

tom

ate

d

Syst

em

Seg

men

tati

on

Tec

hn

iqu

e

Fea

ture

s

Extr

act

ed

Features

Mach

ine

Lea

rnin

g

Alg

ori

thm

/

Cla

ssif

ier

Cla

ssif

ier

Typ

e

Acc

ura

cy

Posi

tive

Pre

dic

t

Valu

e

Sen

siti

vit

y

Fals

e P

osi

tive

Fals

e N

egati

ve

Sp

ecif

icit

y

Oth

er

Fea

ture

Sel

ecti

on

Tec

hn

iqu

e

Fea

ture

Red

uct

ion

Tec

hn

iqu

e

Ky

riac

ou

et

al.

(2

006

) [8

5]

Gra

y s

cale

car

oti

d p

laq

ue

imag

es

Irv

ine

lab

Ult

raso

und

274

×

Sem

i au

tom

ated

Man

ual

Mo

rph

olo

gic

al f

eatu

re

Gra

y S

cale

Mo

rph

olo

gic

al A

nal

ysi

s

-

PNN Supervised 62.04

- 59.85

35.77

40.15

64.23

Acoustic shadows

are

excluded

SVM Supervised 63.1

4

- 62.7

7

36.5 37.2

3

63.5

PC

A

PNN Supervised 60.58

- 57.66

36.5 42.34

63.5

SVM Supervised 66.7

9

- 54.0

1

20.4

4

45.9

9

79.5

6

Ky

riac

ou

et

al.

(2

005

) [8

6]

Cli

nic

al d

ata

+ u

ltra

sound

im

ages

EU

BIO

ME

D I

I A

CS

RS

+ I

rvin

e

Lab

ora

tory

Lon

git

ud

inal

sca

ns

usi

ng d

up

lex

scan

nin

g a

nd

colo

r fl

ow

im

agin

g

Cli

nic

al d

ata:

1298

cas

es

+ U

/S:

274

im

ages

×

Sem

i au

tom

ated

Man

ual

340

cli

nic

al f

acto

rs +

54 T

extu

re

Fea

ture

s

SF

-

PNN Supervised 65.3 - - - - - Acoustic

shadows

are

excluded SVM Supervised 69.3 - - - - -

PC

A

PNN Supervised 65.3 - - - - -

SVM Supervised 70.1 - - - - -

Page 49: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

32

Ref

eren

ce

Inp

ut

Typ

e

Inp

ut

Sou

rce

Mod

ali

ty

Sam

ple

Siz

e

Gen

etic

s

Au

tom

ate

d/

Sem

i

Au

tom

ate

d

Syst

em

Seg

men

tati

on

Tec

hn

iqu

e

Fea

ture

s

Extr

act

ed

Features

Mach

ine

Lea

rnin

g

Alg

ori

thm

/

Cla

ssif

ier

Cla

ssif

ier

Typ

e

Acc

ura

cy

Posi

tive

Pre

dic

t

Valu

e

Sen

siti

vit

y

Fals

e P

osi

tive

Fals

e N

egati

ve

Sp

ecif

icit

y

Oth

er

Fea

ture

Sel

ecti

on

Tec

hn

iqu

e

Fea

ture

Red

uct

ion

Tec

hn

iqu

e

SG

LD

M

-

PNN Supervised 71.2 - - - - -

SVM Supervised 69.7 - - - - -

PC

A

PNN Supervised 70.8 - - - - -

SVM Supervised 68.6 - - - - -

SF

+ N

GT

DM

-

PNN Supervised 62.8 - - - - -

SVM Supervised 71.2 - - - - -

PC

A

PNN Supervised 62.4 - - - - -

SVM Supervised 69.3 - - - - -

SF

+ S

GL

DM

+

TE

M +

NG

TD

M

-

PNN Supervised 65.3 - - - - -

SVM Supervised 70.8 - - - - -

Page 50: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

33

Ref

eren

ce

Inp

ut

Typ

e

Inp

ut

Sou

rce

Mod

ali

ty

Sam

ple

Siz

e

Gen

etic

s

Au

tom

ate

d/

Sem

i

Au

tom

ate

d

Syst

em

Seg

men

tati

on

Tec

hn

iqu

e

Fea

ture

s

Extr

act

ed

Features

Mach

ine

Lea

rnin

g

Alg

ori

thm

/

Cla

ssif

ier

Cla

ssif

ier

Typ

e

Acc

ura

cy

Posi

tive

Pre

dic

t

Valu

e

Sen

siti

vit

y

Fals

e P

osi

tive

Fals

e N

egati

ve

Sp

ecif

icit

y

Oth

er

Fea

ture

Sel

ecti

on

Tec

hn

iqu

e

Fea

ture

Red

uct

ion

Tec

hn

iqu

e

PC

A

PNN Supervised 64.2 - - - - -

SVM Supervised 70.1 - - - - -

Ky

riac

ou

et

al.

(2

007

) [8

7]

Cli

nic

al r

isk f

acto

rs +

ult

raso

und

im

ages

AC

SR

S,

Irv

ine

Lab

Lon

git

ud

inal

sca

ns

usi

ng d

up

lex s

cann

ing a

nd

co

lor

flo

w i

mag

ing

18 c

lin

ical

fac

tors

, 2

74

im

ages

Sem

i au

tom

ated

IMT

seg

men

tati

on

: W

illi

ams

and

Sh

ah. A

ther

osc

lero

tic

pla

que

seg

men

tati

on

: L

ai

and

Chin

snak

e.

Tex

ture

Fea

ture

s

SF

+ S

GL

DM

+ G

LD

S+

NG

TD

M+

SF

M+

Law

’s T

EM

+ F

ract

als

PC

A

PNN Supervised 72.3 - 75.9 31.4 24.1 68.6 -

SVM Supervised 73 - 82.5 36.5 17.5 63.5

PNN Supervised 69.7 - 74.4 35 25.5 65

SVM Supervised 73.4 - 81 34.3 19 65.7

Page 51: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

34

Ref

eren

ce

Inp

ut

Typ

e

Inp

ut

Sou

rce

Mod

ali

ty

Sam

ple

Siz

e

Gen

etic

s

Au

tom

ate

d/

Sem

i

Au

tom

ate

d

Syst

em

Seg

men

tati

on

Tec

hn

iqu

e

Fea

ture

s

Extr

act

ed

Features

Mach

ine

Lea

rnin

g

Alg

ori

thm

/

Cla

ssif

ier

Cla

ssif

ier

Typ

e

Acc

ura

cy

Posi

tive

Pre

dic

t

Valu

e

Sen

siti

vit

y

Fals

e P

osi

tive

Fals

e N

egati

ve

Sp

ecif

icit

y

Oth

er

Fea

ture

Sel

ecti

on

Tec

hn

iqu

e

Fea

ture

Red

uct

ion

Tec

hn

iqu

e

San

thiy

aku

mar

i et

al.

(2

011

) [8

1]

Ult

raso

und

car

oti

d a

rter

y i

mag

es

-

Ult

raso

und

100

im

ages

×

Au

tom

ated

Con

tou

r ex

trac

tio

n u

sing

ener

gy m

inim

izat

ion

pro

cess

.

Con

tou

rs

- -

ANN, MBPN

Supervised Normal :

96%,

Cardiovas

cular

Dise

ase:

90%,

Cerebrov

ascul

ar Dise

ase:

92%

- - - - - -

Ab

do

lmal

eki

et a

l. (

2005

) [8

8]

Lon

git

ud

inal

vie

w

-

Hig

h r

eso

luti

on B

-mo

de

ult

raso

und

, co

lor

Dop

ple

r im

ages

128

×

Sem

i A

uto

mat

ed

-

Qu

anti

tati

ve

feat

ure

s: 1

0

Ult

raso

nic

mea

sure

men

t

-

Logistic

Regressi

on

Model

Supervised - .94 - - - - -

Page 52: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

35

Ref

eren

ce

Inp

ut

Typ

e

Inp

ut

Sou

rce

Mod

ali

ty

Sam

ple

Siz

e

Gen

etic

s

Au

tom

ate

d/

Sem

i

Au

tom

ate

d

Syst

em

Seg

men

tati

on

Tec

hn

iqu

e

Fea

ture

s

Extr

act

ed

Features

Mach

ine

Lea

rnin

g

Alg

ori

thm

/

Cla

ssif

ier

Cla

ssif

ier

Typ

e

Acc

ura

cy

Posi

tive

Pre

dic

t

Valu

e

Sen

siti

vit

y

Fals

e P

osi

tive

Fals

e N

egati

ve

Sp

ecif

icit

y

Oth

er

Fea

ture

Sel

ecti

on

Tec

hn

iqu

e

Fea

ture

Red

uct

ion

Tec

hn

iqu

e

Lam

bro

u e

t a

l. (

201

2)

[89

]

Ult

raso

und

im

ages

of

car

oti

d p

laq

ue

Irv

ine

Lab

-

274

×

Sem

i au

tom

ated

Man

ual

Tex

ture

: 7

fea

ture

set

s

SF,

SG

LD

M,

GL

DS

, N

GT

DM

, S

FM

, T

EM

, F

DT

A,

FP

S,

RU

NL

-

Confidence

Predict

Value

AN

N

Supervised

71.53

- - - - - -

k

-N

N

Supervi

sed

70.8 - - - - -

Mo

rph

olo

gic

al

Mult

i le

vel

app

roac

h, g

ray

sca

le

mo

rpho

logic

al a

nal

ysi

s

-

Confiden

ce Predict

Value

A

NN

Supervi

sed

72.2

6

- - - - -

S

V

M

supervis

ed

73.7

2

- - - - -

Page 53: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

36

Ref

eren

ce

Inp

ut

Typ

e

Inp

ut

Sou

rce

Mod

ali

ty

Sam

ple

Siz

e

Gen

etic

s

Au

tom

ate

d/

Sem

i

Au

tom

ate

d

Syst

em

Seg

men

tati

on

Tec

hn

iqu

e

Fea

ture

s

Extr

act

ed

Features

Mach

ine

Lea

rnin

g

Alg

ori

thm

/

Cla

ssif

ier

Cla

ssif

ier

Typ

e

Acc

ura

cy

Posi

tive

Pre

dic

t

Valu

e

Sen

siti

vit

y

Fals

e P

osi

tive

Fals

e N

egati

ve

Sp

ecif

icit

y

Oth

er

Fea

ture

Sel

ecti

on

Tec

hn

iqu

e

Fea

ture

Red

uct

ion

Tec

hn

iqu

e

Ch

rist

odou

lou e

t a

l. (

201

0)

[90

]

Car

oti

d p

laq

ue

imag

e

Irv

ine

Lab

Ult

raso

und

274

×

Sem

i au

tom

ated

Man

ual

Mult

i re

gio

n H

isto

gra

m

- -

SOM 64.8 - - - - - -

k-NN supervised 63.1 - - - - -

Page 54: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

37

Most commonly used classifiers for ischemic stroke risk estimation are ANN and SVM as

apparent from Table 2.2. Tsiaparas et al. [79] have used B mode ultrasound images of plaque

to achieve an accuracy rate of 79.3% using SVM for Texture features. Stoitsis et al. [80]

achieved classification accuracy of 84% based on texture features combined with motion

features using Fuzzy c-means clustering. Contours based classification using ANN and

Multilayer Back Propagation Network (MBPN) is used by Santhiyakumari et al. [81] to

achieve the highest accuracy of 96%. Contours, Texture and Motion features prove to give

best classification results.

We have discussed and compared many carotid imaging techniques, image processing and

classification techniques. All the discussed techniques have some advantages as well as

disadvantages. Factors that are of prime importance for automated analysis and classification

of the carotid images are the spatial resolution and quality of the image. The preferred

imaging technique should be low risk, noninvasive as well as low cost without much

compromise on the quality of the image. Carotid ultrasound is a preferred technique that has

all the stated properties. The challenge is to consider the features that contribute the most for

assessing the ischemic stroke risk.

Page 55: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

CHAPTER NO. 3

PROPOSED APPROACH-

ISCHEMIC STROKE RISK

ESTIMATION USING CAROTID

IMAGING

Page 56: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

38

Proposed Approach- Ischemic Stroke Risk

Estimation Using Carotid Imaging

CA atherosclerosis is one of the major reasons for ischemic stroke. A vast range of medical

imaging techniques are available for CA assessment including CDU [91, 92], CT [93], MRA

[94, 95], CA [96] and DSA [97]. Carotid ultrasound is widely used for diagnosis of carotid

diseases and risk estimation of stroke due to its noninvasive nature, being low cost, having

short examination time, aiding accurate approximation of carotid intima media thickness

(CIMT) values and continuous improvement in quality of ultrasound images.

Nighoghossian et. al. [98] has reviewed different CA assessment techniques and discussed

the future prospects of the research field. A detailed comparison of these medical imaging

techniques for CA assessment is given in the previous chapter.

IMT serves as an indicator of stroke risk and cardiovascular diseases. IMT is directly

associated with increased risk of stroke especially in elderly population without any history

of cardiovascular diseases. IMT is measured as the distance between the LI and MA

interfaces of the CA. The layers of carotid arterial walls can be seen in figure 3.1.

IntimaMedia

Adventitia

Lumen

Figure 3.1: Layers of carotid arterial wall

Page 57: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

39

The inner layer is the intima which consists of endothelial cells. Central layer is the media

which consists of smooth cells, collagen, elastin and proteoglycan. Outermost wall is

adventitia which consists of fibroblasts, collagen and elastin. Each wall is separated by an

elastic membrane. In the middle of the CA is the blood known as lumen that flows between

the CA.

We have proposed SF based approach for ischemic stroke risk estimation from B-mode

ultrasound images. An active contour model based intima-media (IM) segmentation approach

is being proposed that uses coordinates obtained from lines extracted using HT on ultrasound

images and deforms the contours formed by minimizing energy function to get finalized

contours.

Here, parametric snake model has been used as active contour model. Coordinates of the

segments of lines being detected using HT can adapt to the variation in curve and thickness

of the blood vessel as they are taken to be part of co-centric circles. These circles can be big

enough to consist of the points that form straight lines or small enough to cover points that

form curved vessels. The IMT values calculated are used to estimate stroke risk using an

objective criteria based on scientific research [99, 100].

Section 1 describes the data and details of our proposed approach. Section 2 contains the

results of our proposed approach followed by discussion on the results achieved in section 3.

We conclude this chapter in section 4 with a comparison between our approach and other

approaches and future prospects of the proposed approach.

3.1 Materials and Methods

We have used the CA B-mode ultrasound dataset available at eHealth Laboratory website,

University of Cyprus [101]. The dataset includes 100 B-mode longitudinal ultrasound images

Page 58: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

40

of 100 patients who were at the risk of atherosclerosis and already had developed symptoms

like stroke or TIA.

This dataset was recorded at the Cyprus Institute of Neurology and Genetics, Nicosia,

Cyprus and used for segmentation of CA IM [33]. Images were stored on a magneto optical

drive after logarithmic compression. The recorded images were resized using bicubic method

to16.66 pixels/mm. The details about the dataset used are given in the Table 3.1.

Table 3.1: Dataset details.

Attributes Values

No. of Patients 100

Male 58

Female 42

Age Group 26–95 years

Mean Age 54 years

Image Size 768x576 pixels

Image Type Grayscale

Image Extension .cri

Ultrasound Mode B-mode

Plane Longitudinal Ultrasound

Image

Scanner Type ATL HDI-3000

No. of Elements 64

Head Operating Frequency Range 4-7 MHz

Acoustic Aperture 10 x 8 mm

Transmission Focal Range 0.8-11 cm

Page 59: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

41

For ischemic stroke risk estimation from the carotid ultrasound images, the main phases are

Preprocessing, Intima-Media Segmentation and Estimation. Figure 3.2 illustrates these steps

and their outputs.

In the Preprocessing phase (Phase-I) of the proposed approach, Image Enhancement, ROI

Coordinates Extraction and Noise Removal is performed on Original Image (ultrasound

image) to get Enhanced Image, ROI Coordinates and Filtered Image respectively.

Intima-Media Segmentation phase (Phase-II) includes Line Extraction, Candidate Line

Selection and Contour Extraction. Line Extraction process is executed on the Enhanced

Image using ROI Coordinates obtained from the ROI Coordinates Extraction process to get

the Extracted Lines. Extracted Lines undergo a process of Candidate Line Selection to

determine whether they are the part of CA wall or not. The output of this process is Selected

Lines. Original Image using ROI Coordinates is filtered by Noise Removal process to get

Filtered Image.

The Selected Lines and Filtered Image are then further processed by the Contour Extraction

process to get final Contours of the Intima-Media. In Estimation Phase (Phase-III), Features

are acquired by calculating IMT values in Feature Extraction process and afterwards, these

Features are fed to the Classification process for Stroke Risk Estimation.

Figure 3.3 shows the detailed working of ischemic stroke risk estimation from ultrasound

images of the proposed approach.

Page 60: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

42

Original ImageOriginal Image PreprocessingIntima-Media Segmentation

Estimation

ContoursStroke RiskEstimation

Stroke RiskEstimation

Filtered Image

ROI coordinates

Enhanced Image

Figure 3.2: Block diagram of proposed approach.

Page 61: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

43

Intima-Media SegmentationPreprocessing

Estimation

OriginalImage

OriginalImage

Image Enhancement Enhanced

Image

ROI Coordinates Extraction

Line Extraction

Contour Extraction

Candidate Line Selection

Classification

Extracted Lines

SelectedLines

Contours

Stroke Risk Estimation

Stroke Risk Estimation

Feature Extraction

Features

Noise Removal

Filtered Image

ROI Coordinates

Figure 3.3: Detailed working of proposed approach.

Page 62: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

44

3.1.1 Phase-I: Preprocessing

The Original Image is processed for automatic extraction of Enhanced Image, ROI

Coordinates and Filtered Image. Contrast in the Original Image is enhanced using

Adaptive Histogram Equalization to get Enhanced Image. In the Image Enhancement

process a mapping function is used that adjusts the intensity of pixels. Difference

between the values of the pixel under consideration with the local variance of the

neighboring pixels is calculated. If the difference is greater than a fixed threshold then

the intensity of the pixel is replaced with the global intensity of that pixel.

Enhanced Image is first binarized during the ROI Coordinates Extraction process.

The resultant binary image is then scanned starting from the last row exploiting the

prior information that first occurrence of maximum number of white pixels is found

in the row that is closer to the Media Layer. The coordinates of the starting point of

the identified row are used to calculate ROI Coordinates. ROI of height ‘w’ is

comprised of w rows above and w rows below the identified row. A value of 15

determined through experimental analysis is used for w in our proposed approach.

The ROI Coordinates are the coordinates of first pixels of the first and last row of

ROI i.e. Crow-w, col and Crow+w, col respectively.

A sub image is formed on the basis of the Original Image and ROI Coordinates and is

filtered for speckle noise reduction to produce Filtered Image. Detail Preserving

Anisotropic Diffusion (DPAD) filter proposed by [102] is used to filter the sub image

for speckle noise. Δt is set at 0.2 with 100 iterations. Estimation of noise statistics is

made using C2

MAD. The gradient is calculated on a 5x5 window.

The algorithm for the Preprocessing phase is given as Algorithm 1.

Page 63: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

45

Algorithm 1: Preprocessing phase Algorithm

Function Preprocessing (OriginalImage) : Enhanced Image, ROI coordinates,

Filtered Image

{

AHEImage = Call AdaptiveHistogramEqualization for Image

[ Point x1, y1 , Point x2 , y2 ] = Call ROI for AHEImage

FLTRImage = Call NoiseRemoval for Image, [ Point x1, y1 , Point x2 , y2 ]

return AHEImage, [ Point x1, y1 , Point x2 , y2 ], FLTRImage

}

Function AdaptiveHistogramEqualization (Image) : AHEImage

{

EImage e = Call HistogramEqualization for Image

∀row Є Image c : i

∀ column Є Image c : j

sd = Call StandardDeviation for c (i-1:i+1,j-1:j+1)

if | c(i,j) – sd | > threshold //threshold = 10

then d(i,j) = e (i,j);

else d(i,j) = c(i,j)

return d

}

Page 64: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

46

Function ROI (AHEImage) : ROI coordinates

{

BinaryImage= Call Binarize for AHEImage

C row,col = maxwhite (BinaryImage)

return Crow+w, col , Crow-w, col // w is ½ (ROI

height)

}

Function NoiseRemoval (Image, [ Point x1, y1 , Point x2 , y2 ] ) : Filtered Image

{

// Method DPAD for Image row Point x1, y1 to row Point x2 , y2 as in [102]

}

Function StandardDeviation (Intensities of 8 Neighbours) : Standard Deviation

{

// Default Method

}

Function HistogramEqualization (Image) : EImage

{

// Default Method

}

Page 65: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

47

Function Binarize (AHEImage) : BinaryImage

{

∀row Є AHEImage a : i

∀ column Є AHEImage a : j

if a[i] [j] <= threshold

then b [i] [j] = 0

else b[i] [j] = 255

return b

}

3.1.2 Phase-II: Intima-Media Segmentation

A sub image is obtained from Enhanced Image using ROI Coordinates and is

processed for Line Extraction. LI and MA contain some piecewise straight

boundaries. HT is applied to the sub image to get Extracted Lines. These Extracted

Lines are then processed by Candidate Line Selection process as described in

Algorithm 2 to find the lines that are part of the LI and MA. Candidate Line Selection

is considered as a 2D problem given a set of n points Pi (xi , yi) (i = 1…n) comprising

of the starting and ending points of each of the Extracted Lines, find the center points

C1(xc1 , yc1), C2(xc2 , yc2) and the radius r1 and r2 of two unknown circles that

pass closest to the points. LI and MA each are considered as part of circles. Even

straight LI and MA can be considered to be part of very big circles. The two best fit

Page 66: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

48

circles are found using Conjugate method with Polak and Ribiere Factor [103]

implemented in Matlab 7.14 and is given as Algorithm 2. The circles are selected

such that they are co-centered within a range and are are covering maximum number

of points. Moreover the circles selected do not have points inside illumination region,

since the adventitia region is brighter than the media region [104].

The Selected Lines are processed for Contour Extraction. The coordinates of the

Selected Lines are used as seed points for accurate contour extraction using snakes

model. The gradient vector flow (GVF) snake model [105] is used in the proposed

approach that processes and deforms the initial contour built using the seed points.

Algorithm 2: Candidate Line Selection Algorithm

Function CandidateLineSelection (Extracted Lines) : Selected Lines

{

Circles [C 1:n], Member Lines [L1:n]= Call ExtractCircles for Extracted Lines

Selected Lines [L1:n] = Call LineSelection for Circles [C1:n], Member Lines

return Selected Lines [L1:n]

}

Function ExtractCircles (Extracted Lines) : Circles, Member Coordinates

{

// Method for finding best fit circles using conjugate gradient with Polak and Ribiere

factor [103]

}

Page 67: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

49

Function LineSelection (Circles, Member Lines) : Selected Lines

{

maxcoordinates = Call CoordinatesCountOrdering for Circles and Member Lines

∀ Ci 1:n Є Circles

∀ Cj 1:n Є Circles

x = Call IsCocentric for Ci and Cj

y = Call CoversMaxCoordinates for Ci and Cj and maxcoordinates

z = Call IlluminationRegionCheck for Ci and Cj

if (x && y && z)

Ca = Ci

Cb = Cj

Selected Lines = S1S2 : S1 ∈Member Lines Ca, S2 ∈ Member Lines Cb

break

return Selected Lines

}

Function CoordinatesCountOrdering (Circles, Member Lines) : maxcoordinates

{

// return maximum number of coordinates on a circle from Circles

}

Page 68: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

50

Function IsCocentric (Ci , Cj) : BOOL

{

if |Center Ci – Center Cj| <= threshold

return TRUE

else

return FALSE

}

Function CoversMaxCoordinates (Ci, Cj, maxcoordinates) : BOOL

{

if ( Coordinates count Ci or Coordinates count Cj >= max coordinates / 2)

return TRUE

else

return FALSE

}

Function IlluminationRegionCheck ( Ci, Cj ) : BOOL

{

if ( Illumination value of Ci and Cj is in range [104])

return FALSE

else

return TRUE

Page 69: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

51

}

For the proposed approach, GVF field calculation and initial contour deformation is

implemented in Matlab 7.14. Weightline component = 0.30, Weightedge component = 0.30,

Weightterminal component = 0.30 and number of iterations = 30 after experimentation.

The initial values of snake parameters were set to α = 0.40, β = 0.20, γ = 1, κ = 0.15.

Figure 3.4 illustrates the Image processing steps performed in Phase-I and Phase-II on

the Original Image to get the Contours.

Page 70: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

52

Original Image

Image Enhancement

ROI Coordinates Extraction

Contours

Line Extraction

Candidate Line Selection Contour Extraction

Preprocessing

Intima-Media Segmentation

Noise RemovalROI Coordinates

Figure 3.4: Image processing steps on carotid ultrasound images.

Page 71: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

53

3.1.3 Phase -III: Estimation

After getting the CA Contours, distance values between contours are calculated and

formation of Features is done by the Feature Extraction process. Pixels are counted

downwards from upper contour (LI) till the lower contour (MA) is reached. A

stepping function is used so that the distance is calculated in steps for pixels located

on the upper contour. A step value of 5 pixels is chosen.

The distance values are used for calculating the features from the carotid images.

Four types of features are being calculated and taken into account for analyzing the

thickness data. These features include minimum, maximum, mean and standard

deviation values of the line distance data calculated using Equations (3.1), (3.2), (3.3)

and (3.4) respectively.

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑚𝑖𝑛𝑝𝑖𝑥𝑒𝑙𝑠= min ( 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑝𝑖𝑥𝑒𝑙𝑠1

𝑛 ) (3.1)

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑚𝑎𝑥𝑝𝑖𝑥𝑒𝑙𝑠= max ( 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑝𝑖𝑥𝑒𝑙𝑠1

𝑛 ) (3.2)

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑚𝑒𝑎𝑛𝑝𝑖𝑥𝑒𝑙𝑠=

∑ 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑝𝑖𝑥𝑒𝑙𝑠𝑖𝑛𝑖=1

𝑛 (3.3)

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑠𝑑𝑝𝑖𝑥𝑒𝑙𝑠= √

∑ (𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑝𝑖𝑥𝑒𝑙𝑠𝑖−𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑚𝑒𝑎𝑛𝑝𝑖𝑥𝑒𝑙𝑠

)2 𝑛𝑖=1

𝑛 (3.4)

where,

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑝𝑖𝑥𝑒𝑙𝑠 is distance calculated in pixels

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑚𝑖𝑛𝑝𝑖𝑥𝑒𝑙𝑠 is Minimum value for the distance in pixels

Page 72: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

54

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑚𝑎𝑥𝑝𝑖𝑥𝑒𝑙𝑠 is Maximum value for the distance in pixels

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑚𝑒𝑎𝑛𝑝𝑖𝑥𝑒𝑙𝑠 is Mean value for the distance in pixels

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑠𝑑𝑝𝑖𝑥𝑒𝑙𝑠 is Standard deviation for the distance in pixels

𝑛 is total number of distance values

The distance values calculated are mapped from pixels to millimeters (mms). General

Mapping function is given in Equation (3.5).

𝐼𝑀𝑇_𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠𝑚𝑖𝑛𝑚𝑚,𝐼𝑀𝑇_𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠𝑚𝑎𝑥𝑚𝑚

, 𝐼𝑀𝑇_𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠𝑚𝑒𝑎𝑛𝑚𝑚and

𝐼𝑀𝑇_𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠𝑠𝑑𝑚𝑚 are calculated using thickness mapping functions represented as

Equations (3.6), (3.7), (3.8) and (3.9) respectively.

Pixel density per mm is already known for the ultrasound images. In our case the

pixel density per mm is 16.66 pixels per mm. These thickness values form the

Features. The thickness mapping function is as below:

𝑚𝑎𝑝(𝑋𝑝𝑖𝑥𝑒𝑙𝑠) = 𝑋𝑝𝑖𝑥𝑒𝑙𝑠

𝑝𝑖𝑥𝑒𝑙_𝑑𝑒𝑛𝑠𝑖𝑡𝑦= 𝑋𝑚𝑚 (3.5)

𝐼𝑀𝑇_𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠𝑚𝑖𝑛𝑚𝑚=

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑚𝑖𝑛𝑝𝑖𝑥𝑒𝑙𝑠

𝑝𝑖𝑥𝑒𝑙_𝑑𝑒𝑛𝑠𝑖𝑡𝑦 (3.6)

𝐼𝑀𝑇_𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠𝑚𝑎𝑥𝑚𝑚=

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑚𝑎𝑥𝑝𝑖𝑥𝑒𝑙𝑠

𝑝𝑖𝑥𝑒𝑙_𝑑𝑒𝑛𝑠𝑖𝑡𝑦 (3.7)

𝐼𝑀𝑇_𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠𝑚𝑒𝑎𝑛𝑚𝑚=

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑚𝑒𝑎𝑛𝑝𝑖𝑥𝑒𝑙𝑠

𝑝𝑖𝑥𝑒𝑙_𝑑𝑒𝑛𝑠𝑖𝑡𝑦 (3.8)

𝐼𝑀𝑇_𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠𝑠𝑑𝑚𝑚=

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑠𝑑𝑝𝑖𝑥𝑒𝑙𝑠

𝑝𝑖𝑥𝑒𝑙_𝑑𝑒𝑛𝑠𝑖𝑡𝑦 (3.9)

where,

Page 73: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

55

𝑋𝑝𝑖𝑥𝑒𝑙𝑠 is any X value in pixels

𝑋𝑚𝑚 is mapped X value in mm

𝑝𝑖𝑥𝑒𝑙_𝑑𝑒𝑛𝑠𝑖𝑡𝑦 is pixel density per mm

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑚𝑖𝑛𝑝𝑖𝑥𝑒𝑙𝑠 is Minimum value for the distance in pixels

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑚𝑎𝑥𝑝𝑖𝑥𝑒𝑙𝑠 is Maximum value for the distance in pixels

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑚𝑒𝑎𝑛𝑝𝑖𝑥𝑒𝑙𝑠 is Mean value for the distance in pixels

𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑠𝑑𝑝𝑖𝑥𝑒𝑙𝑠 is Standard Deviation for the distance in pixels

𝐼𝑀𝑇_𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠𝑚𝑖𝑛𝑚𝑚 is mapped Minimum value for IMT thickness in mm

𝐼𝑀𝑇_𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠𝑚𝑎𝑥𝑚𝑚 is mapped Maximum value for IMT thickness in mm

𝐼𝑀𝑇_𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠𝑚𝑒𝑎𝑛𝑚𝑚 is mapped Mean value for IMT thickness in mm

𝐼𝑀𝑇_𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠𝑠𝑑𝑚𝑚 is mapped Standard Deviation value for IMT thickness in mm

𝐼𝑀𝑇_𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠𝑚𝑖𝑛𝑚𝑚, IMT_thicknessmaxmm

, IMT_thicknessmeanmmcorrespond to

𝐼𝑀𝑇𝑚𝑖𝑛, 𝐼𝑀𝑇𝑚𝑎𝑥 and 𝐼𝑀𝑇𝑚𝑒𝑎𝑛values.

In the Classification process the subjects are classified into four major categories i.e.

No Stenosis, Mild Stenosis, Moderate Stenosis and Severe Stenosis based on well-

established criteria [99, 100]. This classification is further associated with the % risk

of ischemic stroke episode. The % risk is in terms of no risk, 25%, 50%, 75%

increased risk.

Page 74: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

56

The IMT_thicknesssdmm is a marker of thickness irregularities. It is used to find out

the % risk for an individual examined for the risk of ischemic stroke episode. The

IMT_thicknessmaxmm is used for the major classification, further classification is

done on the basis of IMT_thicknesssdmm. The proposed decision tree for the

classification is given in Figure 3.5.

Page 75: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

57

≥ 0.88 mm & ≤ 0.99 mm

Classified as No Stenosis

No Risk

Classified as Mild Stenosis

Classified as Mild

Stenosis with 25% Increased

Risk

Classified as Moderate Stenosis

Classified as Moderate Stenosis with 25% Increased

Risk

Classified as Moderate

Stenosis with 50%

Increased Risk

Classified as Severe

Stenosis with 75% Increased

Risk

<0.88 mm > 0.99 mm & ≤ 1.12 mm

< 0.1 mm≥ 0.1 mm & ≤ 0.89 mm

≥ 0.1 mm & ≤ 0.89 mm

> 0.9 mm< 0.1 mm

> 1.12 mm

Figure 3.5: Decision tree for classification of stenosis and ischemic stroke risk.

Page 76: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

58

3.2 Results

To validate the results of the proposed approach, the Features extracted after Intima-Media

Segmentation and Stroke Risk Estimation results are investigated. The automatic

segmentation results are investigated that how much the automatic results differ from the

manual results. The evaluation metrics used are Intra-Observer Error (IOE) [29, 33],

coefficient of variation (CV) % [29, 33], wilcoxon matched pairs rank sum test[33], bland-

altman plots [33, 106] and figure of merit (FoM) %[34].

𝐼𝑀𝑇𝑚𝑖𝑛,𝐼𝑀𝑇maxand 𝐼𝑀𝑇mean have been obtained both for the proposed approach using the

equations (3.6), (3.7) and (3.8) and for manual measurements by two experts for each one of

100 images. Intra-observer error is calculated using the formula IOE = SDIMT / √2 where

SDIMT is the standard deviation for each 100 measurements. Difference as percentage of the

collective mean value is calculated using CV% = IOE / IMT̅̅ ̅̅ ̅ * 100 where IMT̅̅ ̅̅ ̅ is the mean

IMT for each 100 measurements.

Each set of measurement is checked whether a significant difference exists or not between all

segmented boundaries by using the Wilcoxon pairs rank sum test at p<0.05. Agreement

between automatic and manual results is evaluated using Bland-Altman plots with 95%

agreement.

The measurements for all 100 images are given as manual measurements (NM1, NM2) by the

experts and automatic measurements (NA) generated by the proposed approach. The

observed standard deviation, SDIMT for the 𝐼𝑀𝑇𝑚𝑒𝑎𝑛 values of manual for expert1NM1 is

0.67mm, expert2 NM2 is 0.65mm and for automatic measurements by proposed approach NA

is 0.68mm. The results computed for the 100 images are given in Table 3.2 for 𝐼𝑀𝑇𝑚𝑒𝑎𝑛,

𝐼𝑀𝑇𝑚𝑖𝑛, 𝐼𝑀𝑇𝑚𝑎𝑥for NM1, NM2 and NA along with the values for IOE and CV%. The

𝐼𝑀𝑇𝑚𝑒𝑎𝑛± SDIMT for NM1 is 0.67± 0.15, NM2 is 0.65±0.16 and 0.68±0.15 for NA.

Page 77: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

59

Table 3.2: Comparison of manual and proposed approach measurements for 100 carotid artery ultrasound images

Metrics

Manual measurements (mm) Automatic measurements by

proposed approach NA (mm) Expert 1 NM1 Expert 2 NM2

IMT mean (SDIMT) 0.67 (0.15) 0.65 (0.16) 0.68 (0.15)

IMT min (SDIMT) 0.53 (0.13) 0.57 (0.15) 0.51 (0.11)

IMT max (SDIMT) 0.82 (0.20) 0.74 (0.16) 0.86 (0.16)

IOE 0.105 0.109 0.088

CV % 15.72 16.92 12.99

IOE Intra-observer error, CV% coefficient of variation.

Page 78: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

60

The results of Wilcoxon rank sum test for both of the manual segmentation measurements

and automatic measurement is given in Table 3.3. It is observed that a significant difference

does not exist between the automatic and manual measurements suggesting that manual

measurements can be replaced by automatic measurements with confidence.

Table 3.3: Wilcoxon ranksum test computed for Expert 1, Expert 2 and Automatic measurements for 100 carotid artery ultrasound images

Wilcoxon ranksum test

Expert 1 Expert 2

Automatic Not Significant (0.3538) Not Significant (0.0532)

Expert 1 - Not Significant (0.2365)

p value is shown in parentheses (Significant difference at p < 0.05, Not significant

difference at p > 0.05).

Bland-Altman plots between the manual measurements by Expert1NM1, Expert2 NM2 and

Automatic measurements by proposed approach NA are given in Figure 3.6. Mean difference

is represented by the middle line. Upper and lower lines represent the limits of agreement

between the manual and automatic measurements i.e. mean of the measurements ± 2SD. The

difference of measurements of the proposed approach and expert 1 is 0.012 + 0.17 and 0.012

- 0.14 (Figure 3.6 (a)), and for the expert 2 is 0.034 + 0.31 and 0.0.34 – 0.24 (Figure 3.6 (b)).

FoM% is calculated for the proposed approach using equation 3.10.

FoM = 100 − |IMT̅̅ ̅̅ ̅̅ 𝐴𝑢𝑡𝑜𝑚𝑎𝑡𝑖𝑐− IMT̅̅ ̅̅ ̅̅ 𝑀𝑎𝑛𝑢𝑎𝑙

IMT̅̅ ̅̅ ̅̅ 𝑀𝑎𝑛𝑢𝑎𝑙| ∗ 100 (3.10)

where,

Page 79: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

61

IMT̅̅ ̅̅�̅�𝑢𝑛𝑡𝑜𝑚𝑎𝑡𝑖𝑐is the average IMT value calculated for the 100 measurements done by the

proposed approach

IMT̅̅ ̅̅ ̅𝑀𝑎𝑛𝑢𝑎𝑙 is the mean IMT for the 100 measurements done manually by an expert

The FoM% is 98.5 for the proposed approach w.r.t expert 1 and 95.4 for the proposed

approach w.r.t expert 2.

Patients are classified for stroke risk by the proposed classification decision tree using the

automatic IMT values calculated by the proposed approach. The manual measurements by

the experts are evaluated by expert physician for stroke risk estimation. The risk estimation

results of both the manual measurements evaluated by a physician and the automatic

measurements calculated using the proposed approach and evaluated by the proposed

decision tree are compared.

The major classes for stroke risk estimation i.e. no stenosis, mild stenosis, moderate stenosis

and severe stenosis, are assigned values 1, 2, 3 and 4 respectively. The difference of

classification results is measured by using equation 3.11.

Df = |𝑀𝑎𝑛𝑢𝑎𝑙𝐶𝑙𝑎𝑠𝑠𝑉𝑎𝑙𝑢𝑒 − 𝐴𝑢𝑡𝑜𝑚𝑎𝑡𝑖𝑐𝐶𝑙𝑎𝑠𝑠𝑉𝑎𝑙𝑢𝑒| (3.11)

where,

ManualClassValue is the value of the class for manual measurement for an image

AutomaticClassValue is the value of the class for the automatic measurement for an image

Page 80: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

62

(a) (b)

Figure 3.6: Bland-Altman plots of (a) Expert1 NM1 versus Automatic measurements by proposed approach NA (b) Expert2 NM2 versus Automatic

measurements by proposed approach NA.

Page 81: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

63

The difference values were 0 where the classification results were same for the automatic

values and those by an expert physician for manual measurements. For the cases where the

results were not the same, the difference values were 1, 2 or 3. The results of these

differences are given in Table 3.4.

Table 3.4: Difference computed for classification results of Expert 1, Expert 2 and Automatic measurements for 100 carotid artery ultrasound images

Classification

Accuracy

Classification Difference (Df)

1 2 3

Automatic vs. Expert1 76 % 19% 5% 0

Automatic vs. Expert 2 68% 24% 8% 0

Expert 1 vs. Expert 2 69% 20% 6% 5%

3.3 Discussion

CIMT is considered to be a marker for early diagnosis and risk estimation of atherosclerosis.

Ultrasound images are generally used to measure CIMT. We have developed a fully

automatic image processing based approach to measure the CIMT. The proposed approach

can predict ischemic stroke risk based on IMT values and variation in the IMT values of

same individual. An IM segmentation approach is being proposed that uses an improved

snakes initialization method for the GVF snakes. The coordinates of the selected lines

extracted by HT are used for automatic initialization of the snakes.

The proposed approach is more reproducible as the CV% (12.99%) and IOE (0.08) of the

proposed approach are both smaller than that of the manual measurements by experts (see

Table 3.2). The manual measurements by expert 1 and expert 2 (0.67mm and 0.65mm

Page 82: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

64

respectively) were smaller than the measurements by the proposed approach (0.68mm).

Similar results have been reported in other studies [29, 33]. Intra-observer variability is an

effective metric to measure the performance of a computer aided measurement approach. It

can be seen in Table 3.2 that the intra-observer variability calculated as IOE for both experts

is higher as compared to the IOE for the proposed approach.

Wilcoxon ranksum test results are given in Table 3.3. It is observed that there is no

significant difference between the automatic measurements made by the proposed approach

and the manual measurements made by experts. Bland-Altman plots given as Figure 3.6

show that almost all of the data points lie within the 2σ of the mean range. The difference

between the proposed approach and measurements by expert 1 is 0.012 mm and is 0.034mm

for the proposed approach and measurements by expert 2.

The dataset used has images that were recorded using a standard recording technique to

adjust the position of the probe such that the ultrasound beam is at right angle to the arterial

wall [33]. This improves the Intima-Media visualization.

Correct segmentation depends on the estimation and positioning of the initial snake contour.

All the images are correctly segmented by the proposed approach giving a segmentation

accuracy of 100%.

The classification results by an expert physician for both of the manual measurements and by

proposed decision tree for automatic measurements are compared. The classification results

of the proposed approach are found to be more close to that of the expert 1 than expert 2.

Similar findings are also produced by the Bland-Altman plots (See Figure 3.6). The reason

for these results is that expert 2 tends to give smaller values for the IMT measurements. The

classification accuracy for the proposed approach when compared with the diagnosis by the

physician for expert 1 measurements is 76% and is 68% when compared with expert 2. The

classification accuracy for the comparison between the two manual measurements is 69%.

Page 83: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

65

The classification difference (Df) of 1 is observed for 19% of the images for the automatic

vs. expert1 measurements, for 24% of the images for the automatic vs. expert2 measurements

and for 20% of the images for the expert 1 vs. expert 2 measurements. Classification

difference (Df) of 2 is observed for 5% of the images for the automatic vs. expert1

measurements, for 8% of the images for the automatic vs. expert2 measurements and for 6%

of the images for the expert 1 vs. expert 2 measurements. Classification difference (Df) of 3

is observed for 0% of the images for the automatic vs. expert1 measurements, for 0% of the

images for the automatic vs. expert2 measurements and for 5% of the images for the expert 1

vs. expert 2 measurements.

The Classification difference (Df) of 3 means that the diagnosis is completely incorrect. For

example, if a individual has severe stenosis then he is diagnosed as having no stenosis and

estimation of stroke risk is ‘risk free’. Or an individual has no stenosis and is diagnosed as

having severe stenosis and estimation of stroke risk as increased to 75 %. The results of

proposed approach when compared with the diagnosis by expert physician for either of the

manual measurements did not produce any incorrect diagnosis. On the contrary the diagnosis

for both of the manual measurements when compared with each other had such incorrect

results for 5% of the cases.

3.4 Conclusion

In this research we have proposed an improved approach for Intima-Media segmentation

with improved snake initialization process. A classification scheme is also proposed to

associate the stenosis with ischemic stroke risk estimation. The proposed approach extracts

the contours in the ultrasound images using gradient vector flow snakes with an improved

snake initialization process. The seed points for this improved snake initialization process are

extracted using selected edges returned by the candidate line selection algorithm.

IMT is calculated from the extracted contours. SF are calculated using the IMT values and

Page 84: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

66

are included in the feature set for risk estimation analysis. The proposed approach is tested

and clinically validated on a data set of 100 longitudinal ultrasound images of the CA. The

IOE of 0.088, a CV of 12.99%, Bland-Altman plots with small differences between experts

(0.01 and 0.03 for Expert 1 and Expert 2, respectively) and FoM of 98.5% are obtained.

We found no significant difference between the IMT measurements through proposed

approach and the manual measurements which shows the accuracy of our approach. Based

on the Bland-Altman test, CV% and FoM it can be observed that the proposed approach

measurements are interchangeable to manual measurements.

The proposed approach can be successfully used for measurement of IMT, complementing

the manual IMT measurements. The IMT values are then further used for a individual’s risk

estimation of stroke. The risk estimation for the measurements by proposed approach and

measurements taken manually are also found to be similar.

The proposed approach is better than the existing approaches in terms of FoM % and

variation (see Table 3.5). Variation is the difference of the manual and system means.

FoM = |IMT̅̅ ̅̅�̅�𝑢𝑡𝑜𝑚𝑎𝑡𝑖𝑐 − IMT̅̅ ̅̅ ̅

𝑀𝑎𝑛𝑢𝑎𝑙| (3.12)

where,

IMT̅̅ ̅̅�̅�𝑢𝑛𝑡𝑜𝑚𝑎𝑡𝑖𝑐is the average IMT value calculated for the 100 measurements done by the

proposed approach

IMT̅̅ ̅̅ ̅𝑀𝑎𝑛𝑢𝑎𝑙 is the mean IMT for the 100 measurements done manually by an expert

Page 85: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

67

Table 3.5: Comparison of proposed approach with existing approaches

Approach Method Mode Dataset

Size

IMT Values (Mean ± SD) mm

Manual System |Variation| FoM %

Delsanto et al.,

2007 [22]

Snakes, Fuzzy C-

means method

Automatic 200 0.77± 0.22 0.71± 0.16 0.06 92.2

Loizou et al.,

2007 [33]

Snakes based

SF

Semi-

Automatic

100 0.65± 0.18 0.68±0.12 0.03 95.4

Faita et al., 2008

[13]

FOAM edge operator Automatic 150 0.56± 0.14 0.57± 0.14 0.01 98.2

Loizou et al.,

2009 [107]

Snakes Semi-

Automatic

100 0.71± 0.17 0.67± 0.12 0.04 94.4

Page 86: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

68

Approach Method Mode Dataset

Size

IMT Values (Mean ± SD) mm

Manual System |Variation| FoM %

Molinari et al.,

2010 [24]

Integrated approach Automatic 182 0.92± 0.30 0.75± 0.39 0.17 81.5

Molinari et al.,

2010 [47]

Integrated approach Automatic 200 0.92± 0.30 0.75± 0.39 0.17 81.5

Molinari et al.,

2011 [108]

Integrated approach,

FOAM

Automatic 295 0.782 ± 0.281 0.750 ± 0.203 0.03 95.9

Meiburger et al.,

2011 [38]

Edge flow Automatic 300 0.818 ± 0.246 0.861 ± 0.276 0.04 94.7

Molinari et al.,

2012 [49]

FOAM Automatic 365 0.95 ± 0.39 0.91± 0.44 0.04 95.8

Page 87: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

69

Approach Method Mode Dataset

Size

IMT Values (Mean ± SD) mm

Manual System |Variation| FoM %

Molinari et al.,

2012 [34]

FOAM,

Dual snakes

Automatic 665 0.836 ± 0.296 0.823± 0.239 0.01 98.4

Xu et al., 2012

[35]

HT, Dual snakes Semi-

Automatic

50 0.63± 0.14 0.65± 0.16 0.02 96.8

Menchón-Lara et

al., 2014 [109]

Morphological

operations, ANN

Automatic 60 0.64± 0.19 0.61±0.19 0.03 95.3

Proposed Integrated approach Automatic 100 0.672 ± 0.149 0.683 ± 0.148 0.01 98.5

Page 88: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

70

The presented research may be further explored for estimation on ultrasound images of the

CA on large scale. Moreover, it may take into account the plaque texture properties,

phenotype data e.g. age, gender, stroke history, smoking, Body Mass Index (BMI), etc. and

genetic data for developing a substantially improved and extensive criterion for ischemic

stroke risk estimation.

Page 89: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

CHAPTER NO. 4

PROPOSED APPROACH-

GENETIC DATA BASED

ISCHEMIC STROKE

CLASSIFICATION

Page 90: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

71

Proposed Approach- Genetic Data Based Ischemic

Stroke Classification

In the last decade the genetic basis underlying the human diseases has been investigated

intensively. Different Genome-Wide Association Studies (GWAS) have been conducted to

find out the genetic reasons for different human diseases [110-113]. Such a study consists of

analyzing genotype data from affected (cases) and healthy (controls) individuals to identify

the Single Nucleotide Polymorphisms (SNPs) having a significant difference in frequencies

for the two groups. Thousands of SNPs associated with diseases have been identified [114-

117]. SNPs have been identified for diabetes [118-121], cancer [122-125], alzheimer’s [126-

128], autism [129, 130] etc.

Ischemic stroke is a common neurological multifactorial disorder. There are many risk

factors for ischemic stroke. Some of these factors can be changed and some cannot be

changed.

Changeable factors are the risk factors that can be treated, controlled or changed. Such

factors include high blood pressure, smoking, diabetes, artery diseases, atrial fibrillation,

sickle cell disease/ anemia, high blood cholesterol, physical inactivity and obesity.

On the contrary unchangeable risk factors are the one that cannot be changed, modified or

cannot be treated. They include age, genetic predisposition, race, gender, prior stroke, TIAs

and heart attack.

Other factors related to ischemic stroke risk are geographic location, drug and alcohol usage.

Ischemic stroke has a research based indication of genetic influence [112, 131-133]. Almost

half of the ischemic stroke cases are suspected to be genetic as the patients do not suffer from

the conventional risk factors.

Extensive research has been conducted to investigate the unknown reasons and their

Page 91: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

72

relationship with genetics. Different studies have suggested an association of genetic factors

among young individuals. Similarly family history of stroke as an increase in risk for stroke

is also being investigated. It is found that family history is a major risk factor both in young

individuals and adults.

Living organisms are made up of cells. The instructions for growth and functionality are

stored in the nucleus of the cell. Proteins are produced inside a cell that enables a cell to

perform special functions. There are 100 trillion cells in a human body.

DeoxyriboNucleic Acid (DNA) is the genetic material of a cell in all living things. DNA is a

molecule having the instructions, also known as the blueprint of a human being. Detailed set

of instructions are encoded in DNA. There are two main types of DNA namely;

1. Genomic DNA

2. Mitochondrial DNA

Genomic DNA/ Nuclear DNA comprise of the whole genome of an organism. Nuclear DNA

undergoes recombination. Mitochondrial DNA is the present in the mitochondria. It is always

maternally inherited and remains unchanged from parent to child. It is used to investigate

maternal hereditary diseases.

The DNA is in the shape of a twisted ladder known as “double helix” with the ladder rungs

made of base pairs bonded together with hydrogen. Each of the bases can be either of the

four letters A (Adenine), C (Cytosine), T (Thymine) and G (Guanine). There is a special rule

for pairing of the bases i.e. A always pairs with T and C with G.

Sequence of the base letters is known as DNA strand. e.g.

ATGCTCGAATAAATGTCAATTTGA. These letters combine together to make words. For

example, ATG CTC GAA TAA ATG TCA ATT TGA. Words combine to make

sentences. These sentences are known as genes.

Genes instruct the cell to produce molecules known as proteins. These proteins are

Page 92: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

73

responsible for enabling a cell to collectively work together with other cells to perform

special functions for example heart cells work together to make the heart pump blood to all

parts of body. There are almost 25000 genes in human body. Chromosomes are compact

units of DNA. DNA is packaged in chromosomes. Each chromosome has two DNA chains.

There are 46 chromosomes (2 sets of 23 chromosomes) in humans. One set is inherited from

mother and other from the father. A specific position in chromosome is known as marker or

locus. There are two types of markers which include SNPs and macrosatellite markers. SNPs

are short DNA sequences that surround a single base-pair change. Macrosatellite markers are

long DNA sequences that surround base pair changes.

If the instructions in a gene (DNA sequence) are changed then the gene is known as mutated

gene. Mutated genes are responsible for malfunctioning of cells and for different disorders.

During the process of copying DNA just before the division of cell an error occurs in almost

every 100,000 nucleotides. This error can be when a base is substituted by another base. The

error can also be caused by deletion or addition of a base. These changes are mostly being

repaired by the cell itself. If the change in the DNA is not repaired then it is passed on to the

child through the parents containing the changed cell.

Heredity is the passing of traits from parents to children. Genes encode instructions to define

our traits that are notable features or qualities in a individual. Every human has different

combinations of traits that make him/her unique. These qualities and features are passed from

generation to generation. One generation inherits these traits from previous generation and

passes onto the next generation. Environmental conditions affect the traits and can partially

or completely change them. There are several types of traits.

1. Physical Traits

2. Behavioral Traits

Physical traits form one’s appearance. Skin color, eyes color, height, hair type and color are

Page 93: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

74

some of the examples of physical traits. Environmental factors like exposure to sunlight or

certain chemicals etc. can change these. Behavioral traits form one’s personality and

disposition. Examples include guard dog’s instinct to guard houses or herds of cattle. But we

can train a watch dog to be a play dog. This implies that environmental conditions can

change behavioral traits. Predisposition to certain disease traits increases a individual’s risk

of certain genetically transmissible diseases. Hereditary diseases include heart disease, sickle

cell anaemia, cancer, mental disorders, etc. Disease risk can be reduced by preventive

measures and healthy lifestyle.

Allele is the genetic information for each trait. These are DNA sequences within a marker or

locus. There are two alleles for each trait. Each one of the allele is a copy from mother and

father. These copies can be identical or different from each other. Individuals who have two

of the same alleles for a particular trait are known as homozygous. The identical alleles can

be used to predict an individual’s traits. Heterozygous defines the individual who have two

different alleles for a trait. One of the alleles is masked by the other allele. The masked allele

is called “recessive” and the one that masks it is “dominant”.

Well-defined physical traits can be easily traced through generations. For such traits the

alleles are known. These include eye color, skin color, thumb extension etc. Incomplete

dominance traits are not easy to trace through generations. In such a case the alleles interact

together to produce a particular trait. Single-gene traits are influenced by a single gene. Traits

that are formulated by more than one gene are known as complex traits.

Research has proved that genetic mutations can cause more than 4,000 diseases.

Environmental factors as well as multi-gene variations also play a major role in disease risk

elevation. Usually a human cell has 5 to 10 mutated genes. But having a mutated gene

doesn’t always indicate that an individual will develop a certain disease. The problem occurs

when either the diseased gene is dominant or when both copies of the recessive gene are

Page 94: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

75

mutated. Individuals who have either of the two conditions are diseased.

Individuals with mutation in the recessive gene are called carriers. When only the recessive

gene is mutated then the normal copy of the gene takes over the functionality. If both the

parents are carriers then there is 25% chance of the child developing the disease. One of the

variations is X-linked diseases. In such a case if there is a mutation on recessive gene of X

chromosome then only males might develop a genetic disease but females might not as they

have two copies of X chromosome.

Another variation can be getting extra or missing chromosomes in the cell during the cell

division in the reproduction phase. This can also cause genetic diseases. The two alleles

combinely at a specific locus on each pair of the chromosomes is known as genotype.

Haplotypes are sequence of multiple alleles on a chromosome.

The International Haplotype Map Project (HapMap) along with genotyping methods has

provided opportunities for association studies. Different linkage and candidate association

studies have been done to identify the candidate genes and mutations that are prospective risk

for ischemic stroke. There are many candidate genes significantly associated with ischemic

stroke [131, 134, 135] but quite a few have been replicated.

Different SNPs and one haplogroup that have been reported in literature to be associated with

ischemic stroke risk are shown in Figure 4.1. Table 4.1 shows the genes and their SNP id as

well as the respective SNPs that are identified by meta-analysis of genes for stroke risk

estimation.

Multiple researches have been conducted to associate SNPs with disease risk in individuals

using their SNP profiles [136-138]. The disease risk is evaluated using SNP profile and then

compared with the actual status of the individual (case/control).

Page 95: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

76

Genes

MTHFR

F5

PRKCH

PROC

CYP11B2

APOE e4

Near NINJ2

9p21.3

rs1801133

rs1801131

rs12121543

rs13306553

rs9651118

rs1801133

rs2274976

rs1801131

rs6025

rs2230500

rs2246700

rs3783799

rs12587610

rs3825655

rs1401296

rs1799998

rs7412

rs429358

rs1333049

rs12425791

rs11833579

Haplogroup

Figure 4.1: Identified SNPs causing stroke risk

Page 96: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

77

Table 4.1: Stroke associated genes/ locus, SNP id/ haplotype and corresponding SNPs

Gene

Symbol/

Locus

Gene Name Chromosome

Cytogenetic

Location

SNP id/

Haplotype

SNP

MTHFR

[133, 139-

142]

Methylenetetrahydrofolate

reductase (NAD(P)H)

1 1p36.3

rs1801133

TTGAAGGAGAAGGTGTCTGCG

GGAG[C/T]CGATTTCATCATCAC

GCAGCTTTTC

rs1801131

TGGGGGGAGGAGCTGACCAGT

GAAG[A/C]AAGTGTCTTTGAAG

TCTTCGTTCTT

rs12121543

GCCACCACATGCCCAGGAGGCC

ATT[A/C]CTGTAAATTCTGCCCC

TGACTCCTC

Page 97: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

78

Gene

Symbol/

Locus

Gene Name Chromosome

Cytogenetic

Location

SNP id/

Haplotype

SNP

rs13306553

rs9651118

rs1801133

ACATGCAGAGGTGAACTGCACC

ATG[C/T]CCTTGCTCCTTTTGTAT

CACCCACT

ACTTTTCACAGCGCTTGCCTGT

TTA[C/T]TATCTCAGGTGAGTTA

AGACATCAT

TTGAAGGAGAAGGTGTCTGCG

GGAG[C/T]CGATTTCATCATCAC

GCAGCTTTTC

GAGGCCTTTGCCCTGTGGATTG

Page 98: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

79

Gene

Symbol/

Locus

Gene Name Chromosome

Cytogenetic

Location

SNP id/

Haplotype

SNP

rs2274976

rs1801131

AGC[A/G]GTGGGGAAAGCTGTA

TGAGGAGGAG

TGGGGGGAGGAGCTGACCAGT

GAAG[A/C]AAGTGTCTTTGAAG

TCTTCGTTCTT

F5 [143-

145]

Coagulation factor V

(proaccelerin, labile

factor)

1 1q23 rs6025

TGTAAGAGCAGATCCCTGGACA

GGC[A/G]AGGAATACAGGTATTT

TGTCCTTGA

Page 99: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

80

Gene

Symbol/

Locus

Gene Name Chromosome

Cytogenetic

Location

SNP id/

Haplotype

SNP

PRKCH

[112, 146,

147]

Protein kinase C, eta. 14 14q23.1

rs2230500

TTTGCCATAGGTGATGCTTGCA

AGA[A/G]TAAAAGAAACAGGAG

ACCTCTATGC

rs2246700

GATCCTAAATGGGGAAAAGGCA

TTT[A/T]ATGGCTCTAGAGAGGG

TCCTGGGGA

rs3783799

GCCTGGGGACAATGAAGGATCT

GAG[A/G]CGTTATCAGCTGGAAT

AAATTCTGA

rs12587610 CATATTATATATGGTGGTTAAGAT

Page 100: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

81

Gene

Symbol/

Locus

Gene Name Chromosome

Cytogenetic

Location

SNP id/

Haplotype

SNP

T[A/G]GGGCTCTGGAATCAGATT

TGGATTT

rs3825655

TGCTGATGGGAGGTGAAACTGA

AGC[A/G]ACAAGCAACACATTC

CTGTTTTATG

PROC

[148, 149]

Protein C (inactivator of

coagulation factors Va and

VIIIa)

2 2q12-q14 rs1401296

AACCGCGCCCGGGGCTGGAAG

CACC[C/T]GCCGAATGGCACAG

GGCCAGTGCCC

CYP11B2

[133, 150,

151]

Cytochrome P450,

family11, subfamily B,

polypeptide 2

8 8q21-q22 rs1799998

AAAGTCTATTAAAAGAATCCAA

GGC[C/T]CCCTCTCATCTCACGA

TAAGATAAA

Page 101: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

82

Gene

Symbol/

Locus

Gene Name Chromosome

Cytogenetic

Location

SNP id/

Haplotype

SNP

APOE- e4

[152-154]

Apolipoprotein E version

e4

19 19q13.2

rs7412

CCGCGATGCCGATGACCTGCAG

AAG[C/T]GCCTGGCAGTGTACC

AGGCCGGGGC

rs429358

GCTGGGCGCGGACATGGAGGA

CGTG[C/T]GCGGCCGCCTGGTG

CAGTACCGCGG

Near NINJ2

[155, 156]

- 12 12p13 rs12425791

CCTGGTAAAAAGATTTTGTGCC

AAC[A/G]GTTCTTGGTTTCTCCT

CTGACAACC

Page 102: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

83

Gene

Symbol/

Locus

Gene Name Chromosome

Cytogenetic

Location

SNP id/

Haplotype

SNP

rs11833579

CTTTCTGGAAAACCTTATTTCGG

AT[A/G]CCAGAAGCAAAATATTA

ACTATTTA

9p21.3

[157-159]

- 9 9p21.3 rs1333049

CATACTAACCATATGATCAACAG

TT[C/G]AAAAGCAGCCACTCGC

AGAGGTAAG

Page 103: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

84

4.1 Materials and Methods

4.1.1 Data

We have compiled data from various databases and studies on ischemic stroke [160-163].

Our compiled data includes DNA data for 249 patients with stroke and 268 controls. The

details about the dataset are given in Table 4.2.

Table 4.2: Details of dataset

Attributes Controls Cases

No. of Subjects 268 249

Male 128 134

Female 140 115

Mean Age (years) 69.5 71.8

4.1.2 Method

Genotype data for all subjects for 15 SNPs and 1 haplotype (listed in Table 4.1) is organized

for analysis. The alleles for selected SNPs and haplotype for all subjects is considered. Table

4.3 shows the genetic data for eight randomly chosen subjects as an example. The attribute

‘class’ in the Table 4.3 shows whether the individual is a normal control or an ischemic

stroke patient.

Page 104: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

85

Table 4.3: Sample genetic data for randomly chosen subjects

SNP id/

Haplotype

Allele

Subject 1 Subject 2 Subject 3 Subject 4 Subject 5 Subject 6 Subject 7 Subject 8

rs1801133 CT CC CT CT CT CT CC CC

rs1801131 AA AA AC AC AA AC AA AA

rs12121543

rs13306553

rs9651118

rs1801133

rs2274976

rs1801131

CC CC AC AC CC AC CC CC

TT TT TT CT TT CT TT TT

CT CT TT TT CT TT CC CC

CT CC CT CT CT CT CC CC

GG GG GG AG GG AG GG GG

AA AA AC AC AA AC AA AA

Page 105: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

86

SNP id/

Haplotype

Allele

Subject 1 Subject 2 Subject 3 Subject 4 Subject 5 Subject 6 Subject 7 Subject 8

rs6025 GG GG GG GG GG - GG GG

rs2230500 GG GG GG GG GG GG GG GG

rs2246700 AT AT AT AT AA - AA AT

rs3783799 GG GG AG AG GG - GG GG

rs12587610 GG AG AG AA GG - AG AG

rs3825655 CC - CT - - - CC CC

rs1401296 CT CT TT CT CT - CT TT

rs1799998 CT CT CC CT CT CT TT CT

Page 106: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

87

SNP id/

Haplotype

Allele

Subject 1 Subject 2 Subject 3 Subject 4 Subject 5 Subject 6 Subject 7 Subject 8

rs7412 CT CC CC CC CC CC CC CC

rs429358 TT TT TT TT TT - TT TT

rs12425791 AG AG GG GG AG AG GG AA

rs11833579 AG AG AG GG AG AG GG AA

rs1333049 GG CG GG CG CC CG GG CG

Class* 1 1 0 0 1 1 0 1

* Class 0 = Normal Subject Control Class, Class 1 = Ischemic Stroke Subject Case Class

This data is used for classification purposes.

Page 107: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

88

4.1.3 Classification

We have used a machine learning tool WEKA (freely available at

http://www.cs.waikato.ac.nz/ml/weka) for the classification of genetic data. Classifiers are

chosen from the bayes, functions, lazy, meta and trees group of classifiers. Multiple

classifiers from different groups are chosen because classifiers produce different

classification results depending on the type of application. Nine classifiers are chosen for the

classification purposes namely, Bayes Net [164], Naïve Bayes [165, 166], IBk [167],

AdaBoostM1 [168], Classification via Regression [169], J48 [170], Random Forest [171],

Bagging [170, 172] and Multilayer Perceptron (MLP) [173].

4.1.3.1 Bayes Net

Bayes Net is a statistical model in which the conditional dependencies of variables using a

directed acyclic graph. They are generally used to answer probabilistic queries about the

variables. These probabilities are used to classify data.

We have used simple estimator algorithm for finding the conditional probability tables of the

bayes network and K2 for searching network structures.

4.1.3.2 Naïve Bayes

Naïve Bayes classifiers are probabilistic classifiers. They are based on probabilistic models

built by applying Bayes’ theorem to feature set with strong independence assumptions. The

features from the feature set contribute independently for classification and any possible

correlations among the features are abandoned. These classifiers require a small training set

for parameter estimation.

Normal distribution is used for numeric attributes.

4.1.3.3 IBk

IBk is based on k-nearest-neighbor. Nearest neighbors can be specified or can be

Page 108: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

89

automatically calculated using leave-one-out cross-validation with a specified upper limit.

The similarity function in IBk calculates the similarity between training instances and the

depicted instances. The similarity results and classification performance records are fed to a

classification function to get the result for classification of an instance. The concept

description is updated after each classification.

We have used 1 as the number of neighbors to use without the classifier to output additional

info to the console. ‘No distance weighting’ method is used. Mean absolute error is used to

do cross-validation for regression. Nearest neighbor search is done using Euclidean distance.

No limit is set to the number of training instances.

4.1.3.4 AdaBoostM1

AdaBoost stands for adaptive boosting and M1 is a version of the adaptive boosting

algorithm. It works in aggregation with many other learning algorithms to improve its

performance. The results of the learning algorithms referred as weak learners are aggregated

to form a weighted sum which is the result of AdaBoostM1 classifier.

Individual learners might be weak but as long as each one produces slightly better results

than random guessing, the resultant final model congregates to a strong learner. It overcomes

the curse of dimensionality by selecting only the features that improve the prediction power

of the model.

We have used Decision stump as a base classifier. The number of iterations to be performed

is set to 10. Seed is set to 1. Reweighting is used and weight threshold for weight pruning is

set to 100.

4.1.3.5 Classification via Regression

In classification via regression a linear regression function is estimated based on the dataset.

Weights are calculated for different parameters having the machine learning objective for the

weights to reduce to least squares fitting. The resulting regression function is used for

Page 109: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

90

classification of new data.

We have used M5P as base classifier with the option to generate model tree/rule. The

minimum numbers of instances allowed at a leaf node are set to 4.

4.1.3.6 J48

J48 is Weka implementation of C4.5 algorithm which is used to produce a decision tree

which is used for classification. It is a statistical classifier that builds decision tree based on

entropy using a set of training data. The data attribute labeled at each node is the one that

maximally splits the data into one class or the other depending on the information gain of

that attribute.

We have used a confidence factor of 0.25 with the minimum number of instances per leaf set

to 2. numFolds is set to 3.

4.1.3.7 Random Forest

It is an ensemble learning based classifier. Many decision trees are made at the training time,

the classification of an instance is the result of the mode of the classification of these trees or

mean prediction of each of the trees. It is a combination of bagging and random selection of

features.

The classification of a new object is made by feeding the input vector to each of the decision

tree in the random forest. Each tree votes for the instance and the decision for classification

is done for the class having majority votes. The maximum depth of trees is set to unlimited

and number of trees to be generated equal to 10.

4.1.3.8 Bagging

Bagging is a model averaging approach also known as bootstrap aggregating. It improves

stability and accuracy of classification algorithm. New training sets are generated by

sampling from the original training set uniformly and with replacement known as bootstrap

Page 110: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

91

samples. Multiple models are fitted on the bootstrap samples. These models are combined by

averaging voting for classification of an instance.

We have set the size of each bag as a percent of the training set size equal to 100. Fast

decision tree learner is used as a base classifier without any restriction on the maximum tree

depth. The minimum total weight of the instances in a leaf are set to 2 with the minimum

proportion of the variance on all the data that needs to be present at a node in order for

splitting to be performed in regression trees set to 0.001 and number of folds equal to 3. 10

iterations are to be performed with seed equal to 1.

4.1.3.9 Multilayer Perceptron (MLP)

It is an ANN based model that maps inputs onto outputs. A MLP is a directed graph having

multiple layers of nodes and each layer is fully connected to the next layer. Network is

trained via backpropagation which is a supervised learning technique. It can even distinguish

the instances which are not linearly separable.

All of these classifiers belong to different classifier families differentiated by their

algorithmic nature. We have used 10 fold and 15 fold cross validation and 66% split for the

performance analysis of our scheme and the classifiers. We have used MLP with an autobuild

option to add and connect up hidden layers in the network. Learning rate is set to 0.3 and

momentum equal to 0.2 with normalized attributes and normalized numeric classes. The reset

option is set to true. The number of epochs to train through is set to 500. Validation threshold

is set to 20.

4.2 Results and Discussion

The results of the genetic data classification using different classifiers are summarized in

Table 4.4. Measures used to compare results are accuracy, sensitivity and specificity for the

Page 111: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

92

selected classifiers using 10 fold, 15 fold cross validation and 66% split. For 10 fold cross

validation, MLP gave the best accuracy of 88.16%. MLP produced almost the same results

when applied using 10 fold, 15 fold cross validation and 66% split. IBk gave the worst

accuracy for 10 fold cross validation when applied to the selected SNPs data. For 15 fold

cross validation the best results are produced using AdaboostM1 (88.01%). MLP produced

an accuracy of 87.8% for 66% split test. Naïve bayes produced the overall lowest result of

80.61% when selected SNPs data was tested with 66% split.

MLP, AdaboostM1 and classification via regression gave best results for SNPs data. IBk and

naïve bayes did not perform very well for the classification of the data. Figures 4.2, 4.3 and

4.4 present the comparison of accuracy, specificity and sensitivity respectively.

Page 112: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

93

Table 4.4: Classification results for genetic data using different classifiers

Classifier Cross Validation Split Test

10 Fold 15 Fold 66% split

ACC SPE SEN ACC SPE SEN ACC SPE SEN

BayesNet 84.46 82.54 87.06 84.38 82.54 86.89 82.14 78.57 87.71

Naïve Bayes 83.42 82.16 85.14 83.42 82.41 84.79 80.61 77.5 85.47

IBk 82.46 80.62 84.97 82.9 80.74 85.84 82.57 79.28 87.71

AdaBoostM1 87.42 88.58 85.84 88.01 88.19 87.76 86.49 85.71 87.71

ClassificationViaRegression 87.27 91.01 82.17 87.79 91.4 82.87 85.62 89.64 79.32

J48 84.6 91.01 75.87 83.94 90 75.7 82.57 88.57 73.18

RandomForest 84.97 88.58 80.07 84.75 89.21 78.67 83.22 85.36 79.89

Page 113: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

94

Classifier Cross Validation Split Test

10 Fold 15 Fold 66% split

ACC SPE SEN ACC SPE SEN ACC SPE SEN

Bagging 84.97 90.12 77.97 85.34 90.5 78.32 83.22 87.86 75.98

MultilayerPerceptron 88.16 89.34 86.54 87.93 88.32 87.41 87.8 88.21 87.15

ACC = Accuracy, SPE = Specificity, SEN = Sensitivity

Page 114: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

95

For the comparison of accuracies it is evident from Figure 4.2 that MLP gives the best

accuracy using the 15 fold cross validation on SNP data. AdaBoostM1and classification via

regression also produce close to best results i.e. 88.01% and 87.79% respectively using 15

fold cross validation. 15 fold cross validation gives better results for all classifiers for SNP

data.

Figure 4.3 shows that classification via regression achieves the highest specificity of 91.4%

using 15 fold cross validation on the SNPs data. The lowest specificity is produced by naïve

bayes on SNP data using 66% split test. 66% split test produced lowest results for all

classifiers. However, the results produced by cross validation for both 10 fold and 15 fold

provided high specificity.

Figure 4.4 gives the comparison of % sensitivities of different classifiers. It is evident from

the figure (Figure 4.4) that adaboostM1 produces highest sensitivity of 87.76% for the SNPs

data when performed using 15 fold cross validation. AdaboostM1 and IBk are very close to

the best, both with the sensitivities of 87.71% using 66% split test.

Page 115: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

96

Figure 4.2: Comparison of % Accuracy of Classifiers using Genetic Data

76

78

80

82

84

86

88

90

Pe

rce

nta

ge A

ccu

racy

Classifiers

Accuracies of Classifiers

10-Fold CV

15-Fold CV

66% Split

Page 116: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

97

Figure 4.3: Comparison of % Specificity of Classifiers using Genetic Data

70

75

80

85

90

95

Pe

rce

nta

ge S

pe

cifi

city

Classifiers

Specificities of Classifiers

10-Fold CV

15-Fold CV

66% Split

Page 117: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

98

Figure 4.4: Comparison of % Sensitivity of Classifiers using Genetic Data

65

70

75

80

85

90

Pe

rce

nta

ge S

en

siti

vity

Classifiers

Sensitivities of Classifiers

10-Fold CV

15-Fold CV

66% Split

Page 118: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

99

4.3 Conclusion

Genotype data is analyzed for ischemic stroke risk estimation in this chapter. The data for the

SNPs known for ischemic stroke risk is arranged for classification. Different classification

models are used to analyze and classify data. Highest accuracy is achieved using MLP.

Research is still going on to unfold the genes and SNPs responsible for ischemic strokes.

Stroke is a complex disease which can occur due to any or all of the multiple risk factors.

Risk gene allele in one population might not be present in other population and thus not

responsible for ischemic stroke in that population. Despite all these considerations our

proposed approach has shown an accuracy of 88.16%.

Page 119: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

CHAPTER NO. 5

ANALYSIS & DISCUSSION-

CORRELATING PHYLOGENETIC

TREES

Page 120: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

100

Analysis & Discussion- Correlating Phylogenetic

Trees

A phylogenetic tree is a branch diagram that is being traditionally used for representing the

evolutionary relations among different species and organisms. Similarity and dissimilarity in

different physical traits are used to build such trees. A more practical approach nowadays is

to generate these trees using the genes or proteins sequences for evolutionary relationships.

In such trees the similarity and dissimilarity is calculated based on the differences in genes or

protein sequences. These trees serve as a tool to reconstruct the evolutionary linkage between

groups of organisms and also to estimate the time of divergence between them. The number

of changes can be estimated using the phylogenetic trees.

Phylogenetic tree is also known as dendogram. The trees can be rooted or unrooted trees.

Rooted phylogenetic trees are also known as cladogram. These trees are built for the objects

that have descended from a common ancestor. The ancestor comes on the root. The paths

from roots to nodes represent the evolutionary time from the ancestor to the object. The

unrooted tree also known as phenogram is a tree where the objects are known to be related

but the ancestor is not confirmed or known. The path between such nodes does not tell about

the evolutionary time involved between the organisms or the objects.

Two approaches to build trees are discussed as under:

1. Traditional Approach:

Traditional approach used characterizing the organisms either through morphology

of organisms or through fossil record of the impressions left by an organism.

Morphological features include simply the physical characteristics of an organism

e.g. shape of beak, feathers, tail, number of legs, etc.

These features are helpful when the organisms under discussion are not extinct. In

Page 121: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

101

the case where using the morphological features is not feasible, then the fossil

records are used. The fossilized remains are used for the estimation of age by

estimating the age of the rock surrounding the fossil imprints.

2. Modern Approach

Protein, Messenger RNA (mRNA), Ribosomal RNA (rRNA), genome sequenced

data etc. are used to characterize different organisms. This data is used to construct

phylogenetic trees. Large molecules consisting of one or more long chains of amino

acid residues are known as proteins. They are responsible for most of the functions

in organisms. They are responsible for DNA replication, stimuli response, molecule

transportation and as a catalyst of metabolic reactions etc. mRNA are molecules

responsible to carry genetic information from the DNA to the ribosome. rRNA is

RNA part of the ribosome. It is needed for protein synthesis in all organisms.

5.1 Phylogenetic Tree Construction Methods

Most commonly used methods for phylogenetic tree construction either fall into the distance

based methods or character based methods. Distance methods include Unweighted Pair

Group Method using arithmetic Averages (UPGMA) [174], Neighbor Joining (NJ) [175] and

Fitch and Margoliash algorithms [68]. Character based methods include MP [176, 177] and

ML [178, 179] methods for construction of phylogenetic trees. The most commonly used

methods are discussed here.

5.1.1 Unweighted Pair Group Method using Arithmetic

Averages (UPGMA)

It clusters nodes at each stage of the tree and forms a new node on the tree. The tree is built

bottom up. The tree is built on the assumption that all the nodes are equidistant from the root.

At each level the length of the branch is determined by the difference in the heights of the

Page 122: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

102

nodes at each end of the branch. Equally distant nodes pose a problem for the molecular

clock hypothesis because UPGMA assumes a constant rate of evolution. It produces a rooted

tree that is the true picture of the pairwise similarity or dissimilarity matrix. At each step two

nearest clusters are combined to make a higher level cluster.

5.1.2 Neighbor Joining (NJ)

It works just like the UPGMA with a difference that new distance matrix is calculated at each

of the iterations. The distance matrix is calculated using Hamming distance between each

node. This method is not based on the additivity of the nodes and works even if the nodes are

not additive. This method produces and unrooted tree. If a common ancestor has to be

assigned to the tree then it has to be selected as an outgroup.

5.1.3 Maximum Parsimony (MP)

MP searches through all possible tree structures and assigning cost to each tree. The most

parsimonious tree is chosen to be the one that requires least changes to explain the aligned

data. This tree is the one that explains the evolutionary pattern of the objects analyzed.

5.2.4 Maximum Likelihood (ML)

It is a computationally intensive approach that optimizes the likelihood of observed data

given a tree and a nucleotide evolution model. It is based on probability theory. It tries to find

a tree that has the highest probability under a specific evolution model. This method has the

disadvantage that it relies on the assumption that the evolution model is accurate and correct.

If it is provided with a false model the resultant tree will not be consistent. It can generate

multiple trees having the same probabilities.

Page 123: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

103

5.2 Softwares and Tools Available for Phylogenetic Tree

Construction

A number of software tools are available for the construction of phylogenetic trees. Some of

the famous tools and the implementation methods of these tools are given in Table 5.1.

Table 5.1: Phylogenetic tools and their implementation methods

Tool Description

Method

Distance Character

NJ UPGMA MP ML

ClustalW2

[180]

It is used for multiple sequence alignment and

phylogenetic tree construction √ √

MEGA [181] It is an integrated tool for sequence alignment

and phylogenetic tree construction and analysis √ √

PAUP [182] It is a program for inferring phylogenetic trees

on the basis of parsimony. √

PAUP*

[183]

It is the version 4 onwards program for PAUP.

It has an additional support for distance matrix

and likelihood based methods for phylogenetic

tree construction.

√ √ √ √

PHYLIP

[184]

It is a computational package for constructing

and analyzing the phylogenetic trees. √ √ √ √

BioNumerics

[185]

It is a software for analysis with a wide range

of Bioinformatics applications. One of its

application is the inference of phylogenetics.

√ √ √ √

NJ= Neighbor-Joining, MP= Maximum Parsimony, ML= Maximum Likelihood, UPGMA= Unweighted Pair

Group Method using Arithmetic Averages.

Page 124: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

104

5.3 Bioinformatics Databanks

There are many bioinformatics databanks. Some of the famous databanks include:

1. European Molecular Biology Laboratories (EMBL) [186]

It is nucleotide sequence resource that contains submissions of DNA and RNA

sequences by different researchers.

2. Database of Genomic Variants (DGV) [187]

This database contains control data from studies that investigate the genomic

variation associations with phenotype data.

3. European Genome Phenome Archive (EGA) [188]

It has genotype data from various case controls, population and family studies.

4. dbSNP - NCBI [189]

This database consists of Single Nucleotide Polymorphisms and their relation with

heritable phenotypes.

5. The SNP Consortium Ltd [190]

This site has Single nucleotide polymorphisms (SNPs) which are common DNA

sequence variations among individuals and have great significance for biomedical

research.

6. HGBASE [191]

HGBASE summarizes all known sequence variations in the human genome. They

facilitate researches on genotypes effects on common diseases, drug responses, and

other complex phenotypes.

7. HAPMAP [192]

The International HapMap Project is a partnership of scientists and funding agencies

from Canada, China, Japan, Nigeria, United Kingdom and United States to develop

a public resource that will help researchers find genes associated with human

Page 125: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

105

disease and response to pharmaceuticals.

8. 1000 Genomes [193]

It is a catalog of human genetic variations.

9. ICGC [194]

ICGC has a comprehensive description of genomic, transcriptomic and epigenomic

changes in 50 different tumor types and/or subtypes. This data is of clinical and

societal importance all over the world.

10. COSMIC [195]

COSMIC is designed to store and display somatic mutation information and related

details. It contains information relating to human cancers.

11. HGMD [196]

The Human Gene Mutation Database (HGMD) keeps record of published gene

lesions responsible for human inherited disease.

12. OMIM [197]

Online Mendelian Inheritance in Man (OMIM) is a catalog of human genes and

genetic disorders. The database contains textual information, pictures, and reference

information.

13. GeneTests [198]

GeneTests is a medical genetics information resource developed for physicians,

other healthcare providers, and researchers.

14. Genomic Variants[187]

The Database of Genomic Variants is a catalog of control data for studies aiming to

correlate genomic variation with phenotypic data.

15. Mitelman Database of Chromosome Aberrations in Cancer [199]

Mitelman Database of Chromosome Aberrations in Cancer is a repository that

Page 126: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

106

relates chromosomal aberrations to tumor characteristics. The relations are based

either on individual cases or associations.

16. Genetic Association Database [200]

The Genetic Association Database is an archive of human genetic association

studies of complex diseases and disorders.

5.4 Materials and Methods

5.4.1 Data

The genotype data of 1417 individuals is downloaded from publically accessable HapMap

database HapMap Genome Browser release #28, The International Hapmap Project,

available at http://hapmap.ncbi. nlm.nih.gov

ftp://ftp.ncbi.nlm.nih.gov/hapmap/genotypes/2010-08_phaseII+III/ website (Jul 12, 2014).

The raw genotype files from different populations were merged together to generate a

combined genotype file. The details of the population data are given in Table 5.2. There are a

total of 11 populations.

Table 5.2: Details of selected HapMap population data

Population Number of

Individuals Name Detail

ASW African ancestry in Southwest USA 87

CEU

Utah residents with Northern and

Western European ancestry

174

CHB Han Chinese in Beijing, China 139

Page 127: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

107

Population Number of

Individuals Name Detail

CHD

Chinese in Metropolitan Denver,

Colorado

109

GIH Gujarati Indians in Houston, Texas 101

JPT Japanese in Tokyo, Japan 116

LWK Luhya in Webuye, Kenya 110

MEX

Mexican ancestry in Los Angeles,

California

86

MKK Maasai in Kinyawa, Kenya 184

TSI Toscans in Italy 102

YRI Yoruban in Ibadan, Nigeria 209

Total 1417

5.4.2 Method

A meta-analysis has been conducted to identify the risk of different SNPs for ischemic

stroke. The risk associated with the SNPs, genes and genotype or haplotype values are given

in Table 5.3.

Page 128: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

108

Table 5.3: Stroke associated genes/ locus, SNP id/ haplotype and corresponding allele risk

Gene/ Locus SNP id Genotype/ Haplotype Allele Risk

MTHFR

rs1801133 TT 4

rs1801131 CC 1

rs12121543

C-T-T-T-G-A 1

rs13306553

rs9651118

rs1801133

rs2274976

rs1801131

F5 rs6025

AA 9

AG 2.7

PRKCH

rs2230500

AA 1.4

AG 1.4

rs2246700

AA 1

AT 1

rs3783799

AA 1.4

AG 1.4

rs12587610

AG 1

GG 1

rs3825655 CC 1

Page 129: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

109

Gene/ Locus SNP id Genotype/ Haplotype Allele Risk

CT 1

PROC rs1401296 CT 1

CYP11B2 rs1799998

CT 1

TT 1

APOE- E4

rs7412

CC 1

CT 1

rs429358 CT 1.4

Near NINJ2

rs12425791

AA 1.4

AG 1.4

rs11833579

AA 1.4

AG 1.4

9p21.3 rs1333049

CC 1.15

CG 1.15

Data for the SNPs given in Table 5.3 are extracted from the genotype data of 1417 samples

of different populations. The allele frequencies for the SNPs calculated for each population

are given in Table 5.4. Frequency values are given for each population against each SNP

allele. A value of zero (‘0’) indicates that a particular population does not have that particular

allele value. On the contrary a dash (‘-’) represents that the population under discussion does

not have that particular SNP.

Page 130: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

110

Table 5.4: Allele frequencies for the SNPs from sample population data

Gene/ Locus SNP id Genotype/ Haplotype

Populations

ASW CEU CHB CHD GIH JPT LWK MEX MKK TSI YRI

MTHFR

rs1801133 TT 5 14 34 13 2 17 0 16 1 26 1

rs1801131 CC 1 17 3 8 16 2 5 4 16 13 4

rs12121543 C-T-T-T-G-A 15 51 37 12 3 17 - 15 1 26 0

rs13306553

rs9651118

rs1801133

rs2274976

rs1801131

F5 rs6025

AA - 1 1 0 0 0 0 0 - 0 0

AG - 4 0 1 1 0 1 3 - 1 0

PRKCH rs2230500 AA - 0 0 0 - 0 0 - - 0 0

Page 131: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

111

Gene/ Locus SNP id Genotype/ Haplotype

Populations

ASW CEU CHB CHD GIH JPT LWK MEX MKK TSI YRI

AG - 0 7 6 - 3 1 - - 1 0

rs2246700

AA - - 14 - - 15 - - - - 20

AT - - 23 - - 26 - - - - 36

rs3783799

AA - - 1 - - 3 - - - - 0

AG - - 14 - - 16 - - - - 0

rs12587610

AG - 47 20 - - 6 - - - - 13

GG - 5 13 - - 29 - - - - 75

rs3825655

CC - 67 24 - - 15 - - - - 76

CT - 5 15 - - 22 - - - - 2

PROC rs1401296 CT - 40 26 - - 25 - - - - 29

CYP11B2 rs1799998 CT 26 85 62 57 49 54 39 40 46 55 53

Page 132: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

112

Gene/ Locus SNP id Genotype/ Haplotype

Populations

ASW CEU CHB CHD GIH JPT LWK MEX MKK TSI YRI

TT 60 51 67 35 28 44 68 30 129 19 145

APOE- E4

rs7412

CC 68 - 109 92 93 101 95 75 154 90 160

CT 16 - 24 15 7 12 11 9 22 9 36

rs429358 CT - - 0 0 - 1 - - - - 2

Near NINJ2

rs12425791

AA 1 4 9 5 10 10 1 11 30 8 2

AG 16 50 43 42 38 54 13 43 3 38 34

rs11833579

AA 5 8 17 10 11 14 4 17 6 10 8

AG 25 56 56 57 44 63 29 38 50 40 76

9p21.3 rs1333049

CC 2 36 30 28 20 33 6 16 10 23 6

CG 39 82 73 48 53 53 46 52 70 60 60

Page 133: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

113

The frequencies are then used to calculate weighted risk for each population. All the frequencies

are converted into percentages by using Equation 5.1. Percentage conversion is done so that the

populations and their risks could be compared. After percentage conversion weighted risk for

each allele is calculated for each population using Equation 5.2. Finally aggregate weighted risk

for each population is calculated using Equation 5.3. If a population does not have a particular

SNP then the weighted average is calculated excluding that allele.

𝑃𝑒𝑟𝑐𝑒𝑛𝑡 𝑆𝑁𝑃 𝑎𝑙𝑙𝑒𝑙𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 = 𝑆𝑁𝑃 𝑎𝑙𝑙𝑒𝑙𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦

𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒× 100 (5.1)

𝑊𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑆𝑁𝑃 𝑎𝑙𝑙𝑒𝑙𝑒 𝑟𝑖𝑠𝑘 = 𝑃𝑒𝑟𝑐𝑒𝑛𝑡 𝑆𝑁𝑃 𝑎𝑙𝑙𝑒𝑙𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 × 𝐴𝑙𝑙𝑒𝑙𝑒 𝑟𝑖𝑠𝑘 (5.2)

𝐴𝑔𝑔𝑟𝑒𝑔𝑎𝑡𝑒 𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑟𝑖𝑠𝑘 = ∑ 𝑊𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑆𝑁𝑃 𝑎𝑙𝑙𝑒𝑙𝑒 𝑟𝑖𝑠𝑘𝑛

𝑖=1∑ 𝐴𝑙𝑙𝑒𝑙𝑒 𝑟𝑖𝑠𝑘𝑛

𝑖=1⁄ (5.3)

where,

n = number of SNP alleles

The detailed SNP allele frequency percentage for each SNP allele and the aggregate weighted

risks for each population are given in Table 5.5.

Page 134: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

114

Table 5.5: Percent allele frequencies for the SNPs from sample population data

Gene/

Locus

SNP id

Genotype/

Haplotype

Risk

Percent Allele Frequency for Populations

ASW CEU CHB CHD GIH JPT LWK MEX MKK TSI YRI

MTHFR

rs1801133 TT 4 5.74 8.54 24.82 11.93 1.98 15.04 0 18.6 0.54 25.74 0.49

rs1801131 CC 1 1.15 10.3 2.19 7.34 15.84 1.77 4.55 4.65 8.7 12.75 1.97

rs12121543

C-T-T-T-G-

A

1 17.24 29.31 26.62 11.01 2.97 14.66 - 17.44 0.54 25.49 0

rs13306553

rs9651118

rs1801133

rs2274976

rs1801131

F5 rs6025 AA 9 - 0.61 0.73 0.00 0 0 0 0 - 0 0

Page 135: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

115

Gene/

Locus

SNP id

Genotype/

Haplotype

Risk

Percent Allele Frequency for Populations

ASW CEU CHB CHD GIH JPT LWK MEX MKK TSI YRI

AG 2.7 - 2.42 0.00 0.92 0.99 0 0.92 3.49 - 0.98 0

PRKCH

rs2230500

AA 1.4 - 0 0.00 0.00 - 0 0 - - 0 0

AG 1.4 - 0 5.22 5.50 - 2.68 0.92 - - 0.98 0

rs2246700

AA 1 - - 31.11 - - 33.33 - - - - 22.22

AT 1 - - 51.11 - - 57.78 - - - - 40

rs3783799

AA 1.4 - - 2.22 - - 6.67 - - - - 0

AG 1.4 - - 31.11 - - 35.56 - - - - 0

rs12587610

AG 1 - 52.81 44.44 - - 13.33 - - - - 14.61

GG 1 - 5.62 28.89 - - 64.44 - - - - 84.27

Page 136: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

116

Gene/

Locus

SNP id

Genotype/

Haplotype

Risk

Percent Allele Frequency for Populations

ASW CEU CHB CHD GIH JPT LWK MEX MKK TSI YRI

rs3825655

CC 1 - 93.06 60.00 - - 37.5 - - - - 97.44

CT 1 - 6.94 37.50 - - 55 - - - - 2.56

PROC rs1401296 CT 1 - 44.44 57.78 - - 56.82 - - - - 32.22

CYP11B2 rs1799998

CT 1 30.23 52.15 46.27 53.27 50.52 49.54 35.78 46.51 25 56.12 26.37

TT 1 69.77 31.29 50.00 32.71 28.87 40.37 62.39 34.88 70.11 19.39 72.14

APOE-

E4

rs7412

CC 1 79.07 - 81.95 85.98 92.08 89.38 89.62 89.29 87.5 90.91 81.22

CT 1 18.6 - 18.05 14.02 6.93 10.62 10.38 10.71 12.5 9.09 18.27

rs429358 CT 1.4 - - 0.00 - - 2.27 - - - - 2.22

Near rs12425791 AA 1.4 1.15 2.42 6.57 4.59 9.9 8.85 0.94 12.79 16.3 7.84 0.99

Page 137: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

117

Gene/

Locus

SNP id

Genotype/

Haplotype

Risk

Percent Allele Frequency for Populations

ASW CEU CHB CHD GIH JPT LWK MEX MKK TSI YRI

NINJ2 AG 1.4 18.39 30.3 31.39 38.53 37.62 47.79 12.26 50 1.63 37.25 16.75

rs11833579

AA 1.4 5.81 4.88 12.69 9.35 11.11 12.61 3.77 20.24 3.3 9.8 3.94

AG 1.4 29.07 34.15 41.79 53.27 44.44 56.76 27.36 45.24 27.47 39.22 37.44

9p21.3 rs1333049

CC 1.15 2.3 21.82 21.90 25.69 19.8 29.2 5.45 18.6 5.43 22.55 2.96

CG 1.15 44.83 49.7 53.28 44.04 52.48 46.9 44.82 60.47 38.04 58.82 29.56

Aggregate Weighted Risk 20.64 15.67 20.91 15.14 14.7 20.98 10.07 18.84 18.14 16.85 14.19

Page 138: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

118

The aggregate weighted risk is used to calculate distance matrix. A pairwise difference is

calculated for each population. Distance matrix shown in Table 5.6 is calculated using Equation

5.4.

𝑑𝑎,𝑏 = |𝐴𝑔𝑔𝑟𝑒𝑔𝑎𝑡𝑒 𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑟𝑖𝑠𝑘𝑎 − 𝐴𝑔𝑔𝑟𝑒𝑔𝑎𝑡𝑒 𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑟𝑖𝑠𝑘𝑏| (5.4)

where,

da,b is a cell in the distance matrix.

a and b are any two populations from the set of 11 populations.

The calculated distance matrix is used to create a phylogenetic tree for the populations (Figure

5.1). The tree is generated using Matlab 7.14 by neighbor joining method.

Page 139: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

119

Table 5.6: Distance matrix for all populations

Distance ASW CEU CHB CHD GIH JPT LWK MEX MKK TSI YRI

ASW 4.97 0.27 5.5 5.94 0.34 10.57 1.8 2.5 3.79 6.45

CEU 4.97 5.24 0.53 0.97 5.31 5.6 3.17 2.47 1.18 1.48

CHB 0.27 5.24 5.77 6.21 0.07 10.84 2.07 2.77 4.06 6.72

CHD 5.5 0.53 5.77 0.44 5.84 5.07 3.7 3 1.71 0.95

GIH 5.94 0.97 6.21 0.44 6.28 4.63 4.14 3.44 2.15 0.51

JPT 0.34 5.31 0.07 5.84 6.28 10.91 2.14 2.84 4.13 6.79

LWK 10.57 5.6 10.84 5.07 4.63 10.91 8.77 8.07 6.78 4.12

MEX 1.8 3.17 2.07 3.7 4.14 2.14 8.77 0.7 1.99 4.65

MKK 2.5 2.47 2.77 3 3.44 2.84 8.07 0.7 1.29 3.95

TSI 3.79 1.18 4.06 1.71 2.15 4.13 6.78 1.99 1.29 2.66

YRI 6.45 1.48 6.72 0.95 0.51 6.79 4.12 4.65 3.95 2.66

Page 140: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

120

Figure 5.1: Phylogenetic tree using our distance matrix

Page 141: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

121

5.5 Results and Discussion

JPT population has the highest aggregate weighted risk of 20.98 while LWK has the lowest

aggregate weighted risk of 10.07. Figure 5.2 shows the percent SNP allele frequency of all 11

populations. Different populations have different frequencies of the SNP alleles. ASW and MKK

are the least to be described by the genetic data and SNPs. JPT, CHH and CHB are best

described by the genetic data and SNPs chosen for analysis.

Combined allele frequencies for all populations are shown for each SNP are shown in Figure

5.3. It is obvious from the figure (Figure 5.3) that data for rs7412 on APOE-e4 gene is present in

most of the populations. AA allele for rs2230500 is not present in all the populations. AA allele

for rs6025 is present in CEU and CHB only. CT allele for rs429358 is present only in JPT and

YRI.

The phylogenetic tree generated using the calculated distance matrix (Figure 5.1) shows that

MEX, MKK and TSI are closely related w.r.t. distance matrix based on aggregate weighted risk.

Similarly, CHB, JPT and ASW form another group of closely related populations. CHD, GIH,

YRI and CEU are the third closely related group. Population LWK is distant from other

populations.

Another phylogenetic tree is constructed based on FST distances calculated by Altshuler et. al.

[201] in Figure 5.4. The phylogenetic relationships observed in the second tree (Figure 5.4)

shows different results when compared with the one in Figure 5.1. First group of closely related

populations are CEU, TSI, MEX and GIH. Second group comprises of CHB, CHD and JPT. The

third group comprises of LWK, YRI, ASW and MKK.

Page 142: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

122

Figure 5.2: Comparison of % allele frequency of all sample populations

0

20

40

60

80

100

120

ASW CEU CHB CHD GIH JPT LWK MEX MKK TSI YRI

rs1801133

rs1801131

Haplotype

rs6025 AA

rs6025 AG

rs2230500 AA

rs2230500 AG

rs2246700 AA

rs2246700 AT

rs3783799 AA

rs3783799 AG

rs12587610 AG

rs12587610 GG

rs3825655 CC

rs3825655 CT

rs1401296

rs1799998 CT

rs1799998 TT

rs7412 CC

rs7412 CT

Page 143: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

123

Figure 5.3: Combined allele frequencies of all sample populations

0

100

200

300

400

500

600

700

800

900

1000

YRI

TSI

MKK

MEX

LWK

JPT

GIH

CHD

CHB

CEU

ASW

Page 144: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

124

Figure 5.4: Constructed phylogenetic tree using FST matrix as calculated by Altshuler et. al. [201]

Page 145: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

125

One of reasons for the difference in both trees is that different populations might be genetically

close but not w.r.t. ischemic stroke risk. Another reason for this difference is that when analyzing

the genetic distances in general, too many SNPs and a large amount of genetic data are involved.

While for ischemic risk estimation we have used only a few selected SNPs. This clearly

indicates that genetic closeness of populations is not an indication that their risk for some

disease would be the same as well.

5.6 Conclusion

This research addresses the correlation of phylogenetic trees with ischemic stroke risk. Two

types of phylogenetic trees are generated; one based on genetic distance matrix and the other

based on ischemic stroke risk difference matrix. A strenuous comparison of genetic distance and

ischemic stroke risk difference matrix in the form of phylogenetic trees is presented in this

chapter. Both trees show different relationships among different populations. These relationships

indicate that different populations might be close genetically but they might have differences as

far as disease risks are concerned. The reason for this difference is that genetic distances are

calculated using data of all genes while the ischemic stroke risk difference matrix is calculated

using selected SNPs only.

Page 146: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

CHAPTER NO. 6

CONCLUSION & FUTURE

RECOMENDATIONS

Page 147: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

126

Conclusion and Future Recommendations

6.1 Conclusion

The real-time analysis of genetic data and ultrasound imaging provides a quick means by which

to qualitatively analyze data and draw meaningful interpretation. This subsequently helps in

analyzing the risk factors and to take into account the preventive measures for the stroke. The

trees can be used for the representation of multi-dimensional data sets and accessing the genetic

distance of a population based on ischemic stroke risk.

Multi-dimensional feature sets – Image and Genetic features are used for the early diagnosis and

assessment of the stroke risk that would be beneficial for the process of correct identification of

individuals at high risk of ischemic stroke. This research is an effort to improve and contribute

in the visual assessment procedure conducted by the medical personals. This research not only

facilitates the medical personals but also the community at large by elucidating the risk factors.

The proposed approach plays an important role by contributing to the area of Computer Aided

Diagnostics and Preventive Studies. The research facilitates in meaningful interpretation of

genetic and image based data that ultimately helps in critical analysis of the risk factors and the

preventive measures for ischemic strokes.

6.2 Future Recommendations

This research can be enhanced in future by incorporating the texture features of plaque into the

feature set already used. This may strengthen the risk estimation task. In addition to this multiple

CA imaging modalities such as CTA, MRA, CAG, DSA etc. can also be used for improvement

of the accuracy rate of the classification algorithms.

Page 148: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

127

References

1 Organization, W. H. Top 10 causes of death: Fact sheet no. 310: Who 2011. Ref Type:

Report L U. (2011).

2 Naghavi, M., Wang, H., Lozano, R., Davis, A., Liang, X., Zhou, M., Vollset, S. E. V.,

Abbasoglu Ozgoren, A., Norman, R. E., and Vos, T. Global, regional, and national age–

sex specifi c all-cause and cause-specifi c mortality for 240 causes of death, 1990–2013:

A systematic analysis for the global burden of disease study 2013. The lancet,

385(9963): 117-171. (2015).

3 Association, A. H. A. A. S. Types of stroke Retrieved 12/02, 2014, from

http://www.strokeassociation.org/STROKEORG/AboutStroke/TypesofStroke/Types-of-

Stroke_UCM_308531_SubHomePage.jsp

4 Cerebrovascular accident - national library of medicine - pubmed health. (2012), 2012,

from http://www.ncbi.nlm.nih.gov/pubmedhealth/PMH0001740/

5 Deep venous thrombosis: Medlineplus medical encyclopedia. Retrieved 20/04, 2015,

from http://www.nlm.nih.gov/medlineplus/ency/article/000156.htm

6 Stroke.org. (2014). What is stroke? Retrieved 10/12/2014, 2015, from

http://www.stroke.org/site/PageServer?pagename=TYPE

7 Goldschmidt-Clermont, P. J., Dong, C., Seo, D. M., and Velazquez, O. C.

Atherosclerosis, inflammation, genetics, and stem cells: 2012 update. Current

atherosclerosis reports, 14(3): 201-210. (2012).

8 National Heart Lung and Blood Institute, N. What is atherosclerosis? Retrieved

10/12/2014, 2014, from http://www.nhlbi.nih.gov/health/health-

topics/topics/atherosclerosis/

9 Bots, M. L., Hoes, A. W., Koudstaal, P. J., Hofman, A., and Grobbee, D. E. Common

carotid intima-media thickness and risk of stroke and myocardial infarction the rotterdam

Page 149: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

128

study. Circulation, 96(5): 1432-1437. (1997).

10 Simons, L. A., McCallum, J., Friedlander, Y., and Simons, J. Risk factors for ischemic

stroke dubbo study of the elderly. Stroke, 29(7): 1341-1346. (1998).

11 Touboul, P.-J., Labreuche, J., Vicaut, E., and Amarenco, P. Carotid intima-media

thickness, plaques, and framingham risk score as independent determinants of stroke

risk. Stroke, 36(8): 1741-1745. (2005).

12 Johnson, H., Turke, T. L., Grossklaus, M., Dall, T., Carimi, S., Koenig, L. M.,

Aeschlimann, S. E., Korcarz, C. E., and Stein, J. H. Detection of subclinical

atherosclerosis by office-based carotid ultrasound increases prescription of preventive

therapies. Journal of the American College of Cardiology, 55(10): A58. E557. (2010).

13 Faita, F., Gemignani, V., Bianchini, E., Giannarelli, C., Ghiadoni, L., and Demi, M. Real-

time measurement system for evaluation of the carotid intima-media thickness with a

robust edge operator. Journal of Ultrasound in Medicine, 27(9): 1353-1361. (2008).

14 Gariepy, J., Massonneau, M., Levenson, J., Heudes, D., and Simon, A. Evidence for in

vivo carotid and femoral wall thickening in human hypertension. Groupe de prévention

cardio-vasculaire en médecine du travail. Hypertension, 22(1): 111-118. (1993).

15 Liguori, C., Paolillo, A., and Pietrosanto, A. An automatic measurement system for the

evaluation of carotid intima-media thickness. Instrumentation and Measurement, IEEE

Transactions on, 50(6): 1684-1691. (2001).

16 Pignoli, P., and Longo, T. Evaluation of atherosclerosis with b-mode ultrasound imaging.

The Journal of nuclear medicine and allied sciences, 32(3): 166-173. (1987).

17 Rocha, R., Campilho, A., Silva, J., Azevedo, E., and Santos, R. Segmentation of the

carotid intima-media region in b-mode ultrasound images. Image and Vision Computing,

28(4): 614-625. (2010).

18 Selzer, R. H., Hodis, H. N., Kwong-Fu, H., Mack, W. J., Lee, P. L., Liu, C.-r., and Liu,

C.-h. Evaluation of computerized edge tracking for quantifying intima-media thickness

Page 150: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

129

of the common carotid artery from b-mode ultrasound images. Atherosclerosis, 111(1):

1-11. (1994).

19 Stein, J. H., Korcarz, C. E., Mays, M. E., Douglas, P. S., Palta, M., Zhang, H., LeCaire,

T., Paine, D., Gustafson, D., and Fan, L. A semiautomated ultrasound border detection

program that facilitates clinical measurement of ultrasound carotid intima-media

thickness. Journal of the American Society of Echocardiography, 18(3): 244-251.

(2005).

20 Touboul, P.-J., Prati, P., Scarabin, P.-Y., Adrai, V., Thibout, E., and Ducimetière, P. Use of

monitoring software to improve the measurement of carotid wall thickness by b-mode

imaging. Journal of hypertension, 10: S37-S42. (1992).

21 Bastida-Jumilla, M. C., Menchón-Lara, R. M., Morales-Sánchez, J., Verdú-Monedero,

R., Larrey-Ruiz, J., and Sancho-Gómez, J. L. Segmentation of the common carotid artery

walls based on a frequency implementation of active contours. Journal of digital

imaging, 26(1): 129-139. (2013).

22 Delsanto, S., Molinari, F., Giustetto, P., Liboni, W., Badalamenti, S., and Suri, J. S.

Characterization of a completely user-independent algorithm for carotid artery

segmentation in 2-d ultrasound images. Instrumentation and Measurement, IEEE

Transactions on, 56(4): 1265-1274. (2007).

23 Golemati, S., Stoitsis, J., Sifakis, E. G., Balkizas, T., and Nikita, K. S. Using the hough

transform to segment ultrasound images of longitudinal and transverse sections of the

carotid artery. Ultrasound in medicine & biology, 33(12): 1918-1932. (2007).

24 Molinari, F., Zeng, G., and Suri, J. S. Intima-media thickness: Setting a standard for a

completely automated method of ultrasound measurement. Ultrasonics, Ferroelectrics,

and Frequency Control, IEEE Transactions on, 57(5): 1112-1124. (2010).

25 Petroudi, S., Loizou, C., Pantziaris, M., Pattichis, M., and Pattichis, C. 2011. A fully

automated method using active contours for the evaluation of the intima-media thickness

in carotid us images. Engineering in Medicine and Biology Society, EMBC, 2011 Annual

Page 151: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

130

International Conference of the IEEE. p. 8053-8057

26 Mahmoud, A., Morsy, A., and de Groot, E. 2010. A new gradient-based algorithm for

edge detection in ultrasonic carotid artery images. Engineering in Medicine and Biology

Society (EMBC), 2010 Annual International Conference of the IEEE. p. 5165-5168

27 Gustavsson, T., Abu-Gharbieh, R., Hamarneh, G., and Liang, Q. 1997. Implementation

and comparison of four different boundary detection algorithms for quantitative

ultrasonic measurements of the human carotid artery. Computers in Cardiology 1997. p.

69-72

28 Holdfeldt, P., Viberg, M., and Gustavsson, T. 2008. A new method based on dynamic

programming for boundary detection in ultrasound image sequences. Engineering in

Medicine and Biology Society, 2008. EMBS 2008. 30th Annual International Conference

of the IEEE. p. 3072-3074

29 Liang, Q., Wendelhag, I., Wikstrand, J., and Gustavsson, T. A multiscale dynamic

programming procedure for boundary detection in ultrasonic artery images. Medical

Imaging, IEEE Transactions on, 19(2): 127-142. (2000).

30 Cheng, D.-C., and Jiang, X. Detections of arterial wall in sonographic artery images

using dual dynamic programming. Information Technology in Biomedicine, IEEE

Transactions on, 12(6): 792-799. (2008).

31 Cheng, D.-c., Schmidt-Trucksass, A., Cheng, K.-s., Sandrock, M., Pu, Q., and Burkhardt,

H. 1999. Automatic detection of the intimal and the adventitial layers of the common

carotid artery wall in ultrasound b-mode images using snakes. Image Analysis and

Processing, 1999. Proceedings. International Conference on. p. 452-457

32 Cheng, D.-c., Schmidt-Trucksäss, A., Cheng, K.-s., and Burkhardt, H. Using snakes to

detect the intimal and adventitial layers of the common carotid artery wall in sonographic

images. Computer methods and programs in biomedicine, 67(1): 27-37. (2002).

33 Loizou, C. P., Pattichis, C. S., Pantziaris, M., Tyllis, T., and Nicolaides, A. Snakes based

segmentation of the common carotid artery intima media. Medical & biological

Page 152: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

131

engineering & computing, 45(1): 35-49. (2007).

34 Molinari, F., Meiburger, K. M., Saba, L., Zeng, G., Acharya, U. R., Ledda, M.,

Nicolaides, A., and Suri, J. S. Fully automated dual-snake formulation for carotid intima-

media thickness measurement a new approach. Journal of Ultrasound in Medicine,

31(7): 1123-1136. (2012).

35 Xu, X., Zhou, Y., Cheng, X., Song, E., and Li, G. Ultrasound intima–media segmentation

using hough transform and dual snake model. Computerized Medical Imaging and

Graphics, 36(3): 248-258. (2012).

36 Bastida-Jumilla, M., Morales-Sánchez, J., Verdú-Monedero, R., Larrey-Ruiz, J., and

Sancho-Gómez, J.-L. 2010. Detection of the intima and media walls of the carotid artery

with geodesic active contours. Image Processing (ICIP), 2010 17th IEEE International

Conference on. p. 2213-2216

37 Moursi, S. G., and El-Sakka, M. R. 2008. Active contours initialization for ultrasound

carotid artery images. Computer Systems and Applications, 2008. AICCSA 2008.

IEEE/ACS International Conference on. p. 629-636

38 Meiburger, K. M., Molinari, F., Zeng, G., Saba, L., and Suri, J. S. 2011. Carotid

automated ultrasound double line extraction system (cadles) via edge-flow. Engineering

in Medicine and Biology Society, EMBC, 2011 Annual International Conference of the

IEEE. p. 575-578

39 Molinari, F., Liboni, W., Giustetto, P., Badalamenti, S., and Suri, J. S. Automatic

computer-based tracings (act) in longitudinal 2-d ultrasound images using different

scanners. Journal of Mechanics in Medicine and Biology, 9(04): 481-505. (2009).

40 Delsanto, S., Molinari, F., Liboni, W., Giustetto, P., Badalamenti, S., and Suri, J. S. 2006.

User-independent plaque characterization and accurate imt measurement of carotid artery

wall using ultrasound. Engineering in Medicine and Biology Society, 2006. EMBS'06.

28th Annual International Conference of the IEEE. p. 2404-2407

41 Molinari, F., Delsanto, S., Giustetto, P., Liboni, W., Badalamenti, S., and Suri, J. User-

Page 153: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

132

independent plaque segmentation and accurate intima-media thickness measurement of

carotid artery wall using ultrasound. (2008).

42 Destrempes, F., Meunier, J., Giroux, M.-F., Soulez, G., and Cloutier, G. Segmentation in

ultrasonic b-mode images of healthy carotid arteries using mixtures of nakagami

distributions and stochastic optimization. Medical Imaging, IEEE Transactions on,

28(2): 215-229. (2009).

43 Destrempes, F., Soulez, G., Giroux, M.-F., Meunier, J., and Cloutier, G. 2009.

Segmentation of plaques in sequences of ultrasonic b-mode images of carotid arteries

based on motion estimation and nakagami distributions. Ultrasonics Symposium (IUS),

2009 IEEE International. p. 2480-2483

44 Golemati, S., Stoitsis, J., Balkizas, T., and Nikita, K. 2006. Comparison of b-mode, m-

mode and hough transform methods for measurement of arterial diastolic and systolic

diameters. Engineering in Medicine and Biology Society, 2005. IEEE-EMBS 2005. 27th

Annual International Conference of the. p. 1758-1761

45 Golemati, S., Tegos, T. J., Sassano, A., Nikita, K. S., and Nicolaides, A. N. Echogenicity

of b-mode sonographic images of the carotid artery work in progress. Journal of

Ultrasound in Medicine, 23(5): 659-669. (2004).

46 Stoitsis, J., Golemati, S., Kendros, S., and Nikita, K. 2008. Automated detection of the

carotid artery wall in b-mode ultrasound images using active contours initialized by the

hough transform. Engineering in Medicine and Biology Society, 2008. EMBS 2008. 30th

Annual International Conference of the IEEE. p. 3146-3149

47 Molinari, F., Zeng, G., and Suri, J. S. An integrated approach to computer-based

automated tracing and its validation for 200 common carotid arterial wall ultrasound

images a new technique. Journal of Ultrasound in Medicine, 29(3): 399-418. (2010).

48 Delsanto, S., Molinari, F., Giustetto, P., Liboni, W., and Badalamenti, S. 2006. Culex-

completely user-independent layers extraction: Ultrasonic carotid artery images

segmentation. Engineering in Medicine and Biology Society, 2005. IEEE-EMBS 2005.

Page 154: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

133

27th Annual International Conference of the. p. 6468-6471

49 Molinari, F., Pattichis, C. S., Zeng, G., Saba, L., Acharya, U. R., Sanfilippo, R.,

Nicolaides, A., and Suri, J. S. Completely automated multiresolution edge snapper—a

new technique for an accurate carotid ultrasound imt measurement: Clinical validation

and benchmarking on a multi-institutional database. Image Processing, IEEE

Transactions on, 21(3): 1211-1222. (2012).

50 Molinari, F., Zeng, G., and Suri, J. S. 2010. Inter-greedy technique for fusion of different

carotid segmentation boundaries leading to high-performance imt measurement.

Engineering in Medicine and Biology Society (EMBC), 2010 Annual International

Conference of the IEEE. p. 4769-4772

51 Molinari, F., Meiburger, K. M., Acharya, U. R., Zeng, G., Rodrigues, P., Saba, L.,

Nicolaides, A., and Suri, J. 2011. Cares 3.0: A two stage system combining feature-based

recognition and edge-based segmentation for cimt measurement on a multi-institutional

ultrasound database of 300 images. Engineering in Medicine and Biology Society,

EMBC, 2011 Annual International Conference of the IEEE. p. 5149-5152

52 Mao, F., Gill, J., Downey, D., and Fenster, A. Segmentation of carotid artery in

ultrasound images: Method development and evaluation technique. Medical Physics,

27(8): 1961-1970. (2000).

53 Abolmaesumi, P., Sirouspour, M. R., and Salcudean, S. 2000. Real-time extraction of

carotid artery contours from ultrasound images. Computer-Based Medical Systems,

2000. CBMS 2000. Proceedings. 13th IEEE Symposium on. p. 181-186

54 Gill, J., Ladak, H., Steinman, D., and Fenster, A. 2000. Segmentation of ulcerated

plaque: A semi-automatic method for tracking the progression of carotid atherosclerosis.

Engineering in Medicine and Biology Society, 2000. Proceedings of the 22nd Annual

International Conference of the IEEE. p. 669-672

55 Cohen, L. D. On active contour models and balloons. CVGIP: Image understanding,

53(2): 211-218. (1991).

Page 155: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

134

56 Hamou, A. K., and El-Sakka, M. R. 2004. A novel segmentation technique for carotid

ultrasound images. Acoustics, Speech, and Signal Processing, 2004.

Proceedings.(ICASSP'04). IEEE International Conference on. p. iii-521-524 vol. 523

57 Abdel-Dayem, A. R., and Ei-Sakka, M. 2004. A novel morphological-based carotid

artery contour extraction. Electrical and Computer Engineering, 2004. Canadian

Conference on. p. 1873-1876

58 Weszka, J. S., and Rosenfeld, A. A comparative study of texture measures for terrain

classification. NASA STI/Recon Technical Report N, 76: 13470. (1975).

59 Amadasun, M., and King, R. Textural features corresponding to textural properties.

Systems, Man and Cybernetics, IEEE Transactions on, 19(5): 1264-1274. (1989).

60 Laws, K. I. 1980. Rapid texture identification. 24th Annual Technical Symposium. p.

376-381

61 Wu, C.-M., Chen, Y.-C., and Hsieh, K.-S. Texture features for classification of ultrasonic

liver images. Medical Imaging, IEEE Transactions on, 11(2): 141-152. (1992).

62 Mandelbrot, B. B. The fractal geometry of nature/revised and enlarged edition. New

York, WH Freeman and Co., 1983, 495 p., 1. (1983).

63 Robinson, D., and Foulds, L. R. Comparison of phylogenetic trees. Mathematical

Biosciences, 53(1): 131-147. (1981).

64 Floßmann, E., Schulz, U. G., and Rothwell, P. M. Systematic review of methods and

results of studies of the genetic epidemiology of ischemic stroke. Stroke, 35(1): 212-227.

(2004).

65 Jerrard-Dunne, P., Cloud, G., Hassan, A., and Markus, H. S. Evaluating the genetic

component of ischemic stroke subtypes a family history study. Stroke, 34(6): 1364-1369.

(2003).

66 Matarin, M., Brown, W. M., Singleton, A., Hardy, J. A., and Meschia, J. F. Whole

Page 156: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

135

genome analyses suggest ischemic stroke and heart disease share an association with

polymorphisms on chromosome 9p21. Stroke, 39(5): 1586-1589. (2008).

67 Li, J., Lang, J., Zeng, Z., and McCullough, L. D. Akt1 gene deletion and stroke. Journal

of the neurological sciences, 269(1): 105-112. (2008).

68 Fitch, W. M., and Margoliash, E. Construction of phylogenetic trees. Science,

155(3760): 279-284. (1967).

69 Sacco, R. L. Risk factors and outcomes for ischemic stroke. Neurology, 45(2 Suppl 1):

S10-14. (1995).

70 Tang, J., and Moret, B. M. Scaling up accurate phylogenetic reconstruction from gene-

order data. Bioinformatics, 19(suppl 1): i305-i312. (2003).

71 Nei, M., Tajima, F., and Tateno, Y. Accuracy of estimated phylogenetic trees from

molecular data. Journal of Molecular Evolution, 19(2): 153-170. (1983).

72 Rambaut, A., and Grass, N. C. Seq-gen: An application for the monte carlo simulation of

DNA sequence evolution along phylogenetic trees. Computer applications in the

biosciences: CABIOS, 13(3): 235-238. (1997).

73 Parry-Smith, D., and Attwood, T. K. Somap: A novel interactive approach to multiple

protein sequences alignment. Computer applications in the biosciences: CABIOS, 7(2):

233-235. (1991).

74 Milinkovitch, M. C., LeDuc, R. G., Adachi, J., Farnir, F., Georges, M., and Hasegawa,

M. Effects of character weighting and species sampling on phylogeny reconstruction: A

case study based on DNA sequence data in cetaceans. Genetics, 144(4): 1817. (1996).

75 Baxevanis, A. D. The importance of biological databases in biological discovery.

Current Protocols in Bioinformatics: 1.1. 1-1.1. 6. (2006).

76 Sigaud, O., and Wilson, S. W. Learning classifier systems: A survey. Soft Computing,

11(11): 1065-1078. (2007).

Page 157: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

136

77 Kotsiantis, S. B., Zaharakis, I., and Pintelas, P.: ‘Supervised machine learning: A review

of classification techniques’, in Editor (Ed.)^(Eds.): ‘Book Supervised machine learning:

A review of classification techniques’ (2007, edn.), pp.

78 Zhu, X. Semi-supervised learning literature survey. (2005).

79 Tsiaparas, N., Golemati, S., Andreadis, I., Stoitsis, J., Valavanis, I., and Nikita, K.

Assessment of carotid atherosclerosis from b-mode ultrasound images using directional

multiscale texture features. Measurement Science and Technology, 23(11): 114004.

(2012).

80 Stoitsis, J., Golemati, S., Nikita, K., and Nicolaides, A. 2004. Characterization of carotid

atherosclerosis based on motion and texture features and clustering using fuzzy c-means.

Engineering in Medicine and Biology Society, 2004. IEMBS'04. 26th Annual

International Conference of the IEEE. p. 1407-1410

81 Santhiyakumari, N., Rajendran, P., and Madheswaran, M. Medical decision-making

system of ultrasound carotid artery intima–media thickness using neural networks.

Journal of digital imaging, 24(6): 1112-1125. (2011).

82 Kemény, V., Droste, D. W., Hermes, S., Nabavi, D. G., Schulte-Altedorneburg, G.,

Siebler, M., and Ringelstein, E. B. Automatic embolus detection by a neural network.

Stroke, 30(4): 807-810. (1999).

83 Kyriacou, E., Pattichis, M. S., Pattichis, C. S., Mavrommatis, A., Christodoulou, C. I.,

Kakkos, S., and Nicolaides, A. Classification of atherosclerotic carotid plaques using

morphological analysis on ultrasound images. Applied Intelligence, 30(1): 3-23. (2009).

84 Christodoulou, C. I., Pattichis, C. S., Pantziaris, M., and Nicolaides, A. Texture-based

classification of atherosclerotic carotid plaques. Medical Imaging, IEEE Transactions

on, 22(7): 902-912. (2003).

85 Kyriacou, E., Pattichis, C. S., Pattichis, M. S., Mavrommatis, A., Panagiotou, S.,

Christodoulou, C. I., Kakkos, S., and Nicolaides, A. Classification of atherosclerotic

carotid plaques using gray level morphological analysis on ultrasound images Artificial

Page 158: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

137

intelligence applications and innovations (pp. 737-744): Springer.(2006).

86 Kyriacou, E., Pattichis, C., Christodoulou, C., Karaolis, M., and Nicolaides, A. 2005. An

integrated teleconsultation system for the evaluation of the risk of stroke. Proceedings of

the 5th International Network Conference INC2005. p.

87 Kyriacou, E. C., Pattichis, C. S., Karaolis, M. A., Loizou, C. P., Christodoulou, C. I.,

Pattichis, M. S., Kakkos, S., and Nicolaides, A. An integrated system for assessing stroke

risk. Engineering in Medicine and Biology Magazine, IEEE, 26(5): 43-50. (2007).

88 Abdolmaleki, P., Mokhtari-Dizaji, M., Montazeri, M., Saberi, H., and Nikanjam, N.

2005. Applying the logistic regression model to predict the carotid artery stenosis using

the sequential color doppler ultrasound image processing. Proceedings of Third Iranian

Conference on Machine Vision and Image Processing and Applications (MVIP), Tehran,

Iran. p. 312-317

89 Lambrou, A., Papadopoulos, H., Kyriacou, E., Pattichis, C. S., Pattichis, M. S.,

Gammerman, A., and Nicolaides, A. Evaluation of the risk of stroke with confidence

predictions based on ultrasound carotid image analysis. International Journal on

Artificial Intelligence Tools, 21(04). (2012).

90 Christodoulou, C., Pattichis, C., Kyriacou, E., and Nicolaides, A. Image retrieval and

classification of carotid plaque ultrasound images. The Open Cardiovascular Imaging

Journal, 2: 18-28. (2010).

91 French-Sherry, E., Bassiouny, H., Pooley, T., Marcoux, M., Castilla, M., Watson, D., and

Lozanski, L. Assessing stroke risk with carotid duplex ultrasound scanning. JOURNAL

OF CRITICAL ILLNESS, 13: 448-460. (1998).

92 Perez, I. Appropriateness of carotid duplex ultrasound in acute stroke management. PhD,

King's College, London. (1999).

93 Chalela, J. A., Kidwell, C. S., Nentwich, L. M., Luby, M., Butman, J. A., Demchuk, A.

M., Hill, M. D., Patronas, N., Latour, L., and Warach, S. Magnetic resonance imaging

and computed tomography in emergency assessment of patients with suspected acute

Page 159: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

138

stroke: A prospective comparison. Lancet, 369(9558): 293-298. (2007).

94 Amin-Hanjani, S., Du, X., Zhao, M., Walsh, K., Malisch, T. W., and Charbel, F. T. Use of

quantitative magnetic resonance angiography to stratify stroke risk in symptomatic

vertebrobasilar disease. Stroke, 36(6): 1140-1145. (2005).

95 Lee, L. J., Kidwell, C. S., Alger, J., Starkman, S., and Saver, J. L. Impact on stroke

subtype diagnosis of early diffusion-weighted magnetic resonance imaging and magnetic

resonance angiography. Stroke, 31(5): 1081-1089. (2000).

96 Zanette, E. M., Fieschi, C., Bozzao, L., Roberti, C., Toni, D., Argentino, C., and Lenzi,

G. L. Comparison of cerebral angiography and transcranial doppler sonography in acute

stroke. Stroke, 20(7): 899-903. (1989).

97 Hassan, A. E., Rostambeigi, N., Chaudhry, S. A., Khan, A. A., Zacharatos, H., Khatri, R.,

Uzun, G., and Qureshi, A. I. Combination of noninvasive neurovascular imaging

modalities in stroke patients: Patterns of use and impact on need for digital subtraction

angiography. Journal of Stroke and Cerebrovascular Diseases, 22(7): e53-e58. (2013).

98 Nighoghossian, N., Derex, L., and Douek, P. The vulnerable carotid artery plaque:

Current imaging methods and new perspectives. Stroke, 36(12): 2764-2772. (2005).

99 Lorenz, M. W., Markus, H. S., Bots, M. L., Rosvall, M., and Sitzer, M. Prediction of

clinical cardiovascular events with carotid intima-media thickness a systematic review

and meta-analysis. Circulation, 115(4): 459-467. (2007).

100 van der Meer, I. M., Bots, M. L., Hofman, A., del Sol, A. I., van der Kuip, D. A., and

Witteman, J. C. Predictive value of noninvasive measures of atherosclerosis for incident

myocardial infarction the rotterdam study. Circulation, 109(9): 1089-1094. (2004).

101 E-health laboratory, computer science department. (2015) Retrieved 26 March 2015,

2015, from http://www.medinfo.cs.ucy.ac.cy/

102 Aja-Fernández, S., and Alberola-López, C. On the estimation of the coefficient of

variation for anisotropic diffusion speckle filtering. Image Processing, IEEE

Page 160: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

139

Transactions on, 15(9): 2694-2701. (2006).

103 Maisonobe, L. Finding the circle that best fits a set of points. October 25th. (2007).

104 Rocha, R., Campilho, A., Silva, J., Azevedo, E., and Santos, R. Segmentation of

ultrasound images of the carotid using ransac and cubic splines. Computer methods and

programs in biomedicine, 101(1): 94-106. (2011).

105 Xu, C., and Prince, J. L. Snakes, shapes, and gradient vector flow. Image Processing,

IEEE Transactions on, 7(3): 359-369. (1998).

106 Bland, J. M., and Altman, D. Statistical methods for assessing agreement between two

methods of clinical measurement. The lancet, 327(8476): 307-310. (1986).

107 Loizou, C. P., Pattichis, C. S., Nicolaides, A. N., and Pantziaris, M. Manual and

automated media and intima thickness measurements of the common carotid artery.

Ultrasonics, Ferroelectrics, and Frequency Control, IEEE Transactions on, 56(5): 983-

994. (2009).

108 Molinari, F., Acharya, R. U., Zeng, G., Meiburger, K. M., Rodrigues, P. S., Saba, L., and

Suri, J. S. Cares 2.0: Completely automated robust edge snapper for cimt measurement in

300 ultrasound images—a two stage paradigm. Journal of Medical Imaging and Health

Informatics, 1(2): 150-163. (2011).

109 Menchón-Lara, R.-M., Bastida-Jumilla, M.-C., Morales-Sánchez, J., and Sancho-Gómez,

J.-L. Automatic detection of the intima-media thickness in ultrasound images of the

common carotid artery using neural networks. Medical & biological engineering &

computing, 52(2): 169-181. (2014).

110 Bush, W. S., and Moore, J. H. Genome-wide association studies. PLoS computational

biology, 8(12): e1002822. (2012).

111 Hirschhorn, J. N., and Daly, M. J. Genome-wide association studies for common diseases

and complex traits. Nature Reviews Genetics, 6(2): 95-108. (2005).

Page 161: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

140

112 Traylor, M., Farrall, M., Holliday, E. G., Sudlow, C., Hopewell, J. C., Cheng, Y.-C.,

Fornage, M., Ikram, M. A., Malik, R., and Bevan, S. Genetic risk factors for ischaemic

stroke and its subtypes (the metastroke collaboration): A meta-analysis of genome-wide

association studies. The Lancet Neurology, 11(11): 951-962. (2012).

113 Kato, N., Takeuchi, F., Tabara, Y., Kelly, T. N., Go, M. J., Sim, X., Tay, W. T., Chen, C.-

H., Zhang, Y., and Yamamoto, K. Meta-analysis of genome-wide association studies

identifies common variants associated with blood pressure variation in east asians.

Nature genetics, 43(6): 531-538. (2011).

114 Maurano, M. T., Humbert, R., Rynes, E., Thurman, R. E., Haugen, E., Wang, H.,

Reynolds, A. P., Sandstrom, R., Qu, H., and Brody, J. Systematic localization of common

disease-associated variation in regulatory DNA. Science, 337(6099): 1190-1195. (2012).

115 Schaub, M. A., Boyle, A. P., Kundaje, A., Batzoglou, S., and Snyder, M. Linking disease

associations with regulatory information in the human genome. Genome research, 22(9):

1748-1759. (2012).

116 Jeck, W. R., Siebold, A. P., and Sharpless, N. E. Review: A meta‐analysis of gwas and

age‐associated diseases. Aging cell, 11(5): 727-731. (2012).

117 Campo, M., Randhawa, A. K., Dunstan, S., Farrar, J., Caws, M., Bang, N. D., Lan, N. N.,

Chau, T. T. H., Horne, D. J., and Thuong, N. T. Common polymorphisms in the cd43

gene region are associated with tuberculosis disease and mortality. American journal of

respiratory cell and molecular biology(ja). (2014).

118 Steinthorsdottir, V., Thorleifsson, G., Sulem, P., Helgason, H., Grarup, N., Sigurdsson,

A., Helgadottir, H. T., Johannsdottir, H., Magnusson, O. T., and Gudjonsson, S. A.

Identification of low-frequency and rare sequence variants associated with elevated or

reduced risk of type 2 diabetes. Nature genetics, 46(3): 294-298. (2014).

119 Black, M. H., Wu, J., Takayanagi, M., Wang, N., Taylor, K. D., Haritunians, T., Trigo, E.,

Lawrence, J. M., Watanabe, R. M., and Buchanan, T. A. Variation in pparg is associated

with longitudinal change in insulin resistance in mexican americans at risk for type 2

Page 162: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

141

diabetes. The Journal of Clinical Endocrinology & Metabolism. (2015).

120 Dayeh, T. A., Olsson, A. H., Volkov, P., Almgren, P., Rönn, T., and Ling, C. Identification

of cpg-snps associated with type 2 diabetes and differential DNA methylation in human

pancreatic islets. Diabetologia, 56(5): 1036-1046. (2013).

121 de Miguel-Yanes, J. M., Shrader, P., Pencina, M. J., Fox, C. S., Manning, A. K., Grant, R.

W., Dupuis, J., Florez, J. C., D'Agostino, R. B., and Cupples, L. A. Genetic risk

reclassification for type 2 diabetes by age below or above 50 years using 40 type 2

diabetes risk single nucleotide polymorphisms. Diabetes Care, 34(1): 121-125. (2011).

122 Michailidou, K., Hall, P., Gonzalez-Neira, A., Ghoussaini, M., Dennis, J., Milne, R. L.,

Schmidt, M. K., Chang-Claude, J., Bojesen, S. E., and Bolla, M. K. Large-scale

genotyping identifies 41 new loci associated with breast cancer risk. Nature genetics,

45(4): 353-361. (2013).

123 Cowper-Sal, R., Zhang, X., Wright, J. B., Bailey, S. D., Cole, M. D., Eeckhoute, J.,

Moore, J. H., and Lupien, M. Breast cancer risk-associated snps modulate the affinity of

chromatin for foxa1 and alter gene expression. Nature genetics, 44(11): 1191-1198.

(2012).

124 Wu, C., Miao, X., Huang, L., Che, X., Jiang, G., Yu, D., Yang, X., Cao, G., Hu, Z., and

Zhou, Y. Genome-wide association study identifies five loci associated with

susceptibility to pancreatic cancer in chinese populations. Nature genetics, 44(1): 62-66.

(2012).

125 Spurdle, A. B., Thompson, D. J., Ahmed, S., Ferguson, K., Healey, C. S., O'Mara, T.,

Walker, L. C., Montgomery, S. B., Dermitzakis, E. T., and Fahey, P. Genome-wide

association study identifies a common variant associated with risk of endometrial cancer.

Nature genetics, 43(5): 451-454. (2011).

126 Hollingworth, P., Harold, D., Sims, R., Gerrish, A., Lambert, J.-C., Carrasquillo, M. M.,

Abraham, R., Hamshere, M. L., Pahwa, J. S., and Moskvina, V. Common variants at

abca7, ms4a6a/ms4a4e, epha1, cd33 and cd2ap are associated with alzheimer's disease.

Page 163: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

142

Nature genetics, 43(5): 429-435. (2011).

127 Lambert, J.-C., Ibrahim-Verbaas, C. A., Harold, D., Naj, A. C., Sims, R., Bellenguez, C.,

Jun, G., DeStefano, A. L., Bis, J. C., and Beecham, G. W. Meta-analysis of 74,046

individuals identifies 11 new susceptibility loci for alzheimer's disease. Nature genetics,

45(12): 1452-1458. (2013).

128 Jonsson, T., Stefansson, H., Steinberg, S., Jonsdottir, I., Jonsson, P. V., Snaedal, J.,

Bjornsson, S., Huttenlocher, J., Levey, A. I., and Lah, J. J. Variant of trem2 associated

with the risk of alzheimer's disease. New England Journal of Medicine, 368(2): 107-116.

(2013).

129 Baron-Cohen, S., Murphy, L., Chakrabarti, B., Craig, I., Mallya, U., Lakatošová, S.,

Rehnstrom, K., Peltonen, L., Wheelwright, S., and Allison, C. A genome wide

association study of mathematical ability reveals an association at chromosome 3q29, a

locus associated with autism and learning difficulties: A preliminary study. PloS one,

9(5): e96374. (2014).

130 Gilman, S. R., Iossifov, I., Levy, D., Ronemus, M., Wigler, M., and Vitkup, D. Rare de

novo variants associated with autism implicate a large functional network of genes

involved in formation and function of synapses. Neuron, 70(5): 898-907. (2011).

131 Bevan, S., Traylor, M., Adib-Samii, P., Malik, R., Paul, N. L., Jackson, C., Farrall, M.,

Rothwell, P. M., Sudlow, C., and Dichgans, M. Genetic heritability of ischemic stroke

and the contribution of previously reported candidate gene and genomewide associations.

Stroke, 43(12): 3161-3167. (2012).

132 Holliday, E. G., Maguire, J. M., Evans, T.-J., Koblar, S. A., Jannes, J., Sturm, J. W.,

Hankey, G. J., Baker, R., Golledge, J., and Parsons, M. W. Common variants at 6p21. 1

are associated with large artery atherosclerotic stroke. Nature genetics, 44(10): 1147-

1151. (2012).

133 Yadav, S., Hasan, N., Marjot, T., Khan, M. S., Prasad, K., Bentley, P., and Sharma, P.

Detailed analysis of gene polymorphisms associated with ischemic stroke in south asians.

Page 164: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

143

PloS one, 8(3): e57305. (2013).

134 Della-Morte, D., Guadagni, F., Palmirotta, R., Testa, G., Caso, V., Paciaroni, M., Abete,

P., Rengo, F., Ferroni, P., and Sacco, R. L. Genetics of ischemic stroke, stroke-related

risk factors, stroke precursors and treatments. Pharmacogenomics, 13(5): 595-613.

(2012).

135 Leu, H.-B., Chung, C.-M., Chuang, S.-Y., Bai, C.-H., Chen, J.-R., Chen, J.-W., and Pan,

W.-H. Genetic variants of connexin37 are associated with carotid intima-medial

thickness and future onset of ischemic stroke. Atherosclerosis, 214(1): 101-106. (2011).

136 Milne, R., and Antoniou, A. Genetic modifiers of cancer risk for brca1 and brca2

mutation carriers. Annals of oncology, 22(suppl 1): i11-i17. (2011).

137 Webster, J. A., Gibbs, J. R., Clarke, J., Ray, M., Zhang, W., Holmans, P., Rohrer, K.,

Zhao, A., Marlowe, L., and Kaleem, M. Genetic control of human brain transcript

expression in alzheimer disease. The American Journal of Human Genetics, 84(4): 445-

458. (2009).

138 MacInnis, R. J., Antoniou, A. C., Eeles, R. A., Severi, G., Olama, A., Amin, A.,

McGuffog, L., Kote‐Jarai, Z., Guy, M., and O'Brien, L. T. A risk prediction algorithm

based on family history and common genetic variants: Application to prostate cancer

with potential clinical impact. Genetic epidemiology, 35(6): 549-556. (2011).

139 Sazci, A., Ergul, E., Tuncer, N., Akpinar, G., and Kara, I. Methylenetetrahydrofolate

reductase gene polymorphisms are associated with ischemic and hemorrhagic stroke:

Dual effect of mthfr polymorphisms c677t and a1298c. Brain research bulletin, 71(1):

45-50. (2006).

140 Li, P., and Qin, C. Methylenetetrahydrofolate reductase (mthfr) gene polymorphisms and

susceptibility to ischemic stroke: A meta-analysis. Gene, 535(2): 359-364. (2014).

141 Zhou, B.-S., Bu, G.-Y., Li, M., Chang, B.-G., and Zhou, Y.-P. Tagging snps in the mthfr

gene and risk of ischemic stroke in a chinese population. International journal of

Page 165: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

144

molecular sciences, 15(5): 8931-8940. (2014).

142 Somarajan, B., Kalita, J., Mittal, B., and Misra, U. Evaluation of mthfr c677t

polymorphism in ischemic and hemorrhagic stroke patients. A case–control study in a

northern indian population. Journal of the neurological sciences, 304(1): 67-70. (2011).

143 Ridker, P. M., Hennekens, C. H., Lindpaintner, K., Stampfer, M. J., Eisenberg, P. R., and

Miletich, J. P. Mutation in the gene coding for coagulation factor v and the risk of

myocardial infarction, stroke, and venous thrombosis in apparently healthy men. New

England Journal of Medicine, 332(14): 912-917. (1995).

144 Lynch, J. K., Nelson, K. B., Curry, C. J., and Grether, J. K. Cerebrovascular disorders in

children with the factor v leiden mutation. Journal of child neurology, 16(10): 735-744.

(2001).

145 Chabrier, S., Husson, B., Dinomais, M., Landrieu, P., and Tich, S. N. T. New insights

(and new interrogations) in perinatal arterial ischemic stroke. Thrombosis research,

127(1): 13-22. (2011).

146 Wu, L., Shen, Y., Liu, X., Ma, X., Xi, B., Mi, J., Lindpaintner, K., Tan, X., and Wang, X.

The 1425g/a snp in prkch is associated with ischemic stroke and cerebral hemorrhage in

a chinese population. Stroke, 40(9): 2973-2976. (2009).

147 Kubo, M., Hata, J., Ninomiya, T., Matsuda, K., Yonemoto, K., Nakano, T., Matsushita,

T., Yamazaki, K., Ohnishi, Y., and Saito, S. A nonsynonymous snp in prkch (protein

kinase c η) increases the risk of cerebral infarction. Nature genetics, 39(2): 212-217.

(2007).

148 van der Bom, J. G., Bots, M. L., Haverkate, F., Slagboom, P. E., Meijer, P., de Jong, P. T.,

Hofman, A., Grobbee, D. E., and Kluft, C. Reduced response to activated protein c is

associated with increased risk for cerebrovascular disease. Annals of internal medicine,

125(4): 265-269. (1996).

149 Simioni, P., De Ronde, H., Prandoni, P., Saladini, M., Bertina, R., and Girolami, A.

Ischemic stroke in young patients with activated protein c resistance a report of three

Page 166: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

145

cases belonging to three different kindreds. Stroke, 26(5): 885-890. (1995).

150 Munshi, A., Sharma, V., Kaul, S., Rajeshwar, K., Babu, M. S., Shafi, G., Anila, A.,

Balakrishna, N., Alladi, S., and Jyothy, A. Association of the− 344c/t aldosterone

synthase (cyp11b2) gene variant with hypertension and stroke. Journal of the

neurological sciences, 296(1): 34-38. (2010).

151 Saidi, S., Mahjoub, T., and Almawi, W. Y. Aldosterone synthase gene (cyp11b2)

promoter polymorphism as a risk factor for ischaemic stroke in tunisian arabs. Journal

of Renin-Angiotensin-Aldosterone System. (2010).

152 Kessler, C., Spitzer, C., Stauske, D., Mende, S., Stadlmüller, J., Walther, R., and Rettig,

R. The apolipoprotein e and β-fibrinogen g/a-455 gene polymorphisms are associated

with ischemic stroke involving large-vessel disease. Arteriosclerosis, thrombosis, and

vascular biology, 17(11): 2880-2884. (1997).

153 Peng, D.-Q., Zhao, S.-P., and Wang, J.-L. Lipoprotein (a) and apolipoprotein e ε4 as

independent risk factors for ischemic stroke. European Journal of Cardiovascular Risk,

6(1): 1-6. (1999).

154 Couderc, R., Mahieux, F., Bailleul, S., Fenelon, G., Mary, R., and Fermanian, J.

Prevalence of apolipoprotein e phenotypes in ischemic cerebrovascular disease. A case-

control study. Stroke, 24(5): 661-664. (1993).

155 Ikram, M. A., Seshadri, S., Bis, J. C., Fornage, M., DeStefano, A. L., Aulchenko, Y. S.,

Debette, S., Lumley, T., Folsom, A. R., and van den Herik, E. G. Genomewide

association studies of stroke. New England Journal of Medicine, 360(17): 1718-1728.

(2009).

156 Tong, Y., Zhang, Y., Zhang, R., Geng, Y., Lin, L., Wang, Z., Liu, J., Li, X., Cao, Z., and

Xu, J. Association between two key snps on chromosome 12p13 and ischemic stroke in

chinese han population. Pharmacogenetics and genomics, 21(9): 572-578. (2011).

157 Gschwendtner, A., Bevan, S., Cole, J. W., Plourde, A., Matarin, M., Ross‐Adams, H.,

Page 167: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

146

Meitinger, T., Wichmann, E., Mitchell, B. D., and Furie, K. Sequence variants on

chromosome 9p21. 3 confer risk for atherosclerotic stroke. Annals of neurology, 65(5):

531-539. (2009).

158 Anderson, C. D., Biffi, A., Rost, N. S., Cortellini, L., Furie, K. L., and Rosand, J.

Chromosome 9p21 in ischemic stroke population structure and meta-analysis. Stroke,

41(6): 1123-1131. (2010).

159 Smith, J. G., Melander, O., Lövkvist, H., Hedblad, B., Engström, G., Nilsson, P., Carlson,

J., Berglund, G., Norrving, B., and Lindgren, A. Common genetic variants on

chromosome 9p21 confers risk of ischemic stroke a large-scale genetic association study.

Circulation: Cardiovascular Genetics, 2(2): 159-164. (2009).

160 Meschia, J. F., Brott, T. G., Brown, R. D., Crook, R. J., Frankel, M., Hardy, J., Merino, J.

G., Rich, S. S., Silliman, S., and Worrall, B. B. The ischemic stroke genetics study (isgs)

protocol. BMC neurology, 3(1): 4. (2003).

161 Heiss, G., Sharrett, A. R., Barnes, R., Chambless, L., Szklo, M., and Alzola, C. Carotid

atherosclerosis measured by b-mode ultrasound in populations: Associations with

cardiovascular risk factors in the aric study. American journal of epidemiology, 134(3):

250-256. (1991).

162 Hofman, A., Breteler, M. M., van Duijn, C. M., Krestin, G. P., Pols, H. A., Stricker, B. H.

C., Tiemeier, H., Uitterlinden, A. G., Vingerling, J. R., and Witteman, J. C. The rotterdam

study: Objectives and design update. European journal of epidemiology, 22(11): 819-

829. (2007).

163 Bellenguez, C., Bevan, S., Gschwendtner, A., Spencer, C., Burgess, A., Pirinen, M.,

Jackson, C., Traylor, M., Strange, A., and Su, Z. International stroke genetics consortium

(isgc); wellcome trust case control consortium 2 (wtccc2). Genome-wide association

study identifies a variant in hdac9 associated with large vessel ischemic stroke. Nat

Genet, 44(3): 328-333. (2012).

164 Goldszmidt, M. Bayesian network classifiers. Wiley Encyclopedia of Operations

Page 168: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

147

Research and Management Science. (2010).

165 Rish, I. 2001. An empirical study of the naive bayes classifier. IJCAI 2001 workshop on

empirical methods in artificial intelligence. p. 41-46

166 Murphy, K. P. Naive bayes classifiers. University of British Columbia. (2006).

167 Ali, S., and Smith, K. A. On learning algorithm selection for classification. Applied Soft

Computing, 6(2): 119-138. (2006).

168 Freund, Y., and Schapire, R. E. 1996. Experiments with a new boosting algorithm.

ICML. p. 148-156

169 Thornton, C., Hutter, F., Hoos, H. H., and Leyton-Brown, K. 2013. Auto-weka:

Combined selection and hyperparameter optimization of classification algorithms.

Proceedings of the 19th ACM SIGKDD international conference on Knowledge

discovery and data mining. p. 847-855

170 Quinlan, J. R. 1996. Bagging, boosting, and c4. 5. AAAI/IAAI, Vol. 1. p. 725-730

171 Breiman, L. Random forests. Machine learning, 45(1): 5-32. (2001).

172 Breiman, L. Bagging predictors. Machine learning, 24(2): 123-140. (1996).

173 Ruck, D. W., Rogers, S. K., Kabrisky, M., Oxley, M. E., and Suter, B. W. The multilayer

perceptron as an approximation to a bayes optimal discriminant function. Neural

Networks, IEEE Transactions on, 1(4): 296-298. (1990).

174 Sneath, P., and Sokal, R. Unweighted pair group method with arithmetic mean.

Numerical Taxonomy: 230-234. (1973).

175 Saitou, N., and Nei, M. The neighbor-joining method: A new method for reconstructing

phylogenetic trees. Molecular biology and evolution, 4(4): 406-425. (1987).

176 Eck, R., and Dayhoff, M. Atlas of protein sequence and structure national biomedical

research foundation. Springs Silver, Maryland. (1966).

Page 169: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

148

177 Fitch, W. M. Toward defining the course of evolution: Minimum change for a specific

tree topology. Systematic Biology, 20(4): 406-416. (1971).

178 Felsenstein, J. Evolutionary trees from DNA sequences: A maximum likelihood

approach. Journal of Molecular Evolution, 17(6): 368-376. (1981).

179 Felsenstein, J. Evolutionary trees from gene frequencies and quantitative characters:

Finding maximum likelihood estimates. Evolution: 1229-1242. (1981).

180 Larkin, M. A., Blackshields, G., Brown, N., Chenna, R., McGettigan, P. A., McWilliam,

H., Valentin, F., Wallace, I. M., Wilm, A., and Lopez, R. Clustal w and clustal x version

2.0. Bioinformatics, 23(21): 2947-2948. (2007).

181 Tamura, K., Stecher, G., Peterson, D., Filipski, A., and Kumar, S. Mega6: Molecular

evolutionary genetics analysis version 6.0. Molecular biology and evolution, 30(12):

2725-2729. (2013).

182 Swofford, D. L. Phylogenetic analysis using parsimony. Champaign, IL: Illinois Natural

History Survey. (1993).

183 Swofford, D. L. Paup 4.0 for macintosh: Phylogenetic analysis using parsimony

(software and user's book for macintosh). Sinauer Associates, Incorporated. pp. (2004).

184 Felsenstein, J.: ‘Phylogeny inference package’, in Editor (Ed.)^(Eds.): ‘Book Phylogeny

inference package’ (Version, 2006, edn.), pp.

185 Bionumerics seven. (2011) Retrieved 02/12/2014, 2014, from http://www.applied-

maths.com/bionumerics

186 Kanz, C., Aldebert, P., Althorpe, N., Baker, W., Baldwin, A., Bates, K., Browne, P., van

den Broek, A., Castro, M., and Cochrane, G. The embl nucleotide sequence database.

Nucleic Acids Research, 33(suppl 1): D29-D33. (2005).

187 MacDonald, J. R., Ziman, R., Yuen, R. K., Feuk, L., and Scherer, S. W. The database of

genomic variants: A curated collection of structural variation in the human genome.

Page 170: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

149

Nucleic Acids Research, 42(D1): D986-D992. (2014).

188 Church, D. M., Lappalainen, I., Sneddon, T. P., Hinton, J., Maguire, M., Lopez, J.,

Garner, J., Paschall, J., DiCuccio, M., and Yaschenko, E. Public data archives for

genomic structural variation. Nature genetics, 42(10): 813-814. (2010).

189 Sherry, S. T., Ward, M.-H., Kholodov, M., Baker, J., Phan, L., Smigielski, E. M., and

Sirotkin, K. Dbsnp: The ncbi database of genetic variation. Nucleic Acids Research,

29(1): 308-311. (2001).

190 Thorisson, G. A., and Stein, L. D. The snp consortium website: Past, present and future.

Nucleic Acids Research, 31(1): 124-127. (2003).

191 Brookes, A. J., Lehväslaiho, H., Siegfried, M., Boehm, J. G., Yuan, Y. P., Sarkar, C. M.,

Bork, P., and Ortigao, F. Hgbase: A database of snps and other variations in and around

human genes. Nucleic Acids Research, 28(1): 356-360. (2000).

192 Gibbs, R. A., Belmont, J. W., Hardenbol, P., Willis, T. D., Yu, F., Yang, H., Ch'ang, L.-Y.,

Huang, W., Liu, B., and Shen, Y. The international hapmap project. Nature, 426(6968):

789-796. (2003).

193 Clarke, L., Zheng-Bradley, X., Smith, R., Kulesha, E., Xiao, C., Toneva, I., Vaughan, B.,

Preuss, D., Leinonen, R., and Shumway, M. The 1000 genomes project: Data

management and community access. Nature methods, 9(5): 459-462. (2012).

194 Consortium, I. C. G.: ‘Icgc cancer genome projects’, in Editor (Ed.)^(Eds.): ‘Book Icgc

cancer genome projects’ (2013, edn.), pp.

195 Bamford, S., Dawson, E., Forbes, S., Clements, J., Pettett, R., Dogan, A., Flanagan, A.,

Teague, J., Futreal, P. A., and Stratton, M. The cosmic (catalogue of somatic mutations in

cancer) database and website. British journal of cancer, 91(2): 355-358. (2004).

196 Stenson, P. D., Ball, E. V., Mort, M., Phillips, A. D., Shiel, J. A., Thomas, N. S.,

Abeysinghe, S., Krawczak, M., and Cooper, D. N. Human gene mutation database

(hgmd®): 2003 update. Human mutation, 21(6): 577-581. (2003).

Page 171: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

150

197 Amberger, J., Bocchini, C. A., Scott, A. F., and Hamosh, A. Mckusick's online mendelian

inheritance in man (omim®). Nucleic Acids Research, 37(suppl 1): D793-D796. (2009).

198 Pagon, R. A. Genetests: An online genetic information resource for health care providers.

Journal of the Medical Library Association, 94(3): 343. (2006).

199 Mitelman, F., Johansson, B., and Mertens, F. Mitelman database of chromosome

aberrations in cancer. Cancer Genome Anatomy Project. pp. (2007).

200 Becker, K. G., Barnes, K. C., Bright, T. J., and Wang, S. A. The genetic association

database. Nature genetics, 36(5): 431-432. (2004).

201 Altshuler, D. M., Gibbs, R. A., Peltonen, L., Dermitzakis, E., Schaffner, S. F., Yu, F. L.,

Bonnen, P. E., de Bakker, P. I. W., Deloukas, P., Gabriel, S. B., Gwilliam, R., Hunt, S.,

Inouye, M., Jia, X. M., Palotie, A., Parkin, M., Whittaker, P., Chang, K., Hawes, A.,

Lewis, L. R., Ren, Y. R., Wheeler, D., Muzny, D. M., Barnes, C., Darvishi, K., Hurles,

M., Korn, J. M., Kristiansson, K., Lee, C., McCarroll, S. A., Nemesh, J., Keinan, A.,

Montgomery, S. B., Pollack, S., Price, A. L., Soranzo, N., Gonzaga-Jauregui, C., Anttila,

V., Brodeur, W., Daly, M. J., Leslie, S., McVean, G., Moutsianas, L., Nguyen, H., Zhang,

Q. R., Ghori, M. J. R., McGinnis, R., McLaren, W., Takeuchi, F., Grossman, S. R.,

Shlyakhter, I., Hostetter, E. B., Sabeti, P. C., Adebamowo, C. A., Foster, M. W., Gordon,

D. R., Licinio, J., Manca, M. C., Marshall, P. A., Matsuda, I., Ngare, D., Wang, V. O.,

Reddy, D., Rotimi, C. N., Royal, C. D., Sharp, R. R., Zeng, C. Q., Brooks, L. D.,

McEwen, J. E., and Int HapMap, C. Integrating common and rare genetic variation in

diverse human populations. Nature, 467(7311): 52-58. (2010).

Page 172: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

viii

Plagiarism Report

Page 173: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

ix

List of Publications and Reprints

1 Farhan, S., Fahiem, M. A., andTauseef, H. An ensemble-of-classifiers based approach for

early diagnosis of alzheimer’s disease: Classification using structural features of brain

images. Computational and mathematical methods in medicine, 2014. (2014).

2 Tauseef, H., Fahiem, M. A., Farhan, S., andTahir, F. A review of image and phylogenetic

analysis based techniques for ischemic stroke risk estimation. Life Science Journal,

10(7s). (2013).

3 Farhan, S., Fahiem, M. A., Tahir, F., andTauseef, H. A comparative study of

neuroimaging and pattern recognition techniques for estimation of alzheimer’s. Life

Science Journal, 10(7s). (2013).

4 Tahir, F., Fahiem, M. A., Tauseef, H., andFarhan, S. A survey of multispectral high

resolution imaging based drug surface morphology validation techniques. Life Science

Journal, 10(7): 1050-1059. (2013).

5 Aftab, Z., andTuaseef, H. Enhancing pixel oriented visualization by merging circle view

and circle segment visualization techniques Multi-disciplinary trends in artificial

intelligence (pp. 101-109): Springer.(2012).

6 Tauseef, H., Fahiem, M. A., andFarhan, S. 2009. Recognition and translation of hand

gestures to urdu alphabets using a geometrical classification. Visualisation, 2009. VIZ'09.

Second International Conference in. p. 213-217

7 Tauseef, H., Farhan, S., andFahiem, M. A. 2009. A systematic approach for selecting a

suitable software architecture evaluation method. Software Engineering Research and

Practice. p. 295-299

8 Farhan, S., Fahiem, M. A., andTauseef, H. 2009. Geometrical features based approach

for the classification and recognition of handwritten characters. Visualisation, 2009.

VIZ'09. Second International Conference in. p. 185-190

9 Fahiem, M. A., Haq, S. A., Saleemi, F., andTauseef, H. 2009. 3d reconstruction:

Page 174: CORRELATING CAROTID IMAGING AND PHYLOGENETIC TREES …prr.hec.gov.pk/jspui/bitstream/123456789/6678/1/... · My supervisor, Dr. Muhammad Abuzar Fahiem has been a great source of guidance

x

Estimating depth of hole from 2d camera perspectives. Proceedings of the European

Computing Conference. p. 213-221

10 Farhan, S., Tauseef, H., andFahiem, M. A. 2009. Adding agility to architecture tradeoff

analysis method for mapping on crystal. Software Engineering, 2009. WCSE'09. WRI

World Congress on. p. 121-125

11 Tauseef, H., Fahiem, M. A., andSaleemi, F. 2007. Target recognition task based online

system for refractive error measurement using font transformations. Proceedings of the

7th Conference on 7th WSEAS International Conference on Applied Computer Science-

Volume 7. p. 162-167

12 Tauseef, H., Fahiem, M. A., Farhan, S.; Image Based Sign Language Translation: Urdu

Text. LAP LAMBERT Academic Publishing, Germany, 2011, ISBN-13: 978-3-8454-

0969-6, ISBN-10: 384540969X.

13 Farhan, S., Fahiem, M. A., Tauseef, H.; Hand Written Character Recognition: Non-

Cursive Scripts. LAP LAMBERT Academic Publishing, Germany, 2011, ISBN-13: 978-

3-8454-0104-1, ISBN-10: 3845401044X.