detection of apple marssonina blotch disease using

DETECTION OF APPLE MARSSONINA BLOTCH DISEASE USING HYPERSPECTRAL DATA

By

MUBARAKAT SHUAIBU

A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE

UNIVERSITY OF FLORIDA

2016

To my family and friends

4

ACKNOWLEDGMENTS

I would like to express my sincere gratitude and appreciation to my Committee

Chair, Dr. Won Suk Lee, for his guidance and encouragement through the course of my

Master’s program. I would also like to thank the other members of my supervisory

committee, Dr. John Schueller and Dr. Paul Gader, for their advice and help.

My sincere gratitude also goes to the Rural Development Administration (RDA),

Korea, for supporting this research and to my colleagues at the Precision Agriculture

Laboratory for their support and friendship.

I thank my Mum and my dear friends, Tega and Sabrina, for their moral

encouragement and prayer throughout the course of this work. Finally, and most

importantly, I would like to thank God for seeing me through the course of this program.

5

TABLE OF CONTENTS page

ACKNOWLEDGMENTS .................................................................................................. 4

LIST OF TABLES ............................................................................................................ 7

LIST OF FIGURES .......................................................................................................... 8

LIST OF ABBREVIATIONS ............................................................................................. 9

ABSTRACT ................................................................................................................... 10

CHAPTER

1 INTRODUCTION .................................................................................................... 12

Precision Agriculture ............................................................................................... 13

Spectroscopy .......................................................................................................... 14 Multispectral and Hyperspectral Imaging ................................................................ 15 Global Apple Industry.............................................................................................. 15

Apple Industry in South Korea ................................................................................ 16 Apple Marssonina Blotch ........................................................................................ 16

2 LITERATURE REVIEW .......................................................................................... 20

Application of Spectroscopy for Plant Disease Detection ....................................... 20

Application of Multispectral and Hyperspectral Imaging for Plant Disease Detection ............................................................................................................. 22

3 APPLE MARSSONINA BLOTCH DETECTION USING INDOOR SPECTRORADIOMETER DATA ............................................................................ 25

Background ............................................................................................................. 25

Materials and Methods............................................................................................ 29 Data Collection ................................................................................................. 29 Vegetation Indices ............................................................................................ 31

Jefferies Matusita-Orthogonal Subspace Projection (JM-OSP) Band Selection ....................................................................................................... 33

Results and Discussion........................................................................................... 35 Spectral Feature Analysis ................................................................................. 35

JM-OSP Band Selection ................................................................................... 35 ARI1 and MAREP ............................................................................................. 36 Classification Based On JM-OSP Bands .......................................................... 37 Classification Based On ARI1 and MAREP Features ....................................... 39 QDA Binary Classification Based On ARI1 and MAREP Features ................... 40

Conclusion .............................................................................................................. 42

6

4 QUANTITATIVE ANALYSIS AND SPECTRAL UNMIXING OF EARLY STAGE APPLE MARSSONINA BLOTCH DISEASE ........................................................... 52

Background ............................................................................................................. 52

Methods .................................................................................................................. 53 Abundance Estimation of Early Stage AMB Diseased Data ............................. 53 Endmember Extraction from Early Stage AMB Diseased Data ........................ 54 Regression Models ........................................................................................... 56

Results and Discussion........................................................................................... 56

Spectral Unmixing of Brown and Green Endmembers for Early Stage AMB Diseased Samples ........................................................................................ 56

Quantitative Analysis of Early Stage AMB Diseased Samples ......................... 58 Conclusion .............................................................................................................. 59

5 HYPERSPECTRAL IMAGE ANALYSIS OF EARLY STAGE APPLE MARSSONINA BLOTCH DISEASE USING SEQUENTIAL MAXIMUM ANGLE CONVEX CONE ALGORITHM ............................................................................... 64

Background ............................................................................................................. 64

Materials and Methods............................................................................................ 65 Outdoor Hyperspectral Data Acquisition........................................................... 65 Image Preprocessing ....................................................................................... 66

Vegetation Indices ............................................................................................ 66 Sequential Maximum Angle Convex Cone (SMACC) ....................................... 68

Kullback-Liebler Divergence (KLD) .................................................................. 69

Support Vector Machine (SVM) Classification .................................................. 69

Results and Discussion........................................................................................... 70 SVM and Texture Filter Background Masking .................................................. 70 SMACC Endmember Extraction and Spectral Feature Analysis ...................... 71

Vegetation Indices ............................................................................................ 72 Band Selection using KLD Hierarchical Clustering for 2014 Dataset ............... 72

SVM Classification of MTVI and MAREP Features .......................................... 73 SVM Classification of JM-KLD Hierarchical Clustering ..................................... 73

Conclusion .............................................................................................................. 75

6 SUMMARY AND FUTURE DIRECTION ................................................................. 82

LIST OF REFERENCES ............................................................................................... 85

BIOGRAPHICAL SKETCH ............................................................................................ 90

7

LIST OF TABLES

Table page 3-1 QDA classification accuracy for five spectroradiometer classes in 2014

dataset using five spectral bands selected by JM-OSP. ..................................... 47

3-2 Neural network classification accuracy for five spectroradiometer classes in 2014 dataset using five spectral bands selected by JM-OSP. ............................ 47

3-3 Discriminant tree classification accuracy for five spectroradiometer classes in 2014 dataset using five spectral bands selected by JM-OSP. ............................ 48

3-4 QDA classification accuracy for five spectroradiometer classes in 2015 dataset using five spectral bands selected by JM-OSP. ..................................... 48

3-5 Neural network classification accuracy for five spectroradiometer classes in 2015 dataset using five spectral bands selected by JM-OSP. ............................ 49

3-6 Discriminant tree classification accuracy for five spectroradiometer classes in 2015 dataset using five spectral bands selected by JM-OSP. ............................ 49

3-7 QDA classification accuracy for five spectroradiometer classes in 2014 dataset using ARI1 and MAREP vegetation indices. .......................................... 50

3-8 QDA classification accuracy for five spectroradiometer classes in 2015 dataset using ARI1 and MAREP vegetation indices. .......................................... 50

3-9 QDA classification accuracy for healthy-nd and early stage AMB diseased spectroradiometer classes in 2014 dataset using ARI1 and MAREP vegetation indices. .............................................................................................. 51

3-10 QDA classification accuracy for healthy-nd and early stage AMB diseased spectroradiometer classes in 2015 dataset using ARI1 and MAREP vegetation indices. .............................................................................................. 51

4-1 Disease severity statistical analysis results for 20 healthy and 261 early stage AMB diseased samples from 2014 and 2015 spectroradiometer datasets. ............................................................................................................. 63

5-1 SVM classification accuracy for six outdoor hyperspectral classes using MTVI and MAREP2 vegetation indices as input features. .................................. 81

5-2 SVM classification accuracy for six outdoor hyperspectral classes using six spectral bands selected by JM-KLD hierarchical clustering algorithm. ............... 81

8

LIST OF FIGURES

Figure page 1-1 Comparison of the apple and other fruits in South Korea ................................... 18

1-2 AMB distribution in South Korea ......................................................................... 19

3-1 Apple leaves used in indoor spectroradiometer analysis .................................... 43

3-2 Mean reflectance spectra of samples in spectroradiometer dataset ................... 44

3-3 JM distance matrix of spectral bands in 2014 and 2015 indoor spectroradiometer datasets. ............................................................................... 45

3-4 The plot of MAREP against ARI1 for samples in spectroradiometer dataset ...... 46

4-1 Flowchart showing steps for extraction of brown and green colored pixel abundances in early-stage AMB diseased samples ........................................... 60

4-2 Steps taken in segmenting early stage AMB diseased color images .................. 61

4-3 The plot of MAREP against ARI1 classes in spectroradiometer dataset ............ 62

4-4 Predicted versus actual disease severity for validation dataset using a combination of PLSR and SMLR prediction models. .......................................... 63

5-1 RGB color display of an outdoor hyperspectral image taken in 2014 ................. 76

5-2 Comparison of original hyperspectral image and SVM-masked image .............. 76

5-3 Texture images from previously SVM-masked image at wavelength 450.8 nm . 77

5-4 Combined SVM-texture filter masking result ...................................................... 78

5-5 Mean reflectance spectra of non-symptomatic and AMB symptomatic samples in 2014 outdoor hyperspectral dataset ................................................. 79

5-6 The plot of MAREP 2 against MTVI for non-symptomatic and AMB symptomatic samples in 2014 outdoor hyperspectral dataset. ........................... 80

9

LIST OF ABBREVIATIONS

AMB Apple Marssonina blotch

ARI1 Anthocyanin reflectance index 1

FAO Food and Agriculture Organization

GA Genetic algorithm

GPS Global positioning system

JM Jefferies Matusita

KLD Kullback–Leibler divergence

K-NN K-Nearest Neighbor

MAREP Matrix-adjusted red edge position

MTVI Modified triangular vegetation index

NIR Near-infrared

qPCR Quantitative fluorogenic polymerase chain reaction

PLSR Partial least squares regression

PSO Particle swarm optimization

QDA Quadratic discriminant analysis

R2 Coefficient of determination

REP Red edge position

SID Spectral information divergence

SMACC Sequential maximum angle convex cone

SMLR Stepwise multiple linear regression

SVM Support vector machine

SWIR Short-wave infrared

TVI Triangular vegetation index

10

Abstract of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science

DETECTION OF APPLE MARSSONINA BLOTCH DISEASE USING

HYPERSPECTRAL DATA

By

Mubarakat Shuaibu

May 2016

Chair: Won Suk Lee Major: Agricultural and Biological Engineering

Apple Marssonina blotch (AMB) is one of the most devastating apple diseases in

the world, and it has caused huge economic losses to countries like Japan, India, and

Korea. It is a fungal disease that mainly affects the leaves of apple trees and causes

premature defoliation, which in turn results in low quality and quantity of harvested

apples. Technologies that can efficiently detect the disease in its early stage could help

growers apply timely control measures to contain the spread of the disease. In this

work, the use of hyperspectral imaging and spectroradiometer measurements were

examined for the early diagnosis of the disease. Vegetation indices--anthocyanin

reflectance index 1 (ARI1), modified triangular vegetation index (MTVI) and matrix-

adjusted red edge position (MAREP)--and optimal spectral bands, calculated using

Jefferies Matusita, orthogonal subspace projection, and hierarchal clustering algorithms,

were used as features in building various classifiers. An endmember extraction

algorithm, known as sequential maximum angle convex cone (SMACC), was used to

create hyperspectral endmembers based on the severity of the disease. Both

spectroradiometer and hyperspectral imaging systems worked well in distinguishing

healthy samples from diseased ones. The highest classification accuracies achieved in

11

this work for both healthy and early stage AMB diseased classes were 97.7% and

99.2%, respectively.

12

CHAPTER 1 INTRODUCTION

This thesis examines the use of hyperspectral technology for the detection of a

severe fungal disease in apples called Marssonina blotch (AMB). It serves as a

preliminary step towards the design of a low-cost automated sensing system for the

early diagnosis of the disease, especially for use in South Korean apple orchards. With

such a sensing system available, apple growers will be able to apply fungicides more

precisely to areas of their fields infected by various degrees of the disease, thereby

improving control of disease spread, reducing fungicide wastage and maximizing fruit

yield.

The thesis is structured as follows: an overview of precision agriculture and

current remote sensing technologies (spectroscopy, multispectral imaging, and

hyperspectral imaging) in agriculture is given, followed by an outline of the South

Korean and global apple industries. Next, some information about AMB disease is

provided, including its symptoms and current management practices being adopted by

apple growers in infected regions. A literature review of some spectroscopic,

multispectral and hyperspectral imaging technologies currently utilized for plant disease

detection is then presented. Subsequent chapters provide a description of field and lab

data acquisition methods used in extracting information for spectral and image

analyses. Finally, results from various disease detection methods are compared and

discussed followed by recommendations for improving AMB disease detection at both

the asymptomatic and symptomatic stages.

13

Precision Agriculture

The conventional method of managing a farm involves treating a field as a single

unit and distributing crop production inputs, such as water, fertilizer, seeds, and

pesticides, uniformly on the whole field. Growers who treat their field in this way tend to

over-apply crop inputs as insurance and more often than not, this leads to

counterproductive results including poorer yield, input wastage, and environmental

pollution.

Over the past two decades, the use of crop management technologies that take

into account in-field variability has grown tremendously all around the world (Batte,

1999; Zhang et al., 2002). A combination of these technologies to achieve site-specific

crop management is referred to as precision agriculture. Global positioning system

(GPS), yield monitoring and mapping, soil sampling, variable rate application, and

remote sensing are just a few of the tools that have made the concept of precision

agriculture a reality. Unlike the conventional method of farming, precision agriculture

has the potential to produce the same level of yield with decreased input, higher yield

with decreased input, and higher yield with the same amount of input (Morgan & Ess,

1997).

Good crop management decisions rely on accurate spatial and temporal

variability information collected in a field. The first step in achieving this is to use

information sensing and extraction technologies that make use of sensors in acquiring

information relating to field conditions. One major way this is accomplished these days

is by using remote sensing technologies attached to either ground-based platforms or

aerial (drones and helicopters) and space-borne platforms (satellites).

14

Remote sensing technologies are capable of extracting information about an

object without coming in physical contact with it and provide an economical way of

acquiring field data in a short period of time. The primary type of remote sensors used

for most agricultural applications detects natural radiation that is emitted or reflected by

an object and makes use of surface reflectance between the visible and near-infrared

regions of the electromagnetic spectrum in providing information about the object.

Spectroscopy

Spectroscopy is a scientific technique used for the study and identification of

materials. It is capable of capturing the chemical composition of a material by

measuring the amount of light that the material absorbs, emits, or reflects. Light, in this

context, refers to electromagnetic radiation that can come from any region of

electromagnetic spectrum. Light waves are classified by their energies or wavelength,

and light energy is inversely proportional to its wavelength. When light is absorbed or

reflected by materials, not all of the light behaves the same; only certain wavelengths of

light get absorbed while others get reflected. In the past, spectroscopy was simply the

study of visible light according to its wavelength and dispersion by a prism; but since the

nineteenth century, optical components called diffraction gratings have made it possible

to expand the range to other regions of the electromagnetic spectrum including

ultraviolet, near-infrared, and shortwave-infrared. Devices that use these optical

elements are called spectrometers. Spectroscopy is commonly used for precision

agriculture applications because it is an inexpensive, fast, and most importantly, it is a

non-destructive method that can be used in the field.

15

Multispectral and Hyperspectral Imaging

Multispectral and hyperspectral imaging technologies combine two sensing

methods for their operation--spectroscopy and imaging. They are both capable of

capturing spatial and spectral information, and they contain spectral bands that extend

beyond the visible range of the electromagnetic spectrum. Their spectral range is

typically between the ultraviolet and near-infrared regions for most agricultural

applications. Multispectral imaging systems differ from hyperspectral imaging systems

in that the former contains broader bands and typically comprises less than ten spectral

bands. Hyperspectral systems, on the other hand, contain narrower bands and usually

comprises hundreds of spectral bands. In other words, hyperspectral systems have finer

spectral resolution than multispectral systems. Multispectral sensors are more portable

than hyperspectral sensors and can be easily integrated into most remote sensing

platforms. The finer spectral resolution quality of hyperspectral systems cuts both ways

--they can “see” better than their multispectral counterpart and provide more accurate

information about a material, but they also contain a great deal of redundant

information. A number of agricultural applications combine hyperspectral data and

feature selection algorithms in developing multispectral sensors that can be used in the

field.

Global Apple Industry

The apple (Malus domestica) is one of the most important fruit crops in the world

mostly because of the numerous ways it can be consumed and the many health

benefits it offers to people. According to the Food and Agriculture Organization (FAO),

apple ranks second worldwide after banana in terms of production. Over 80.8 million

metric tons of apples were produced in 2013, and Asia alone accounts for over 60%

16

that quantity (FAO, 2013). China and the United States are the top two apple producing

countries in the world and in 2013 China produced over 40 million tons of apples while

the United States produced about 4 million tons.

Apple Industry in South Korea

The apple industry in Korea was highlighted in this work because apple data

used in our analysis was acquired from the country. In 2013, the apple industry added

over $9 million to the nation’s economy and apple was the second most produced fruit

in the country with over 395, 000 tons harvested. The total cultivation area that same

year was over 30,000 hectares--higher than any other fruit grown in the country (Figure

1-1). Korea grows several varieties of apples, but Fuji is by far the most important apple

cultivar in the country.

Apple Marssonina Blotch

Apple Marssonina blotch (AMB), caused by Diplocarpon mali, is a highly

destructive fungal disease that mainly affects the leaves of apple trees. The disease has

been found in apple orchards in several countries, including Japan, China, Korea,

Brazil, Italy, and Canada. The occurrence of the disease was first recorded in Japan

over a century ago and by the 1980s, the disease had made its way to other countries

in Asia, North America, and Europe. In Korea alone, AMB has caused significant

economic losses with over 50% of apple orchards in the country infected by the disease

(Figure

1-2).

The disease thrives in warm and humid climates and usually occurs between the

months of June and July. It has a long latency period, typically between two to five

weeks, after which symptoms begin to develop at an incredibly fast rate. Symptoms

17

start off as small brown spots, with dark pin-like fruiting bodies, known as acervuli,

growing in symptomatic areas. At the advanced stage of the disease, leaves turn yellow

and prematurely fall off the tree. The disease affects the quality and quantity of apples

that can be harvested, including a reduction in starch content and fruit size.

Burning and burying of defoliated diseased leaves have been adopted by apple

growers to prevent the spread of AMB. Fungicides, such as thiophanate-methyl, are

also sprayed on crops at the appearance of early symptoms before the rainy season

begins. Unfortunately, in 1997, an incidence of thiophanate-methyl resistance in

Diplocarpon mali was found in Japan. As a result of the many management challenges

that exist with AMB, more emphasis is being placed on finding ways to prevent the

disease rather than on controlling it.

18

Figure 1-1. Comparison of the apple and other fruits in South Korea in 2013. A)

cultivation area, B) production quantity.

31

1417

14

21

62

0

10

20

30

40

50

60

70

Apple Pear Grape Peach Citrus theothers

Cultiv

ation a

rea (

1, 000 h

a)

A

395

173

278

202

692

635

0

100

200

300

400

500

600

700

800

Apple Pear Grape Peach Citrus theothers

Pro

duction (

1,

000 t

ons)

B

19

Figure 1-2. AMB distribution in South Korea in 2013 (infected regions are highlighted in red).

20

CHAPTER 2 LITERATURE REVIEW

Application of Spectroscopy for Plant Disease Detection

Over the past couple of decades, the use of spectroscopy has grown for

precision agriculture applications. This growth was prompted by the economic need of

growers to reduce crop production input, increase yield and maximize profit in an

efficient and environmentally friendly way. Spectroscopy has been used by several

researchers for various quality and quantity assessment of crops, and one major use

has been for the detection of diseases. Disease stress can greatly influence the

biochemical properties of plants. Infected plants have been shown to produce spectral

characteristics different from healthy ones due to the difference in the way they absorb

light in the visible and near-infrared spectral regions.

Many researchers have taken advantage of these unique spectral characteristics

in detecting plant diseases. Xu et al. (2007) used several spectral parameters including

single-wavelength reflectance, peak area, and water band index to classify five different

severity stages of leaf miner damage on tomato leaves. They found that spectral

reflectance between the NIR region of 800 nm to 1100 nm reduces significantly with

increasing disease severity levels, and the reverse was the case between wavelengths

1450 nm and 1900 nm. They achieved the highest correlation coefficient when the

disease severity levels were modeled using the 1450 nm – 1900 nm range.

Jones et al. (2010) also investigated the use of reflectance spectroscopy in the

quantitative analysis of a tomato disease called bacterial leaf spot. They found

significant wavelengths from the absorbance spectra that distinguished between several

degrees of disease infestation using both partial least squares (PLSR) and stepwise

21

multiple linear regression (SMLR). The disease predictive model built based on spectral

data achieved a coefficient of determination (R2) of 0.82. A highly destructive disease in

winter wheat called powdery mildew was detected by Yuan et al. (2014). They extracted

32 spectral features and examined them using both independent t-test and correlation

analysis. PLSR and multivariate linear regression (MLR) were also used in estimating

disease severity. They reported that the PLSR model outperformed the MLR model and

achieved an R2 of 0.8 using seven regression components.

Huanglongbing (HLB), arguably the most severe disease affecting the citrus

industry in Florida and other regions of the world, has been analyzed be several

researchers using spectroscopic technology. One of such analyses, conducted by

Sankaran et al. (2011), yielded an accuracy of 98% for HLB detection using a quadratic

discriminant analysis classification algorithm. It was reported that the raw spectral data

originally consisting of 989 spectral features, extracted from the wavelength range of

350 nm to 2500 nm, was initially reduced to 86 spectral features and even further to 24

features using a feature extraction algorithm known as principal component analysis

(PCA).

In 2001, transmittance and reflectance spectroscopy was reported to be a

preeminent tool for quickly detecting aflatoxin in corn. In the study, more than 95% of

corn kernels tested using transmittance or reflectance spectroscopy were correctly

classified as containing either high or low levels of aflatoxin (Pearson et al., 2001).

Laser induced fluorescence spectroscopy has also been used in the past with success

for canker disease caused by the Xanthomonas axonopodis pv.citri bacteria in citrus

plants (Belasque Jr et al., 2008).

22

In 2008, it was reported that a preliminary study of visible and near-infrared

reflectance spectroscopy produced a model that appears to be valuable in the early

detection of Botrytis cinerea on non-symptomatic eggplant leaves. The resulting back

propagation neural network (BP-NN) model was found to have an accuracy rate of 70-

85% in predicting fungal infections (Wu et al., 2008). More recently, researchers have

anticipated that spectroscopy will become an important component in food safety. For

example, near- infrared spectroscopy procedure for the detection of organic matter has

been found to be non-destructive, accurate and easy to implement. As a result, near-

infrared spectroscopy was used to identify toxic metabolites, including mycotoxigenic

fungi in maize crops (Berardo et al., 2005).

Application of Multispectral and Hyperspectral Imaging for Plant Disease Detection

Remote sensing has been shown to be a useful tool for monitoring the

heterogeneity of crop vitality within agricultural sites (Franke & Menz, 2007). In 2001,

airborne multispectral scanners were effective in detecting the occurrence of rice

panicle blast using a band combination of 530 nm – 570 nm and 650 nm – 700 nm

regions (Kobayashi et al., 2001). It was also found that a ground-based real-time remote

sensing system for detecting pre-symptomatic yellow rust disease in winter wheat crops

was developed through the use of fused multispectral fluoresce imaging and

hyperspectral reflection, with an overall error of about 5.5%. In addition, data fusion

using a self-organizing map neural network decreased the overall classification error to

1% (Moshou et al., 2005).

The agricultural industry desires a method to detect fungal infections as early as

possible. Therefore, multi-spectral remote sensing is being explored for the analysis of

23

crop diseases of winter wheat containing powdery mildew and leaf rust pathogens at

various stages of infection. Classification accuracies of the infections were between

56.8% and 88.6% during the trials, which indicates a moderate success rate of early

detection of infection (Franke & Menz, 2007).

Citrus canker has continuously threatened the marketability of citrus crops. Thus,

a hyperspectral imaging approach was developed to detecting canker lesions on Ruby

Red grapefruit with a resulting accuracy of 95.2%. It was concluded that hyperspectral

imaging technique coupled with the spectral information divergence (SID) based image

classification method was effective in discriminating citrus canker from other surface

diseases (Qin et al., 2009). HLB, another threatening citrus disease, has been detected

using both multispectral and hyperspectral images acquired from aerial platforms.

Kumar et al. (2012) detected HLB-infected trees from aerial images obtained from a

citrus grove in Florida with an accuracy of 87% using mixture tuned matched filtering

(MTMF) and spectral angle mapping (SAM) algorithms. Their reasoning behind using an

aerial platform to acquire the images instead of a ground-based one was so as to

expedite the process of detecting the disease in very large citrus groves thus allowing

growers to provide more efficient management practices to HLB-infected regions.

Hyperspectral imaging was also found to be a valuable tool in detecting disease

in crops in their early stage using a procedure based on both support vector machines

and spectral vegetation indices. This method distinguished diseased from non-diseased

sugar beet leaves as well as differentiated between leaves infected by the pathogens

Cercospora beticola, Uromyces betae, and Erysiphe betae at the asymptomatic stage

with accuracies between 65% and 90% (Rumpf et al., 2010).

24

Also, a study of sugarcane areas affected by orange rust disease found that

hyperspectral imagery can be used to detect the disease in sugarcane crops. The

combination of visible and near-infrared spectral bands with the moisture-sensitive

band, 1660 nm, yielded increased ability to identify rust-affected areas. However,

disease-water stress indices (R800/R1660; R1660/R550; (R800+R550)/(R1660+R680))

performed the best in targeting affected areas (Apan et al., 2004).

25

CHAPTER 3 APPLE MARSSONINA BLOTCH DETECTION USING INDOOR

SPECTRORADIOMETER DATA

Background

The apple is a very important fruit crop and ranks second after banana in terms

of production globally (FAO, 2013). Based on 2013 statistics from the Food and

Agriculture Organization (FAO), over 60% of the total world’s apple is grown in Asia.

Unfortunately, apple production has been declining in recent years in some of the top

producing countries in the world including China, Japan, India, and South Korea. A

significant part of this decline is attributed to a disease called apple Marssonina blotch

(AMB). AMB is a severe fungal disease that primarily affects apple tree leaves, and it is

caused by a pathogen called Diplocarpon mali. The first appearance of the disease was

recorded in Japan in 1907; but unfortunately, by the 1980s, the disease spread to some

other countries in Asia, Europe and North America (Harada, 1974; Lee et al., 2011; Lee

& Shin, 2000; Tamietti & Matta, 2003). A case in point of the prevalence of the disease

is in Korea. The country has suffered significant economic losses with over 50% of

apple orchards infected by the disease.

AMB disease occurs in the summer after periods of extended rainfall, and it

thrives in high humidity, warm temperature, and high rainfall climates. It is wind and rain

dispersed and spread in two major ways during the apple growing season. The primary

form of infection is caused by ascopores that are released from overwintered apothecia

in fallen leaves while the secondary infection is caused by asexually produced fungal

spores in the acervuli (EPPO, 2013). A long latency period of two to five weeks is typical

for AMB and symptoms begin to develop after this time. At the early symptomatic stage

of the disease, small grayish black or brownish spots appear on the surface of the leaf.

26

The disease then progresses to a stage where the spots coalesce, and necrotic and

chlorotic spots appear. The size of the spots keeps growing until leaves turn yellow and

prematurely fall off the tree. This defoliation affects the quality and quantity of apples on

a tree, including a reduction fruit size and starch content.

The primary preventative methods being adopted by apple growers whose fields

have been infected by the disease are burning and burying of defoliated leaves.

Treatments for AMB, including thiophyphanate-methyl fungicide, exist for the control of

the disease; however there have been reports of the disease pathogen being resistant

to some of these treatments (Tanaka et al., 2000). At this point, growing AMB resistant

cultivars might be one of the only few ways to economically, reliably and more efficiently

control its spread and some researchers have been working on finding disease resistant

cultivars and species (Yin et al., 2013). It is a challenging task trying to check for the

occurrence of disease by visually inspecting each tree in the field, not only because this

is a time-consuming process, but also because there is a high chance that diseased

leaves may not be spotted given the long latency period of the disease and at the

symptomatic stage, AMB can be mistaken for other apple blotch-like diseases (Back et

al., 2015). As a result of the many management challenges AMB is posing to growers, it

is imperative that methods that can be used for its early diagnosis be developed without

delay.

Oberhänsli et al. (2014) used quantitative fluorogenic polymerase chain reaction

(qPCR) for the early diagnosis of the disease. The authors concluded that the method

efficiently diagnosed AMB infected leaves and did so more accurately than visual

diagnosis by growers and other AMB experts. Methods like qPCR, however, are

27

destructive in nature and require that leaves be plucked from a tree for analysis. The

use of non-destructive methods for the detection of the disease is still relatively new and

to the best of the authors’ knowledge, only one technique has been explored in

literature. Lee et al. (2012) reported diagnosing the disease at the early and

asymptomatic stages using a non-invasive tool called optical coherence tomography.

From two-dimensional (2D) and three-dimensional (3D) imaging scans created by the

system, they were able to find distinctive differences between the inner cross-sectional

layers of healthy and diseased leaves. They concluded that an early stage AMB

detection tool can be developed based on an upgraded version of the system. There

was no mention, however, if the technology could be used for quantitatively analyzing

the disease.

Spectroscopy is extensively used in precision agriculture for assessing the

general health of crops (Del Fiore et al., 2010; Gómez-Sanchis et al., 2008; Graeff et

al., 2006; Qin et al., 2008). It is preferred to some other tools being used for qualitative

and quantitative analyses in agriculture because it is inexpensive, accurate, fast, and

non-destructive (Del Fiore et al., 2010; Li et al., 2007; Roggo et al., 2002; Sankaran et

al., 2010). Some researchers have used spectral analysis in detecting plant diseases.

Jones et al. (2010) and Xu et al. (2007) showed the potential of spectral technology in

the detection of bacterial leaf spot and leafminer diseases in tomatoes. They

successfully created disease prediction models capable of diagnosing tomato diseases

at different severity levels of infestation. The notorious Huanglongbing (HLB) disease of

citrus has also been detected with an accuracy of 87% using spectral features

developed from reflectance spectral data (Sankaran et al., 2013).

28

Spectroscopic data usually possesses high numbers of spectral bands with some

data containing hundreds or even thousands of bands. With so many spectral bands,

the feature space of a given spectral data could potentially contain tens of thousands of

features (Thenkabail et al., 2011). In most situations, some of these features hold little

or no information about the target of interest and including them in information

extraction processes such as classification could slow down the process and cause

inaccurate classification results. One way of dealing with this problem is to apply

preprocessing techniques such as feature selection and extraction to the spectral data

before classification is performed. A number of methods for feature selection and

extraction exist with the most popular being spectral band selection, spectral indices,

and projection pursuit measures. Many researchers have investigated several

dimensionality reduction methods based on these measures (Bruce et al., 2002; Jia &

Richards, 1999; Martínez-Usó et al., 2007; Wang & Angelopoulou, 2006; Yang et al.,

2012; Zhang et al., 2005). Feature selection methods are generally preferred to feature

extraction methods because the latter does not preserve original information; instead, it

produces transformed features. This is especially undesirable in situations where the

goal is to build a multispectral sensor based on selected bands or features.

The primary objective of this work was to evaluate the potential of using spectral

data for the identification of AMB disease. The specific objectives were to:

i. determine optimal spectral features for AMB disease detection, and

ii. develop a classification algorithm capable of distinguishing between diseased and healthy leaves.

29

Materials and Methods

Data Collection

Leaf samples used in this work were acquired from Fuji apple trees in an

experimental apple orchard located in Gunwi-city, Korea. A test area measuring 40 m x

60 m, with a total of 260 trees, was set aside for the experiment. Datasets were

acquired during the fall season in two consecutive years–2014 and 2015. Before leaves

were plucked from the trees and analyzed, the trees were inoculated with AMB spores

in order to facilitate disease development. Leaf samples were collected on different

days to ensure leaves that were healthy and others that showed varying degrees of

AMB infection were included in the dataset. Molybdenum and Manganese nutrient

deficient leaves were also included in the dataset due to their similarity in color with

some of the other classes.

Indoor spectral measurements were obtained from the samples immediately after

they were plucked from the trees. This was done so as to minimize damage to the cell

structure of the leaves. Reflectance spectral information was acquired between

wavelengths 350 nm and 2500 nm using a spectroradiometer system that consisted of a

spectroradiometer (Field Spec 3, ASD Inc., Boulder, CO, USA) and a plant probe with

an attached leaf clip (Leafclip Assembly A122325, ASD Inc., Boulder, CO, USA). The

clip had a spot size of 10 mm, and it was used in holding leaves in place while spectral

measurements were obtained. Before the leaves were clipped to the plant probe, a

white polytetrafluorethylene (PTFE) reflector material with 99% reflectance was clipped

to the plant probe and used in calibrating the system to reflectance. After spectral

measurements of leaf samples had been acquired, they were transferred and stored on

a laptop (Intel Core i7-3720QM, HP, USA). The spectroradiometer system had a

30

spectral resolution of 1 nm, had three detectors: a VNIR detector (350-1000 nm), a

SWIR1 detector (1000-1800 nm) and a SWIR2 (1800-2500 nm) detector and operated

with a scanning time of 100 milliseconds. Stable illumination was produced throughout

the spectral range using a halogen lamp (ASD Illuminator - 70 watts, ASD Inc., Boulder,

CO, USA).

A total of 621 and 751 samples were acquired in 2014 and 2015, respectively.

Spectral measurements from both years contained 2151 spectral bands and five

different classes were defined for the datasets as shown in Figure 3-1: mature healthy

(healthM), young healthy (healthY), early stage AMB diseased (ambE), advanced stage

AMB diseased (ambA), and molybdenum/manganese nutrient deficient (nd). Aside from

the healthy classes, all other classes had, at least, two different colored regions on the

leaves, and as earlier stated, a leaf clip with a spot size of 10 mm was used in the

spectral data acquisition process. This large spot size made it impossible to capture

spectral information of individual symptomatic and asymptomatic regions; what was

captured was the average reflectance spectra of all the colors enclosed within the

clipped area, thus resulting in spectrally mixed data for AMB diseased and nutrient

deficient samples. The ambE spectral class was a spectral mixture of reflectance data

from brown and green portions of early stage AMB diseased leaves. Most samples in

the ambA spectral class were spectral mixtures of just green, yellow and brown colored

regions, but some of its other samples also had orange symptomatic regions included in

their spectral mixtures. The nd spectral class was a spectral mixture of light and dark

green portions of nutrient deficient samples.

31

Vegetation Indices

A number of vegetation indices were developed and tested on the five classes

including carotenoid reflectance index (CRI), sum green index (SGI) and normalized

difference vegetation index (NDVI). Among those tested, only anthocyanin reflectance

index 1 (ARI1) and a newly developed index called matrix-adjusted red edge position

(MAREP)–based on the 4-point interpolation red edge position--were efficient in

discriminating among the classes. MATLAB R2014a (Version 8.3, The MathWorks Inc.,

Natwick, MA, USA) was used in developing vegetation indices.

Anthocyanins are water-soluble pigments that impart the color of plant leaves.

After chlorophyll, anthocyanins are the most important group of pigments found in the

visible range of the spectrum for assessing vegetation health. Anthocyanins are typically

found in more abundance in stressed vegetation than in healthy ones. Anthocyanin

reflectance index 1 ( 1ARI ) is calculated using reflectance found in two very significant

wavelengths for plants: 550 nm and 700 nm, and it is given by the following equation:

700

1

550

11

ARI (3-1)

where 500 represents the reflectance at the nitrogen absorption band of 550 nm and

700 represents the reflectance at the red band of 700 nm.

The spectral reflectance curve of vegetation contains a region that abruptly

changes from low to high reflectance between the red and near infrared (NIR) range of

the spectrum. This point of inflection is referred to as red edge, and weakened

vegetation typically has this position shifted towards shorter wavelengths—a

phenomenon known as blue shift. Several existing algorithms for calculating the red

edge position (REP) were assessed, but none showed as good of a separation as a

32

newly developed vegetation index created by the authors based on the 4-point

interpolation approach of calculating REP. This vegetation index is called matrix-

adjusted red edge position (MAREP). MAREP, unlike the 4-point interpolation REP

method which performs element-wise operations on any given class dataset, uses a

matrix based approach in computing its vegetation index. The algorithm takes into

account the combined effect a given cluster or class has on each of its samples and as

a result, it allows for better separation among samples in different classes.

It should be noted that for MAREP to work efficiently, information about the

location of class samples is required and in a situation where this information is not

known, it is recommended first to apply unsupervised clustering to the dataset before

applying MAREP. The equations in (3-2) and (3-3) represent the 4-point interpolation

REP and MAREP algorithms, respectively.

700740

70040700

rededgeRREP (3-2)

, 1

, 1

700

700 40

740 700

i

i i

k

c i

rededge k

c i

MAREP R

(3-3)

2

780670 rededgeR (3-4)

where

rededgeR is the reflectance at the inflection point. ,700,780,670 and 740

represent reflectance at 670 nm, 780 nm, 700 nm, and 740 nm, respectively. The

constants 700 and 40 are values resulting from interpolating between the 700 and 740

nm spectral range, c represents a particular cluster or class and k represents the

number of samples in the cluster.

33

Jefferies Matusita-Orthogonal Subspace Projection (JM-OSP) Band Selection

Most band selection methods do not take into account feature redundancy. If

redundancy is not accounted for, the computed optimal spectral bands by a feature

selection algorithm could all be concentrated in one spectral region having very similar

information. To minimize this effect, a very robust distance measure called Jefferies

Matusita (JM) distance was used as a criterion for removing redundant spectral bands

before feature selection was performed. JM distance is traditionally used for performing

class separability operations; but in this work, it was modified for the task of minimizing

feature redundancy. The JM distance algorithm computes the distance between density

functions of two classes or features based on Bhattacharyya distance with an

assumption of Gaussian class distributions made in order to simplify the computation of

Bhattacharyya distance. A JM distance value of 1.414 suggests two spectral bands

contain very distinct information about the classes to be separated and thus, would be

good candidates for the band selection process. The JM distance, 𝐽𝑀, for any two pairs

of spectral bands is given as:

𝐽𝑀 = [2(1 − 𝑒−𝐷𝐵)]1

2⁄ (3-5)

𝐷𝐵 = 𝐷𝑀

8+

1

2 ln [

|(𝐶𝑠𝑖+ 𝐶𝑠𝑗)/2|

(|𝐶𝑠𝑖 || 𝐶𝑠𝑗|)1

2⁄] (3-6)

𝐷𝑀 = [(𝜇𝑠1 − 𝜇𝑠2)𝑇 (𝐶𝑠𝑖+ 𝐶𝑠𝑗

2)

−1

(𝜇𝑠𝑖 − 𝜇𝑠𝑗)]

12⁄

(3-7)

where 𝐷𝐵 is the Bhattacharyya distance and 𝐷𝑀 is the Mahalanobis distance. 𝜇𝑠i, 𝜇𝑠𝑗 and

𝐶𝑠𝑖, 𝐶𝑠𝑗 are the mean and covariance of reflectance data in bands 𝑖 and 𝑗, respectrively.

Du and Yang (2008) proposed a band selection method based on using a similarity

metric that is commonly employed for endmember extraction called orthogonal subspace

34

projection (OSP). OSP, unlike other similarity measures which take measurements from

pairs of bands, evaluates bands jointly. The algorithm is less computationally expensive

than most other band selection methods because the others find optimal band

combinations by performing an exhaustive search; whereas, OSP performs a sequential

forward search to find the best bands.

OSP performs band selection as follows: assuming there are M bands in the

original dataset, in order to find the first band, the algorithm randomly selects a band for

band 1, A1 and projects all the other M-1 bands to its orthogonal subspace. It then finds

a second band, A2, with the maximum projection in A1’s orthogonal subspace; this is

considered as the band most dissimilar to A1. All other M-2 bands are now projected on

A2’s orthogonal subspace, and band A3 is chosen as the band with maximum projection

in A2’s orthogonal subspace. The algorithm continues until Ai+1 = Ai-1. When this occurs,

Ai+1 is selected as the true band 1, B1 and Ai is selected as the true band 2, B2. To find

the third band, B3 (and subsequent bands), the algorithm finds the band that is most

dissimilar from B1 and B2 by using the orthogonal subspace, 𝐏, of both bands defined

below:

𝐏 = 𝐈 − 𝐙(𝐙𝐓𝐙)−𝟏𝐙𝐓 (3-8)

where 𝐈 is an N x N identity matrix, N is the number of pixels per band, and 𝐙 is an N x 2

matrix with the first column containing all pixels in band 1, B1 and the second column

includes all pixels in band 2, B2. (The superscript,𝐓, means transpose).

After implementing (3-8), the projection 𝐲𝟎 = 𝐏𝐓𝐲 is computed; 𝒚 includes all

pixels in the original dataset, B and 𝐲𝟎 is the component of B in the orthogonal subspace

of B1 and B2. The band that yields the maximum orthogonal component ‖𝐲𝟎‖ is considered

35

the most dissimilar band to the first two bands and will be selected as band, B3. For finding

subsequent bands, the size of 𝐙 in (3-8) changes to [B1 B2 B3] and then to [B1 B2 B3 B4]

and so forth until the desired number of bands are selected. (Note: ‖𝐲𝟎‖ denotes the

norm of 𝐲𝟎 ).

Results and Discussion

Spectral Feature Analysis

The mean reflectance spectra for the five classes defined for 2014 and 2015

datasets are given in Figure 3-2. Upon inspection of the figure, it can be seen the

signatures of the five classes were notably different. The slight differences in signatures

for same class pairs in both years can be attributed to varying fractional abundances of

colors. In both plots, at the nitrogen absorption band of 550 nm, it can be seen healthier

samples have a lower peak when compared to stressed samples. Healthy leaves

generally have more concentrations of nitrogen because they contain more chlorophyll

than stressed vegetation. Higher measured reflectance at this band was only

experienced by stressed samples indicating lower levels of nitrogen in them.

The reflectance at the near infrared (NIR) region, between 700 nm and 1000 nm,

was high for all classes due to the internal scattering of light within leaves’ structure—a

phenomenon present in all vegetation. As the water content in leaves increases, so

does the absorption strength at bands 1450 nm and 1950 nm. From the plots, it can be

seen healthier samples have stronger absorption at these bands than stressed

vegetation and as a result, they contain higher amounts of water.

JM-OSP Band Selection

The original dataset from 2014 and 2015 contained a total of 2151 spectral

bands. The JM distance was calculated for each pair of bands and resulted in a total of

36

2151(2151-1)/2 distances as depicted in Figure 3-3. The main diagonal in the figure

represents the JM distance between same band pairs and as expected, resulted in zero

JM distances. Higher JM distances between bands suggest those bands have higher

class separability than bands with lower values, and they would be good candidates for

the band selection process. It can also be seen from the figure that spectral bands

roughly between wavelength range pairs 400 nm – 700 nm and 720 nm – 1400 nm, and

between 700 nm – 1400 nm and 1800 nm – 2500 nm have very high JM distances. A

threshold of 1.4 was applied to the data and spectral bands above this threshold (1915

in total) were selected for OSP band selection. In order to determine the appropriate

number of optimal bands required for the classification process, a stopping criterion for

the OSP algorithm was created using a cumulative average entropy approach. For the

combined 2014 and 2015 datasets, the algorithm stopped at the fifth iteration and

resulted in optimal bands at wavelengths 790 nm, 1384 nm, 695 nm, 1076 nm and 1716

nm.

ARI1 and MAREP

Two-dimensional scatter plots were created for 2014 and 2015 datasets to

assess visually how well ARI1 and MAREP worked in separating the classes, and they

are shown in Figure 3-4. From the figure, it can be seen samples in ambA class had the

lowest MAREP values and could easily be separated from the other classes. Working

with the same principle as would have been applied to the 4-point interpolation REP

method, MAREP shows healthier samples have higher values than stressed samples

and as a result, contain higher concentrations of chlorophyll. As expected, the ARI1

values for diseased samples were generally higher than those for healthy and nutrient

deficient samples. There was a slight overlap between some healthM samples and nd

37

samples. This was because samples in the nd class had different proportions of nutrient

deficient symptoms, with some samples having more abundances of

healthy/asymptomatic regions than stressed areas.

Classification Based On JM-OSP Bands

Due to the small sample size in each class dataset and to avoid any bias and

ensure analysis results from this work would generalize well in an independent dataset,

three-fold cross validation was applied to the spectral data. Classification performance

at individual stages of the validation process was averaged to create one classification

dataset. The spectral bands selected by the JM-OSP algorithm were used as input

features for three classifiers--quadratic discriminant analysis (QDA), neural network and

discriminant tree. Spectral datasets from both years were not combined for the

classification process because there were slight differences in their reflectance

amplitudes. Classification results achieved for 2014 and 2015 datasets using the

aforementioned classifiers are given in Tables 3-1 to 3-6. The number of correctly

classified samples together with their corresponding percentage accuracies are

provided in the cells in the main diagonal while values in the other cells indicate

misclassification errors.

Comparing the classification results achieved by all three classifiers for 2014

dataset, given in Tables 3-1 to 3-3, it can be seen that the QDA classifier achieved the

highest overall accuracy of 91.9%, followed by the neural network classifier with an

overall accuracy of 89.2% and finally, the discriminant tree classifier with an overall

accuracy of 88.5%. Both QDA and discriminant tree had the highest accuracy for the

ambE class with an accuracy of 79.4%, and a total of 27 of its samples were

misclassified as healthM, healthY and nd. QDA and neural network both classified the

38

ambA class with an accuracy of 90.4% and misclassified 11 of its samples as ambE,

healthY, and nd. For the healthM class, QDA achieved the highest accuracy of 95.6%,

followed by discriminant tree with an accuracy of 89.5%, and the neural network

classifier achieved the lowest accuracy for this class with an accuracy of 89.5% and

misclassified 12 of its samples as ambE and nd. All three classifiers correctly classified

the healthY class with an accuracy over 97%, and it was the only class to achieve the

highest accuracy by all three classifiers. Neural network performed the worst in

classifying the nd class. It correctly classified 84.5% of its samples and misclassified 10

as ambE and healthM. QDA and discriminant tree, on the other hand, achieved an

accuracy of 93.9% and 93.2%, respectively.

The classification results for the 2015 dataset are given in Tables 3-4 to 3-6.

Comparing the classification performance by all three classifiers, it can be seen that the

QDA classifier again achieved the highest overall accuracy of 92.1%, followed by the

neural network classifier with an overall accuracy of 89.2% and finally, the discriminant

tree classifier with an overall accuracy of 86.1%. All three classifiers achieved an

accuracy of over 86% for the ambE class with QDA leading with an accuracy of 88.5%.

The ambA class had the overall highest accuracy among all the classes with an

accuracy of 98.2% achieved by the QDA classifier; two of its samples were

misclassified as ambE and one as nd. For the healthM class, QDA achieved the highest

accuracy of 86.7%, followed by discriminant tree with an accuracy of 78.3%, and the

neural network classifier achieved the lowest accuracy for this class with an accuracy of

86.7% and misclassified 27 of its samples as ambE, healthY, and nd.

39

All three classifiers correctly classified the healthY class with an accuracy over

95%. This time, discriminant tree performed the worst in classifying the nd class. It

correctly classified 81.2% of it samples while QDA and discriminant tree achieved an

accuracy of 92.9% and 85.3%, respectively.

Classification Based On ARI1 and MAREP Features

ARI1 and MAREP were used as input features for a quadratic discriminant

analysis (QDA) classifier. A QDA classifier was chosen over neural network and

discriminant tree classifiers because it performed better in mapping input data to their

respective classes and as a result, classification information from the other two

classifiers are not shown.

The classification results that were achieved for 2014 and 2015 datasets are

given in Tables 3-7 and 3-8, respectively. Again, the number of correctly classified

samples together with their corresponding percentage accuracies are given in the cells

in the main diagonal. Values in the other cells indicate misclassification errors. For 2014

dataset, the overall accuracy from the classification analysis was 98.4%. Samples

belonging to healthY, ambE, and ambA classes were all perfectly classified. From

Figure 3-4a, it can be seen there was no overlap between the ARI1- MAREP features of

the three classes and those of other classes, thus explaining why the classifier was able

to classify them efficiently. The class healthM had the lowest classification accuracy of

93.9% with a total of five samples misclassified as nd and two samples misclassified as

ambE. The classification accuracy for the nd class was 4% higher than that of the

healthM class. Two of its samples were misclassified as ambE while only one was

misclassified as healthM. One major reason some samples in healthM and nd classes

were misclassified as either ambE, nd or healthM was because ambE and nd samples

40

still retained some healthy and asymptomatic regions even after symptoms began to

show on the leaves.

Even though classification results from this analysis were high, more efficient and

meaningful results could have been generated had a system capable of acquiring

spectral measurements from much smaller regions been used. Since such a dataset did

not exist at the time of this analysis, a workaround was developed, and it is discussed in

Chapter 4. Future work on the detection of the disease will include using a

hyperspectral imaging system to acquire both spectral and spatial information on a per

pixel basis from leaves. A spectroradiometer system was chosen over a hyperspectral

imaging system for this preliminary analysis because hyperspectral imaging systems

are more expensive and usually contain more noise.

For 2015 dataset, an overall accuracy of 98.4% was also achieved. Samples

belonging to the ambA class were unsurprisingly classified with an accuracy of 100%,

again because they had the most distinct features among all the classes. The healthM

class achieved an accuracy of 99.2% with one misclassification as nd. The ambE class

had the lowest classification accuracy of 96.2% with a total of five samples misclassified

as healthY. As for the nd class, 97.1% of its samples were correctly classified with five

misclassifications as ambE.

QDA Binary Classification Based On ARI1 and MAREP Features

Since the primary objective of this work was to detect AMB at the earliest stage

possible, samples belonging to the ambA class were ignored in another analysis due to

their very distinct color. ARI1 and MAREP features were chosen for the classification

process because they achieved better classification results than JM-OSP spectral

bands. From the 2014 dataset, a total of 50 samples were randomly selected from each

41

of the healthM, healthY, and nd classes and combined to form one class with 150

samples while all 131 samples from the ambE class were used.

For the purpose of this analysis, the combined class was termed “healthy-nd.”

Table 3-9 shows the QDA binary classification results for healthy-nd and ambE samples

for 2014. An overall accuracy of 94.7% was achieved with 96% of healthy-nd samples

correctly classified and six samples misclassified as ambE. For the ambE class, 93.1%

of its samples were correctly classified with nine misclassifications as healthy-nd. The

overall classification accuracy reduced by 3.7% when only two classes were

considered. Table 3-10 shows the binary classification for 2015 dataset.

Just as was done with the 2014 dataset, 150 samples were extracted from the

healthM, healthY and nd classes and all 130 samples in the ambE class were used. An

overall accuracy of 94.6% was achieved with 98.7% of healthy-nd samples correctly

classified and two misclassifications as ambE. For the ambE class, 90% of its samples

were correctly classified with 13 misclassifications as healthy-nd. The overall

classification accuracy reduced by 3.8% when compared to the classification accuracy

achieved with five classes.

Figure 3-4 provides some insight into the cause of the reduction in classification

accuracies observed in both years. From the figure, it can be seen there was some

slight overlap between ambE and healthY samples in the 2015 dataset. Color images of

ambE samples were compared with those of healthY, and it was found that some ambE

samples had asymptomatic and healthy regions the same color as healthY samples.

This shows that those samples were initially healthy and young leaves that later

became infected by the disease. The plots also reveal that the combined healthy-nd

42

classes were not clustered so tightly together that the classifier could create a quadratic

boundary that could perfectly separate the two classes. The elimination of the ambA

class was another reason for the reduction in accuracy since its perfect classification

enhanced classification results in the previous analysis.

Conclusion

The primary objective of this study was to detect AMB disease at the earliest

stage possible using spectral data from two consecutive years. Features were selected

using two different methods. JM distance was combined with OSP in selecting five

optimal spectral bands between the red and SWIR spectral range while ARI1 and

MAREP spectral indices were built based on five bands between 550 nm and 780 nm

wavelength range. This spectral region is known to have a notable influence on many

plant diseases. Both types of features tested in this analysis were used in discriminating

between healthy, AMB diseased and nutrient deficient samples and were combined with

at least one of three different classifiers—QDA, neural network, and discriminant tree--

to classify samples in all five classes. Results showed that MAREP, a vegetation index

derived from the 4-point interpolation REP, efficiently separated the classes and worked

well for the early detection of AMB. Based on results achieved from this analysis, a

multispectral camera can be built using the five ARI1-MAREP spectral bands for the

detection of the disease on apple fields. Overall, the results of this work indicate the

potential of using spectroscopic technology as a valuable non-invasive tool for the early

diagnosis of AMB.

43

Figure 3-1. Apple leaves used in indoor spectroradiometer analysis. A) healthy mature

(healthM), B) healthy young (healthY), C) early stage AMB diseased (ambE), D) advanced stage AMB diseased (ambA), and E) molybdenum/manganese nutrient deficient (nd).

A B C D E

44

Figure 3-2. Mean reflectance spectra of samples in spectroradiometer dataset for two

consecutive years. A) 2014, B) 2015. The classes include healthy mature (healthM), healthy young (healthY), early stage AMB diseased (ambE), advanced stage AMB diseased (ambA), and molybdenum/manganese nutrient deficient (nd).

550 nm

A

B

45

Figure 3-3. JM distance matrix of spectral bands in 2014 and 2015 indoor spectroradiometer datasets.

46

Figure 3-4. The plot of MAREP against ARI1 for samples in spectroradiometer dataset

for two consecutive years. A) 2014, B) 2015. The classes include healthy mature (healthM), healthy young (healthY), early stage AMB diseased (ambE), advanced stage AMB diseased (ambA), and molybdenum/manganese nutrient deficient (nd).

A

B

47

Table 3-1. QDA classification accuracy for five spectroradiometer classes in 2014 dataset using five spectral bands selected by JM-OSP.

Pre

dic

ted

Actual

Total

Total

ambE ambA healthM healthY nd

ambE 104

(79.4%) 9

(7.8%) 1

(0.9%) 0

7 (4.7%)

121

ambA 8 (6.1%)

104 (90.4%)

0 2

(1.8%) 0 114

healthM 1 (0.8%)

0 109

(95.6%) 0

2 (1.4%)

112

healthY 0 2

(1.7%) 0

111 (98.2%)

0 113

nd 18 (13.7)

0 4

(3.5%) 0

139 (93.9%)

161

Total 131

115 114 113 148 621

Table 3-2. Neural network classification accuracy for five spectroradiometer classes in 2014

dataset using five spectral bands selected by JM-OSP.

Pre

dic

ted

Actual

Total ambE ambA healthM healthY nd

ambE 102

(77.9%) 6

(5.2%) 8

(7.0%) 0

6 (4.1%)

122

ambA 6 (4.6%)

104 (90.4%)

0 1

(0.9%) 0 111

healthM 6 (4.6%)

0 98

(86.0%) 0

4 (2.7%)

108

healthY 0 4

(3.5%) 0

112 (99.1%)

0 116

nd 17 (13.0%)

1 (0.9%)

8 (7.0%)

0 138

(93.2%) 164

Total 131 115 114 113 148 621

48

Table 3-3. Discriminant tree classification accuracy for five spectroradiometer classes in 2014 dataset using five spectral bands selected by JM-OSP.

Pre

dic

ted

Actual


ambE 104

(79.4%) 8

(7.0%) 5

(4.4%) 0

15 (10.1%)

132

ambA 8 (6.1%)

103 (89.6%)

0 2

(1.8%) 5

(3.4%) 118

healthM 3 (2.3%)

0 102

(89.5%) 0

1 (1.4%)

107

healthY 0 4

(3.5%) 0

110 (97.3%)

1 (0.7%)

115

nd 16 (12.2%)

0 7

(6.1%) 1

(0.9%) 125

(84.5%) 149

Total 131 115 114 113 148 621

Table 3-4. QDA classification accuracy for five spectroradiometer classes in 2015 dataset using

five spectral bands selected by JM-OSP.

Pre

dic

ted

Actual


ambE 115

(88.5%) 2

(1.2%) 2

(1.7%) 0

1 (0.6%)

120

ambA 7 (5.4%)

163 (98.2%)

1 (0.8%)

3 (1.8%)

1 (0.6%)

175

healthM 1 (0.8%)

0 104

(86.7%) 3

(1.8%) 8

(4.7%) 116

healthY 0 0 3

(2.5%) 159

(96.4%) 2

(1.2%) 164

nd 7 (5.4%)

1 (0.6%)

10 (8.3%)

0 158

(92.9%) 176

Total 130 165 120 165 170 751

49

Table 3-5. Neural network classification accuracy for five spectroradiometer classes in 2015 dataset using five spectral bands selected by JM-OSP.

Pre

dic

ted

Actual


ambE 114

(87.7%) 2

(1.2%) 4

(3.3%) 0

3 (1.8%)

123

ambA 8 (6.2%)

159 (95.8%)

0 1

(0.6%) 1

(0.6%) 169

healthM 0 1

(0.6%) 93

(77.5%) 2

(1.2%) 20

(11.8%) 116

healthY 0 4

(2.4%) 3

(2.5%) 159

(96.4%) 1

(0.6%) 167

nd 8 (6.2%)

0 20

(16.7%) 3

(1.8%) 145

(85.3%) 150

Total 130 166 120 165 170 751

Table 3-6. Discriminant tree classification accuracy for five spectroradiometer classes in 2015

dataset using five spectral bands selected by JM-OSP.

Pre

dic

ted

Actual


ambE 112

(86.2%) 12

(7.2%) 8

(6.7%) 0

6 (3.5%)

138

ambA 7 (5.4%)

147 (88.6%)

1 (0.8%)

3 (1.8%)

0 158

healthM 4 (3.1%)

0 94

(78.3%) 2

(1.2%) 17

(10.0%) 117

healthY 2 (1.5%)

7 (4.2%)

7 (5.8%)

157 (95.2%)

9 (5.3%)

182

nd 5 (3.8%)

0 10

(8.3%) 3

(1.8%) 138

(81.2%) 156

Total 130 166 120 165 170 751

50

Table 3-7. QDA classification accuracy for five spectroradiometer classes in 2014 dataset using ARI1 and MAREP vegetation indices.

Pre

dic

ted

Actual

Total

ambE

ambA

healthM

healthY

nd

ambE 131

(100%) 0 2

(1.7%) 0 2

(1.4%) 135

ambA 0 115

(100%) 0 0 0 115

healthM 0 0 107

(93.9%) 0 1

(0.7%) 108

healthY 0 0 0 113

(100%) 0 113

nd 0 0 5

(4.4%) 0 145

(97.9%) 150

Total 131 115 114 113 148 621

Table 3-8. QDA classification accuracy for five spectroradiometer classes in 2015 dataset using

ARI1 and MAREP vegetation indices.

Pre

dic

ted

Actual

Total

ambE

ambA

healthM

healthY

nd

ambE 125

(96.2%) 0 0

1

(0.6%)

5

(2.9%) 131

ambA

0

166

(100%) 0 0 0 166

healthM 0 0 119

(99.2%) 0 0

119

healthY 5

(3.8%) 0 0 164

(99.4%) 0 169

nd 0 0 1

(0.8%) 0 165

(97.1%) 166

Total 130 166 120 165 170 751

51

Table 3-9. QDA classification accuracy for healthy-nd and early stage AMB diseased spectroradiometer classes in 2014 dataset using ARI1 and MAREP vegetation indices.

Pre

dic

ted

Actual

Total ambE

healthy-nd

ambE 122

(93.1%)

6

(4%) 128

healthy-nd 9

(6.9%)

144

(96%) 153

Total 131 150 281

Table 3-10. QDA classification accuracy for healthy-nd and early stage AMB diseased

spectroradiometer classes in 2015 dataset using ARI1 and MAREP vegetation indices.

P

redic

ted

Actual

Total

ambE

healthy-nd

ambE 117

(90%)

2

(1.3%) 119

healthy-nd 13

(10%)

148

(98.7%) 161

Total 130 150 280

52

CHAPTER 4 QUANTITATIVE ANALYSIS AND SPECTRAL UNMIXING OF EARLY STAGE APPLE

MARSSONINA BLOTCH DISEASE

Background

In chapter 3, qualitative analysis of five classes of apple leaves was performed

using spectral data acquired from a spectroradiometer system. These classes included

healthy mature (healthM), healthy young (healthY), early stage AMB diseased (ambE),

advanced stage AMB diseased (ambA) and nutrient deficient (nd). It was reported that

data collected from all three stressed classes were a spectral mixture of at least two

regions (or colors) with possibly very distinct spectral characteristics. The spectral

analysis results in Figure 3-4 showed the ARI1 and MAREP features for the two

diseased classes varied more widely than the other three classes having only one

color—green. This occurred as a result of combining samples with various levels of

infestation into one class. Since the main objective of this research was to detect AMB

at the earliest stage possible so as to allow growers apply fungicides more precisely to

areas of their field infected by various degrees of the disease, methods capable of

estimating the extent of disease infestation were analyzed.

Spectroradiometer is one of the most popular spectroscopic systems being used

to detect plant stress by measuring crop spectral reflectance. It has several advantages

including ease of use, efficiency and portability. However, some spectroradiometer

systems, like the one utilized in this work, are only capable of extracting one spectral

signature for several regions on an object being sensed. This poses a problem when

the aim is to analyze areas with different chemical properties separately. Optimization

algorithms, such a particle swarm optimization (PSO), have been successfully used by

some researchers in extracting endmembers from spectrally mixed data (Omran et al.,

53

2006; Zhang et al., 2011). PSO is preferred to some of the other well-known

endmember extraction algorithms due to its ease of use, efficiency and robustness in

solving optimization problems.

The main objectives of this work were to (i) develop a model capable of

performing quantitative analysis on previously named “early stage diseased” samples

and, (ii) to create an optimal method for unmixing mixed spectral features for early stage

AMB diseased samples.

Methods

Abundance Estimation of Early Stage AMB Diseased Data

Early stage AMB diseased samples had both brown and green colored regions.

A number of steps were taken to arrive at the abundance estimation for both colored

regions (Figure 4-1). After spectral measurements had been taken, a white ring was

used in marking regions from which spectral data had been acquired. A digital camera

(Canon EOS 5D, Canon, Japan), with a focal length of 24 mm and an exposure time of

0.04 sec, was used in acquiring color images of leaf samples. Hough transformation

method was used in extracting only regions marked by the circular ring. To allow for

easy segmentation of brown pixels, green pixels were first masked using the ratio of the

red channel to the green channel and setting green pixel extraction to less than or equal

to a threshold of 0.5. This threshold was chosen because it gave the best separation

between the green pixels and other colors from the analysis of the histogram plot.

Nineteen color and texture features were then created and used as input for a

supervised k-nearest neighbor (kNN) classifier. The classifier made use a Euclidean

distance metric and a neighborhood size of four for segmenting ambE color images.

54

Endmember Extraction from Early Stage AMB Diseased Data

Particle swarm optimization is a methodology in evolutionary computation that

was invented by Eberhart and Kennedy in 1995 (Eberhart & Kennedy, 1995). It is a

stochastic algorithm inspired by the social behavior of fish schooling and bird flocking.

The algorithm simulates the social behavior of humans and insects; individuals can

interact with one another while also learning from their experiences and with time, the

population members move into more suitable regions in the problem space. Just like

genetic algorithm (GA) and other heuristic tools, PSO is randomly initialized with a set of

potential solutions and iteratively searches for an optimal solution by updating the

population in each iteration. Unlike GA, however, PSO does not make use of the

selection, crossover and mutation evolution operators in its implementation. In the PSO

algorithm, each particle represents a potential solution and flies through a

multidimensional search space by following the current optimum particles and keeps a

record of its current position as well as the best position it has achieved so far. Aside

from the personal best positions obtained by each particle, the algorithm also finds a

global best position for all particles in the search space, and it can be referred to as the

“best” of the personal bests. Particles are initialized with random positions and velocities

with the velocity of each particle adjusted according to its flying experience and those of

other particles. The position of each particle is updated according to Euler’s integration

equation. The velocity and position of each particle are modified according to the

equations below:

)()( )(2)(1)()1( titiitii xgbestrandCxbestprandCvv (4-1)

)1()()1( titit vxx (4-2)

55

where v = velocity of thi particle

x = position of thi particle

i = particle index

t = discrete time index

bestpi = personal best position of thi particle

gbest = global best of all particles

rand= uniformly distributed number between 0 and 1

2,1C = weighting factors. In most situations 1C =

2C = 2

= inertia function. Values close to one facilitate global exploration while values

close to zero facilitate a local exploration. The algorithm performs best if the

inertia function linearly decreases through the course of the implementation of

the algorithm.

In this work, PSO was used in spectrally unmixing brown and green

endmembers’ reflectance data corresponding to spectral bands used in computing ARI1

and MAREP. The linear unmixing model was used in relating abundances of each

colored region to the reflectance spectrum generated by the spectroradiometer system.

The linear mixing model for a mixed spectrum, mixedX , containing two endmembers is

given by:

2211 ccccmixed bfafX (4-3)

where 1cf represents the fractional abundance for class, 1c and 2cf represents fractional

abundance for class, 2c , while a and b represent the unmixed reflectance data for class

1c and class 2c , respectively.

56

In order to get the spectral reflectance for each of the classes, the objective function, J , had to be minimized using the least squares equation given below.

))(())((2

122112211 ccccmixed

T

ccccmixed bfafXbfafXJ (4-4)

Regression Models

Both partial least squares regression (PLSR) and stepwise multiple linear

regression (SMLR) statistical methods were used in building prediction models for the

quantitative assessment of early stage AMB samples and SPSS Statistics 23 (IBM,

Armonk, NY, USA) was used in developing prediction models.


Spectral Unmixing of Brown and Green Endmembers for Early Stage AMB Diseased Samples

Before a kNN classifier was applied to the 19 color and texture feature images, a

couple of preprocessing steps were applied to the color images. Hough transformation

method was used in extracting regions of interest while green pixels were masked for

easy segmentation of brown pixels using the ratio of the red channel to the green

channel. A threshold of 0.5 was chosen for green pixel masking after studying the band

ratio histogram. A summary of the steps is shown in Figure 4-2 using one of the

samples in the ambE dataset. Even after the application of Hough transformation, there

was still some remnants of the white ring on some of the images. The green pixels were

also not completely removed after applying the mask. As a result, five classes were

defined for the segmentation of brown pixels. The five classes were green, brown, vein,

background, and ring. The background class represented previously extracted green

pixel mask. The kNN algorithm assigned a unique color to each of the classes and the

57

light blue region in Figure 4-2c indicates regions in the image containing brown pixels.

Abundance estimation for the brown subclass was computed by dividing the sum of

brown pixels by the total number of pixels within the region of interest. In order to

simplify computation, all the other pixels representing vein and green regions were

regarded as the “green” class. For 2014 dataset, abundance estimation for the brown

class ranged roughly between 0.015 and 0.58, while for 2015, it ranged between

0.00006 and 0.21. The formula used in estimating the abundance, A , for each subclass

is given below:

T

c

S

SA (4-5)

where cS and TS represent the total number of pixels enclosed by a particular subclass

and the total number pixels enclosed by the entire region of interest, respectively.

Before PSO algorithm was applied to the ambE dataset, its parameters were

tuned using a spectral dataset with known mixed features, endmembers, and

abundances. This was done so as to ensure the algorithm would produce accurate

results for the new dataset. To guarantee stable results, the algorithm was run 1000

times on reflectance data at the five spectral bands used in computing ARI1 and

MAREP indices. After PSO analysis, ARI1 and MAREP spectral indices were calculated

for early stage AMB asymptomatic (green subclass), and symptomatic (brown subclass)

endmembers and Figure 4-3 shows plots of MAREP against ARI1 for 2014 and 2015

datasets. Asymptomatic and healthM classes had some slight overlap in 2014 while

healthY and asymptomatic classes had some overlap in 2015. The overlap resulted due

to the similarity in color of the three classes. As expected, there was no overlap

between the symptomatic class and the other classes as a result of its distinct color.

58

These results from the PSO analysis prove that mixed spectral data acquired from a

system, such as the one used in this analysis, can be separated and more efficiently

analyzed.

Quantitative Analysis of Early Stage AMB Diseased Samples

Both 2014 and 2015 datasets were combined, and the abundances estimated

using the kNN classifier were used as degrees of infestation for ambE. A total of 20

healthy samples were also included in the analysis to represent zero disease infestation

while all 261 ambE samples were used. Table 4-1 shows some statistical measures

that were computed for early stage disease analysis. A total of 140 samples had

disease severity levels less than 4%, 92 samples had severity levels between 4% and

15% while 49 samples had severity levels over 15%. Disease severity ranged between

0 and about 58%.

A PLSR analysis was performed using six components, and a coefficient of

determination (R2) of 0.74 was achieved. The regression coefficients, BETA, were

plotted against the full wavelength range to extract bands with the highest discriminatory

power. As was done by Jones et al. (2010), a threshold of the absolute value of 0.005

was set, and any spectral bands above this limit were retained. A total of 143 bands

met this criterion and were extracted for further analysis using SMLR. The dataset

contained a total of 281 samples; two-thirds of the samples were used as calibration

data while one-thirds was used as validation. The stepping method criteria were set to a

p-value of 0.05 for entry and a p-value of 0.1 for removal. A total of nine bands were

selected, and they were: 673 nm, 367 nm, 690 nm, 1655 nm, 988 nm, 1669 nm, 996

nm, 1415 nm and 1407 nm. The calibration dataset achieved an R2 of 0.76 while an R2

of 0.73 was achieved for the validation dataset. As demonstrated in Figure 4-3, these

59

results show some promise in using predictive models in quantifying disease severity.

Future work will include repeating prediction analysis using degrees of infestation

extracted from more accurate abundance estimation algorithms.

Conclusion

The main objective of this study was to detect AMB disease at the earliest stage

possible using spectral data from two consecutive years. Abundance estimation and

spectral unmixing algorithms were developed using a kNN classifier and PSO algorithm.

PSO showed promising results in spectrally unmixing early stage AMB diseased

samples by creating symptomatic and asymptomatic endmembers from previously

mixed spectral data. Two prediction models were combined to quantify various degrees

of infestation in early-stage AMB samples. The models that were used for this analysis

were PLSR and SMLR statistical methods. A total of nine spectral bands, between the

visible and SWIR wavelength range, were identified as significant in predicting disease

severity; but an R2 of 0.73 indicated an above average fit of the model. Future work on

this research will include collecting spectral reflectance information of leaf samples

using a system that can acquire data on a per pixel basis such as a hyperspectral

imaging system and investigate more efficient methods for disease abundance

estimation, spectral unmixing, and regression analysis. Overall, the results of this work

indicate the potential of using spectroscopic technology as a valuable non-invasive tool

for the early diagnosis of AMB.

60

Figure 4-1. Flowchart showing steps for extraction of brown and green colored pixel abundances in early-stage AMB diseased samples.

Step 1•Spectral measurements (350 - 2500 nm range)

Step 2•RGB image acquisition of leaf samples

Step 3•Extraction of circular region of interest using Hough transformation method

Step 4•Development of a method to mask green pixels

Step 5•Creation of color and texture features

Step 6•Abundance estimation using supervised kNN classfier

61

Figure 4-2. Steps taken in segmenting early stage AMB diseased color images. A)

extraction of region of interest using circle Hough transform method, B) masking of green pixels using ratio of red and green channels, C) kNN image segmentation--light blue regions indicate locations of brown pixels with a total abundance estimation of about 0.58.

A

B

C

62

Figure 4-3. The plot of MAREP against ARI1 for classes in spectroradiometer dataset

for two consecutive years. A) 2014, B) 2015.The classes include healthy mature (healthM), healthy young (healthY), early stage AMB asymptomatic (ambEGr), and early stage AMB symptomatic (ambEBr) Asymptomatic and symptomatic endmembers were calculated using a combination of kNN abundance estimation and particle swarm optimization.

A

B

63

Table 4-1. Disease severity statistical analysis results for 20 healthy and 261 early stage AMB diseased samples from 2014 and 2015 spectroradiometer datasets

Disease Severity

(%)

Number of samples

Min (%)

Max (%)

Mean (%)

Standard Deviation

(%)

< 4 140 0 3.9 1.4 1.2

4 - 15 92 4.1 14.8 8.5 3.2

> 15 49 15.1 57.9 23.5 9.1

Figure 4-4. Predicted versus actual disease severity for validation dataset using a combination of PLSR and SMLR prediction models.The classes that were used in the analysis were healthy and early stage AMB diseased samples from 2014 and 2015 datasets.

64

CHAPTER 5 HYPERSPECTRAL IMAGE ANALYSIS OF EARLY STAGE APPLE MARSSONINA

BLOTCH DISEASE USING SEQUENTIAL MAXIMUM ANGLE CONVEX CONE ALGORITHM

Background

Optical sensing technologies, such as hyperspectral imaging, are gaining

popularity as tools that can be used for quality assessment in the agricultural field.

Hyperspectral imaging is a quick, non-destructive and cost-effective technique for

detecting plant diseases. Hyperspectral images contain an abundance of information

that can be simplified to identify plant diseases. Researchers have been successful in

using the hyperspectral imaging technique for the identification of plant diseases. (Qin

et al., 2009) discovered four wavelengths (553, 677, 718, and 858 nm) that could be

used to recognize citrus canker in grapefruits with 92.7% accuracy. Additionally,

Penicilliumdigitatum in mandarin was successfully identified with an accuracy above

91% when using 20 wavelength bands (Gómez-Sanchis et al., 2008). Zhang et al.

(2005) succeeded in using five vegetation indices to detect late blight disease in tomato

leaves. They were able to separate healthy leaves from diseased ones before any

economic damage occurred. Hyperspectral imaging is also being used to expose

diseases in wheat, maize, and wine grapes (Del Fiore et al., 2010; Graeff et al., 2006;

Huang et al., 2007; Muhammed & Larsolle, 2003; Naidu et al., 2009)

The main objective of this work was to evaluate the potential of using a

hyperspectral imaging system for the identification of apple Marssonina blotch disease.

The specific objectives were to determine optimal spectral features and bands for early

disease detection and to develop a detection algorithm for early stage AMB disease

detection.

65

Materials and Methods

Outdoor Hyperspectral Data Acquisition

Hyperspectral images of leaf samples used in this work were acquired from Fuji

apple trees in an experimental apple orchard located in Gunwi-city, Korea. A test area

measuring 40 m x 60 m, with a total of 260 trees, was set aside for the experiment.

Datasets used in this analysis were acquired between the months of August and

October 2014. A hyperspectral camera (Spectral camera PS-V10E, SPECIM Inc.,

Finland) was used to acquire hyperspectral images for the outdoor hyperspectral

dataset. The camera consisted of an imaging spectrograph covering the spectral range

of 400 nm -1000 nm and a sensitive high speed interlaced CCD detector. The spectral

resolution of the camera was set to 2.8 nm, and its output was saved as digital 12-bit

files.

Unlike the spectroradiometer system referred to in previous sections, the

hyperspectral imaging system was able to capture both spatial and spectral information

on a per-pixel basis and allowed for pixel per pixel classification. The system was setup

with the hyperspectral camera mounted on a tripod, and a dark cloth was placed on the

ground to prevent weeds from being displayed in the image. Tree branches containing

about fifteen leaves with healthy, AMB asymptomatic and AMB symptomatic regions

were imaged. Figure 5-1 shows the RGB composite of the outdoor hyperspectral image.

Each hyperspectral image measured about 975 (spatial) x 696 (spatial) x 519 (spectral)

dimensions.

66

Image Preprocessing

Flat field correction, using 99% white reflectance standard and a dark current

measurement, was used to normalize each hyperspectral image to unit reflectance

using the normalization formula given below:

darkwhite

darkrawctd

RR

RRR

(5-1)

where ctdR is the corrected reflectance image,

rawR is the original uncorrected image,

and darkR and

whiteR are the mean radiance spectral values of regions of interests

extracted from dark current and white reflectance images, respectively.

After images were calibrated to reflectance, a Savitzky-Golay filter, with a

second-degree polynomial and a filter width of 7, was used to smooth and minimize

noisy signals in the images. All hyperspectral images were spectrally subsetted so as to

remove noisy bands. A total of 429 spectral bands between the wavelength range of

400 nm to 1000 nm were retained after spectrally subsetting the images. A background

mask was created using a combination of support vector machine (SVM) and texture

filters. All hyperspectral image analyses, including the preprocessing steps, were

performed using a combination of ENVI 5.2 (Exelis visual solutions information Inc.,

Boulder, CO, USA) and MATLAB R2014a (Version 8.3, The MathWorks Inc., Natwick,

MA, USA).

Vegetation Indices

Over ten vegetation indices (VIs) were investigated for the outdoor hyperspectral

dataset, but only a combination of modified triangular vegetation index (MTVI) and

matrix-adjusted red edge position (MAREP) efficiently separated the spectral classes.

67

Haboudane et al. (2004) created a modified version of the triangular vegetation

index (TVI) which was first introduced by Broge and Leblanc (2001). The concept

behind TVI, according to Broge and Leblanc (2001) is that the total area of the triangle,

defined by the green peak, the NIR shoulder, and the minimum reflectance in the red

region, will increase as a result of chlorophyll absorption (decrease in red reflectance)

and leaf tissue abundance (increase of NIR reflectance). Modified triangular vegetation

index (MTVI) makes TVI a better predictor of green leaf area index (LAI) by replacing

the 750 nm wavelength with 800 nm. MTVI is stated mathematically as the following

equation:

800 550 670 5501.2 1.2 2.5 MTVI (5-2)

where 𝜌800, 𝜌550, and 𝜌670 represent reflectance at 800 nm, 550 nm, and 670 nm,

respectively.

Matrix-adjusted red edge position (MAREP) is a vegetation index based on the 4-

point interpolation red edge position algorithm. The red edge in vegetation spectra

refers to the region of abrupt change in reflectance (also known as the point of

inflection) close to the near infrared (NIR) region. Stressed vegetation typically has this

point of inflection shifted towards shorter wavelengths in the visible spectral range and

this phenomenon is referred to as blue shift. More detailed information about MAREP

can be found in chapter 3. The MAREP algorithm used in this analysis was adjusted

slightly by removing the summation signs given in equation 3-3 and calculating the

MAREP index for each pixel without regarding the combined effect of a pixel’s cluster.

This was done because the modified version of the algorithm produced better

68

separation results between classes than when clustering analysis was combined with

MAREP. The modified version of the algorithm used in this chapter was named

MAREP2 so as to distinguish it from the version discussed in chapter 3.

Sequential Maximum Angle Convex Cone (SMACC)

In order to identify healthy, AMB asymptomatic and AMB symptomatic leaf

regions in the outdoor hyperspectral image, a model capable of distinguishing between

healthy and diseased pixels had to be utilized. In this work, the sequential maximum

angle convex cone (SMACC) algorithm was used to identify healthy, asymptomatic and

symptomatic regions on a cluster of leaves. SMACC is an endmember extraction

algorithm that simultaneously finds endmembers and their respective abundances in

hyperspectral images (Gruninger et al., 2004). SMACC uses a convex cone model for

representing vector data. The technique finds extreme vectors within a dataset and

uses these extreme vectors as endmembers. It first finds the endmember with the

highest intensity and then, the next endmember it finds is the one most extreme from

the first one found. Subsequent endmembers are chosen as pixels most different from

the already found endmembers. If no predefined number of endmembers is stated, the

algorithm continues the search until an endmember pixel already accounted for in the

previous group is found. SMACC uses the following equation in finding each

endmember, H:

, , ,

1

N

c i c k k j

k

H L f

(5-3)

where i is the pixel index. j and k are endmember indices from 1 to the expansion

length, N. L is a matrix that contains the endmember spectra as columns. c is the

69

spectral channel index. f is a matrix that contains the fractional contribution (abundance)

of each endmember j in each endmember, k, for each pixel.

Kullback-Liebler Divergence (KLD)

Kullback-Liebler divergence (KLD) is a dissimilarity measure between two

probability distributions and there are both asymmetric and symmetric versions of KLD.

It is not a true metric, but it is still often used to measure the distance between two

probability distributions. In the context of hyperspectral band selection, it can tell how

different two image bands are. Bands that are less correlated have higher KLD values

and vice versa. The symmetric KLD was used as a dissimilarity measure for a

hierarchical clustering band selection process, and it is given as:

𝐷𝐾𝐿(𝑋𝑖, 𝑋𝑗) = ∑ 𝑝𝑖(𝑥) log𝑝𝑖(𝑥)

𝑝𝑗(𝑥)𝑥∈Ω

+ ∑ 𝑝𝑗(𝑥) log𝑝𝑗(𝑥)

𝑝𝑖(𝑥)𝑥∈Ω

(5-4)

where 𝑋𝑖, 𝑋𝑗 represent two pairs of discrete variables defined in a Ω space and 𝑖 and 𝑗

are two bands of a hyperspectral image. 𝑝𝑖(𝑥) and 𝑝𝑗(𝑥) are the probability distributions

of variables in bands 𝑖 and 𝑗, respectively. 𝐷𝐾𝐿(𝑋𝑖, 𝑋𝑗) requires that both 𝑝𝑗(𝑥) 𝑎𝑛𝑑 𝑝𝑖(𝑥)

be absolute continuous with respect to each other. 𝐷𝐾𝐿(𝑋𝑖, 𝑋𝑗) is always nonnegative

and results in zero when probability distributions, 𝑝𝑖(𝑥) and𝑝𝑗(𝑥), are the same.

Just as was done in chapter 3 with the OSP algorithm, JM distance was also

used in removing redundant bands before the KLD hierarchical clustering band

selection process.

Support Vector Machine (SVM) Classification

A support vector machine (SVM) classifier with a radial basis function kernel was

designed for the classification process. SVM classifiers are known to yield good

70

classification results for complex and noisy data. The gamma parameter used in the

kernel function was set to the inverse of the number of spectral features used in the

classification process. A neural network classifier was also used in classifying the

hyperspectral data, but SVM performed better in mapping input data to their respective

classes, and as a result was not included in this work. SVM in its most basic form is a

binary classifier and cannot perform multi-label classification in one single stage. In

order to efficiently classify the hyperspectral data, a series of SVM classifiers were built

in ENVI for each pair of classes and the results were combined to give one set of

classification result.


SVM and Texture Filter Background Masking

The original 2014 outdoor hyperspectral image contained leaves and background

(branches, apple fruit, reflectance standards, etc.). In order to ensure the SMACC

endmember extraction algorithm would only find leaf endmembers in the image, the

background was masked using a two-stage SVM and occurrence texture filter process.

At first, supervised classification was performed using SVM to mask the background,

but not all the regions were successfully masked as shown in Figure 5-2.

Five occurrence filters were then applied on the SVM masked image and their

results were compared. The filters that were applied were data range, mean, entropy,

variance and skewness. These filters were used in calculating the image texture in

every 3 x 3 processing window by using a number of occurrence of each gray level in

the window. Due to the large size of the hyperspectral image, only the band image at

450.8 nm wavelength was applied in this process and the results are shown in Figure 5-

3. From the figure, it can be seen the entropy image emphasizes the leaf regions more

71

than the other texture images and as a result, it was chosen for the second stage of the

masking process. The final result after applying the entropy image mask is given in

Figure 5-4. It can be seen both SVM and entropy texture filter did a good job at masking

all the background regions.

SMACC Endmember Extraction and Spectral Feature Analysis

SMACC algorithm was applied to the hyperspectral image after the background

was masked. Three endmember spectra and three abundance images were created

using SMACC. The abundance images were thoroughly analyzed and for each

abundance image, threshold values were set so as to extract regions of interest which

could be used to define classes for the outdoor hyperspectral dataset. At the end of the

analysis, five endmember classes were generated and were named according to their

disease severity levels. Three of the classes were non-symptomatic while two were

symptomatic and as a result the classes were named: non-symp 1, non-symp 2, non-

symp 3, symp 1 and symp 2. A total of 3034 pixels were randomly selected from each

endmember region (15,170 pixels combined) and a new image was created using these

pixels. For each class, 1001 pixels were randomly selected for calibration while the

remaining 2033 pixels were used as validation data.

The mean spectra developed using the calibration dataset is given in Figure 5-5.

From the figure, it can be seen only symptomatic classes do not have a peak at the

nitrogen absorption band of 550 nm indicating the absence of chlorophyll in those

samples. Due to chlorophyll absorption in the red range, healthier samples have lower

reflectance than AMB diseased samples at that range. The internal scattering of light in

the NIR range is responsible for higher reflectance in healthier samples. Since the

original hyperspectral image was calibrated to unit reflectance, it was expected that the

72

reflectance data for each of the classes would not exceed one; but from the figure, it

can be seen the healthy class has its reflectance signature in the NIR region going over

one. One probable cause of this anomaly could be due to the transmission of light

through the leaves resulting in nonlinearities in the data.

Vegetation Indices

A total of five spectral bands were used in computing both MAREP2 and MTVI

for the five SMACC endmembers and the results are shown in Figure 5-6. From the

figure, it can be seen the healthier the class, the higher the MAREP2 and MTVI values.

The plot also shows slight overlap between the classes but for the most part, the five

classes could be separated from one another.

Band Selection using KLD Hierarchical Clustering for 2014 Dataset

Before the band selection process, JM distance was used in removing 92

redundant spectral bands. A total of 337 bands were used for the band selection stage.

KLD was calculated for each pair of the remaining spectral bands and an agglomerative

hierarchical clustering band selection was applied on 5005 training pixel vectors (1001

pixels for each class). The clustering algorithm started by defining each hyperspectral

band as a separate cluster and then began clustering bands based on the KLD

divergence measure. The maximum number of clusters defined for the dataset was ten

and six optimal spectral bands were obtained in the end. Each cluster represented

bands that were highly correlated with one another. The bands that were least

correlated with the other bands in each cluster were selected as optimal bands. The six

spectral bands that were selected at the end of the process were 666.6 nm, 671.7 nm,

828 nm, 876.5 nm, 947.9 nm and 1000 nm.

73

SVM Classification of MTVI and MAREP Features

Both MTVI and MAREP2 vegetation indices were used as input features for an

SVM classifier. The classification accuracy results achieved for the test data are shown

in Table 5-1. The overall classification accuracy was 93.5%. The background class

aside, non-symp 1 and symp 2 classes had the highest classification accuracies of

94.5% and 94.6%, respectively. For the non-symp 1 class, 112 samples were

misclassified as non-symp 2. Non-symp 1 samples were classified with an accuracy of

87.1% with 118 samples misclassified as non-symp 1 and 144 samples misclassified as

non-symp 3. Samples in the non-symp 3 class were classified with an accuracy of

90.7% with 107 samples misclassified as non-symp 2 and 82 samples misclassified as

symp 1. For the symp 1 class, 94.1% of its samples were correctly classified, but there

were 70 misclassifications as non-symp 3 and 48 misclassifications as symp 2. Lastly,

109 samples in the symp 2 class were misclassified as symp 1 while one of its samples

was misclassified as non-symp 3. The general trend from the classification analysis

showed misclassified samples were from adjacent classes. This is not surprising since

adjacent classes tend to be more spectrally similarity to one another than classes that a

farther away. It can be inferred from the results that if both symptomatic classes were

merged into one class and the non-symptomatic classes into another, the classification

results would have been better. However, such an analysis was not regarded since our

goal was to detect the disease at the earliest stage possible so as to reduce economic

losses.

SVM Classification of JM-KLD Hierarchical Clustering

The spectral reflectance data at the six optimal bands selected by JM-KLD

hierarchical clustering were again used as input features for an SVM classifier. The

74

classification accuracy results achieved for the test data are shown in Table 5-2. The

overall classification accuracy was 92.7%, 0.8% less than the accuracy achieved using

MTVI and MAREP2 features. The non-symp 1 class achieved the highest accuracy

among all the classes with an accuracy of 96.9%, about 2.4% higher than the previous

classification. The second highest accuracy was obtained by symp 2 class, about 2%

lower than what was achieved for the non-symp 1 class. For the non-symp 1 class, 63

samples were misclassified as non-symp 2. Non-symp 2 samples were classified with

an accuracy of 88.3.7% with 94 samples misclassified as non-symp 1 and 143 samples

misclassified as non-symp 3. Samples in the non-symp 3 class were classified with an

accuracy of 90.2% with 125 samples misclassified as non-symp 2 and 74 samples

misclassified as symp 1. For the symp 1 class, there were 63 misclassifications as non-

symp 3 and 66 misclassifications as symp 2. Finally, 1921samples in the symp 2 class

were correctly classified with 112 misclassifications as symp 1. Again, we see the

general trend from the classification analysis showed misclassified samples were from

adjacent classes with similar spectral characteristics.

Overall, it can be seen the classification results achieved using both vegetation

indices and optimal bands were very similar and provided classification accuracies

higher than 90% for the five classes. While SMACC is a well-known method for

extracting endmembers, more efficient asymptomatic detection methods need to be

explored and their results compared to those achieved by SMACC. Future work on the

detection of the disease will include analyzing time-lapse hyperspectral images taken

every two to three days in order to guarantee the detection of healthy and asymptomatic

regions before symptoms become visible.

75

Conclusion

The main objective of this study was to detect AMB disease at the asymptomatic

stage using outdoor hyperspectral images acquired in 2014. In order to efficiently define

endmembers for the outdoor hyperspectral dataset, an endmember extraction algorithm

called sequential maximum angle convex cone (SMACC) was used. Five classes were

created, and MTVI and MAREP 2 vegetation indices were built based on five spectral

bands. Six optimal spectral bands were also chosen using a combination of Jefferies

Matusita distance and KLD hierarchical clustering algorithms. These features were used

as input features for an SVM classifier. Results showed that both MTVI and MAREP 2

vegetation indices and the six selected bands could efficiently separate non-

symptomatic and symptomatic pixels on AMB diseased leaves.

76

Figure 5-1. RGB color display of an outdoor hyperspectral image taken in 2014.

Figure 5-2. Comparison of original hyperspectral image and SVM-masked image. A)

before mask was applied, B) after mask was applied.

A B

77

Figure 5-3. Texture images from previously SVM-masked image at wavelength 450.8 nm. A) data range, B) entropy, C) variance, D) mean, and E) skewness.

A

B

C

D

E

78

Figure 5-4. Combined SVM-texture filter masking result. A) entropy mask, B) hyperspectral image after applying mask.

A B

79

Figure 5-5. Mean reflectance spectra of non-symptomatic and AMB symptomatic samples in 2014 outdoor hyperspectral dataset

80

Figure 5-6. The plot of MAREP 2 against MTVI for non-symptomatic and AMB symptomatic samples in 2014 outdoor hyperspectral dataset.

81

Table 5-1. SVM classification accuracy for six outdoor hyperspectral classes using MTVI and MAREP2 vegetation indices as input features.

Pre

dic

ted

Actual

total

non_symp1

non_symp2

non_symp3

symp1

symp2

background

non_symp1 1921 118 0 0 0 0 2039

(94.5%) (5.8%)

non_symp2 112 1771 107 1

0 0 1991

(5.5%) (87.1%) (5.3%) (0.05%)

non_symp3 0 144 1844 70 1 0 2059

(7.1%) (90.7%) (3.4%) (0.05%)

symp1 0 0 82 1914 109 0 2105

(4.0%) (94.1%) (5.4%)

symp2 0 0 0 48 1924 0 1971

(2.4%) (94.6%)

background 0 0 0 0 0 2033 2033

(100%)

Total 2033 2033 2033 2033 2033 2033 12198

Table 5-2. SVM classification accuracy for six outdoor hyperspectral classes using six spectral bands selected by JM-KLD hierarchical clustering algorithm.

Pre

dic

ted

Actual

total

non_symp1

non_symp2

non_symp3

symp1

symp2

background

non_symp1 1970 94 0 0 0 0 2064

(96.9%) (4.6%)

non_symp2 63 1796 125 0 0 0 1984

(3.1%) (88.3%) (6.1%)

non_symp3 0 143 1834 63 0 0 2040

(7.0%) (90.2%) (3.1%)

symp1 0 0 74 1904 112 0 2090

(3.6%) (93.7%) (5.5%)

symp2 0 0 0 66 1921 0 1987

(3.2%) (94.5%)

background 0 0 0 0 0 2033

(100%)

2033

(100%)

Total 2033 2033 2033 2033 2033 2033 12198

82

CHAPTER 6 SUMMARY AND FUTURE DIRECTION

Two different datasets were analyzed in this study for the early detection of AMB

disease: indoor spectroradiometer data and outdoor hyperspectral images. For the

indoor dataset, both qualitative and quantitative analyses were performed. Qualitative

analysis of the disease was carried out by building classification algorithms and using

vegetation indices and reflectance data at five optimal spectral bands as input features.

These spectral features were developed using JM distance, OSP, ARI1 and MAREP

algorithms. The results indicated a combination of MAREP and ARI1 were more

effective in separating the healthy and stressed classes than reflectance data at the

selected spectral bands.

The qualitative analysis of the disease using data acquired from a

spectroradiometer system revealed two important things. The first was that spectral

data of small regions on the leaves could not be acquired and analyzed due to the large

diameter of the system component used in holding the leaves in place during spectral

data acquisition. This resulted in spectrally mixed data for leaf samples containing both

seemingly healthy and diseased regions. Another significant finding resulted from

analyzing the MAREP and ARI1 features of early-stage diseased spectral samples. It

was found that the spectral features of samples belonging to this class were more

widely distributed than those of the healthy and nutrient deficient classes. As a result,

quantitative analysis of previously analyzed early-stage diseased samples was

performed.

Abundance estimation and spectral unmixing algorithms were developed using a

kNN classifier and PSO algorithm. PSO showed promising results in spectrally unmixing

83

early stage AMB diseased samples by creating symptomatic and asymptomatic

endmembers from previously mixed spectral data. As for the qualitative aspect of the

analysis, PLSR and SMLR prediction models were combined to quantify various

degrees of infestation in early stage AMB samples. A total of nine spectral bands,

between the visible and SWIR wavelength range, were identified as significant in

predicting disease severity.

Even though the analysis of distinct regions on early-stage diseased leaves was

improved using a combination of abundance estimation and spectral unmixing

algorithms, the algorithms could not separate the regions 100% and as a result, a

device capable of extracting regions on a per-pixel basis was investigated for more

efficient analysis of the disease. The system that was used was a hyperspectral imaging

system and this time data was acquired from an outdoor setting. A tree branch

containing about fifteen leaves, with some regions on the leaves showing early

symptoms of the disease, was imaged in 2014. In order to extract both non-

symptomatic and symptomatic regions on the leaves, SMACC endmember extraction

algorithm was used. Spectral features were developed using MTVI, MAREP2, JM

distance and, KLD hierarchical clustering algorithms and they served as input features

for an SVM classifier. Comparable results were achieved using both vegetation indices

and optimal spectral bands and both methods efficiently separated non-symptomatic

and symptomatic pixels on AMB diseased leaves with an overall accuracy of over 92%.

The SMACC algorithm worked well in creating endmembers at different disease

severity stages, but it is by no means the most efficient method for analyzing AMB,

particularly at the asymptomatic stage. A recommendation for future work on this project

84

would be to collect time-lapse hyperspectral images every two to three days, at about

the same time of the day, until early-stage symptoms begin to develop. To ensure the

same regions were symptoms would eventually develop are analyzed, one of the

preprocessing stages should include image registration and resampling. This process

will allow for more accurate analysis of regions of interest from the time-series images.

After accomplishing this, thresholds can be set for healthy, asymptomatic and

symptomatic classes using a combination of the six optimal spectral bands found by

using a combination of JM distance and KLD hierarchical clustering or by using the

spectral features built using a combination of MAREP2 and MTVI. Based on results

achieved from analyzing the time-lapse images, a low-cost multispectral camera can

then be built and mounted on either a ground-based platform like an unmanned ground

vehicle (UGV) or an aerial platform like a drone for the early detection of the disease in

apple orchards.

85

LIST OF REFERENCES

Apan, A., Held, A., Phinn, S., & Markley, J. (2004). Detecting sugarcane ‘orange rust’disease using EO-1 Hyperion hyperspectral imagery. International Journal of Remote Sensing, 25(2), 489-498.

Back, C.-G., Lee, S.-Y., Kang, I.-K., Yoon, T.-M., & Jung, H.-Y. (2015). Occurrence and

Analysis of Apple Blotch-like Symptoms on Apple Leaves. 원예과학기술지,

33(3), 429-434.

Batte, M. T. (1999). National Research Council. Precision Agriculture in the 21st Century: Geospatial and Information Technologies in Crop Management. Washington DC: National Academy Press, 1997, 168 pp., $39.95. American Journal of Agricultural Economics, 81(3), 755-756.

Belasque Jr, J., Gasparoto, M., & Marcassa, L. G. (2008). Detection of mechanical and disease stresses in citrus plants by fluorescence spectroscopy. Applied Optics, 47(11), 1922-1926.

Berardo, N., Pisacane, V., Battilani, P., Scandolara, A., Pietri, A., & Marocco, A. (2005). Rapid detection of kernel rots and mycotoxins in maize by near-infrared reflectance spectroscopy. Journal of Agricultural and Food Chemistry, 53(21), 8128-8134.

Broge, N. H., & Leblanc, E. (2001). Comparing prediction power and stability of broadband and hyperspectral vegetation indices for estimation of green leaf area index and canopy chlorophyll density. Remote sensing of environment, 76(2), 156-172.

Bruce, L. M., Koger, C. H., & Li, J. (2002). Dimensionality reduction of hyperspectral data using discrete wavelet transform feature extraction. Geoscience and Remote Sensing, IEEE Transactions on, 40(10), 2331-2338.

Del Fiore, A., Reverberi, M., Ricelli, A., Pinzari, F., Serranti, S., Fabbri, A., . . . Fanelli, C. (2010). Early detection of toxigenic fungi on maize by hyperspectral imaging analysis. International Journal of Food Microbiology, 144(1), 64-71.

Du, Q., & Yang, H. (2008). Similarity-based unsupervised band selection for hyperspectral image analysis. Geoscience and Remote Sensing Letters, IEEE, 5(4), 564-568.

Eberhart, R. C., & Kennedy, J. (1995). A new optimizer using particle swarm theory. Paper presented at the Proceedings of the sixth international symposium on micro machine and human science.

86

EPPO. (2013). Diplocarpon mali (anamorph: Marssonina coronaria). Retrieved December 12, 2015, from http://www.eppo.int/QUARANTINE/Alert_List/fungi/Diplocarpon_mali.htm

FAO. (2013). Global fruit production in 2013, by variety (in million metric tons). Retrieved November 16, 2015, from http://www.statista.com/statistics/264001/worldwide-production-of-fruit-by-variety/

Franke, J., & Menz, G. (2007). Multi-temporal wheat disease detection by multi-spectral remote sensing. Precision Agriculture, 8(3), 161-172.

Gómez-Sanchis, J., Gómez-Chova, L., Aleixos, N., Camps-Valls, G., Montesinos-Herrero, C., Moltó, E., & Blasco, J. (2008). Hyperspectral system for early detection of rottenness caused by penicilliumdigitatum in mandarins. Journal of Food Engineering, 89(1), 80-86.

Graeff, S., Link, J., & Claupein, W. (2006). Identification of powdery mildew (Erysiphe graminis sp. tritici) and take-all disease (Gaeumannomyces graminis sp. tritici) in wheat (Triticum aestivum L.) by means of leaf reflectance measurements. Open Life Sciences, 1(2), 275-288.

Gruninger, J. H., Ratkowski, A. J., & Hoke, M. L. (2004). The sequential maximum angle convex cone (SMACC) endmember model. Paper presented at the Defense and Security.

Haboudane, D., Miller, J. R., Pattey, E., Zarco-Tejada, P. J., & Strachan, I. B. (2004). Hyperspectral vegetation indices and novel algorithms for predicting green LAI of crop canopies: Modeling and validation in the context of precision agriculture. Remote sensing of environment, 90(3), 337-352.

Harada, Y., Sawamura, K, & Konno, K. (1974). Diplocarpon mali sp. nov., the perfect state of apple blotch fungus Marssonina coronaria. Annals of the Phytopathological Society of Japan.

Huang, W., Lamb, D. W., Niu, Z., Zhang, Y., Liu, L., & Wang, J. (2007). Identification of yellow rust in wheat using in-situ spectral reflectance measurements and airborne hyperspectral imaging. Precision Agriculture, 8(4-5), 187-197.

Jia, X., & Richards, J. (1999). Segmented principal components transformation for efficient hyperspectral remote-sensing image display and classification. Geoscience and Remote Sensing, IEEE Transactions on, 37(1), 538-542.

Jones, C., Jones, J., & Lee, W. (2010). Diagnosis of bacterial spot of tomato using spectral signatures. Computers and Electronics in Agriculture, 74(2), 329-335.

Kobayashi, T., Kanda, E., Kitada, K., Ishiguro, K., & Torigoe, Y. (2001). Detection of rice panicle blast with multispectral radiometer and the potential of using airborne multispectral scanners. Phytopathology, 91(3), 316-323.

http://www.eppo.int/QUARANTINE/Alert_List/fungi/Diplocarpon_mali.htm

http://www.statista.com/statistics/264001/worldwide-production-of-fruit-by-variety/

87

Kumar, A., Lee, W. S., Ehsani, R. J., Albrigo, L. G., Yang, C., & Mangan, R. L. (2012). Citrus greening disease detection using aerial hyperspectral and multispectral imaging techniques. Journal of Applied Remote Sensing, 6(1), 063542-063541-063542-063522.

Lee, C.-H., Lee, S.-Y., Jung, H.-Y., & Kim, J.-H. (2012). The application of optical coherence tomography in the diagnosis of Marssonina blotch in apple leaves. Journal of the Optical Society of Korea, 16(2), 133-140.

Lee, D.-H., Back, C.-G., Win, N. K. K., Choi, K.-H., Kim, K.-M., Kang, I.-K., . . . Jung, H.-Y. (2011). Biological characterization of Marssonina coronaria associated with apple blotch disease. Mycobiology, 39(3), 200-205.

Lee, H.-T., & Shin, H.-D. (2000). Taxonomic studies on the genus Marssonina in Korea. Mycobiology, 28(1), 39-46.

Li, H., Lv, X., Wang, J., Li, J., Yang, H., & Qin, Y. (2007). Quantitative determination of soybean meal content in compound feeds: comparison of near-infrared spectroscopy and real-time PCR. Analytical and bioanalytical chemistry, 389(7-8), 2313-2322.

Martínez-Usó, A., Pla, F., Sotoca, J. M., & García-Sevilla, P. (2007). Clustering-based hyperspectral band selection using information measures. Geoscience and Remote Sensing, IEEE Transactions on, 45(12), 4158-4171.

Morgan, M., & Ess, D. (1997). The precision-farming guide for agriculturists. Deere and Company.

Moshou, D., Bravo, C., Oberti, R., West, J., Bodria, L., McCartney, A., & Ramon, H. (2005). Plant disease detection based on data fusion of hyper-spectral and multi-spectral fluorescence imaging using Kohonen maps. Real-Time Imaging, 11(2), 75-83.

Muhammed, H. H., & Larsolle, A. (2003). Feature vector based analysis of hyperspectral crop reflectance data for discrimination and quantification of fungal disease severity in wheat. Biosystems engineering, 86(2), 125-134.

Naidu, R. A., Perry, E. M., Pierce, F. J., & Mekuria, T. (2009). The potential of spectral reflectance technique for the detection of Grapevine leafroll-associated virus-3 in two red-berried wine grape cultivars. Computers and Electronics in Agriculture, 66(1), 38-45.

Oberhänsli, T., Vorley, T., Tamm, L., & Schärer, H. (2014). Development of a quantitative PCR for improved detection of Marssonina coronaria in field samples. Paper presented at the Ecofruit. 16th International Conference on Organic-Fruit Growing: Proceedings, 17-19 February 2014, Hohenheim, Germany.

88

Omran, M. G., Engelbrecht, A. P., & Salman, A. (2006). Particle swarm optimization for pattern recognition and image processing Swarm intelligence in data mining (pp. 125-151): Springer.

Pearson, T., Wicklow, D., Maghirang, E., Xie, F., & Dowell, F. (2001). Detecting aflatoxin in single corn kernels by transmittance and reflectance spectroscopy. Transactions of the ASAE, 44(5), 1247.

Qin, J., Burks, T. F., Kim, M. S., Chao, K., & Ritenour, M. A. (2008). Citrus canker detection using hyperspectral reflectance imaging and PCA-based image classification method. Sensing and Instrumentation for Food Quality and Safety, 2(3), 168-177.

Qin, J., Burks, T. F., Ritenour, M. A., & Bonn, W. G. (2009). Detection of citrus canker using hyperspectral reflectance imaging with spectral information divergence. Journal of food engineering, 93(2), 183-191.

Roggo, Y., Duponchel, L., Noe, B., & Huvenne, J. (2002). Sucrose content determination of sugar beets by near infrared reflectance spectroscopy. Comparison of calibration methods and calibration transfer. Journal of near infrared spectroscopy, 10(2), 137-150.

Rumpf, T., Mahlein, A.-K., Steiner, U., Oerke, E.-C., Dehne, H.-W., & Plümer, L. (2010). Early detection and classification of plant diseases with Support Vector Machines based on hyperspectral reflectance. Computers and Electronics in Agriculture, 74(1), 91-99.

Sankaran, S., Maja, J. M., Buchanon, S., & Ehsani, R. (2013). Huanglongbing (citrus greening) detection using visible, near infrared and thermal imaging techniques. Sensors, 13(2), 2117-2130.

Sankaran, S., Mishra, A., Ehsani, R., & Davis, C. (2010). A review of advanced techniques for detecting plant diseases. Computers and Electronics in Agriculture, 72(1), 1-13.

Sankaran, S., Mishra, A., Maja, J. M., & Ehsani, R. (2011). Visible-near infrared spectroscopy for detection of Huanglongbing in citrus orchards. Computers and electronics in agriculture, 77(2), 127-134.

Tamietti, G., & Matta, A. (2003). First report of leaf blotch caused by Marssonina coronaria on apple in Italy. Plant Disease, 87(8), 1005-1005.

Tanaka, S., Kamegawa, N., Ito, S., & Kameya Iwaki, M. (2000). Detection of thiophanate-methyl-resistant strains in Diplocarpon mali, causal fungus of apple [Malus pumila] blotch. Journal of General Plant Pathology (Japan).

Thenkabail, P. S., Lyon, J. G., & Huete, A. (2011). Hyperspectral remote sensing of vegetation: CRC Press.

89

Wang, H., & Angelopoulou, E. (2006). Sensor band selection for multispectral imaging via average normalized information. Journal of Real-Time Image Processing, 1(2), 109-121.

Wu, D., Feng, L., Zhang, C., & He, Y. (2008). Early detection of Botrytis cinerea on eggplant leaves based on visible and near-infrared spectroscopy. Transactions of the ASABE, 51(3), 1133-1139.

Xu, H., Ying, Y., Fu, X., & Zhu, S. (2007). Near-infrared spectroscopy in detecting leaf miner damage on tomato leaf. Biosystems Engineering, 96(4), 447-454.

Yang, C., Lee, W. S., & Williamson, J. G. (2012). Classification of blueberry fruit and leaves based on spectral signatures. biosystems engineering, 113(4), 351-362.

Yin, L., Li, M., Ke, X., Li, C., Zou, Y., Liang, D., & Ma, F. (2013). Evaluation of Malus germplasm resistance to marssonina apple blotch. European journal of plant pathology, 136(3), 597-602.

Yuan, L., Huang, Y., Loraamm, R. W., Nie, C., Wang, J., & Zhang, J. (2014). Spectral analysis of winter wheat leaves for detection and differentiation of diseases and insects. Field Crops Research, 156, 199-207.

Zhang, B., Sun, X., Gao, L., & Yang, L. (2011). Endmember extraction of hyperspectral remote sensing images based on the discrete particle swarm optimization algorithm. Geoscience and Remote Sensing, IEEE Transactions on, 49(11), 4173-4176.

Zhang, M., Qin, Z., & Liu, X. (2005). Remote sensed spectral imagery to detect late blight in field tomatoes. Precision Agriculture, 6(6), 489-508.

Zhang, N., Wang, M., & Wang, N. (2002). Precision agriculture—a worldwide overview. Computers and electronics in agriculture, 36(2), 113-132.

90

BIOGRAPHICAL SKETCH

Mubarakat Shuaibu was born and raised in Nigeria, Africa. In 2016, she

graduated with a Master of Science degree at the University of Florida where she was

interested in agricultural engineering. She was also appointed as a graduate research

assistant at the university’s Department of Agricultural and Biological Engineering. Her

research focused on finding ways hyperspectral imaging technology could be used as a

tool for the early detection of a fungal disease in apples called Marssonina blotch.

Shuaibu’s interest in agriculture began in 2002 while she was still a senior in high

school; however, it was only in 2011, while working as a project engineer at her family’s

real estate company, she decided she would pursue agricultural engineering as a

career. This decision stemmed from doing personal research on the agricultural sector

of her country and finding out that factors such as climate change, insufficient land for

farming, population growth, and most importantly, inadequate technical expertise, had

not exactly favored the production of adequate food. Her career goal is to be an

accomplished Agricultural Engineer, who focuses on finding ways to develop better and

more sustainable methods to grow food to meet world need.

In 2009, Shuaibu graduated with a first-class in BEng (Hons) Electrical and

Electronic Engineering from one of the top engineering universities in the UK --

University of Nottingham. After her graduation, she returned home and worked as an

instrument engineer at National Engineering and Technical Company (NETCO), an

engineering firm primarily focused on designing and building plants and facilities for

companies in the Nigerian oil and gas sector. She later went on to work as a project

engineer at her family's real estate company called Bright Star Realties and as an

electrical design engineer and health and safety officer at C.A. Preston Engineering Ltd,

91

a full-service oil and gas engineering consulting company with specialized expertise in

both onshore and offshore petroleum developments.

detection of apple marssonina blotch disease using

Documents