detection of apple marssonina blotch disease using
TRANSCRIPT
DETECTION OF APPLE MARSSONINA BLOTCH DISEASE USING HYPERSPECTRAL DATA
By
MUBARAKAT SHUAIBU
A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE
UNIVERSITY OF FLORIDA
2016
© 2016 Mubarakat Shuaibu
To my family and friends
4
ACKNOWLEDGMENTS
I would like to express my sincere gratitude and appreciation to my Committee
Chair, Dr. Won Suk Lee, for his guidance and encouragement through the course of my
Master’s program. I would also like to thank the other members of my supervisory
committee, Dr. John Schueller and Dr. Paul Gader, for their advice and help.
My sincere gratitude also goes to the Rural Development Administration (RDA),
Korea, for supporting this research and to my colleagues at the Precision Agriculture
Laboratory for their support and friendship.
I thank my Mum and my dear friends, Tega and Sabrina, for their moral
encouragement and prayer throughout the course of this work. Finally, and most
importantly, I would like to thank God for seeing me through the course of this program.
5
TABLE OF CONTENTS page
ACKNOWLEDGMENTS .................................................................................................. 4
LIST OF TABLES ............................................................................................................ 7
LIST OF FIGURES .......................................................................................................... 8
LIST OF ABBREVIATIONS ............................................................................................. 9
ABSTRACT ................................................................................................................... 10
CHAPTER
1 INTRODUCTION .................................................................................................... 12
Precision Agriculture ............................................................................................... 13
Spectroscopy .......................................................................................................... 14 Multispectral and Hyperspectral Imaging ................................................................ 15 Global Apple Industry.............................................................................................. 15
Apple Industry in South Korea ................................................................................ 16 Apple Marssonina Blotch ........................................................................................ 16
2 LITERATURE REVIEW .......................................................................................... 20
Application of Spectroscopy for Plant Disease Detection ....................................... 20
Application of Multispectral and Hyperspectral Imaging for Plant Disease Detection ............................................................................................................. 22
3 APPLE MARSSONINA BLOTCH DETECTION USING INDOOR SPECTRORADIOMETER DATA ............................................................................ 25
Background ............................................................................................................. 25
Materials and Methods............................................................................................ 29 Data Collection ................................................................................................. 29 Vegetation Indices ............................................................................................ 31
Jefferies Matusita-Orthogonal Subspace Projection (JM-OSP) Band Selection ....................................................................................................... 33
Results and Discussion........................................................................................... 35 Spectral Feature Analysis ................................................................................. 35
JM-OSP Band Selection ................................................................................... 35 ARI1 and MAREP ............................................................................................. 36 Classification Based On JM-OSP Bands .......................................................... 37 Classification Based On ARI1 and MAREP Features ....................................... 39 QDA Binary Classification Based On ARI1 and MAREP Features ................... 40
Conclusion .............................................................................................................. 42
6
4 QUANTITATIVE ANALYSIS AND SPECTRAL UNMIXING OF EARLY STAGE APPLE MARSSONINA BLOTCH DISEASE ........................................................... 52
Background ............................................................................................................. 52
Methods .................................................................................................................. 53 Abundance Estimation of Early Stage AMB Diseased Data ............................. 53 Endmember Extraction from Early Stage AMB Diseased Data ........................ 54 Regression Models ........................................................................................... 56
Results and Discussion........................................................................................... 56
Spectral Unmixing of Brown and Green Endmembers for Early Stage AMB Diseased Samples ........................................................................................ 56
Quantitative Analysis of Early Stage AMB Diseased Samples ......................... 58 Conclusion .............................................................................................................. 59
5 HYPERSPECTRAL IMAGE ANALYSIS OF EARLY STAGE APPLE MARSSONINA BLOTCH DISEASE USING SEQUENTIAL MAXIMUM ANGLE CONVEX CONE ALGORITHM ............................................................................... 64
Background ............................................................................................................. 64
Materials and Methods............................................................................................ 65 Outdoor Hyperspectral Data Acquisition........................................................... 65 Image Preprocessing ....................................................................................... 66
Vegetation Indices ............................................................................................ 66 Sequential Maximum Angle Convex Cone (SMACC) ....................................... 68
Kullback-Liebler Divergence (KLD) .................................................................. 69
Support Vector Machine (SVM) Classification .................................................. 69
Results and Discussion........................................................................................... 70 SVM and Texture Filter Background Masking .................................................. 70 SMACC Endmember Extraction and Spectral Feature Analysis ...................... 71
Vegetation Indices ............................................................................................ 72 Band Selection using KLD Hierarchical Clustering for 2014 Dataset ............... 72
SVM Classification of MTVI and MAREP Features .......................................... 73 SVM Classification of JM-KLD Hierarchical Clustering ..................................... 73
Conclusion .............................................................................................................. 75
6 SUMMARY AND FUTURE DIRECTION ................................................................. 82
LIST OF REFERENCES ............................................................................................... 85
BIOGRAPHICAL SKETCH ............................................................................................ 90
7
LIST OF TABLES
Table page 3-1 QDA classification accuracy for five spectroradiometer classes in 2014
dataset using five spectral bands selected by JM-OSP. ..................................... 47
3-2 Neural network classification accuracy for five spectroradiometer classes in 2014 dataset using five spectral bands selected by JM-OSP. ............................ 47
3-3 Discriminant tree classification accuracy for five spectroradiometer classes in 2014 dataset using five spectral bands selected by JM-OSP. ............................ 48
3-4 QDA classification accuracy for five spectroradiometer classes in 2015 dataset using five spectral bands selected by JM-OSP. ..................................... 48
3-5 Neural network classification accuracy for five spectroradiometer classes in 2015 dataset using five spectral bands selected by JM-OSP. ............................ 49
3-6 Discriminant tree classification accuracy for five spectroradiometer classes in 2015 dataset using five spectral bands selected by JM-OSP. ............................ 49
3-7 QDA classification accuracy for five spectroradiometer classes in 2014 dataset using ARI1 and MAREP vegetation indices. .......................................... 50
3-8 QDA classification accuracy for five spectroradiometer classes in 2015 dataset using ARI1 and MAREP vegetation indices. .......................................... 50
3-9 QDA classification accuracy for healthy-nd and early stage AMB diseased spectroradiometer classes in 2014 dataset using ARI1 and MAREP vegetation indices. .............................................................................................. 51
3-10 QDA classification accuracy for healthy-nd and early stage AMB diseased spectroradiometer classes in 2015 dataset using ARI1 and MAREP vegetation indices. .............................................................................................. 51
4-1 Disease severity statistical analysis results for 20 healthy and 261 early stage AMB diseased samples from 2014 and 2015 spectroradiometer datasets. ............................................................................................................. 63
5-1 SVM classification accuracy for six outdoor hyperspectral classes using MTVI and MAREP2 vegetation indices as input features. .................................. 81
5-2 SVM classification accuracy for six outdoor hyperspectral classes using six spectral bands selected by JM-KLD hierarchical clustering algorithm. ............... 81
8
LIST OF FIGURES
Figure page 1-1 Comparison of the apple and other fruits in South Korea ................................... 18
1-2 AMB distribution in South Korea ......................................................................... 19
3-1 Apple leaves used in indoor spectroradiometer analysis .................................... 43
3-2 Mean reflectance spectra of samples in spectroradiometer dataset ................... 44
3-3 JM distance matrix of spectral bands in 2014 and 2015 indoor spectroradiometer datasets. ............................................................................... 45
3-4 The plot of MAREP against ARI1 for samples in spectroradiometer dataset ...... 46
4-1 Flowchart showing steps for extraction of brown and green colored pixel abundances in early-stage AMB diseased samples ........................................... 60
4-2 Steps taken in segmenting early stage AMB diseased color images .................. 61
4-3 The plot of MAREP against ARI1 classes in spectroradiometer dataset ............ 62
4-4 Predicted versus actual disease severity for validation dataset using a combination of PLSR and SMLR prediction models. .......................................... 63
5-1 RGB color display of an outdoor hyperspectral image taken in 2014 ................. 76
5-2 Comparison of original hyperspectral image and SVM-masked image .............. 76
5-3 Texture images from previously SVM-masked image at wavelength 450.8 nm . 77
5-4 Combined SVM-texture filter masking result ...................................................... 78
5-5 Mean reflectance spectra of non-symptomatic and AMB symptomatic samples in 2014 outdoor hyperspectral dataset ................................................. 79
5-6 The plot of MAREP 2 against MTVI for non-symptomatic and AMB symptomatic samples in 2014 outdoor hyperspectral dataset. ........................... 80
9
LIST OF ABBREVIATIONS
AMB Apple Marssonina blotch
ARI1 Anthocyanin reflectance index 1
FAO Food and Agriculture Organization
GA Genetic algorithm
GPS Global positioning system
JM Jefferies Matusita
KLD Kullback–Leibler divergence
K-NN K-Nearest Neighbor
MAREP Matrix-adjusted red edge position
MTVI Modified triangular vegetation index
NIR Near-infrared
qPCR Quantitative fluorogenic polymerase chain reaction
PLSR Partial least squares regression
PSO Particle swarm optimization
QDA Quadratic discriminant analysis
R2 Coefficient of determination
REP Red edge position
SID Spectral information divergence
SMACC Sequential maximum angle convex cone
SMLR Stepwise multiple linear regression
SVM Support vector machine
SWIR Short-wave infrared
TVI Triangular vegetation index
10
Abstract of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science
DETECTION OF APPLE MARSSONINA BLOTCH DISEASE USING
HYPERSPECTRAL DATA
By
Mubarakat Shuaibu
May 2016
Chair: Won Suk Lee Major: Agricultural and Biological Engineering
Apple Marssonina blotch (AMB) is one of the most devastating apple diseases in
the world, and it has caused huge economic losses to countries like Japan, India, and
Korea. It is a fungal disease that mainly affects the leaves of apple trees and causes
premature defoliation, which in turn results in low quality and quantity of harvested
apples. Technologies that can efficiently detect the disease in its early stage could help
growers apply timely control measures to contain the spread of the disease. In this
work, the use of hyperspectral imaging and spectroradiometer measurements were
examined for the early diagnosis of the disease. Vegetation indices--anthocyanin
reflectance index 1 (ARI1), modified triangular vegetation index (MTVI) and matrix-
adjusted red edge position (MAREP)--and optimal spectral bands, calculated using
Jefferies Matusita, orthogonal subspace projection, and hierarchal clustering algorithms,
were used as features in building various classifiers. An endmember extraction
algorithm, known as sequential maximum angle convex cone (SMACC), was used to
create hyperspectral endmembers based on the severity of the disease. Both
spectroradiometer and hyperspectral imaging systems worked well in distinguishing
healthy samples from diseased ones. The highest classification accuracies achieved in
11
this work for both healthy and early stage AMB diseased classes were 97.7% and
99.2%, respectively.
12
CHAPTER 1 INTRODUCTION
This thesis examines the use of hyperspectral technology for the detection of a
severe fungal disease in apples called Marssonina blotch (AMB). It serves as a
preliminary step towards the design of a low-cost automated sensing system for the
early diagnosis of the disease, especially for use in South Korean apple orchards. With
such a sensing system available, apple growers will be able to apply fungicides more
precisely to areas of their fields infected by various degrees of the disease, thereby
improving control of disease spread, reducing fungicide wastage and maximizing fruit
yield.
The thesis is structured as follows: an overview of precision agriculture and
current remote sensing technologies (spectroscopy, multispectral imaging, and
hyperspectral imaging) in agriculture is given, followed by an outline of the South
Korean and global apple industries. Next, some information about AMB disease is
provided, including its symptoms and current management practices being adopted by
apple growers in infected regions. A literature review of some spectroscopic,
multispectral and hyperspectral imaging technologies currently utilized for plant disease
detection is then presented. Subsequent chapters provide a description of field and lab
data acquisition methods used in extracting information for spectral and image
analyses. Finally, results from various disease detection methods are compared and
discussed followed by recommendations for improving AMB disease detection at both
the asymptomatic and symptomatic stages.
13
Precision Agriculture
The conventional method of managing a farm involves treating a field as a single
unit and distributing crop production inputs, such as water, fertilizer, seeds, and
pesticides, uniformly on the whole field. Growers who treat their field in this way tend to
over-apply crop inputs as insurance and more often than not, this leads to
counterproductive results including poorer yield, input wastage, and environmental
pollution.
Over the past two decades, the use of crop management technologies that take
into account in-field variability has grown tremendously all around the world (Batte,
1999; Zhang et al., 2002). A combination of these technologies to achieve site-specific
crop management is referred to as precision agriculture. Global positioning system
(GPS), yield monitoring and mapping, soil sampling, variable rate application, and
remote sensing are just a few of the tools that have made the concept of precision
agriculture a reality. Unlike the conventional method of farming, precision agriculture
has the potential to produce the same level of yield with decreased input, higher yield
with decreased input, and higher yield with the same amount of input (Morgan & Ess,
1997).
Good crop management decisions rely on accurate spatial and temporal
variability information collected in a field. The first step in achieving this is to use
information sensing and extraction technologies that make use of sensors in acquiring
information relating to field conditions. One major way this is accomplished these days
is by using remote sensing technologies attached to either ground-based platforms or
aerial (drones and helicopters) and space-borne platforms (satellites).
14
Remote sensing technologies are capable of extracting information about an
object without coming in physical contact with it and provide an economical way of
acquiring field data in a short period of time. The primary type of remote sensors used
for most agricultural applications detects natural radiation that is emitted or reflected by
an object and makes use of surface reflectance between the visible and near-infrared
regions of the electromagnetic spectrum in providing information about the object.
Spectroscopy
Spectroscopy is a scientific technique used for the study and identification of
materials. It is capable of capturing the chemical composition of a material by
measuring the amount of light that the material absorbs, emits, or reflects. Light, in this
context, refers to electromagnetic radiation that can come from any region of
electromagnetic spectrum. Light waves are classified by their energies or wavelength,
and light energy is inversely proportional to its wavelength. When light is absorbed or
reflected by materials, not all of the light behaves the same; only certain wavelengths of
light get absorbed while others get reflected. In the past, spectroscopy was simply the
study of visible light according to its wavelength and dispersion by a prism; but since the
nineteenth century, optical components called diffraction gratings have made it possible
to expand the range to other regions of the electromagnetic spectrum including
ultraviolet, near-infrared, and shortwave-infrared. Devices that use these optical
elements are called spectrometers. Spectroscopy is commonly used for precision
agriculture applications because it is an inexpensive, fast, and most importantly, it is a
non-destructive method that can be used in the field.
15
Multispectral and Hyperspectral Imaging
Multispectral and hyperspectral imaging technologies combine two sensing
methods for their operation--spectroscopy and imaging. They are both capable of
capturing spatial and spectral information, and they contain spectral bands that extend
beyond the visible range of the electromagnetic spectrum. Their spectral range is
typically between the ultraviolet and near-infrared regions for most agricultural
applications. Multispectral imaging systems differ from hyperspectral imaging systems
in that the former contains broader bands and typically comprises less than ten spectral
bands. Hyperspectral systems, on the other hand, contain narrower bands and usually
comprises hundreds of spectral bands. In other words, hyperspectral systems have finer
spectral resolution than multispectral systems. Multispectral sensors are more portable
than hyperspectral sensors and can be easily integrated into most remote sensing
platforms. The finer spectral resolution quality of hyperspectral systems cuts both ways
--they can “see” better than their multispectral counterpart and provide more accurate
information about a material, but they also contain a great deal of redundant
information. A number of agricultural applications combine hyperspectral data and
feature selection algorithms in developing multispectral sensors that can be used in the
field.
Global Apple Industry
The apple (Malus domestica) is one of the most important fruit crops in the world
mostly because of the numerous ways it can be consumed and the many health
benefits it offers to people. According to the Food and Agriculture Organization (FAO),
apple ranks second worldwide after banana in terms of production. Over 80.8 million
metric tons of apples were produced in 2013, and Asia alone accounts for over 60%
16
that quantity (FAO, 2013). China and the United States are the top two apple producing
countries in the world and in 2013 China produced over 40 million tons of apples while
the United States produced about 4 million tons.
Apple Industry in South Korea
The apple industry in Korea was highlighted in this work because apple data
used in our analysis was acquired from the country. In 2013, the apple industry added
over $9 million to the nation’s economy and apple was the second most produced fruit
in the country with over 395, 000 tons harvested. The total cultivation area that same
year was over 30,000 hectares--higher than any other fruit grown in the country (Figure
1-1). Korea grows several varieties of apples, but Fuji is by far the most important apple
cultivar in the country.
Apple Marssonina Blotch
Apple Marssonina blotch (AMB), caused by Diplocarpon mali, is a highly
destructive fungal disease that mainly affects the leaves of apple trees. The disease has
been found in apple orchards in several countries, including Japan, China, Korea,
Brazil, Italy, and Canada. The occurrence of the disease was first recorded in Japan
over a century ago and by the 1980s, the disease had made its way to other countries
in Asia, North America, and Europe. In Korea alone, AMB has caused significant
economic losses with over 50% of apple orchards in the country infected by the disease
(Figure
1-2).
The disease thrives in warm and humid climates and usually occurs between the
months of June and July. It has a long latency period, typically between two to five
weeks, after which symptoms begin to develop at an incredibly fast rate. Symptoms
17
start off as small brown spots, with dark pin-like fruiting bodies, known as acervuli,
growing in symptomatic areas. At the advanced stage of the disease, leaves turn yellow
and prematurely fall off the tree. The disease affects the quality and quantity of apples
that can be harvested, including a reduction in starch content and fruit size.
Burning and burying of defoliated diseased leaves have been adopted by apple
growers to prevent the spread of AMB. Fungicides, such as thiophanate-methyl, are
also sprayed on crops at the appearance of early symptoms before the rainy season
begins. Unfortunately, in 1997, an incidence of thiophanate-methyl resistance in
Diplocarpon mali was found in Japan. As a result of the many management challenges
that exist with AMB, more emphasis is being placed on finding ways to prevent the
disease rather than on controlling it.
18
Figure 1-1. Comparison of the apple and other fruits in South Korea in 2013. A)
cultivation area, B) production quantity.
31
1417
14
21
62
0
10
20
30
40
50
60
70
Apple Pear Grape Peach Citrus theothers
Cultiv
ation a
rea (
1, 000 h
a)
A
395
173
278
202
692
635
0
100
200
300
400
500
600
700
800
Apple Pear Grape Peach Citrus theothers
Pro
duction (
1,
000 t
ons)
B
19
Figure 1-2. AMB distribution in South Korea in 2013 (infected regions are highlighted in red).
20
CHAPTER 2 LITERATURE REVIEW
Application of Spectroscopy for Plant Disease Detection
Over the past couple of decades, the use of spectroscopy has grown for
precision agriculture applications. This growth was prompted by the economic need of
growers to reduce crop production input, increase yield and maximize profit in an
efficient and environmentally friendly way. Spectroscopy has been used by several
researchers for various quality and quantity assessment of crops, and one major use
has been for the detection of diseases. Disease stress can greatly influence the
biochemical properties of plants. Infected plants have been shown to produce spectral
characteristics different from healthy ones due to the difference in the way they absorb
light in the visible and near-infrared spectral regions.
Many researchers have taken advantage of these unique spectral characteristics
in detecting plant diseases. Xu et al. (2007) used several spectral parameters including
single-wavelength reflectance, peak area, and water band index to classify five different
severity stages of leaf miner damage on tomato leaves. They found that spectral
reflectance between the NIR region of 800 nm to 1100 nm reduces significantly with
increasing disease severity levels, and the reverse was the case between wavelengths
1450 nm and 1900 nm. They achieved the highest correlation coefficient when the
disease severity levels were modeled using the 1450 nm – 1900 nm range.
Jones et al. (2010) also investigated the use of reflectance spectroscopy in the
quantitative analysis of a tomato disease called bacterial leaf spot. They found
significant wavelengths from the absorbance spectra that distinguished between several
degrees of disease infestation using both partial least squares (PLSR) and stepwise
21
multiple linear regression (SMLR). The disease predictive model built based on spectral
data achieved a coefficient of determination (R2) of 0.82. A highly destructive disease in
winter wheat called powdery mildew was detected by Yuan et al. (2014). They extracted
32 spectral features and examined them using both independent t-test and correlation
analysis. PLSR and multivariate linear regression (MLR) were also used in estimating
disease severity. They reported that the PLSR model outperformed the MLR model and
achieved an R2 of 0.8 using seven regression components.
Huanglongbing (HLB), arguably the most severe disease affecting the citrus
industry in Florida and other regions of the world, has been analyzed be several
researchers using spectroscopic technology. One of such analyses, conducted by
Sankaran et al. (2011), yielded an accuracy of 98% for HLB detection using a quadratic
discriminant analysis classification algorithm. It was reported that the raw spectral data
originally consisting of 989 spectral features, extracted from the wavelength range of
350 nm to 2500 nm, was initially reduced to 86 spectral features and even further to 24
features using a feature extraction algorithm known as principal component analysis
(PCA).
In 2001, transmittance and reflectance spectroscopy was reported to be a
preeminent tool for quickly detecting aflatoxin in corn. In the study, more than 95% of
corn kernels tested using transmittance or reflectance spectroscopy were correctly
classified as containing either high or low levels of aflatoxin (Pearson et al., 2001).
Laser induced fluorescence spectroscopy has also been used in the past with success
for canker disease caused by the Xanthomonas axonopodis pv.citri bacteria in citrus
plants (Belasque Jr et al., 2008).
22
In 2008, it was reported that a preliminary study of visible and near-infrared
reflectance spectroscopy produced a model that appears to be valuable in the early
detection of Botrytis cinerea on non-symptomatic eggplant leaves. The resulting back
propagation neural network (BP-NN) model was found to have an accuracy rate of 70-
85% in predicting fungal infections (Wu et al., 2008). More recently, researchers have
anticipated that spectroscopy will become an important component in food safety. For
example, near- infrared spectroscopy procedure for the detection of organic matter has
been found to be non-destructive, accurate and easy to implement. As a result, near-
infrared spectroscopy was used to identify toxic metabolites, including mycotoxigenic
fungi in maize crops (Berardo et al., 2005).
Application of Multispectral and Hyperspectral Imaging for Plant Disease Detection
Remote sensing has been shown to be a useful tool for monitoring the
heterogeneity of crop vitality within agricultural sites (Franke & Menz, 2007). In 2001,
airborne multispectral scanners were effective in detecting the occurrence of rice
panicle blast using a band combination of 530 nm – 570 nm and 650 nm – 700 nm
regions (Kobayashi et al., 2001). It was also found that a ground-based real-time remote
sensing system for detecting pre-symptomatic yellow rust disease in winter wheat crops
was developed through the use of fused multispectral fluoresce imaging and
hyperspectral reflection, with an overall error of about 5.5%. In addition, data fusion
using a self-organizing map neural network decreased the overall classification error to
1% (Moshou et al., 2005).
The agricultural industry desires a method to detect fungal infections as early as
possible. Therefore, multi-spectral remote sensing is being explored for the analysis of
23
crop diseases of winter wheat containing powdery mildew and leaf rust pathogens at
various stages of infection. Classification accuracies of the infections were between
56.8% and 88.6% during the trials, which indicates a moderate success rate of early
detection of infection (Franke & Menz, 2007).
Citrus canker has continuously threatened the marketability of citrus crops. Thus,
a hyperspectral imaging approach was developed to detecting canker lesions on Ruby
Red grapefruit with a resulting accuracy of 95.2%. It was concluded that hyperspectral
imaging technique coupled with the spectral information divergence (SID) based image
classification method was effective in discriminating citrus canker from other surface
diseases (Qin et al., 2009). HLB, another threatening citrus disease, has been detected
using both multispectral and hyperspectral images acquired from aerial platforms.
Kumar et al. (2012) detected HLB-infected trees from aerial images obtained from a
citrus grove in Florida with an accuracy of 87% using mixture tuned matched filtering
(MTMF) and spectral angle mapping (SAM) algorithms. Their reasoning behind using an
aerial platform to acquire the images instead of a ground-based one was so as to
expedite the process of detecting the disease in very large citrus groves thus allowing
growers to provide more efficient management practices to HLB-infected regions.
Hyperspectral imaging was also found to be a valuable tool in detecting disease
in crops in their early stage using a procedure based on both support vector machines
and spectral vegetation indices. This method distinguished diseased from non-diseased
sugar beet leaves as well as differentiated between leaves infected by the pathogens
Cercospora beticola, Uromyces betae, and Erysiphe betae at the asymptomatic stage
with accuracies between 65% and 90% (Rumpf et al., 2010).
24
Also, a study of sugarcane areas affected by orange rust disease found that
hyperspectral imagery can be used to detect the disease in sugarcane crops. The
combination of visible and near-infrared spectral bands with the moisture-sensitive
band, 1660 nm, yielded increased ability to identify rust-affected areas. However,
disease-water stress indices (R800/R1660; R1660/R550; (R800+R550)/(R1660+R680))
performed the best in targeting affected areas (Apan et al., 2004).
25
CHAPTER 3 APPLE MARSSONINA BLOTCH DETECTION USING INDOOR
SPECTRORADIOMETER DATA
Background
The apple is a very important fruit crop and ranks second after banana in terms
of production globally (FAO, 2013). Based on 2013 statistics from the Food and
Agriculture Organization (FAO), over 60% of the total world’s apple is grown in Asia.
Unfortunately, apple production has been declining in recent years in some of the top
producing countries in the world including China, Japan, India, and South Korea. A
significant part of this decline is attributed to a disease called apple Marssonina blotch
(AMB). AMB is a severe fungal disease that primarily affects apple tree leaves, and it is
caused by a pathogen called Diplocarpon mali. The first appearance of the disease was
recorded in Japan in 1907; but unfortunately, by the 1980s, the disease spread to some
other countries in Asia, Europe and North America (Harada, 1974; Lee et al., 2011; Lee
& Shin, 2000; Tamietti & Matta, 2003). A case in point of the prevalence of the disease
is in Korea. The country has suffered significant economic losses with over 50% of
apple orchards infected by the disease.
AMB disease occurs in the summer after periods of extended rainfall, and it
thrives in high humidity, warm temperature, and high rainfall climates. It is wind and rain
dispersed and spread in two major ways during the apple growing season. The primary
form of infection is caused by ascopores that are released from overwintered apothecia
in fallen leaves while the secondary infection is caused by asexually produced fungal
spores in the acervuli (EPPO, 2013). A long latency period of two to five weeks is typical
for AMB and symptoms begin to develop after this time. At the early symptomatic stage
of the disease, small grayish black or brownish spots appear on the surface of the leaf.
26
The disease then progresses to a stage where the spots coalesce, and necrotic and
chlorotic spots appear. The size of the spots keeps growing until leaves turn yellow and
prematurely fall off the tree. This defoliation affects the quality and quantity of apples on
a tree, including a reduction fruit size and starch content.
The primary preventative methods being adopted by apple growers whose fields
have been infected by the disease are burning and burying of defoliated leaves.
Treatments for AMB, including thiophyphanate-methyl fungicide, exist for the control of
the disease; however there have been reports of the disease pathogen being resistant
to some of these treatments (Tanaka et al., 2000). At this point, growing AMB resistant
cultivars might be one of the only few ways to economically, reliably and more efficiently
control its spread and some researchers have been working on finding disease resistant
cultivars and species (Yin et al., 2013). It is a challenging task trying to check for the
occurrence of disease by visually inspecting each tree in the field, not only because this
is a time-consuming process, but also because there is a high chance that diseased
leaves may not be spotted given the long latency period of the disease and at the
symptomatic stage, AMB can be mistaken for other apple blotch-like diseases (Back et
al., 2015). As a result of the many management challenges AMB is posing to growers, it
is imperative that methods that can be used for its early diagnosis be developed without
delay.
Oberhänsli et al. (2014) used quantitative fluorogenic polymerase chain reaction
(qPCR) for the early diagnosis of the disease. The authors concluded that the method
efficiently diagnosed AMB infected leaves and did so more accurately than visual
diagnosis by growers and other AMB experts. Methods like qPCR, however, are
27
destructive in nature and require that leaves be plucked from a tree for analysis. The
use of non-destructive methods for the detection of the disease is still relatively new and
to the best of the authors’ knowledge, only one technique has been explored in
literature. Lee et al. (2012) reported diagnosing the disease at the early and
asymptomatic stages using a non-invasive tool called optical coherence tomography.
From two-dimensional (2D) and three-dimensional (3D) imaging scans created by the
system, they were able to find distinctive differences between the inner cross-sectional
layers of healthy and diseased leaves. They concluded that an early stage AMB
detection tool can be developed based on an upgraded version of the system. There
was no mention, however, if the technology could be used for quantitatively analyzing
the disease.
Spectroscopy is extensively used in precision agriculture for assessing the
general health of crops (Del Fiore et al., 2010; Gómez-Sanchis et al., 2008; Graeff et
al., 2006; Qin et al., 2008). It is preferred to some other tools being used for qualitative
and quantitative analyses in agriculture because it is inexpensive, accurate, fast, and
non-destructive (Del Fiore et al., 2010; Li et al., 2007; Roggo et al., 2002; Sankaran et
al., 2010). Some researchers have used spectral analysis in detecting plant diseases.
Jones et al. (2010) and Xu et al. (2007) showed the potential of spectral technology in
the detection of bacterial leaf spot and leafminer diseases in tomatoes. They
successfully created disease prediction models capable of diagnosing tomato diseases
at different severity levels of infestation. The notorious Huanglongbing (HLB) disease of
citrus has also been detected with an accuracy of 87% using spectral features
developed from reflectance spectral data (Sankaran et al., 2013).
28
Spectroscopic data usually possesses high numbers of spectral bands with some
data containing hundreds or even thousands of bands. With so many spectral bands,
the feature space of a given spectral data could potentially contain tens of thousands of
features (Thenkabail et al., 2011). In most situations, some of these features hold little
or no information about the target of interest and including them in information
extraction processes such as classification could slow down the process and cause
inaccurate classification results. One way of dealing with this problem is to apply
preprocessing techniques such as feature selection and extraction to the spectral data
before classification is performed. A number of methods for feature selection and
extraction exist with the most popular being spectral band selection, spectral indices,
and projection pursuit measures. Many researchers have investigated several
dimensionality reduction methods based on these measures (Bruce et al., 2002; Jia &
Richards, 1999; Martínez-Usó et al., 2007; Wang & Angelopoulou, 2006; Yang et al.,
2012; Zhang et al., 2005). Feature selection methods are generally preferred to feature
extraction methods because the latter does not preserve original information; instead, it
produces transformed features. This is especially undesirable in situations where the
goal is to build a multispectral sensor based on selected bands or features.
The primary objective of this work was to evaluate the potential of using spectral
data for the identification of AMB disease. The specific objectives were to:
i. determine optimal spectral features for AMB disease detection, and
ii. develop a classification algorithm capable of distinguishing between diseased and healthy leaves.
29
Materials and Methods
Data Collection
Leaf samples used in this work were acquired from Fuji apple trees in an
experimental apple orchard located in Gunwi-city, Korea. A test area measuring 40 m x
60 m, with a total of 260 trees, was set aside for the experiment. Datasets were
acquired during the fall season in two consecutive years–2014 and 2015. Before leaves
were plucked from the trees and analyzed, the trees were inoculated with AMB spores
in order to facilitate disease development. Leaf samples were collected on different
days to ensure leaves that were healthy and others that showed varying degrees of
AMB infection were included in the dataset. Molybdenum and Manganese nutrient
deficient leaves were also included in the dataset due to their similarity in color with
some of the other classes.
Indoor spectral measurements were obtained from the samples immediately after
they were plucked from the trees. This was done so as to minimize damage to the cell
structure of the leaves. Reflectance spectral information was acquired between
wavelengths 350 nm and 2500 nm using a spectroradiometer system that consisted of a
spectroradiometer (Field Spec 3, ASD Inc., Boulder, CO, USA) and a plant probe with
an attached leaf clip (Leafclip Assembly A122325, ASD Inc., Boulder, CO, USA). The
clip had a spot size of 10 mm, and it was used in holding leaves in place while spectral
measurements were obtained. Before the leaves were clipped to the plant probe, a
white polytetrafluorethylene (PTFE) reflector material with 99% reflectance was clipped
to the plant probe and used in calibrating the system to reflectance. After spectral
measurements of leaf samples had been acquired, they were transferred and stored on
a laptop (Intel Core i7-3720QM, HP, USA). The spectroradiometer system had a
30
spectral resolution of 1 nm, had three detectors: a VNIR detector (350-1000 nm), a
SWIR1 detector (1000-1800 nm) and a SWIR2 (1800-2500 nm) detector and operated
with a scanning time of 100 milliseconds. Stable illumination was produced throughout
the spectral range using a halogen lamp (ASD Illuminator - 70 watts, ASD Inc., Boulder,
CO, USA).
A total of 621 and 751 samples were acquired in 2014 and 2015, respectively.
Spectral measurements from both years contained 2151 spectral bands and five
different classes were defined for the datasets as shown in Figure 3-1: mature healthy
(healthM), young healthy (healthY), early stage AMB diseased (ambE), advanced stage
AMB diseased (ambA), and molybdenum/manganese nutrient deficient (nd). Aside from
the healthy classes, all other classes had, at least, two different colored regions on the
leaves, and as earlier stated, a leaf clip with a spot size of 10 mm was used in the
spectral data acquisition process. This large spot size made it impossible to capture
spectral information of individual symptomatic and asymptomatic regions; what was
captured was the average reflectance spectra of all the colors enclosed within the
clipped area, thus resulting in spectrally mixed data for AMB diseased and nutrient
deficient samples. The ambE spectral class was a spectral mixture of reflectance data
from brown and green portions of early stage AMB diseased leaves. Most samples in
the ambA spectral class were spectral mixtures of just green, yellow and brown colored
regions, but some of its other samples also had orange symptomatic regions included in
their spectral mixtures. The nd spectral class was a spectral mixture of light and dark
green portions of nutrient deficient samples.
31
Vegetation Indices
A number of vegetation indices were developed and tested on the five classes
including carotenoid reflectance index (CRI), sum green index (SGI) and normalized
difference vegetation index (NDVI). Among those tested, only anthocyanin reflectance
index 1 (ARI1) and a newly developed index called matrix-adjusted red edge position
(MAREP)–based on the 4-point interpolation red edge position--were efficient in
discriminating among the classes. MATLAB R2014a (Version 8.3, The MathWorks Inc.,
Natwick, MA, USA) was used in developing vegetation indices.
Anthocyanins are water-soluble pigments that impart the color of plant leaves.
After chlorophyll, anthocyanins are the most important group of pigments found in the
visible range of the spectrum for assessing vegetation health. Anthocyanins are typically
found in more abundance in stressed vegetation than in healthy ones. Anthocyanin
reflectance index 1 ( 1ARI ) is calculated using reflectance found in two very significant
wavelengths for plants: 550 nm and 700 nm, and it is given by the following equation:
700
1
550
11
ARI (3-1)
where 500 represents the reflectance at the nitrogen absorption band of 550 nm and
700 represents the reflectance at the red band of 700 nm.
The spectral reflectance curve of vegetation contains a region that abruptly
changes from low to high reflectance between the red and near infrared (NIR) range of
the spectrum. This point of inflection is referred to as red edge, and weakened
vegetation typically has this position shifted towards shorter wavelengths—a
phenomenon known as blue shift. Several existing algorithms for calculating the red
edge position (REP) were assessed, but none showed as good of a separation as a
32
newly developed vegetation index created by the authors based on the 4-point
interpolation approach of calculating REP. This vegetation index is called matrix-
adjusted red edge position (MAREP). MAREP, unlike the 4-point interpolation REP
method which performs element-wise operations on any given class dataset, uses a
matrix based approach in computing its vegetation index. The algorithm takes into
account the combined effect a given cluster or class has on each of its samples and as
a result, it allows for better separation among samples in different classes.
It should be noted that for MAREP to work efficiently, information about the
location of class samples is required and in a situation where this information is not
known, it is recommended first to apply unsupervised clustering to the dataset before
applying MAREP. The equations in (3-2) and (3-3) represent the 4-point interpolation
REP and MAREP algorithms, respectively.
700740
70040700
rededgeRREP (3-2)
, 1
, 1
700
700 40
740 700
i
i i
k
c i
rededge k
c i
MAREP R
(3-3)
2
780670 rededgeR (3-4)
where
rededgeR is the reflectance at the inflection point. ,700,780,670 and 740
represent reflectance at 670 nm, 780 nm, 700 nm, and 740 nm, respectively. The
constants 700 and 40 are values resulting from interpolating between the 700 and 740
nm spectral range, c represents a particular cluster or class and k represents the
number of samples in the cluster.
33
Jefferies Matusita-Orthogonal Subspace Projection (JM-OSP) Band Selection
Most band selection methods do not take into account feature redundancy. If
redundancy is not accounted for, the computed optimal spectral bands by a feature
selection algorithm could all be concentrated in one spectral region having very similar
information. To minimize this effect, a very robust distance measure called Jefferies
Matusita (JM) distance was used as a criterion for removing redundant spectral bands
before feature selection was performed. JM distance is traditionally used for performing
class separability operations; but in this work, it was modified for the task of minimizing
feature redundancy. The JM distance algorithm computes the distance between density
functions of two classes or features based on Bhattacharyya distance with an
assumption of Gaussian class distributions made in order to simplify the computation of
Bhattacharyya distance. A JM distance value of 1.414 suggests two spectral bands
contain very distinct information about the classes to be separated and thus, would be
good candidates for the band selection process. The JM distance, 𝐽𝑀, for any two pairs
of spectral bands is given as:
𝐽𝑀 = [2(1 − 𝑒−𝐷𝐵)]1
2⁄ (3-5)
𝐷𝐵 = 𝐷𝑀
8+
1
2 ln [
|(𝐶𝑠𝑖+ 𝐶𝑠𝑗)/2|
(|𝐶𝑠𝑖 || 𝐶𝑠𝑗|)1
2⁄] (3-6)
𝐷𝑀 = [(𝜇𝑠1 − 𝜇𝑠2)𝑇 (𝐶𝑠𝑖+ 𝐶𝑠𝑗
2)
−1
(𝜇𝑠𝑖 − 𝜇𝑠𝑗)]
12⁄
(3-7)
where 𝐷𝐵 is the Bhattacharyya distance and 𝐷𝑀 is the Mahalanobis distance. 𝜇𝑠i, 𝜇𝑠𝑗 and
𝐶𝑠𝑖, 𝐶𝑠𝑗 are the mean and covariance of reflectance data in bands 𝑖 and 𝑗, respectrively.
Du and Yang (2008) proposed a band selection method based on using a similarity
metric that is commonly employed for endmember extraction called orthogonal subspace
34
projection (OSP). OSP, unlike other similarity measures which take measurements from
pairs of bands, evaluates bands jointly. The algorithm is less computationally expensive
than most other band selection methods because the others find optimal band
combinations by performing an exhaustive search; whereas, OSP performs a sequential
forward search to find the best bands.
OSP performs band selection as follows: assuming there are M bands in the
original dataset, in order to find the first band, the algorithm randomly selects a band for
band 1, A1 and projects all the other M-1 bands to its orthogonal subspace. It then finds
a second band, A2, with the maximum projection in A1’s orthogonal subspace; this is
considered as the band most dissimilar to A1. All other M-2 bands are now projected on
A2’s orthogonal subspace, and band A3 is chosen as the band with maximum projection
in A2’s orthogonal subspace. The algorithm continues until Ai+1 = Ai-1. When this occurs,
Ai+1 is selected as the true band 1, B1 and Ai is selected as the true band 2, B2. To find
the third band, B3 (and subsequent bands), the algorithm finds the band that is most
dissimilar from B1 and B2 by using the orthogonal subspace, 𝐏, of both bands defined
below:
𝐏 = 𝐈 − 𝐙(𝐙𝐓𝐙)−𝟏𝐙𝐓 (3-8)
where 𝐈 is an N x N identity matrix, N is the number of pixels per band, and 𝐙 is an N x 2
matrix with the first column containing all pixels in band 1, B1 and the second column
includes all pixels in band 2, B2. (The superscript,𝐓, means transpose).
After implementing (3-8), the projection 𝐲𝟎 = 𝐏𝐓𝐲 is computed; 𝒚 includes all
pixels in the original dataset, B and 𝐲𝟎 is the component of B in the orthogonal subspace
of B1 and B2. The band that yields the maximum orthogonal component ‖𝐲𝟎‖ is considered
35
the most dissimilar band to the first two bands and will be selected as band, B3. For finding
subsequent bands, the size of 𝐙 in (3-8) changes to [B1 B2 B3] and then to [B1 B2 B3 B4]
and so forth until the desired number of bands are selected. (Note: ‖𝐲𝟎‖ denotes the
norm of 𝐲𝟎 ).
Results and Discussion
Spectral Feature Analysis
The mean reflectance spectra for the five classes defined for 2014 and 2015
datasets are given in Figure 3-2. Upon inspection of the figure, it can be seen the
signatures of the five classes were notably different. The slight differences in signatures
for same class pairs in both years can be attributed to varying fractional abundances of
colors. In both plots, at the nitrogen absorption band of 550 nm, it can be seen healthier
samples have a lower peak when compared to stressed samples. Healthy leaves
generally have more concentrations of nitrogen because they contain more chlorophyll
than stressed vegetation. Higher measured reflectance at this band was only
experienced by stressed samples indicating lower levels of nitrogen in them.
The reflectance at the near infrared (NIR) region, between 700 nm and 1000 nm,
was high for all classes due to the internal scattering of light within leaves’ structure—a
phenomenon present in all vegetation. As the water content in leaves increases, so
does the absorption strength at bands 1450 nm and 1950 nm. From the plots, it can be
seen healthier samples have stronger absorption at these bands than stressed
vegetation and as a result, they contain higher amounts of water.
JM-OSP Band Selection
The original dataset from 2014 and 2015 contained a total of 2151 spectral
bands. The JM distance was calculated for each pair of bands and resulted in a total of
36
2151(2151-1)/2 distances as depicted in Figure 3-3. The main diagonal in the figure
represents the JM distance between same band pairs and as expected, resulted in zero
JM distances. Higher JM distances between bands suggest those bands have higher
class separability than bands with lower values, and they would be good candidates for
the band selection process. It can also be seen from the figure that spectral bands
roughly between wavelength range pairs 400 nm – 700 nm and 720 nm – 1400 nm, and
between 700 nm – 1400 nm and 1800 nm – 2500 nm have very high JM distances. A
threshold of 1.4 was applied to the data and spectral bands above this threshold (1915
in total) were selected for OSP band selection. In order to determine the appropriate
number of optimal bands required for the classification process, a stopping criterion for
the OSP algorithm was created using a cumulative average entropy approach. For the
combined 2014 and 2015 datasets, the algorithm stopped at the fifth iteration and
resulted in optimal bands at wavelengths 790 nm, 1384 nm, 695 nm, 1076 nm and 1716
nm.
ARI1 and MAREP
Two-dimensional scatter plots were created for 2014 and 2015 datasets to
assess visually how well ARI1 and MAREP worked in separating the classes, and they
are shown in Figure 3-4. From the figure, it can be seen samples in ambA class had the
lowest MAREP values and could easily be separated from the other classes. Working
with the same principle as would have been applied to the 4-point interpolation REP
method, MAREP shows healthier samples have higher values than stressed samples
and as a result, contain higher concentrations of chlorophyll. As expected, the ARI1
values for diseased samples were generally higher than those for healthy and nutrient
deficient samples. There was a slight overlap between some healthM samples and nd
37
samples. This was because samples in the nd class had different proportions of nutrient
deficient symptoms, with some samples having more abundances of
healthy/asymptomatic regions than stressed areas.
Classification Based On JM-OSP Bands
Due to the small sample size in each class dataset and to avoid any bias and
ensure analysis results from this work would generalize well in an independent dataset,
three-fold cross validation was applied to the spectral data. Classification performance
at individual stages of the validation process was averaged to create one classification
dataset. The spectral bands selected by the JM-OSP algorithm were used as input
features for three classifiers--quadratic discriminant analysis (QDA), neural network and
discriminant tree. Spectral datasets from both years were not combined for the
classification process because there were slight differences in their reflectance
amplitudes. Classification results achieved for 2014 and 2015 datasets using the
aforementioned classifiers are given in Tables 3-1 to 3-6. The number of correctly
classified samples together with their corresponding percentage accuracies are
provided in the cells in the main diagonal while values in the other cells indicate
misclassification errors.
Comparing the classification results achieved by all three classifiers for 2014
dataset, given in Tables 3-1 to 3-3, it can be seen that the QDA classifier achieved the
highest overall accuracy of 91.9%, followed by the neural network classifier with an
overall accuracy of 89.2% and finally, the discriminant tree classifier with an overall
accuracy of 88.5%. Both QDA and discriminant tree had the highest accuracy for the
ambE class with an accuracy of 79.4%, and a total of 27 of its samples were
misclassified as healthM, healthY and nd. QDA and neural network both classified the
38
ambA class with an accuracy of 90.4% and misclassified 11 of its samples as ambE,
healthY, and nd. For the healthM class, QDA achieved the highest accuracy of 95.6%,
followed by discriminant tree with an accuracy of 89.5%, and the neural network
classifier achieved the lowest accuracy for this class with an accuracy of 89.5% and
misclassified 12 of its samples as ambE and nd. All three classifiers correctly classified
the healthY class with an accuracy over 97%, and it was the only class to achieve the
highest accuracy by all three classifiers. Neural network performed the worst in
classifying the nd class. It correctly classified 84.5% of its samples and misclassified 10
as ambE and healthM. QDA and discriminant tree, on the other hand, achieved an
accuracy of 93.9% and 93.2%, respectively.
The classification results for the 2015 dataset are given in Tables 3-4 to 3-6.
Comparing the classification performance by all three classifiers, it can be seen that the
QDA classifier again achieved the highest overall accuracy of 92.1%, followed by the
neural network classifier with an overall accuracy of 89.2% and finally, the discriminant
tree classifier with an overall accuracy of 86.1%. All three classifiers achieved an
accuracy of over 86% for the ambE class with QDA leading with an accuracy of 88.5%.
The ambA class had the overall highest accuracy among all the classes with an
accuracy of 98.2% achieved by the QDA classifier; two of its samples were
misclassified as ambE and one as nd. For the healthM class, QDA achieved the highest
accuracy of 86.7%, followed by discriminant tree with an accuracy of 78.3%, and the
neural network classifier achieved the lowest accuracy for this class with an accuracy of
86.7% and misclassified 27 of its samples as ambE, healthY, and nd.
39
All three classifiers correctly classified the healthY class with an accuracy over
95%. This time, discriminant tree performed the worst in classifying the nd class. It
correctly classified 81.2% of it samples while QDA and discriminant tree achieved an
accuracy of 92.9% and 85.3%, respectively.
Classification Based On ARI1 and MAREP Features
ARI1 and MAREP were used as input features for a quadratic discriminant
analysis (QDA) classifier. A QDA classifier was chosen over neural network and
discriminant tree classifiers because it performed better in mapping input data to their
respective classes and as a result, classification information from the other two
classifiers are not shown.
The classification results that were achieved for 2014 and 2015 datasets are
given in Tables 3-7 and 3-8, respectively. Again, the number of correctly classified
samples together with their corresponding percentage accuracies are given in the cells
in the main diagonal. Values in the other cells indicate misclassification errors. For 2014
dataset, the overall accuracy from the classification analysis was 98.4%. Samples
belonging to healthY, ambE, and ambA classes were all perfectly classified. From
Figure 3-4a, it can be seen there was no overlap between the ARI1- MAREP features of
the three classes and those of other classes, thus explaining why the classifier was able
to classify them efficiently. The class healthM had the lowest classification accuracy of
93.9% with a total of five samples misclassified as nd and two samples misclassified as
ambE. The classification accuracy for the nd class was 4% higher than that of the
healthM class. Two of its samples were misclassified as ambE while only one was
misclassified as healthM. One major reason some samples in healthM and nd classes
were misclassified as either ambE, nd or healthM was because ambE and nd samples
40
still retained some healthy and asymptomatic regions even after symptoms began to
show on the leaves.
Even though classification results from this analysis were high, more efficient and
meaningful results could have been generated had a system capable of acquiring
spectral measurements from much smaller regions been used. Since such a dataset did
not exist at the time of this analysis, a workaround was developed, and it is discussed in
Chapter 4. Future work on the detection of the disease will include using a
hyperspectral imaging system to acquire both spectral and spatial information on a per
pixel basis from leaves. A spectroradiometer system was chosen over a hyperspectral
imaging system for this preliminary analysis because hyperspectral imaging systems
are more expensive and usually contain more noise.
For 2015 dataset, an overall accuracy of 98.4% was also achieved. Samples
belonging to the ambA class were unsurprisingly classified with an accuracy of 100%,
again because they had the most distinct features among all the classes. The healthM
class achieved an accuracy of 99.2% with one misclassification as nd. The ambE class
had the lowest classification accuracy of 96.2% with a total of five samples misclassified
as healthY. As for the nd class, 97.1% of its samples were correctly classified with five
misclassifications as ambE.
QDA Binary Classification Based On ARI1 and MAREP Features
Since the primary objective of this work was to detect AMB at the earliest stage
possible, samples belonging to the ambA class were ignored in another analysis due to
their very distinct color. ARI1 and MAREP features were chosen for the classification
process because they achieved better classification results than JM-OSP spectral
bands. From the 2014 dataset, a total of 50 samples were randomly selected from each
41
of the healthM, healthY, and nd classes and combined to form one class with 150
samples while all 131 samples from the ambE class were used.
For the purpose of this analysis, the combined class was termed “healthy-nd.”
Table 3-9 shows the QDA binary classification results for healthy-nd and ambE samples
for 2014. An overall accuracy of 94.7% was achieved with 96% of healthy-nd samples
correctly classified and six samples misclassified as ambE. For the ambE class, 93.1%
of its samples were correctly classified with nine misclassifications as healthy-nd. The
overall classification accuracy reduced by 3.7% when only two classes were
considered. Table 3-10 shows the binary classification for 2015 dataset.
Just as was done with the 2014 dataset, 150 samples were extracted from the
healthM, healthY and nd classes and all 130 samples in the ambE class were used. An
overall accuracy of 94.6% was achieved with 98.7% of healthy-nd samples correctly
classified and two misclassifications as ambE. For the ambE class, 90% of its samples
were correctly classified with 13 misclassifications as healthy-nd. The overall
classification accuracy reduced by 3.8% when compared to the classification accuracy
achieved with five classes.
Figure 3-4 provides some insight into the cause of the reduction in classification
accuracies observed in both years. From the figure, it can be seen there was some
slight overlap between ambE and healthY samples in the 2015 dataset. Color images of
ambE samples were compared with those of healthY, and it was found that some ambE
samples had asymptomatic and healthy regions the same color as healthY samples.
This shows that those samples were initially healthy and young leaves that later
became infected by the disease. The plots also reveal that the combined healthy-nd
42
classes were not clustered so tightly together that the classifier could create a quadratic
boundary that could perfectly separate the two classes. The elimination of the ambA
class was another reason for the reduction in accuracy since its perfect classification
enhanced classification results in the previous analysis.
Conclusion
The primary objective of this study was to detect AMB disease at the earliest
stage possible using spectral data from two consecutive years. Features were selected
using two different methods. JM distance was combined with OSP in selecting five
optimal spectral bands between the red and SWIR spectral range while ARI1 and
MAREP spectral indices were built based on five bands between 550 nm and 780 nm
wavelength range. This spectral region is known to have a notable influence on many
plant diseases. Both types of features tested in this analysis were used in discriminating
between healthy, AMB diseased and nutrient deficient samples and were combined with
at least one of three different classifiers—QDA, neural network, and discriminant tree--
to classify samples in all five classes. Results showed that MAREP, a vegetation index
derived from the 4-point interpolation REP, efficiently separated the classes and worked
well for the early detection of AMB. Based on results achieved from this analysis, a
multispectral camera can be built using the five ARI1-MAREP spectral bands for the
detection of the disease on apple fields. Overall, the results of this work indicate the
potential of using spectroscopic technology as a valuable non-invasive tool for the early
diagnosis of AMB.
43
Figure 3-1. Apple leaves used in indoor spectroradiometer analysis. A) healthy mature
(healthM), B) healthy young (healthY), C) early stage AMB diseased (ambE), D) advanced stage AMB diseased (ambA), and E) molybdenum/manganese nutrient deficient (nd).
A B C D E
44
Figure 3-2. Mean reflectance spectra of samples in spectroradiometer dataset for two
consecutive years. A) 2014, B) 2015. The classes include healthy mature (healthM), healthy young (healthY), early stage AMB diseased (ambE), advanced stage AMB diseased (ambA), and molybdenum/manganese nutrient deficient (nd).
550 nm
A
B
45
Figure 3-3. JM distance matrix of spectral bands in 2014 and 2015 indoor spectroradiometer datasets.
46
Figure 3-4. The plot of MAREP against ARI1 for samples in spectroradiometer dataset
for two consecutive years. A) 2014, B) 2015. The classes include healthy mature (healthM), healthy young (healthY), early stage AMB diseased (ambE), advanced stage AMB diseased (ambA), and molybdenum/manganese nutrient deficient (nd).
A
B
47
Table 3-1. QDA classification accuracy for five spectroradiometer classes in 2014 dataset using five spectral bands selected by JM-OSP.
Pre
dic
ted
Actual
Total
Total
ambE ambA healthM healthY nd
ambE 104
(79.4%) 9
(7.8%) 1
(0.9%) 0
7 (4.7%)
121
ambA 8 (6.1%)
104 (90.4%)
0 2
(1.8%) 0 114
healthM 1 (0.8%)
0 109
(95.6%) 0
2 (1.4%)
112
healthY 0 2
(1.7%) 0
111 (98.2%)
0 113
nd 18 (13.7)
0 4
(3.5%) 0
139 (93.9%)
161
Total 131
115 114 113 148 621
Table 3-2. Neural network classification accuracy for five spectroradiometer classes in 2014
dataset using five spectral bands selected by JM-OSP.
Pre
dic
ted
Actual
Total ambE ambA healthM healthY nd
ambE 102
(77.9%) 6
(5.2%) 8
(7.0%) 0
6 (4.1%)
122
ambA 6 (4.6%)
104 (90.4%)
0 1
(0.9%) 0 111
healthM 6 (4.6%)
0 98
(86.0%) 0
4 (2.7%)
108
healthY 0 4
(3.5%) 0
112 (99.1%)
0 116
nd 17 (13.0%)
1 (0.9%)
8 (7.0%)
0 138
(93.2%) 164
Total 131 115 114 113 148 621
48
Table 3-3. Discriminant tree classification accuracy for five spectroradiometer classes in 2014 dataset using five spectral bands selected by JM-OSP.
Pre
dic
ted
Actual
Total ambE ambA healthM healthY nd
ambE 104
(79.4%) 8
(7.0%) 5
(4.4%) 0
15 (10.1%)
132
ambA 8 (6.1%)
103 (89.6%)
0 2
(1.8%) 5
(3.4%) 118
healthM 3 (2.3%)
0 102
(89.5%) 0
1 (1.4%)
107
healthY 0 4
(3.5%) 0
110 (97.3%)
1 (0.7%)
115
nd 16 (12.2%)
0 7
(6.1%) 1
(0.9%) 125
(84.5%) 149
Total 131 115 114 113 148 621
Table 3-4. QDA classification accuracy for five spectroradiometer classes in 2015 dataset using
five spectral bands selected by JM-OSP.
Pre
dic
ted
Actual
Total ambE ambA healthM healthY nd
ambE 115
(88.5%) 2
(1.2%) 2
(1.7%) 0
1 (0.6%)
120
ambA 7 (5.4%)
163 (98.2%)
1 (0.8%)
3 (1.8%)
1 (0.6%)
175
healthM 1 (0.8%)
0 104
(86.7%) 3
(1.8%) 8
(4.7%) 116
healthY 0 0 3
(2.5%) 159
(96.4%) 2
(1.2%) 164
nd 7 (5.4%)
1 (0.6%)
10 (8.3%)
0 158
(92.9%) 176
Total 130 165 120 165 170 751
49
Table 3-5. Neural network classification accuracy for five spectroradiometer classes in 2015 dataset using five spectral bands selected by JM-OSP.
Pre
dic
ted
Actual
Total ambE ambA healthM healthY nd
ambE 114
(87.7%) 2
(1.2%) 4
(3.3%) 0
3 (1.8%)
123
ambA 8 (6.2%)
159 (95.8%)
0 1
(0.6%) 1
(0.6%) 169
healthM 0 1
(0.6%) 93
(77.5%) 2
(1.2%) 20
(11.8%) 116
healthY 0 4
(2.4%) 3
(2.5%) 159
(96.4%) 1
(0.6%) 167
nd 8 (6.2%)
0 20
(16.7%) 3
(1.8%) 145
(85.3%) 150
Total 130 166 120 165 170 751
Table 3-6. Discriminant tree classification accuracy for five spectroradiometer classes in 2015
dataset using five spectral bands selected by JM-OSP.
Pre
dic
ted
Actual
Total ambE ambA healthM healthY nd
ambE 112
(86.2%) 12
(7.2%) 8
(6.7%) 0
6 (3.5%)
138
ambA 7 (5.4%)
147 (88.6%)
1 (0.8%)
3 (1.8%)
0 158
healthM 4 (3.1%)
0 94
(78.3%) 2
(1.2%) 17
(10.0%) 117
healthY 2 (1.5%)
7 (4.2%)
7 (5.8%)
157 (95.2%)
9 (5.3%)
182
nd 5 (3.8%)
0 10
(8.3%) 3
(1.8%) 138
(81.2%) 156
Total 130 166 120 165 170 751
50
Table 3-7. QDA classification accuracy for five spectroradiometer classes in 2014 dataset using ARI1 and MAREP vegetation indices.
Pre
dic
ted
Actual
Total
ambE
ambA
healthM
healthY
nd
ambE 131
(100%) 0 2
(1.7%) 0 2
(1.4%) 135
ambA 0 115
(100%) 0 0 0 115
healthM 0 0 107
(93.9%) 0 1
(0.7%) 108
healthY 0 0 0 113
(100%) 0 113
nd 0 0 5
(4.4%) 0 145
(97.9%) 150
Total 131 115 114 113 148 621
Table 3-8. QDA classification accuracy for five spectroradiometer classes in 2015 dataset using
ARI1 and MAREP vegetation indices.
Pre
dic
ted
Actual
Total
ambE
ambA
healthM
healthY
nd
ambE 125
(96.2%) 0 0
1
(0.6%)
5
(2.9%) 131
ambA
0
166
(100%) 0 0 0 166
healthM 0 0 119
(99.2%) 0 0
119
healthY 5
(3.8%) 0 0 164
(99.4%) 0 169
nd 0 0 1
(0.8%) 0 165
(97.1%) 166
Total 130 166 120 165 170 751
51
Table 3-9. QDA classification accuracy for healthy-nd and early stage AMB diseased spectroradiometer classes in 2014 dataset using ARI1 and MAREP vegetation indices.
Pre
dic
ted
Actual
Total ambE
healthy-nd
ambE 122
(93.1%)
6
(4%) 128
healthy-nd 9
(6.9%)
144
(96%) 153
Total 131 150 281
Table 3-10. QDA classification accuracy for healthy-nd and early stage AMB diseased
spectroradiometer classes in 2015 dataset using ARI1 and MAREP vegetation indices.
P
redic
ted
Actual
Total
ambE
healthy-nd
ambE 117
(90%)
2
(1.3%) 119
healthy-nd 13
(10%)
148
(98.7%) 161
Total 130 150 280
52
CHAPTER 4 QUANTITATIVE ANALYSIS AND SPECTRAL UNMIXING OF EARLY STAGE APPLE
MARSSONINA BLOTCH DISEASE
Background
In chapter 3, qualitative analysis of five classes of apple leaves was performed
using spectral data acquired from a spectroradiometer system. These classes included
healthy mature (healthM), healthy young (healthY), early stage AMB diseased (ambE),
advanced stage AMB diseased (ambA) and nutrient deficient (nd). It was reported that
data collected from all three stressed classes were a spectral mixture of at least two
regions (or colors) with possibly very distinct spectral characteristics. The spectral
analysis results in Figure 3-4 showed the ARI1 and MAREP features for the two
diseased classes varied more widely than the other three classes having only one
color—green. This occurred as a result of combining samples with various levels of
infestation into one class. Since the main objective of this research was to detect AMB
at the earliest stage possible so as to allow growers apply fungicides more precisely to
areas of their field infected by various degrees of the disease, methods capable of
estimating the extent of disease infestation were analyzed.
Spectroradiometer is one of the most popular spectroscopic systems being used
to detect plant stress by measuring crop spectral reflectance. It has several advantages
including ease of use, efficiency and portability. However, some spectroradiometer
systems, like the one utilized in this work, are only capable of extracting one spectral
signature for several regions on an object being sensed. This poses a problem when
the aim is to analyze areas with different chemical properties separately. Optimization
algorithms, such a particle swarm optimization (PSO), have been successfully used by
some researchers in extracting endmembers from spectrally mixed data (Omran et al.,
53
2006; Zhang et al., 2011). PSO is preferred to some of the other well-known
endmember extraction algorithms due to its ease of use, efficiency and robustness in
solving optimization problems.
The main objectives of this work were to (i) develop a model capable of
performing quantitative analysis on previously named “early stage diseased” samples
and, (ii) to create an optimal method for unmixing mixed spectral features for early stage
AMB diseased samples.
Methods
Abundance Estimation of Early Stage AMB Diseased Data
Early stage AMB diseased samples had both brown and green colored regions.
A number of steps were taken to arrive at the abundance estimation for both colored
regions (Figure 4-1). After spectral measurements had been taken, a white ring was
used in marking regions from which spectral data had been acquired. A digital camera
(Canon EOS 5D, Canon, Japan), with a focal length of 24 mm and an exposure time of
0.04 sec, was used in acquiring color images of leaf samples. Hough transformation
method was used in extracting only regions marked by the circular ring. To allow for
easy segmentation of brown pixels, green pixels were first masked using the ratio of the
red channel to the green channel and setting green pixel extraction to less than or equal
to a threshold of 0.5. This threshold was chosen because it gave the best separation
between the green pixels and other colors from the analysis of the histogram plot.
Nineteen color and texture features were then created and used as input for a
supervised k-nearest neighbor (kNN) classifier. The classifier made use a Euclidean
distance metric and a neighborhood size of four for segmenting ambE color images.
54
Endmember Extraction from Early Stage AMB Diseased Data
Particle swarm optimization is a methodology in evolutionary computation that
was invented by Eberhart and Kennedy in 1995 (Eberhart & Kennedy, 1995). It is a
stochastic algorithm inspired by the social behavior of fish schooling and bird flocking.
The algorithm simulates the social behavior of humans and insects; individuals can
interact with one another while also learning from their experiences and with time, the
population members move into more suitable regions in the problem space. Just like
genetic algorithm (GA) and other heuristic tools, PSO is randomly initialized with a set of
potential solutions and iteratively searches for an optimal solution by updating the
population in each iteration. Unlike GA, however, PSO does not make use of the
selection, crossover and mutation evolution operators in its implementation. In the PSO
algorithm, each particle represents a potential solution and flies through a
multidimensional search space by following the current optimum particles and keeps a
record of its current position as well as the best position it has achieved so far. Aside
from the personal best positions obtained by each particle, the algorithm also finds a
global best position for all particles in the search space, and it can be referred to as the
“best” of the personal bests. Particles are initialized with random positions and velocities
with the velocity of each particle adjusted according to its flying experience and those of
other particles. The position of each particle is updated according to Euler’s integration
equation. The velocity and position of each particle are modified according to the
equations below:
)()( )(2)(1)()1( titiitii xgbestrandCxbestprandCvv (4-1)
)1()()1( titit vxx (4-2)
55
where v = velocity of thi particle
x = position of thi particle
i = particle index
t = discrete time index
bestpi = personal best position of thi particle
gbest = global best of all particles
rand= uniformly distributed number between 0 and 1
2,1C = weighting factors. In most situations 1C =
2C = 2
= inertia function. Values close to one facilitate global exploration while values
close to zero facilitate a local exploration. The algorithm performs best if the
inertia function linearly decreases through the course of the implementation of
the algorithm.
In this work, PSO was used in spectrally unmixing brown and green
endmembers’ reflectance data corresponding to spectral bands used in computing ARI1
and MAREP. The linear unmixing model was used in relating abundances of each
colored region to the reflectance spectrum generated by the spectroradiometer system.
The linear mixing model for a mixed spectrum, mixedX , containing two endmembers is
given by:
2211 ccccmixed bfafX (4-3)
where 1cf represents the fractional abundance for class, 1c and 2cf represents fractional
abundance for class, 2c , while a and b represent the unmixed reflectance data for class
1c and class 2c , respectively.
56
In order to get the spectral reflectance for each of the classes, the objective function, J , had to be minimized using the least squares equation given below.
))(())((2
122112211 ccccmixed
T
ccccmixed bfafXbfafXJ (4-4)
Regression Models
Both partial least squares regression (PLSR) and stepwise multiple linear
regression (SMLR) statistical methods were used in building prediction models for the
quantitative assessment of early stage AMB samples and SPSS Statistics 23 (IBM,
Armonk, NY, USA) was used in developing prediction models.
Results and Discussion
Spectral Unmixing of Brown and Green Endmembers for Early Stage AMB Diseased Samples
Before a kNN classifier was applied to the 19 color and texture feature images, a
couple of preprocessing steps were applied to the color images. Hough transformation
method was used in extracting regions of interest while green pixels were masked for
easy segmentation of brown pixels using the ratio of the red channel to the green
channel. A threshold of 0.5 was chosen for green pixel masking after studying the band
ratio histogram. A summary of the steps is shown in Figure 4-2 using one of the
samples in the ambE dataset. Even after the application of Hough transformation, there
was still some remnants of the white ring on some of the images. The green pixels were
also not completely removed after applying the mask. As a result, five classes were
defined for the segmentation of brown pixels. The five classes were green, brown, vein,
background, and ring. The background class represented previously extracted green
pixel mask. The kNN algorithm assigned a unique color to each of the classes and the
57
light blue region in Figure 4-2c indicates regions in the image containing brown pixels.
Abundance estimation for the brown subclass was computed by dividing the sum of
brown pixels by the total number of pixels within the region of interest. In order to
simplify computation, all the other pixels representing vein and green regions were
regarded as the “green” class. For 2014 dataset, abundance estimation for the brown
class ranged roughly between 0.015 and 0.58, while for 2015, it ranged between
0.00006 and 0.21. The formula used in estimating the abundance, A , for each subclass
is given below:
T
c
S
SA (4-5)
where cS and TS represent the total number of pixels enclosed by a particular subclass
and the total number pixels enclosed by the entire region of interest, respectively.
Before PSO algorithm was applied to the ambE dataset, its parameters were
tuned using a spectral dataset with known mixed features, endmembers, and
abundances. This was done so as to ensure the algorithm would produce accurate
results for the new dataset. To guarantee stable results, the algorithm was run 1000
times on reflectance data at the five spectral bands used in computing ARI1 and
MAREP indices. After PSO analysis, ARI1 and MAREP spectral indices were calculated
for early stage AMB asymptomatic (green subclass), and symptomatic (brown subclass)
endmembers and Figure 4-3 shows plots of MAREP against ARI1 for 2014 and 2015
datasets. Asymptomatic and healthM classes had some slight overlap in 2014 while
healthY and asymptomatic classes had some overlap in 2015. The overlap resulted due
to the similarity in color of the three classes. As expected, there was no overlap
between the symptomatic class and the other classes as a result of its distinct color.
58
These results from the PSO analysis prove that mixed spectral data acquired from a
system, such as the one used in this analysis, can be separated and more efficiently
analyzed.
Quantitative Analysis of Early Stage AMB Diseased Samples
Both 2014 and 2015 datasets were combined, and the abundances estimated
using the kNN classifier were used as degrees of infestation for ambE. A total of 20
healthy samples were also included in the analysis to represent zero disease infestation
while all 261 ambE samples were used. Table 4-1 shows some statistical measures
that were computed for early stage disease analysis. A total of 140 samples had
disease severity levels less than 4%, 92 samples had severity levels between 4% and
15% while 49 samples had severity levels over 15%. Disease severity ranged between
0 and about 58%.
A PLSR analysis was performed using six components, and a coefficient of
determination (R2) of 0.74 was achieved. The regression coefficients, BETA, were
plotted against the full wavelength range to extract bands with the highest discriminatory
power. As was done by Jones et al. (2010), a threshold of the absolute value of 0.005
was set, and any spectral bands above this limit were retained. A total of 143 bands
met this criterion and were extracted for further analysis using SMLR. The dataset
contained a total of 281 samples; two-thirds of the samples were used as calibration
data while one-thirds was used as validation. The stepping method criteria were set to a
p-value of 0.05 for entry and a p-value of 0.1 for removal. A total of nine bands were
selected, and they were: 673 nm, 367 nm, 690 nm, 1655 nm, 988 nm, 1669 nm, 996
nm, 1415 nm and 1407 nm. The calibration dataset achieved an R2 of 0.76 while an R2
of 0.73 was achieved for the validation dataset. As demonstrated in Figure 4-3, these
59
results show some promise in using predictive models in quantifying disease severity.
Future work will include repeating prediction analysis using degrees of infestation
extracted from more accurate abundance estimation algorithms.
Conclusion
The main objective of this study was to detect AMB disease at the earliest stage
possible using spectral data from two consecutive years. Abundance estimation and
spectral unmixing algorithms were developed using a kNN classifier and PSO algorithm.
PSO showed promising results in spectrally unmixing early stage AMB diseased
samples by creating symptomatic and asymptomatic endmembers from previously
mixed spectral data. Two prediction models were combined to quantify various degrees
of infestation in early-stage AMB samples. The models that were used for this analysis
were PLSR and SMLR statistical methods. A total of nine spectral bands, between the
visible and SWIR wavelength range, were identified as significant in predicting disease
severity; but an R2 of 0.73 indicated an above average fit of the model. Future work on
this research will include collecting spectral reflectance information of leaf samples
using a system that can acquire data on a per pixel basis such as a hyperspectral
imaging system and investigate more efficient methods for disease abundance
estimation, spectral unmixing, and regression analysis. Overall, the results of this work
indicate the potential of using spectroscopic technology as a valuable non-invasive tool
for the early diagnosis of AMB.
60
Figure 4-1. Flowchart showing steps for extraction of brown and green colored pixel abundances in early-stage AMB diseased samples.
Step 1•Spectral measurements (350 - 2500 nm range)
Step 2•RGB image acquisition of leaf samples
Step 3•Extraction of circular region of interest using Hough transformation method
Step 4•Development of a method to mask green pixels
Step 5•Creation of color and texture features
Step 6•Abundance estimation using supervised kNN classfier
61
Figure 4-2. Steps taken in segmenting early stage AMB diseased color images. A)
extraction of region of interest using circle Hough transform method, B) masking of green pixels using ratio of red and green channels, C) kNN image segmentation--light blue regions indicate locations of brown pixels with a total abundance estimation of about 0.58.
A
B
C
62
Figure 4-3. The plot of MAREP against ARI1 for classes in spectroradiometer dataset
for two consecutive years. A) 2014, B) 2015.The classes include healthy mature (healthM), healthy young (healthY), early stage AMB asymptomatic (ambEGr), and early stage AMB symptomatic (ambEBr) Asymptomatic and symptomatic endmembers were calculated using a combination of kNN abundance estimation and particle swarm optimization.
A
B
63
Table 4-1. Disease severity statistical analysis results for 20 healthy and 261 early stage AMB diseased samples from 2014 and 2015 spectroradiometer datasets
Disease Severity
(%)
Number of samples
Min (%)
Max (%)
Mean (%)
Standard Deviation
(%)
< 4 140 0 3.9 1.4 1.2
4 - 15 92 4.1 14.8 8.5 3.2
> 15 49 15.1 57.9 23.5 9.1
Figure 4-4. Predicted versus actual disease severity for validation dataset using a combination of PLSR and SMLR prediction models.The classes that were used in the analysis were healthy and early stage AMB diseased samples from 2014 and 2015 datasets.
64
CHAPTER 5 HYPERSPECTRAL IMAGE ANALYSIS OF EARLY STAGE APPLE MARSSONINA
BLOTCH DISEASE USING SEQUENTIAL MAXIMUM ANGLE CONVEX CONE ALGORITHM
Background
Optical sensing technologies, such as hyperspectral imaging, are gaining
popularity as tools that can be used for quality assessment in the agricultural field.
Hyperspectral imaging is a quick, non-destructive and cost-effective technique for
detecting plant diseases. Hyperspectral images contain an abundance of information
that can be simplified to identify plant diseases. Researchers have been successful in
using the hyperspectral imaging technique for the identification of plant diseases. (Qin
et al., 2009) discovered four wavelengths (553, 677, 718, and 858 nm) that could be
used to recognize citrus canker in grapefruits with 92.7% accuracy. Additionally,
Penicilliumdigitatum in mandarin was successfully identified with an accuracy above
91% when using 20 wavelength bands (Gómez-Sanchis et al., 2008). Zhang et al.
(2005) succeeded in using five vegetation indices to detect late blight disease in tomato
leaves. They were able to separate healthy leaves from diseased ones before any
economic damage occurred. Hyperspectral imaging is also being used to expose
diseases in wheat, maize, and wine grapes (Del Fiore et al., 2010; Graeff et al., 2006;
Huang et al., 2007; Muhammed & Larsolle, 2003; Naidu et al., 2009)
The main objective of this work was to evaluate the potential of using a
hyperspectral imaging system for the identification of apple Marssonina blotch disease.
The specific objectives were to determine optimal spectral features and bands for early
disease detection and to develop a detection algorithm for early stage AMB disease
detection.
65
Materials and Methods
Outdoor Hyperspectral Data Acquisition
Hyperspectral images of leaf samples used in this work were acquired from Fuji
apple trees in an experimental apple orchard located in Gunwi-city, Korea. A test area
measuring 40 m x 60 m, with a total of 260 trees, was set aside for the experiment.
Datasets used in this analysis were acquired between the months of August and
October 2014. A hyperspectral camera (Spectral camera PS-V10E, SPECIM Inc.,
Finland) was used to acquire hyperspectral images for the outdoor hyperspectral
dataset. The camera consisted of an imaging spectrograph covering the spectral range
of 400 nm -1000 nm and a sensitive high speed interlaced CCD detector. The spectral
resolution of the camera was set to 2.8 nm, and its output was saved as digital 12-bit
files.
Unlike the spectroradiometer system referred to in previous sections, the
hyperspectral imaging system was able to capture both spatial and spectral information
on a per-pixel basis and allowed for pixel per pixel classification. The system was setup
with the hyperspectral camera mounted on a tripod, and a dark cloth was placed on the
ground to prevent weeds from being displayed in the image. Tree branches containing
about fifteen leaves with healthy, AMB asymptomatic and AMB symptomatic regions
were imaged. Figure 5-1 shows the RGB composite of the outdoor hyperspectral image.
Each hyperspectral image measured about 975 (spatial) x 696 (spatial) x 519 (spectral)
dimensions.
66
Image Preprocessing
Flat field correction, using 99% white reflectance standard and a dark current
measurement, was used to normalize each hyperspectral image to unit reflectance
using the normalization formula given below:
darkwhite
darkrawctd
RR
RRR
(5-1)
where ctdR is the corrected reflectance image,
rawR is the original uncorrected image,
and darkR and
whiteR are the mean radiance spectral values of regions of interests
extracted from dark current and white reflectance images, respectively.
After images were calibrated to reflectance, a Savitzky-Golay filter, with a
second-degree polynomial and a filter width of 7, was used to smooth and minimize
noisy signals in the images. All hyperspectral images were spectrally subsetted so as to
remove noisy bands. A total of 429 spectral bands between the wavelength range of
400 nm to 1000 nm were retained after spectrally subsetting the images. A background
mask was created using a combination of support vector machine (SVM) and texture
filters. All hyperspectral image analyses, including the preprocessing steps, were
performed using a combination of ENVI 5.2 (Exelis visual solutions information Inc.,
Boulder, CO, USA) and MATLAB R2014a (Version 8.3, The MathWorks Inc., Natwick,
MA, USA).
Vegetation Indices
Over ten vegetation indices (VIs) were investigated for the outdoor hyperspectral
dataset, but only a combination of modified triangular vegetation index (MTVI) and
matrix-adjusted red edge position (MAREP) efficiently separated the spectral classes.
67
Haboudane et al. (2004) created a modified version of the triangular vegetation
index (TVI) which was first introduced by Broge and Leblanc (2001). The concept
behind TVI, according to Broge and Leblanc (2001) is that the total area of the triangle,
defined by the green peak, the NIR shoulder, and the minimum reflectance in the red
region, will increase as a result of chlorophyll absorption (decrease in red reflectance)
and leaf tissue abundance (increase of NIR reflectance). Modified triangular vegetation
index (MTVI) makes TVI a better predictor of green leaf area index (LAI) by replacing
the 750 nm wavelength with 800 nm. MTVI is stated mathematically as the following
equation:
800 550 670 5501.2 1.2 2.5 MTVI (5-2)
where 𝜌800, 𝜌550, and 𝜌670 represent reflectance at 800 nm, 550 nm, and 670 nm,
respectively.
Matrix-adjusted red edge position (MAREP) is a vegetation index based on the 4-
point interpolation red edge position algorithm. The red edge in vegetation spectra
refers to the region of abrupt change in reflectance (also known as the point of
inflection) close to the near infrared (NIR) region. Stressed vegetation typically has this
point of inflection shifted towards shorter wavelengths in the visible spectral range and
this phenomenon is referred to as blue shift. More detailed information about MAREP
can be found in chapter 3. The MAREP algorithm used in this analysis was adjusted
slightly by removing the summation signs given in equation 3-3 and calculating the
MAREP index for each pixel without regarding the combined effect of a pixel’s cluster.
This was done because the modified version of the algorithm produced better
68
separation results between classes than when clustering analysis was combined with
MAREP. The modified version of the algorithm used in this chapter was named
MAREP2 so as to distinguish it from the version discussed in chapter 3.
Sequential Maximum Angle Convex Cone (SMACC)
In order to identify healthy, AMB asymptomatic and AMB symptomatic leaf
regions in the outdoor hyperspectral image, a model capable of distinguishing between
healthy and diseased pixels had to be utilized. In this work, the sequential maximum
angle convex cone (SMACC) algorithm was used to identify healthy, asymptomatic and
symptomatic regions on a cluster of leaves. SMACC is an endmember extraction
algorithm that simultaneously finds endmembers and their respective abundances in
hyperspectral images (Gruninger et al., 2004). SMACC uses a convex cone model for
representing vector data. The technique finds extreme vectors within a dataset and
uses these extreme vectors as endmembers. It first finds the endmember with the
highest intensity and then, the next endmember it finds is the one most extreme from
the first one found. Subsequent endmembers are chosen as pixels most different from
the already found endmembers. If no predefined number of endmembers is stated, the
algorithm continues the search until an endmember pixel already accounted for in the
previous group is found. SMACC uses the following equation in finding each
endmember, H:
, , ,
1
N
c i c k k j
k
H L f
(5-3)
where i is the pixel index. j and k are endmember indices from 1 to the expansion
length, N. L is a matrix that contains the endmember spectra as columns. c is the
69
spectral channel index. f is a matrix that contains the fractional contribution (abundance)
of each endmember j in each endmember, k, for each pixel.
Kullback-Liebler Divergence (KLD)
Kullback-Liebler divergence (KLD) is a dissimilarity measure between two
probability distributions and there are both asymmetric and symmetric versions of KLD.
It is not a true metric, but it is still often used to measure the distance between two
probability distributions. In the context of hyperspectral band selection, it can tell how
different two image bands are. Bands that are less correlated have higher KLD values
and vice versa. The symmetric KLD was used as a dissimilarity measure for a
hierarchical clustering band selection process, and it is given as:
𝐷𝐾𝐿(𝑋𝑖, 𝑋𝑗) = ∑ 𝑝𝑖(𝑥) log𝑝𝑖(𝑥)
𝑝𝑗(𝑥)𝑥∈Ω
+ ∑ 𝑝𝑗(𝑥) log𝑝𝑗(𝑥)
𝑝𝑖(𝑥)𝑥∈Ω
(5-4)
where 𝑋𝑖, 𝑋𝑗 represent two pairs of discrete variables defined in a Ω space and 𝑖 and 𝑗
are two bands of a hyperspectral image. 𝑝𝑖(𝑥) and 𝑝𝑗(𝑥) are the probability distributions
of variables in bands 𝑖 and 𝑗, respectively. 𝐷𝐾𝐿(𝑋𝑖, 𝑋𝑗) requires that both 𝑝𝑗(𝑥) 𝑎𝑛𝑑 𝑝𝑖(𝑥)
be absolute continuous with respect to each other. 𝐷𝐾𝐿(𝑋𝑖, 𝑋𝑗) is always nonnegative
and results in zero when probability distributions, 𝑝𝑖(𝑥) and𝑝𝑗(𝑥), are the same.
Just as was done in chapter 3 with the OSP algorithm, JM distance was also
used in removing redundant bands before the KLD hierarchical clustering band
selection process.
Support Vector Machine (SVM) Classification
A support vector machine (SVM) classifier with a radial basis function kernel was
designed for the classification process. SVM classifiers are known to yield good
70
classification results for complex and noisy data. The gamma parameter used in the
kernel function was set to the inverse of the number of spectral features used in the
classification process. A neural network classifier was also used in classifying the
hyperspectral data, but SVM performed better in mapping input data to their respective
classes, and as a result was not included in this work. SVM in its most basic form is a
binary classifier and cannot perform multi-label classification in one single stage. In
order to efficiently classify the hyperspectral data, a series of SVM classifiers were built
in ENVI for each pair of classes and the results were combined to give one set of
classification result.
Results and Discussion
SVM and Texture Filter Background Masking
The original 2014 outdoor hyperspectral image contained leaves and background
(branches, apple fruit, reflectance standards, etc.). In order to ensure the SMACC
endmember extraction algorithm would only find leaf endmembers in the image, the
background was masked using a two-stage SVM and occurrence texture filter process.
At first, supervised classification was performed using SVM to mask the background,
but not all the regions were successfully masked as shown in Figure 5-2.
Five occurrence filters were then applied on the SVM masked image and their
results were compared. The filters that were applied were data range, mean, entropy,
variance and skewness. These filters were used in calculating the image texture in
every 3 x 3 processing window by using a number of occurrence of each gray level in
the window. Due to the large size of the hyperspectral image, only the band image at
450.8 nm wavelength was applied in this process and the results are shown in Figure 5-
3. From the figure, it can be seen the entropy image emphasizes the leaf regions more
71
than the other texture images and as a result, it was chosen for the second stage of the
masking process. The final result after applying the entropy image mask is given in
Figure 5-4. It can be seen both SVM and entropy texture filter did a good job at masking
all the background regions.
SMACC Endmember Extraction and Spectral Feature Analysis
SMACC algorithm was applied to the hyperspectral image after the background
was masked. Three endmember spectra and three abundance images were created
using SMACC. The abundance images were thoroughly analyzed and for each
abundance image, threshold values were set so as to extract regions of interest which
could be used to define classes for the outdoor hyperspectral dataset. At the end of the
analysis, five endmember classes were generated and were named according to their
disease severity levels. Three of the classes were non-symptomatic while two were
symptomatic and as a result the classes were named: non-symp 1, non-symp 2, non-
symp 3, symp 1 and symp 2. A total of 3034 pixels were randomly selected from each
endmember region (15,170 pixels combined) and a new image was created using these
pixels. For each class, 1001 pixels were randomly selected for calibration while the
remaining 2033 pixels were used as validation data.
The mean spectra developed using the calibration dataset is given in Figure 5-5.
From the figure, it can be seen only symptomatic classes do not have a peak at the
nitrogen absorption band of 550 nm indicating the absence of chlorophyll in those
samples. Due to chlorophyll absorption in the red range, healthier samples have lower
reflectance than AMB diseased samples at that range. The internal scattering of light in
the NIR range is responsible for higher reflectance in healthier samples. Since the
original hyperspectral image was calibrated to unit reflectance, it was expected that the
72
reflectance data for each of the classes would not exceed one; but from the figure, it
can be seen the healthy class has its reflectance signature in the NIR region going over
one. One probable cause of this anomaly could be due to the transmission of light
through the leaves resulting in nonlinearities in the data.
Vegetation Indices
A total of five spectral bands were used in computing both MAREP2 and MTVI
for the five SMACC endmembers and the results are shown in Figure 5-6. From the
figure, it can be seen the healthier the class, the higher the MAREP2 and MTVI values.
The plot also shows slight overlap between the classes but for the most part, the five
classes could be separated from one another.
Band Selection using KLD Hierarchical Clustering for 2014 Dataset
Before the band selection process, JM distance was used in removing 92
redundant spectral bands. A total of 337 bands were used for the band selection stage.
KLD was calculated for each pair of the remaining spectral bands and an agglomerative
hierarchical clustering band selection was applied on 5005 training pixel vectors (1001
pixels for each class). The clustering algorithm started by defining each hyperspectral
band as a separate cluster and then began clustering bands based on the KLD
divergence measure. The maximum number of clusters defined for the dataset was ten
and six optimal spectral bands were obtained in the end. Each cluster represented
bands that were highly correlated with one another. The bands that were least
correlated with the other bands in each cluster were selected as optimal bands. The six
spectral bands that were selected at the end of the process were 666.6 nm, 671.7 nm,
828 nm, 876.5 nm, 947.9 nm and 1000 nm.
73
SVM Classification of MTVI and MAREP Features
Both MTVI and MAREP2 vegetation indices were used as input features for an
SVM classifier. The classification accuracy results achieved for the test data are shown
in Table 5-1. The overall classification accuracy was 93.5%. The background class
aside, non-symp 1 and symp 2 classes had the highest classification accuracies of
94.5% and 94.6%, respectively. For the non-symp 1 class, 112 samples were
misclassified as non-symp 2. Non-symp 1 samples were classified with an accuracy of
87.1% with 118 samples misclassified as non-symp 1 and 144 samples misclassified as
non-symp 3. Samples in the non-symp 3 class were classified with an accuracy of
90.7% with 107 samples misclassified as non-symp 2 and 82 samples misclassified as
symp 1. For the symp 1 class, 94.1% of its samples were correctly classified, but there
were 70 misclassifications as non-symp 3 and 48 misclassifications as symp 2. Lastly,
109 samples in the symp 2 class were misclassified as symp 1 while one of its samples
was misclassified as non-symp 3. The general trend from the classification analysis
showed misclassified samples were from adjacent classes. This is not surprising since
adjacent classes tend to be more spectrally similarity to one another than classes that a
farther away. It can be inferred from the results that if both symptomatic classes were
merged into one class and the non-symptomatic classes into another, the classification
results would have been better. However, such an analysis was not regarded since our
goal was to detect the disease at the earliest stage possible so as to reduce economic
losses.
SVM Classification of JM-KLD Hierarchical Clustering
The spectral reflectance data at the six optimal bands selected by JM-KLD
hierarchical clustering were again used as input features for an SVM classifier. The
74
classification accuracy results achieved for the test data are shown in Table 5-2. The
overall classification accuracy was 92.7%, 0.8% less than the accuracy achieved using
MTVI and MAREP2 features. The non-symp 1 class achieved the highest accuracy
among all the classes with an accuracy of 96.9%, about 2.4% higher than the previous
classification. The second highest accuracy was obtained by symp 2 class, about 2%
lower than what was achieved for the non-symp 1 class. For the non-symp 1 class, 63
samples were misclassified as non-symp 2. Non-symp 2 samples were classified with
an accuracy of 88.3.7% with 94 samples misclassified as non-symp 1 and 143 samples
misclassified as non-symp 3. Samples in the non-symp 3 class were classified with an
accuracy of 90.2% with 125 samples misclassified as non-symp 2 and 74 samples
misclassified as symp 1. For the symp 1 class, there were 63 misclassifications as non-
symp 3 and 66 misclassifications as symp 2. Finally, 1921samples in the symp 2 class
were correctly classified with 112 misclassifications as symp 1. Again, we see the
general trend from the classification analysis showed misclassified samples were from
adjacent classes with similar spectral characteristics.
Overall, it can be seen the classification results achieved using both vegetation
indices and optimal bands were very similar and provided classification accuracies
higher than 90% for the five classes. While SMACC is a well-known method for
extracting endmembers, more efficient asymptomatic detection methods need to be
explored and their results compared to those achieved by SMACC. Future work on the
detection of the disease will include analyzing time-lapse hyperspectral images taken
every two to three days in order to guarantee the detection of healthy and asymptomatic
regions before symptoms become visible.
75
Conclusion
The main objective of this study was to detect AMB disease at the asymptomatic
stage using outdoor hyperspectral images acquired in 2014. In order to efficiently define
endmembers for the outdoor hyperspectral dataset, an endmember extraction algorithm
called sequential maximum angle convex cone (SMACC) was used. Five classes were
created, and MTVI and MAREP 2 vegetation indices were built based on five spectral
bands. Six optimal spectral bands were also chosen using a combination of Jefferies
Matusita distance and KLD hierarchical clustering algorithms. These features were used
as input features for an SVM classifier. Results showed that both MTVI and MAREP 2
vegetation indices and the six selected bands could efficiently separate non-
symptomatic and symptomatic pixels on AMB diseased leaves.
76
Figure 5-1. RGB color display of an outdoor hyperspectral image taken in 2014.
Figure 5-2. Comparison of original hyperspectral image and SVM-masked image. A)
before mask was applied, B) after mask was applied.
A B
77
Figure 5-3. Texture images from previously SVM-masked image at wavelength 450.8 nm. A) data range, B) entropy, C) variance, D) mean, and E) skewness.
A
B
C
D
E
78
Figure 5-4. Combined SVM-texture filter masking result. A) entropy mask, B) hyperspectral image after applying mask.
A B
79
Figure 5-5. Mean reflectance spectra of non-symptomatic and AMB symptomatic samples in 2014 outdoor hyperspectral dataset
80
Figure 5-6. The plot of MAREP 2 against MTVI for non-symptomatic and AMB symptomatic samples in 2014 outdoor hyperspectral dataset.
81
Table 5-1. SVM classification accuracy for six outdoor hyperspectral classes using MTVI and MAREP2 vegetation indices as input features.
Pre
dic
ted
Actual
total
non_symp1
non_symp2
non_symp3
symp1
symp2
background
non_symp1 1921 118 0 0 0 0 2039
(94.5%) (5.8%)
non_symp2 112 1771 107 1
0 0 1991
(5.5%) (87.1%) (5.3%) (0.05%)
non_symp3 0 144 1844 70 1 0 2059
(7.1%) (90.7%) (3.4%) (0.05%)
symp1 0 0 82 1914 109 0 2105
(4.0%) (94.1%) (5.4%)
symp2 0 0 0 48 1924 0 1971
(2.4%) (94.6%)
background 0 0 0 0 0 2033 2033
(100%)
Total 2033 2033 2033 2033 2033 2033 12198
Table 5-2. SVM classification accuracy for six outdoor hyperspectral classes using six spectral bands selected by JM-KLD hierarchical clustering algorithm.
Pre
dic
ted
Actual
total
non_symp1
non_symp2
non_symp3
symp1
symp2
background
non_symp1 1970 94 0 0 0 0 2064
(96.9%) (4.6%)
non_symp2 63 1796 125 0 0 0 1984
(3.1%) (88.3%) (6.1%)
non_symp3 0 143 1834 63 0 0 2040
(7.0%) (90.2%) (3.1%)
symp1 0 0 74 1904 112 0 2090
(3.6%) (93.7%) (5.5%)
symp2 0 0 0 66 1921 0 1987
(3.2%) (94.5%)
background 0 0 0 0 0 2033
(100%)
2033
(100%)
Total 2033 2033 2033 2033 2033 2033 12198
82
CHAPTER 6 SUMMARY AND FUTURE DIRECTION
Two different datasets were analyzed in this study for the early detection of AMB
disease: indoor spectroradiometer data and outdoor hyperspectral images. For the
indoor dataset, both qualitative and quantitative analyses were performed. Qualitative
analysis of the disease was carried out by building classification algorithms and using
vegetation indices and reflectance data at five optimal spectral bands as input features.
These spectral features were developed using JM distance, OSP, ARI1 and MAREP
algorithms. The results indicated a combination of MAREP and ARI1 were more
effective in separating the healthy and stressed classes than reflectance data at the
selected spectral bands.
The qualitative analysis of the disease using data acquired from a
spectroradiometer system revealed two important things. The first was that spectral
data of small regions on the leaves could not be acquired and analyzed due to the large
diameter of the system component used in holding the leaves in place during spectral
data acquisition. This resulted in spectrally mixed data for leaf samples containing both
seemingly healthy and diseased regions. Another significant finding resulted from
analyzing the MAREP and ARI1 features of early-stage diseased spectral samples. It
was found that the spectral features of samples belonging to this class were more
widely distributed than those of the healthy and nutrient deficient classes. As a result,
quantitative analysis of previously analyzed early-stage diseased samples was
performed.
Abundance estimation and spectral unmixing algorithms were developed using a
kNN classifier and PSO algorithm. PSO showed promising results in spectrally unmixing
83
early stage AMB diseased samples by creating symptomatic and asymptomatic
endmembers from previously mixed spectral data. As for the qualitative aspect of the
analysis, PLSR and SMLR prediction models were combined to quantify various
degrees of infestation in early stage AMB samples. A total of nine spectral bands,
between the visible and SWIR wavelength range, were identified as significant in
predicting disease severity.
Even though the analysis of distinct regions on early-stage diseased leaves was
improved using a combination of abundance estimation and spectral unmixing
algorithms, the algorithms could not separate the regions 100% and as a result, a
device capable of extracting regions on a per-pixel basis was investigated for more
efficient analysis of the disease. The system that was used was a hyperspectral imaging
system and this time data was acquired from an outdoor setting. A tree branch
containing about fifteen leaves, with some regions on the leaves showing early
symptoms of the disease, was imaged in 2014. In order to extract both non-
symptomatic and symptomatic regions on the leaves, SMACC endmember extraction
algorithm was used. Spectral features were developed using MTVI, MAREP2, JM
distance and, KLD hierarchical clustering algorithms and they served as input features
for an SVM classifier. Comparable results were achieved using both vegetation indices
and optimal spectral bands and both methods efficiently separated non-symptomatic
and symptomatic pixels on AMB diseased leaves with an overall accuracy of over 92%.
The SMACC algorithm worked well in creating endmembers at different disease
severity stages, but it is by no means the most efficient method for analyzing AMB,
particularly at the asymptomatic stage. A recommendation for future work on this project
84
would be to collect time-lapse hyperspectral images every two to three days, at about
the same time of the day, until early-stage symptoms begin to develop. To ensure the
same regions were symptoms would eventually develop are analyzed, one of the
preprocessing stages should include image registration and resampling. This process
will allow for more accurate analysis of regions of interest from the time-series images.
After accomplishing this, thresholds can be set for healthy, asymptomatic and
symptomatic classes using a combination of the six optimal spectral bands found by
using a combination of JM distance and KLD hierarchical clustering or by using the
spectral features built using a combination of MAREP2 and MTVI. Based on results
achieved from analyzing the time-lapse images, a low-cost multispectral camera can
then be built and mounted on either a ground-based platform like an unmanned ground
vehicle (UGV) or an aerial platform like a drone for the early detection of the disease in
apple orchards.
85
LIST OF REFERENCES
Apan, A., Held, A., Phinn, S., & Markley, J. (2004). Detecting sugarcane ‘orange rust’disease using EO-1 Hyperion hyperspectral imagery. International Journal of Remote Sensing, 25(2), 489-498.
Back, C.-G., Lee, S.-Y., Kang, I.-K., Yoon, T.-M., & Jung, H.-Y. (2015). Occurrence and
Analysis of Apple Blotch-like Symptoms on Apple Leaves. 원예과학기술지,
33(3), 429-434.
Batte, M. T. (1999). National Research Council. Precision Agriculture in the 21st Century: Geospatial and Information Technologies in Crop Management. Washington DC: National Academy Press, 1997, 168 pp., $39.95. American Journal of Agricultural Economics, 81(3), 755-756.
Belasque Jr, J., Gasparoto, M., & Marcassa, L. G. (2008). Detection of mechanical and disease stresses in citrus plants by fluorescence spectroscopy. Applied Optics, 47(11), 1922-1926.
Berardo, N., Pisacane, V., Battilani, P., Scandolara, A., Pietri, A., & Marocco, A. (2005). Rapid detection of kernel rots and mycotoxins in maize by near-infrared reflectance spectroscopy. Journal of Agricultural and Food Chemistry, 53(21), 8128-8134.
Broge, N. H., & Leblanc, E. (2001). Comparing prediction power and stability of broadband and hyperspectral vegetation indices for estimation of green leaf area index and canopy chlorophyll density. Remote sensing of environment, 76(2), 156-172.
Bruce, L. M., Koger, C. H., & Li, J. (2002). Dimensionality reduction of hyperspectral data using discrete wavelet transform feature extraction. Geoscience and Remote Sensing, IEEE Transactions on, 40(10), 2331-2338.
Del Fiore, A., Reverberi, M., Ricelli, A., Pinzari, F., Serranti, S., Fabbri, A., . . . Fanelli, C. (2010). Early detection of toxigenic fungi on maize by hyperspectral imaging analysis. International Journal of Food Microbiology, 144(1), 64-71.
Du, Q., & Yang, H. (2008). Similarity-based unsupervised band selection for hyperspectral image analysis. Geoscience and Remote Sensing Letters, IEEE, 5(4), 564-568.
Eberhart, R. C., & Kennedy, J. (1995). A new optimizer using particle swarm theory. Paper presented at the Proceedings of the sixth international symposium on micro machine and human science.
86
EPPO. (2013). Diplocarpon mali (anamorph: Marssonina coronaria). Retrieved December 12, 2015, from http://www.eppo.int/QUARANTINE/Alert_List/fungi/Diplocarpon_mali.htm
FAO. (2013). Global fruit production in 2013, by variety (in million metric tons). Retrieved November 16, 2015, from http://www.statista.com/statistics/264001/worldwide-production-of-fruit-by-variety/
Franke, J., & Menz, G. (2007). Multi-temporal wheat disease detection by multi-spectral remote sensing. Precision Agriculture, 8(3), 161-172.
Gómez-Sanchis, J., Gómez-Chova, L., Aleixos, N., Camps-Valls, G., Montesinos-Herrero, C., Moltó, E., & Blasco, J. (2008). Hyperspectral system for early detection of rottenness caused by penicilliumdigitatum in mandarins. Journal of Food Engineering, 89(1), 80-86.
Graeff, S., Link, J., & Claupein, W. (2006). Identification of powdery mildew (Erysiphe graminis sp. tritici) and take-all disease (Gaeumannomyces graminis sp. tritici) in wheat (Triticum aestivum L.) by means of leaf reflectance measurements. Open Life Sciences, 1(2), 275-288.
Gruninger, J. H., Ratkowski, A. J., & Hoke, M. L. (2004). The sequential maximum angle convex cone (SMACC) endmember model. Paper presented at the Defense and Security.
Haboudane, D., Miller, J. R., Pattey, E., Zarco-Tejada, P. J., & Strachan, I. B. (2004). Hyperspectral vegetation indices and novel algorithms for predicting green LAI of crop canopies: Modeling and validation in the context of precision agriculture. Remote sensing of environment, 90(3), 337-352.
Harada, Y., Sawamura, K, & Konno, K. (1974). Diplocarpon mali sp. nov., the perfect state of apple blotch fungus Marssonina coronaria. Annals of the Phytopathological Society of Japan.
Huang, W., Lamb, D. W., Niu, Z., Zhang, Y., Liu, L., & Wang, J. (2007). Identification of yellow rust in wheat using in-situ spectral reflectance measurements and airborne hyperspectral imaging. Precision Agriculture, 8(4-5), 187-197.
Jia, X., & Richards, J. (1999). Segmented principal components transformation for efficient hyperspectral remote-sensing image display and classification. Geoscience and Remote Sensing, IEEE Transactions on, 37(1), 538-542.
Jones, C., Jones, J., & Lee, W. (2010). Diagnosis of bacterial spot of tomato using spectral signatures. Computers and Electronics in Agriculture, 74(2), 329-335.
Kobayashi, T., Kanda, E., Kitada, K., Ishiguro, K., & Torigoe, Y. (2001). Detection of rice panicle blast with multispectral radiometer and the potential of using airborne multispectral scanners. Phytopathology, 91(3), 316-323.
87
Kumar, A., Lee, W. S., Ehsani, R. J., Albrigo, L. G., Yang, C., & Mangan, R. L. (2012). Citrus greening disease detection using aerial hyperspectral and multispectral imaging techniques. Journal of Applied Remote Sensing, 6(1), 063542-063541-063542-063522.
Lee, C.-H., Lee, S.-Y., Jung, H.-Y., & Kim, J.-H. (2012). The application of optical coherence tomography in the diagnosis of Marssonina blotch in apple leaves. Journal of the Optical Society of Korea, 16(2), 133-140.
Lee, D.-H., Back, C.-G., Win, N. K. K., Choi, K.-H., Kim, K.-M., Kang, I.-K., . . . Jung, H.-Y. (2011). Biological characterization of Marssonina coronaria associated with apple blotch disease. Mycobiology, 39(3), 200-205.
Lee, H.-T., & Shin, H.-D. (2000). Taxonomic studies on the genus Marssonina in Korea. Mycobiology, 28(1), 39-46.
Li, H., Lv, X., Wang, J., Li, J., Yang, H., & Qin, Y. (2007). Quantitative determination of soybean meal content in compound feeds: comparison of near-infrared spectroscopy and real-time PCR. Analytical and bioanalytical chemistry, 389(7-8), 2313-2322.
Martínez-Usó, A., Pla, F., Sotoca, J. M., & García-Sevilla, P. (2007). Clustering-based hyperspectral band selection using information measures. Geoscience and Remote Sensing, IEEE Transactions on, 45(12), 4158-4171.
Morgan, M., & Ess, D. (1997). The precision-farming guide for agriculturists. Deere and Company.
Moshou, D., Bravo, C., Oberti, R., West, J., Bodria, L., McCartney, A., & Ramon, H. (2005). Plant disease detection based on data fusion of hyper-spectral and multi-spectral fluorescence imaging using Kohonen maps. Real-Time Imaging, 11(2), 75-83.
Muhammed, H. H., & Larsolle, A. (2003). Feature vector based analysis of hyperspectral crop reflectance data for discrimination and quantification of fungal disease severity in wheat. Biosystems engineering, 86(2), 125-134.
Naidu, R. A., Perry, E. M., Pierce, F. J., & Mekuria, T. (2009). The potential of spectral reflectance technique for the detection of Grapevine leafroll-associated virus-3 in two red-berried wine grape cultivars. Computers and Electronics in Agriculture, 66(1), 38-45.
Oberhänsli, T., Vorley, T., Tamm, L., & Schärer, H. (2014). Development of a quantitative PCR for improved detection of Marssonina coronaria in field samples. Paper presented at the Ecofruit. 16th International Conference on Organic-Fruit Growing: Proceedings, 17-19 February 2014, Hohenheim, Germany.
88
Omran, M. G., Engelbrecht, A. P., & Salman, A. (2006). Particle swarm optimization for pattern recognition and image processing Swarm intelligence in data mining (pp. 125-151): Springer.
Pearson, T., Wicklow, D., Maghirang, E., Xie, F., & Dowell, F. (2001). Detecting aflatoxin in single corn kernels by transmittance and reflectance spectroscopy. Transactions of the ASAE, 44(5), 1247.
Qin, J., Burks, T. F., Kim, M. S., Chao, K., & Ritenour, M. A. (2008). Citrus canker detection using hyperspectral reflectance imaging and PCA-based image classification method. Sensing and Instrumentation for Food Quality and Safety, 2(3), 168-177.
Qin, J., Burks, T. F., Ritenour, M. A., & Bonn, W. G. (2009). Detection of citrus canker using hyperspectral reflectance imaging with spectral information divergence. Journal of food engineering, 93(2), 183-191.
Roggo, Y., Duponchel, L., Noe, B., & Huvenne, J. (2002). Sucrose content determination of sugar beets by near infrared reflectance spectroscopy. Comparison of calibration methods and calibration transfer. Journal of near infrared spectroscopy, 10(2), 137-150.
Rumpf, T., Mahlein, A.-K., Steiner, U., Oerke, E.-C., Dehne, H.-W., & Plümer, L. (2010). Early detection and classification of plant diseases with Support Vector Machines based on hyperspectral reflectance. Computers and Electronics in Agriculture, 74(1), 91-99.
Sankaran, S., Maja, J. M., Buchanon, S., & Ehsani, R. (2013). Huanglongbing (citrus greening) detection using visible, near infrared and thermal imaging techniques. Sensors, 13(2), 2117-2130.
Sankaran, S., Mishra, A., Ehsani, R., & Davis, C. (2010). A review of advanced techniques for detecting plant diseases. Computers and Electronics in Agriculture, 72(1), 1-13.
Sankaran, S., Mishra, A., Maja, J. M., & Ehsani, R. (2011). Visible-near infrared spectroscopy for detection of Huanglongbing in citrus orchards. Computers and electronics in agriculture, 77(2), 127-134.
Tamietti, G., & Matta, A. (2003). First report of leaf blotch caused by Marssonina coronaria on apple in Italy. Plant Disease, 87(8), 1005-1005.
Tanaka, S., Kamegawa, N., Ito, S., & Kameya Iwaki, M. (2000). Detection of thiophanate-methyl-resistant strains in Diplocarpon mali, causal fungus of apple [Malus pumila] blotch. Journal of General Plant Pathology (Japan).
Thenkabail, P. S., Lyon, J. G., & Huete, A. (2011). Hyperspectral remote sensing of vegetation: CRC Press.
89
Wang, H., & Angelopoulou, E. (2006). Sensor band selection for multispectral imaging via average normalized information. Journal of Real-Time Image Processing, 1(2), 109-121.
Wu, D., Feng, L., Zhang, C., & He, Y. (2008). Early detection of Botrytis cinerea on eggplant leaves based on visible and near-infrared spectroscopy. Transactions of the ASABE, 51(3), 1133-1139.
Xu, H., Ying, Y., Fu, X., & Zhu, S. (2007). Near-infrared spectroscopy in detecting leaf miner damage on tomato leaf. Biosystems Engineering, 96(4), 447-454.
Yang, C., Lee, W. S., & Williamson, J. G. (2012). Classification of blueberry fruit and leaves based on spectral signatures. biosystems engineering, 113(4), 351-362.
Yin, L., Li, M., Ke, X., Li, C., Zou, Y., Liang, D., & Ma, F. (2013). Evaluation of Malus germplasm resistance to marssonina apple blotch. European journal of plant pathology, 136(3), 597-602.
Yuan, L., Huang, Y., Loraamm, R. W., Nie, C., Wang, J., & Zhang, J. (2014). Spectral analysis of winter wheat leaves for detection and differentiation of diseases and insects. Field Crops Research, 156, 199-207.
Zhang, B., Sun, X., Gao, L., & Yang, L. (2011). Endmember extraction of hyperspectral remote sensing images based on the discrete particle swarm optimization algorithm. Geoscience and Remote Sensing, IEEE Transactions on, 49(11), 4173-4176.
Zhang, M., Qin, Z., & Liu, X. (2005). Remote sensed spectral imagery to detect late blight in field tomatoes. Precision Agriculture, 6(6), 489-508.
Zhang, N., Wang, M., & Wang, N. (2002). Precision agriculture—a worldwide overview. Computers and electronics in agriculture, 36(2), 113-132.
90
BIOGRAPHICAL SKETCH
Mubarakat Shuaibu was born and raised in Nigeria, Africa. In 2016, she
graduated with a Master of Science degree at the University of Florida where she was
interested in agricultural engineering. She was also appointed as a graduate research
assistant at the university’s Department of Agricultural and Biological Engineering. Her
research focused on finding ways hyperspectral imaging technology could be used as a
tool for the early detection of a fungal disease in apples called Marssonina blotch.
Shuaibu’s interest in agriculture began in 2002 while she was still a senior in high
school; however, it was only in 2011, while working as a project engineer at her family’s
real estate company, she decided she would pursue agricultural engineering as a
career. This decision stemmed from doing personal research on the agricultural sector
of her country and finding out that factors such as climate change, insufficient land for
farming, population growth, and most importantly, inadequate technical expertise, had
not exactly favored the production of adequate food. Her career goal is to be an
accomplished Agricultural Engineer, who focuses on finding ways to develop better and
more sustainable methods to grow food to meet world need.
In 2009, Shuaibu graduated with a first-class in BEng (Hons) Electrical and
Electronic Engineering from one of the top engineering universities in the UK --
University of Nottingham. After her graduation, she returned home and worked as an
instrument engineer at National Engineering and Technical Company (NETCO), an
engineering firm primarily focused on designing and building plants and facilities for
companies in the Nigerian oil and gas sector. She later went on to work as a project
engineer at her family's real estate company called Bright Star Realties and as an
electrical design engineer and health and safety officer at C.A. Preston Engineering Ltd,
91
a full-service oil and gas engineering consulting company with specialized expertise in
both onshore and offshore petroleum developments.