pattern recognition and neural networks...pattern recognition and neural networks a thesis submitted...
TRANSCRIPT
-
MACHINE VISION CLASSIFICATION OF PISTACHIO NUTS
USING
PATTERN RECOGNITION AND NEURAL NETWORKS
A Thesis
Submitted to the CoiIege of Graduate Studies and Research
in Partial Fulfiilrnent of the Requiremen ts
for the Degree of
Doctor of Philosophy
in the
Department of Agricultural and Bioresource Engineering
University of Saskatchewan
Saskatoon, Saskatchewan, Canada
by
Ahmad Ghazanfari-Moghaddam
Fall, 1996
O Copy15ght Ghazanfari, Ahmad, 1996. AU rights reserved.
-
National Library 1*1 of Canada Bibliothèque nationale du Canada Acquisitions and Acquisitions et Bibliographie Services services bibliographiques
395 WelIington Street 395. rue Wellington Ottawa ON K I A ON4 Ottawa ON K I A ON4 Canada Canada
The author has granted a non- exclusive licence allowing the National Library of Canada to reproduce, loaq distribute or sell copies of this thesis in microform, paper or electronic formats.
The author retains ownership of the copyright in ths thesis. Neither the thesis nor substantial extracts fiom it may be printed or otherwise reproduced without the author's permission.
Your Ne v m Rilénmce
Our fi& Notre relérence
L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la fome de microfiche/film, de reproduction sur papier ou sur format électronique.
L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.
-
University of Saskatchewan
College of Graduate Studies and Research
SUMMARY OF DISSERTATION
Subrnitted in Partial Fulfillment
of the Requiremen~ for the
DEGREE OF DOCTOR OF PHILOSOPHY
by
Ahmad Ghazanfari-Moghaddam
Depamnent of Agticu1tura.i and Bioresource Engineering
University of Saskatchewan
Summer, 1996
Examining Committee:
Dr. L A , Kells
Dr. E. Barber
Dr. J. Iruday araj
Dr. D. Wulfsohn
Dr. R. Boiton
Dr. RI Ford
Dr. A. Kusaiik
Externai Examiner:
&ad&sia&s DeanlDeanls Designate, Chair College of Graduate Studies and Research
Chair of Advisory Cornmittee, Depamnent of Agriculnual and Bioresource Engineering
Co-supervisor, Department of Agicultural and Bioresource Engineering
Co-supervisor, Department of Agriculrural and Bioresource Engineering
Depamnent of Electrical Engineexing
Department of Agricultural and Bioresource Engineering
Deparmient of Cornputer Science
Dr. D.S. Jayas Department of Biosystems Engineering University of Manitoba Winnipeg, Manitoba, R3T 5V6
-
MACHINE VISION CLASSIFICATION OF PISTACHI0 NUTS
USING
PATTERN RECOGNITION AND N U R A L NETWORKS
Machine vision-based sorting of agicuituraI commodities is an aliemative to the
conventional mechanical and elecnwptical soriing methods. This method offers high
speed multi-category classification by processing multiple-feanrres obtained through image
processing algorithms. The purpose of this thesis was to select an appropriate set of
feames and to investigate different classification schemes for efficient machine vision-
based sorthg of pistachio nuts.
Kerman cultivar pistachio nuts obtained h m Califoniia were used in this study. A
sample of nuts were weighed and manuaIly soried into four classes: "Grade One" (G 1 ) ,
"Grade Two" (GZ), and "Grade Three" (G3), and "unsplit nuts" (UN). Each class
consistai of 260 nuts. Morphological features (area, length, width, perimeter, and
roundness ), Fourier descriptor (FD's) of the boundary, and gray level histograms were
exnacted h m images of the nuts using a Macintosh-based machine vision system and
commercial image processing software.
The discrimination power of the individual sets of features for separaring the four
classes were investigated using Gaussian classifiers. The morphological features and FD's
resulted in relatively low classification accuracîes. 'Ihe gray-level histograms yielded an
average classification acciiracy of 98.5%. Analysis of the classincation results hdicated
that morphological features had a high potentiai for separatïng G1, G2, and G3 frmn each
otha; and that the FDfs had hi& discrimination power for separating the split nuts fiom
unsplit.
-
Different f e a m selection methods including f o w ard selec tion, backw ard
elimination, Fisher criterion, and graphical analysis were applied to select a suitable subset
of features. The feature selection results uidicated that a combination of seven selected FD's
and the area (7FD's & A), or a combination of the fkquency of the gray level56 and the
area (GL-56 & A) were suitable for separating the four classes. The selectd features were
used as input to different classifiers such as Gaussians, decision mes, multi-layer neural
networks (MLNN), and mu1 ti-structure neural networks (MS NN) . A procedure for caicuia~g the computational complexity of the classifiers was developed. The classifiers
were compared in term of performance and computationai complexity.
A decision tree classifier using GL-56 & A resulted in 9 1.7% classification
accuracy. The sarne features using MLNN and MSNN resuited in 92.4% and 93.2%
accuracy, respectively. The GL-56 & A using a Gaussian classifier resulted in an overall
classification accuracy of 89.6%. Using 7FD's & A, the classification accuracies were
82.8%, 88.795, 94.1%, and 95.0% for Gaussian, decision me, MLNN, and MSNN
classifiers respectively.
The decision m e classifiers required the least amount of computational rime, but
relied heavily on the tbreshold vaiues supplied by the user. The neural network classifiers,
in sequential executions, required higher compuiationd tirne, but in terms of classification
accuracy, they were superior to the statistical classification methods. The MSNN classifiers
were the most suitable method for this multi-category classification problem. These
classifiers learned their input-output mapphg faster and were more robust when compared
to MLNN classifiers.
-
COPYRIGHT
The author has agreed that the Library, University of Saskatchewan, may
make this thesis freely available for inspection. Moreover, the author has agreed
that permission for extensive photocopying of this thesis for scholarly purposes
may be granted by professor or professors who supeMsed the thesis work
recorded herein or, in their absence, by the Head of the Department or the Dean
of the college in which the thesis work was done. It is understood that due
recognition will be given to the author of this thesis and to the University of
Saskatchewan in any use of the material in this thesis. Copying or publishing or
any use of the thesis for financial gain without approval by the University of
Saskatchewan and the author's written permission is prohibited.
Request for permission to copy or to make any other use of the material in
this thesis in whole or in part should be addressed to:
Head of the Department of Agriculturai and Bioresource Engineering University of Saskatchewan 57 Campus Drive Saskatoon, Saskatchewan S7N 5A9 Canada
-
Machine vision-based sorting of agricultural commodities is an alternative
to the conventional mechanical and electro-optical sorting methods. This
method offers high-speed, multi-category classification by processing multiple-
features obtained through image processing algorithms. The purpose of this
thesis was to determine an appropriate set of features and to investigate different
dassification schemes for efficient machine vision-based sorting of pis tachio
nuts.
Kerman cultivar pistachio nuts obtained from California were used in this
study. A sample of nuts were weighed and manually sorted into four classes:
"Grade One" (Gl), "Grade Two" (GZ), and "Grade Three" (G3), and "unsplit nuts"
(UN). Each class consisted of 260 nuts. Morphological features (area, length,
width, perimeter, and roundness), Fourier descriptor (FD's) of the boundary, and
gray level histograms were extracted from images of the nuts using a Macintosh-
based machine vision system and commercial image processing software.
The discrimination power of the individual sets of features for separating
the four dasses were investigated using Gaussian dassifiers. The morphological
features and FD's resulted in relatively low classification accuracies. The gray-
level histograms yielded an average classification accuracy of 98.5%. Analysis of
the classification results indicated that morphological features had a better
potential for separating G1, G2, and G3 from each other while the FD's had a
higher discrimination power for separating the split nuts from unsplit.
-
Different feature selection methods induding forward selection, badcward
elimination, Fisher criterion, and graphitai analysis were applied to select a
suitable subset of features. The feature selection results indicated that a
combination of seven selected FD's and the area (7FD1s & A), or a combination of
the frequency of the gray level 56 and the area (GL-56 & A) were suitable for
separating the four classes. The selected features were used as input to different
dassifiers such as Gauçsians, decision trees, multi-layer neural networks
(MLNN), and multi-structure neural networks (MSNN). A procedure for
calculating the computational complexity of the classifiers was developed. The
dassifiers were compared in term of performance and computational complexity.
A deusion tree classifier using GL-56 & A resulted in 91.7% classification
accuracy. The same features using MLNN and MSNN resulted in 92.4% and
93.2% accuracy, respectively. The GL-56 & A using a Gaussian dassifier resulted
in an overall classification accuracy of 89.6%. Using 7FD's & A, the classification
accuracies were 82.8%, 88.7%, 94.1 %, and 95.0% for Gaussian, decision tree,
MLNN, and MSNN dassifiers, respectively.
The decision tree classifiers required the least amount of computational
time, but relied heavily on the threshold values supplied by the user. The neural
network dassifiers, in sequential executions, required higher computational time,
but in terms of classification accuracy, were superior to the statistical
classification methods. The MSNN dassifiers were the most suitable method for
this mdti-category classification problem. These classifiers learned their input-
output mapping faster and were more robust compared to MLNN dassifiers-
-
ACKNOWLEDGMENT
Raise be to God, the Beneficent, the Mercifui.
I wouid like to express my appreciation and gratitude to the individuals who assisted, encouraged, guided, and supported me throughout my education. Speaal appreciation is expressed to my P1i.D. advisors, Professor J. Irudayaraj (Dept. of Bio. & k g . Engin., Utah State University) and Professor D. Wulfsohn (Dept. of Ag. & Bio. Engin.), for their excellent supervison and advice.
My deepest thanks and gratitude are extended to the members of my advisory and examining cornmittees: Professor E. Barber (Dept. of Ag. & Bio. Eng.), Professor R. Bolton (Dept. of Electncai Eng.), Professor R Ford (Dept. of Ag. & Bio. Eng.), Professor A. Kusalik (Dept. of Computer Science), and Professor D. S. Jayas (Dept of Biosystems Eng. Univ. of Manitoba), the extemal examiner, for their guidance and support.
Many thanks to Professor S. Sokhansanj (Dept. of Ag. & Bio. Eng.), for use of the Bioprocess fadities; to Mr. M. Romaniuk, former research engineer (Dept. of Ag. & Bio. Eng.) for his assistance in using the machine vision system; to Professor H.C. Wood (Dept. of Electrical Eng.), for use of A N . software; and to the personnel and graduate students at the Department of Agricultural and Bioresource Engineering, for their support.
Appreciation is further extended to the Ministry of Culture and Higher
Education, Islamic Republic of Iran, for granting a Ph.D. scholarship to cover my entire living and educational expenses.
Thanks are due to California Pistachio Commission and Paramount Farms for supplying the pistachio nuts required by this project, and different brochures on pistachio nuts.
Finally, I would like to express my appreciation to my beloved wife, Shahin and to my two little daughters, Nadia (8) and Farida (6) for their understanding, patience, and support, and for being so nice.
God bless all of you.
-
TABLE OF CONTENTS
. . COPYRIGHT .................. ... ............................................. ... .................................. u ABSTlRACT .................................................................................................................... iv ACKNOWLE DGMENT ........................................................ .-. .................................... vi
. . TABLE OF CONTENTS ................................................. .- i ......................................................................................................... LIST OF FIGURES xi
... ......................................................................................................... LIST OF TABLES ..mi LIST OF FREQUENTLY USED ABBREVTATIONS ................................................. xïv
INTRODUCTION AND OB JECTWES .................................................. i 1.1 Introduction ................................................................................................ 1 1.2 Objectives .................................................................................................... 5
..................................................... 1.3 An Overview of The Thesis Chapters 5
........................ BACKGROUND AND REVIEW OF LITERATURE 7 2.1 Pistachio Nuts ............................................................................................. 7
2.1.1 Cultural Practices ....................................................................... -8 ............................. 2.1.2 Pos t-Harvest Processing of Pistachio Nuts 9
......................... 2.1.3 USDA Standards for Grading PistaChio Nuts 13 2.2 Inspection by Machine Vision .................................................................. 14 2.3 Image Acquisition ..................................................................................... -16
.............................................................................. 2.4 Classification Feahues -17 Extemai Image Fea tues ............................................................ -18 2.4.1.1 Morp hoiogical Features ......................... ... ........... -18 2.4.1.2 Fourier Descrip tors ..................................................... -19 2.4.1.3 Boundary Sequences ................................................... 2 0 2.4.1.4 Boundary Chain Codes .................... .... ................... 21
Interna1 Image Fea tures .............................................................. 21 2.4.2.1 Moments ....................................................................... -22
......................................................... 2.4.2.2 Tex tural Fea tues -22 2.4.2.3 Gray-Level Histograms ............................................... 23
2.5 Feature Selection ..................................................................................... -23 2.5.1 Interclass/Intraclass Method .................................................... -25 2.5.2 Forward Selection ...................................................................... -27
vii
-
3.6 Primary Classifications Using Gaussian Classifier .............................. 76 ..................................................................... 3 .6.1 Gaussian Classifier -76
3.7 Feature Seleciion ....................................................................................... -77 ........................ 3.7.1 Feature Seiebion for Morphological Features 78
3.7.2 Feature S e l d o n for FD's ................ ... ..................................... 79 .......................... 3.7.3 Feature çelection for Gray-level His tograms 80
3.8 Classification Ushg Selected Features .................................................... 80 3.8.1 Decision Tree Classifiers ............................................................ 81 3.8.2 Neural Network Classifiers ........................ .. ........................ .... . 82
3.8.2. 1 Wecting a Network Topology .................................. 83 3.8.2.2 Selecting a Learning s tep ............................................ 84 3.8.2.3 Training a Neural Network ........................................ 85
3.8.3 MSNN Classifier ......... ... ........................................................ û6 3 . 9 Classification Performance Evaluation .................................................. 89 3.10 Computational Complexity Calculations ............................................ -91
3.10.1 Computational Complexity for Gaussian Classifier ............ 92 3.10.2 Computational Complexity for MLNN and MSNN ............ 93
3.11 Summary ................................................................................................... 94
................................................................... 4 RESULTS AND DISCUSSION 95 4.1 Introduction ................................................................................................ 95 4.2 Random Position Experimen ts ................................................................ -95
4.2.1 Classifications Using Morphologid Features ....................... 96 4.2.2 Classifications Using FD's ......................................................... -102 4.2.3 Condusions from Random Position experiments ................. -104
4.3 Controlled Position Experiments ............................................................ -105 4.3.1 Classification Using Morp hologicai Features ......................... 106 4.3.2 Classification Using Fourier Descriptors ................................. 106 4.3.3 Classification Using Selected FD's and Area .......................... 108 4.3.4 Classification Using Gray-level His togram Data .................. -110
4.4 Explonng the Training Behavior of MLM\I ........................................... 117 4.4.1 Learning S tep ............................................................................... 118 4.4.2 Data Arrangement ...................................................................... 120 4.4.3 Effects of Network Topology on Training ............................... 121
4.5 Classification of Pistachio Nuts by MLNN ........................................... 122 4.5.1 GL-56+A ....................................................................................... 123
-
4.5.2 7FD's & A ...................................................................................... 125 ..................................................................... 4.6 Classification Using MSNN -127
4.6.1 GL-56 &A ..................... ... ......................................................... 127 4.6.2 7FD's & A ....................... ........ .......................................... 129
...................................... 4.7 Cornparison of Classifiers' Performances 2 ........................................ 4.7.1 Performance of Gaussian classifiers -131
.................... 4.7.2 Performance of Decision Tree Classifiers .... .. 133 ................................... 4.7.3 Performance of The MLNN Classifiers -133
.......................................... 4.7.4 Performance of MSNN Classifiers. -135 4.8 S u m m a r y ..................................................................................................... 136
5 S W M A R Y AND CONCLUSIONS ........................................................ 138 ..................................................................................................... 5.1 Summary 138
................................................................................................ 5.2 Conclusions -140 ...................................................................................... 5.3 Recommenda tions 142
...................................................................................................... 6 REFERENCES 144
APPENDIX A ........................................................................................................... 154 USDA STANDARDS FOR GRADING PISTACHI0 NUTS ....................... 154
........................................................................................................... APPENDIX B 158 DELTA LEARNING RULE AND ERROR BACK-PROPAGATION FOR MULTI-LAY ER NEURAL NETWOIIKÇ .............................................. -158
........................................................................................................... APPENDIX C 166 CHI-SQUARE TEST FOR GOONESS OF FIT ................................................ 166
-
Figure
LIST OF FIGURES
Page . The main post-harvest processing of pistaduo nuts .................................... 10
........... An optical nut-sorter (ESM International Inc.. Houston. TX. USA) 12 A schematic represenation for a classification process by machine . . ........................................................................ visron ................ ........ .15
The concept of interdass and intraclass distances ....................................... 26 Bayes decision rule for multi-dass probiems ...................... .... .............. -30 Machine classification of patterns uçing discrimination ..... .. ................. -32
.............................................................. A mathematical mode1 for a neuron 41 Typical activation functions for neurons . a) sigmoid, b) hard .................... 42 Mathematical mode1 of a percep tron ............................................................ -44 A mu1 ti-category classifier using c discrete percep tron .............................. 45
............................................................. A typical three layer neural network 47 The general classification system .................................................................... 63 The machine vision system, 1: Ba& lighting device, 2: Camera, 3: Monitor, 4: Cornputer, 5: Object, 6: Image of the object .............................. 65 The polygonal representation of boundary of an image ............................. 73 A typical MSNN for four-class classification ................................................ 87 A typical disauninator of a MSNN ................................................................ 87 The performance of individual mcrphological features in recognizing the three classes of G1, G2 and G3 ............................................ 99 Area / mass correla tion of spli t pis tachio nu ts ............................................. -99 Area histograms: (a) Gl, (b) G2, (c) G3 ........................................................... 100 The mean of the first 15 hannonio of the four classes of pistachio
..................................................................................................................... nu ts -103 A typical gray-scale image of a pistaduo nut .............................................. -105 The mean of the first 15 humonics of split and unsplit nuts ..................... 107 The proposed classification scheme using 7FD's & A ................................. 109 The gray-level dis tnbution of the four dasses .............................................. 111 Increase in accuracy of dassification by increasing the number ................ 113 Tree classification scheme using GL-%&A data .......................................... 115
-
4.1 1 The effect of leaming step in training a M W ........................................... 119 ................................. 4.12 Effect of data arrangement on the training of M W 121
4.13 A typicai configuration of a data file for MLNN .......................................... 123 4.14 A typicai data file for the MSNN dassifiers .................................................. 127
...... 4.15 The location of the G1, G2 and G3 classes on a hypothetical nom al.. 132
xii
-
LIST OF TABLES
Table Page . I 1.1 USDA standards for size grading of the pistachio nu& (California
.......................................................................... Pistachio Commission, 1995) -1 4.2 Performance of the Gaussian dassifier using morphological
............................................................................................................... features -98 4.3 Gaussian classification of Gl, G2 and G3 using area ................................... 100 4.4 Performance of the Gawian classifier using FD's and Area ..................... 104 4.5 Performance of the decision tree classifier using 7FDts & A ...................... 110
............................. 4.6 Performance of the Gaussian classifier using 7FD's & A 110 4.7 Performance of the Gaussian classifier using gray- .................................... -113 4.8 Performance of the decision tree classifier using GL-56 & A ..................... 116 4.9 Performance of the Gausian classifier using GL-56&A ............................ -117 4.10 The performance of different network topologies under ............................ 122 4.11 MLNN Classification using 6-54 network and transfomed data of
GL.56&A ............................................................................................................. 124 4.12 Classification using 10-5-4 network wi th 7FDts & A ................................... 126 4.13 Classification results of MSNN with GL-56&A ............................................ 129 4.14 The dassification results of MSNN using 7FDts & k .................................. 130 4.15 MLNN performance using GL-56 & A using a threshold of 0.6 for
the neîwork output ........................................................................................... 135
xiii
-
LIST OF FREQUENTLY USED ABBREVIATIONS
MSNN
FD' s
Mdti-layer neural network
Multi-structure neural network
Unsplit pistadiio nuts
Pistachio nuts, Grade one
Pistachio nuts, Grade two
Pistalhio nuts, Grade three
Fourier descnptors
Seven selected FD's and area
Gray level56 and area
xiv
-
CHAPTER 1
INTRODUCTION AND OBJECTIVES
Harvested pistachio nuts contain a considerable number of empty,
undeveloped, and unsplit shells due to factors such as unfavorable climate,
incomplete poilination, la& of nutrition, and disease (Woodroof 1967).
However, consumer demand is for large, in-sheii, split pistachio nuts. The
United States Department of Agriculture (USDA) standards for pistachio nu ts
designate size grades of "Extra Large'', "Large", "Medium", and "Small" for
these nuts (Table 1.1). The standards also identify as degrading factors the - -
existence of foreign matenals, and damaged and unsplit nuts.
Table 1.1 USDA standards for size grading of the pistachio nuts (California Pistachio Commission, 1995)
Extra Large Large Medium Small
-
20 or less 21 - 25 26 - 30 31 or more
-
Currently, on-farm separation of split from unsplit pistachio nuts is
accomplished by flotation methods. Primary sorting of the nuts is usually
done using mechanical devices such as saeens, aspirators, and gravity
separators (Kader 1985). Electro-mechanical devices, opticai sorters, and
manual sorting are used in processing plants for further grading of the nuts.
To implement the grades indicated in Table 1.1, a USDA inspecter will take
random samples from a lot and grade the entire lot based on personal
experience. Industries also have their own standards for packaging and
marketing these nuts.
Inspection and sorting by human labor is a subjective and time
consuming process. It also becomes curnbersome after a prolonged period of
time. On the other hand, mechanicd sorting is not precïse, and because of
direct contact can cause darnage to the nuts. Optical sorting devices utilize
lîght with certain wavelengths reflected from a product to assess its quality.
However, such optical devices cannot be used for size or shape grading.
Inspection and grading of pistachio nuts by machine vision is an
attractive al ternative to conventional methods because it offers the po ten tial
for high speed, non-destructive classification of the nuts using a single
machine. In this process a charge-coupled device (CCD) provides analog
signals of an object to a cornputer where the signals are digitized and stored as
an image. Further processing can be done on the image to extract quantitative
information to be used as input to a classification algorithm.
The continual improvement of price/performance of digital computers
have made it practical to automate visual inspection in many areas. As vision
technology continues to develop and industry becomes increasingly aware of
-
its potential, computer vision will find many new applications. Much of the
on-going research in food and agricultural processing is focused on the
application of machine vision to quality control. Examples include maturity
detection of peanuts (Ghate et al. 1993), sorting of dried prunes (Delwiche et al.
1993), and potato inspection (Tao et al. 1990). These industries are extremely
cornpetitive, hence efficiency and quality are primary means to inaease
market share and profit. Automation is not a luxury in these industries, but
an essential requirement.
On the 0th- hand, agriculturai products present a challenge in
inspection technology because of the wide variabilities present in properties
used for assessing qualities and grades. Machine vision systems are suitable for
inspecting rigid or predefined objects such as machine tools and metal parts.
However, visual characteristics of agricultural produds such as color, shape,
size, and texture are difficult for machine vision system to discem. It is even
harder to assess quality based on visual processing of these features.
Artificial neural networks, resembling biological nervous s ystem, have
proved to be robust in deding with ambiguous data and the kind of problems
that require the interpolation of large amounts of data. Neural networks,
instead of sequentially performing a program of instructions, explore many
competing hypotheses simultaneously using massive parallelism (Lippmann
1987). In addition, neural networks have the potential for solving problems in
which some inputs and corresponding output values are known, but the
relationship between the inputs and outputs is not well understood. These
conditions are commonly found in agricultural inspection problems.
-
Pattern recognition has emerged as an important application of artificial
neural networks. One of the most important attributes of neural network
dassifiers is their capability to approximate the a posterior distribution of their
training samples through learning and adaptation. This abiiity makes them
unique among pattern dassifiers. The application of machine vision, coupled
with neural networks capabilities, seems to offer promise for inspecting
agricultural products.
Grading pistachio nuts using machine vision in conjunction with
pattern recognition techniques, including neural networks, offers many
advantages over the conventional optical or mechanical sorting devices.
Multiple sensors can be used to gather the necessary information from the
nuts and send suitable signals to a computer where they c m be decoded for
multi-category dassification. Image processing algorithms can be used to
extract higher-level information from the input signals for improved
dassification performance. The dassification parameters can be easily
modified to take into account annual variations in the product. When neural
networks are used as pattern dassifiers, the sorting device can be equipped
with a training option through which the machine can be trained for
recognizing new grades or for different products.
An extensive literature search and direct communication with
industrial sources have indicated that no pattern recognition machine or
neural network-based system has been used for sorting or grading of the
pistachio nuts. Bench-mark studies are thus needed to develop an efficient
and a practical machine vision-based method for grading pistachio nuts.
Research must be conducted to determine suitable image features and proper
classification methods for accurate grading of pistachio nuts.
-
The main objective of this research was to study the feasibility of
dassifymg pistadiio nuts into four dasses of "Grade One" (Gl), "Grade Two"
(G2), "Grade Three" (G3), and '2Tnsplit" (UN) using a machine vision system.
S p e d c goals were:
1. To investigate the potential of different image-exhacted
features for dassification of pistadüo nuts.
2. To investigate the feasibility of dassifymg pistachio nuts
into their appropriate dasses using the selected features by:
a. Designing or selecting appropriate statistical
pattern dassifiers.
b. Designing suitable multi-layer neural network
based classifiers,
3. To compare the performance of the applied dassifiers and
determine an effiaent classification technique.
1.3 AN OVERVIEW OF THE THESIS CHAPTERS
The material presented in this thesis is organized into five chapters.
The first chapter addresses the justification, importance, and the statement of
the objectives of the research. Chapter Two of this thesis begins with a brief
look at cultural and processing activities for pistachio nuts. It continues with
an explanation of the principles of machine vision and pattern recognition
methods. Exampleç of application of these principles in different areas, with
-
an emphasis on agricultural engineering applications are given throughout
the text. A complete background on multi-layer neural networks and their
relationship to staüstical pattern dassifiers is presented as well. Chapter Two
condudes with a presentation of some examples of application of neural
network dassifiers to selected problems in agricultural engineering.
Chapter Three begins with a brief explanation of the methods followed
to perform the classifications procedures. Then a detailed description of the
image acquisition system is presented. Feahues, feature extraction and
classification methods, ciassifiers and the methods for evaluating their
performance are also desaibed in detail.
Results and discussion of different classification methods are presented
in Chapter Four. The presentation of the results follows the flow of
experiments starting with classification of the nuts using bulk features and
evolving towards more suitable features. The discussion covers the source of
variation in different feature types and their possible limitations and merits.
Also, the performance of different dassifiers are compared and the reasons for
the variations in their performance are explained.
Chapter Five indudes a summary of the research, methods, findings
and the conclusions made from the experimental results. Finally, suggestions
for future research are proposed.
-
CHAPTER 2
BACKGROUND
AND
REVIEW OF LITERATURE
2.1 PISTACHIO NUTS
The pistachio tree belongs to the family Anacardiaceae, of which the
cashew, mango, sumac, and poison oak are also members. There are about a
dozen species of pistica, a few of which produce small nuts. Only Pistica nera,
however, fields the acceptable, larger edible nuts of commercial value. The
nuts of Pistica vera are known as "pesteh" in Persia, from which the name
"pistachio" is derived (Rosengarten 1984). The pistachio tree is deciduous, and
dioeaous. One male tree is usually adequate to pollinate eight to ten trees. The
male trees are planted in the orchard in a way that takes advantage of the
prevailing winds for pollination.
The pistachio tree attains a height of 2 to 8 m, with a spreading top. It is
native to western Asia (Woodroof 1967). The nut has a light-colored woody
shell. The kernel, which is light greenish-yellow, has a sweet and delicate
-
flavor. The kemels are eaten raw or are roasted with salt and different spices.
It is also used for flavoring in cookery and confectionery.
2.1.1 Cultural Practices
The pistachio is a dry-dimate tree and requires little irrigation. Because
of the deeper soil penetration of its root system, the pistachio tree is better able
to withstand la& of soil moisture in the upper soil area than others of the
commonly cultivated fruit or nut trees. The pistachio tree thrives best in areas
having cool enough winters for proper breaking of bud dormancy and long,
hot, dry summers for proper ripening of the nuts
Pistachio trees begin nut production at about six or seven years of age,
but full bearing is not attained until the fifteenth to twentieth year
(Rosengarten 1984). Under favorable conditions, pistachio trees live and
produce for centuries. In the future, the pistachio nut trees could play an
important role in the development of some arid regions where rainfall
predudes the successful growing of almost any other commercial clop.
Pistachio nuts are born in dusters. The seed color ranges from light
yellow to deep green throughout, and the leathery husk shows different
shades of yellow, red, and purple. It is peculiar to the pistaduo nut that, when
ripe, a large portion of the nuts split dong the shell suture. This splitting
property is desirable since pistachio nuts are usually marketed in-shell, to be
opened by the consumer (Woodroof 1967).
The time to harvest pistachio nuts is aiticai. Eariier harvesting results
in a higher percentage of undeveloped kernels, lower nutritional quality, and
loss in yield. Deiayed harvesting causes increased shell staining, losses in
-
kernel quality, and inaeased incidence of insect infestation and fungal attadc
(Woodroof 1967). The easiest indicator used to determine optimum harvest
time is when the huil slips from the shell when the fruit is pressed between
thumb and fingers. In the Central Vaiiey of California this normaliy occurs
during the first two weeks of September.
In California, the pistachio nuts mature in late August through late
September. At maturity the extemal hull of the fruit changes from light green
to a pale straw or whitish, opaque appearance, at the same time softening and
loosening itseif from the stony inner part of the ovary wall, whidi is the gray-
white inner shell. Thus, it is easily slipped off by pressing between the fingers
(Woodroof 1967).
The nuts are harvested by hand picking, knocking them off the tree
with poles, or by shaking the tree by use of mechanical shakers. W hen
harvested, the nuts are in hulls and usually contain 40 to 45% moisture
(Woodroof 1967). koper management of hamesting and post-harvesting
handling procedures of pistachio nuts is important to attain maximum yield
of good quality nuts which in turn determine marketability and profit.
2.1.2 Post-Harvest Processing of Pistachio Nuts
The pistachio nuts, like many other agricultural products, require
considerable amount of post-harvest processing. The main activities for post-
harvest handling of pistachio nuts are given in Fig. 2.1. Pistadiio nuts are
covered with a tight hull, when immature. At the time of harvest the hull
softens and loosens itself from the shell. After harvest the hull must be
removed as soon as possible to reduce the chance for fungal growth and to
-
avoid shell staining. Each hour after the nut is off the tree, the hull degrades.
If the hull is not removed within 24 hours of harvesting, the deteriorating
huli will stain the shell so badly that the nut will no longer be marketable in
the prime market. Abrasive peeling machines are used to remove the hulls
(Kader and Maranto 1985).
S hipping e Grading Roasting
-
Fig. 2.1 The main post-hanest processing of pistachio nuts.
The dehulled nuts always contain a considerable number of unsplit and
blank nuts. Floatation techniques are used to separate these nuts from the split
nuts, since immediately after harvesting and dehulling the unsplit nuts tend
to float. During the floatation process, nuts are immersed in a tank of water
and agitated for about 10 minutes. Once the agitation is ceased, the "sinkers"
are primarily the splits and the "floaters" are the unsplits and blanks. Since
the floatation process is not accurate, some amount of unsplit and Iow
-
quality nuts always remain. After floatation, the nuts are washed to dean the
surface from the residuals left during dehulling.
The moisture content of nuts increases as a result of the floatation
process. To prevent spoilage and increase storage life, the dehuiled nuts must
be dried to about 5-7% moisfmre content (Kader et al. 1980). The most common
methods for drying pistachio nuts are sun dryîng and heated air drying.
Further on-farm sorthg of the nuts is usually done by electro-mechanical
devices such as aspirators, air streams, gravity separators, shakers, and sieves.
The dried pistachios are usually stored on the farm for a period of 2 to 6
months before being shipped to a processing plant or before being exported.
The nuts are sold by mass and the price is set by the grade of the nuts and the
availability. The USDA has established standards for different grades of
pistachio nuts (California Pistachio Commission 1994). These standards are
briefly desaibed in the next section.
Many nut-processing plants have their own standard for marketing
pistachio nuts. Sorting, grading, roasting, dyeing, inspecting and packaging are
some of the unit processes which may take place in a given plant. Pistachio
companies are very reluctant in releasing any information on the methods
and equipment for processing these nuts. However, based on persona1
communication with the North American manufactures of the nut sorting
devices, some pistachio companies use optical sorting machines for detecting
the unsplit, contaminated and stained nuts. Bichromatic-infrared sensors are
commonly used for detecting the stained or the contaminated nuts and
monodiromatic sensors are used for detecting the unsplit nuts.
-
A diagram of an optical sorting machine used for sorting pistachio nuts
is given in Fig. 2-2 The unsorted nuts are poured into the hopper of the
machine and the feedhg mechanism gradualiy distributes the nuts between
several channels. Each channel is equipped with two (180° apart) or three (120°
apart) optical sensors. As the nuts slides down the channels, they pass through
an illuminated turutel where the optical sensing elements are mounted. The
sensors send signals to electronic arcuitry which activates a pneumatic ejector.
These machine are capable of processing approximately one metric tonne of
unsorted nuts per hour .
Hopper
Feeder
Fiow Cham
Illumination and Sensors heuma tic Ejector
Fig. 2.2 An optical nut-sorter (ESM International Inc., Houston, TX,
USA).
-
Pistachio production in the USA is a young and rapidly growing
industry. Within the past decade there has been a few published research
projects on the engineering aspects of pistadiio processing. Pearson et al. (1993)
measured the physical properties of early-split and normal-split pistachios to
determine a sorthg criterion. Hsu at al. (1991) determined some physical and
thermal properties of pistachio nuts. Farsaie et al. (1981) developed an
automatic electro-optical sorter for removing aflatoxin-contaminated pistachio
nuts. The device detected the defected nuts excited by an incident beam of long
wavelength ultra-violet light.
2.1.3 USDA Standards for Grading Pistachio Nuts
In 1994 the USDA approved a standard for grading pistachio nuts
(California Pis tachio Commission 1994). The standards define separate cri teria
for "in shell" and "shelled" nuts. A copy of these standards is reproduced and
presented in Appendix A. Since the nuts used in this research were all "in
shell", the grades of these nu& as approved by the USDA are briefly explained.
Pistachio nuts, regardless of their size, can be graded as "US Fancy", "US
No. l", "US No. 2", or US No. 3". The basic requirement for these grades is
that they must be free from foreign material, blanks, stainç, unsplit shells,
mold, insects, and nuts having diameters less than 10.0 mm as measured
using a round hole saeen. The difference between these grades is as a result of
allowable tolerance of the above mentioned factors.
As was shown in Table 1.1, based on size, pistachio nuts may be graded
as "Extra Large", "Large", "Medium", and "Small". The specification also
indicates that the average number of nuts in a grade should not exceed one-
-
half nut above or below the specificationç given for that grade. For example a
sample of Large grade pistachïo may have an average number of nuts from
20.5 to 25.5 per 28.5 grams (1 oz.). To ensure the uniformity within a grade and
to put a limit on the number of srnalier nuts in a grade, the speufication
indicates that the mass of 10% by count of the largeçt nuts in a sample should
not exceed 1.7 times the mass of 10% by count of the smallest nuts.
2.2 INSPECI'ION BY MACHINE VISION
Machine vision is a technology that has arisen from a union between
camera and computer. Figure 2.3 presents a blodc diagram of the hardware
and software components of a typical classification machine. A video camera
acts as an eye to a machine vision systern (Batchelor al. 1985). Analog signais
generated by the camera are digitized into a sequence of nurnbers and stored as
an image in the computer. Image processing algorithms are used to extract a
pattern from the image to represent the object The pattern is dassified by a
classification algorithm which in turn may generate a signal to activate an
actuator to direct the object into its proper route.
Machine vision systems have gained hemendous attention for
inspecting products in different industries and dernands for their new
applications are increasing. Here inspection refers to many industrial tasks
including defect detection, measuring, locating, detecting orientation, grading,
sorting, and counting. Machine vision offers many advantages over the
conventional grading systems. It is compatible with other automated on-line
processing. It will continue working round the dock and under conditions
which would be unpleasant or impossible for a human operator. It can take
dimensional measurements more accurately than a person can estimate by
-
eye, and c m give an objective measure of other variables such as color,
projected area and shape which an inspector could only assess subjectively
(Batchelor al. 1985). Since the inspection is done through a non-contacting
procedure, there is less damage to the products when they are king inspected.
Image S torage
1 1 Image Processing,Analy sis ] 1
l Feame Selec tion/Extrac tion I Classification JI
Sensor
Actuators
Fig. 2.3 A schematic representation for a classification process by machine vision
Machine vision sorting of agricultural products is more versatile and
have more maneuverability than existing optical sorters. This methods offers
multiple-feature processing. The features c m be the signals sent by different
sensors or obtained through an algorithm or a combinations of the two
methods. The pattern classification algorithms implemented on a machine
vision system provide multi-category classification of the products. The
-
classification parameters can be eady changed to improve the classification
performance or to use the same machine for a different product without any
change in its hardware.
23 m G E ACQUISITION
The first problem in any automated inspection task is to obtain a good
image of the object under investigation. Image acquisition involves collection
of images and their transmission to an associateci processing system. There is
no substitute for a high quality image. The first step in achîeving this is to
provide a proper illumination for the object. Improper illumination of ten
cause the key features to be obscured by glare, or it may reduces the intensity of
light reaching the detector. The best method of illumination is not always
obvious.
The image sensor is the basic element within a camera for capturing
images. Self-scanned solid-state arrays are the most widely used image sensor
in these systems. Photodiode arrays, charge couple device (CCD) and charge
injecting devices (CID) are examples of self-scanned array devices (Batchelor et
al. 1985). Many industrial inspection problems in which the product moves
dong a transport system require line-scan cameras. Using these cameras, an
image of an object is built up line by line as it moves past a camera. Linear
solid state arrays are extremely well suited for such applications. Video
cameras equipped with self-scanned solid-state sensors and high quality
photographic lens and filters are widely used for image acquisition (Batchelor
1985). The image obtained for processing does not have to be simple optical
one. It can be outside the range of visible light. Specid infrared cameras are
-
used for thermal imaging, and ultraviolet light has been used in crack
detection (Batchelore 1985).
An image acquisition board (frame-grabber) is a device which is
installed in a computer to digitize the analog signals received from the image
sensor and store thern as an image in computer memory (Data Translation
Inc. 1988). The analog signal of each sensor pixel is usually digitized by an 8 bit
buffer producing a 0-255 range for the signal. Therefore a white pixel of an
image has a value of 255 and a black pixel is stored as a value of O. The image
is stored as an array of numbers within the computer memory to be used for
further processing.
An image is stored in a computer memory as an arrays of nunibers that
may contain over 300,000 elements. Sequential process of this information is
time consuming and is not feasible for high speed on-line inspections.
Therefore, usually some image processing and/or image analysis algorithms
are applied to the gray scale images to extract some quantitative information
known as "features". The features are used as inputs to some classification
algorithm to determine the dass of the object.
When different features of an image are put together they form a vector
of numbers known as a "pattern". Thus, a pattern is a numerical description
of an object. Many different methods have been proposed to obtain numerical
features from a digitized image for purpose of classification. Pavlidis (1978,
1980) presented a complete survey of algorithms for shape analysis. He
distinguished two main categories of image extracted features, namely,
-
external and internal features. They are briefly reviewed in the foliowing
sections.
2-4-1 Extemal Image Features
Extemai image features are those type of feahues which encode the
boundary information (Pavlidis 1978). They usually require an image
segmentation to get the coordinates of the pixels on the external contour of the
image. The externai features do not requhe any knowledge of the gray level of
the internal pixels. After image segmentation or edge detection process,
mathematical procedures are performed on the coordinate of the boundaries
to extract usefd features. Examples of external image features are
morphological features, Fourier descriptors, boundary chah code, and
boundary sequences.
2.4.1.1 Momholoeical Features
Morphology refers to the area of image processing concerned with the
analysis of images based on their shapes and sizes (Dougherty and Giardina
1978). Morphological features of a shape are obtained by image segmentation
and anaiysis of the image. To get this type of feature one does not need to
know the coordinates or the gray-level intensity of the pixels within an image.
Therefore a silhouette image is suffiaent to extract these features. Examples of
these features are area, width, perimeter, and aspect ratio.
Morphological features have been widely used in automated grading,
sorting and detecting of grains. Neuman et al. (1987) used images of wheat
kernels and extracted shape and size characteristics to discriminate different
-
dasses and varieties of wheat. Brogan and Edison (1974) used morphological
features in conjunction with a recursive learning technique and a Bayesian
deusion mle for identification of six different type of grains. Hehn (1991) used
morphological data for separating canola and mustard seeds.
Morphological features sometimes are not sufficient for a high
performance inspection process. These features are combined with other
appropriate features to achieve higher classification rates. Neuman et al. (1987)
used a combination of morphological feahues and Fourier desaiptors to
separate different varieties of wheat A recent application of morphological
features in machine vision system examined quality of snacks by a neural
network (Sayeed et al. 1995). The features considered induded area, length,
width, roundness, and perimeter.
2.4.1.2 Fourier Descriptors
Fourier desaiptors (FD's) are shape recognition features based on the
Fourier series expansion of periodic functions. The theory behind these
features are discussed in detail in section 35.2. The general idea is to represent
the boundary as a periodic function with a period of 2x. The obtained periodic
function is then expanded in a Fourier series and its coefficients are calculated.
Ordinary Fourier coeffiaents are difficult to use as input to dassifiers, because
they contain factors dependent on size, rotation, and phase angle (Granlund
1 972).
Different methods have been developed to obtain scale and rotation
invariant FD's for 2-dimensional object recognition. Ehrlich and Weinberg
(1970) desaibed the contour of an object in terms of the lengths of equispaced
-
radii extending from the centroid to its boundary. They presented radii as a
periodic function of the central angle with a perïod of 2x. The fundion was
then expandeci as a Fourier series and the polar coefficients of the expansions
were used as shape descriptors. Granlund (1972) derived the Fourier
coeff~aents from the expansion of the boundary coordinates of the objects in
the complex plane. Zahn and Roskies (1972) represented a c u v e as a function
of arc length by the accumulated changes in direction of the curve from a
starting point on the curve. The function was normalized as a 2x periodic
function and the Fourier coefficient of the function were calculated.
Fourier descriptors have been extensively used as shape descriptors in
many pattern recognition applications. Segerlind and Weinberg (1973) used
FD's obtained using the method developed by Ehrlich and Weinberg (1970) for
grain kernel identification. Persoon and Fu (1986) used FD's obtained by
Granlund's (1972) method for character and machine part recognition.
Romaniuk (1994) used FD's obtained by Zahn and Roskies (1972) method as
input to a neural network for classification of barley seeds.
2.4.1.3 Boundarv Secpences
Boundary sequence is a general term for patterns approximating the
boundary of an object. Dubois and Glanz (1986) approximated the boundary of
an image as an ordered sequence of the lengths of N equiangular radial vectors
projected between the object centroid and the boundary. Ghazanfari and
Irudayaraj (1994a) used the same type of sequence for dassification of four
varieties of pistachio nuts. Gupta and Srinath (1987) also used this type of
sequence to derive moments for classification of 2D shapes.
-
Bunk and Buhler (1993) used the curvature of the boundary for
symbolic representation of the object. In their method a starting point is
selected on the boundary and the local change in the curvature of the
boundary segments is recorded at equidistant intervals. This method of
boundary representation is very accurate but results in long sequences which
increases processing tintes.
2.4.1.4 Boundary Chain Codes
The boundary chain code was first developed by Freeman (1970) to
represent a plane object boundary as a string of directional codes. A boundary
chah code is a sequence of directional codes representing the boundary of an
image. Application of boundary diain codes in machine vision inspection
have included fruits' stem detection (Wolfe and Sander 2985) and tomato
sorting (Sarkar and Wolfe 1985). Boundary chain codes have also been used in
some image processing algonthms to extract morphological features from an
image (Hehn and Sokhansanj 1990; Liu and Srinath 1990).
2.4.2 Interna1 Image Features
The internai image features are obtained by analysis of the pixels
within the boundary of an image. Both the location of a pixel and its gray level
may play an important role. There are many different types of features that
may be considered for different applications. For example, Gunasekaran et al.
(1987) used line detection within an image to indicate the existence of a aack
in corn, and the existence of pixels with gray level pixels above a certain
threshold was used for defect detection of dates by Wulfsohn et al. (1993).
-
Moments, textural features, and gray-level his tograms are examples of
intemal image features which wiii be reviewed briefly.
2.4.2.1 Moments
Moments have been among the most commonly used image extracted
features for shape discrimination. Several of the most essential image
attributes such as size, centroid, orientation, shape spreadness, and shape
elongation are directiy related to moments (Leu 1991; Hehn 1991). Moment
invariants were firçt proposed and used for pattern recognition by Hu (1962).
The major disadvantage of moments is that although the first few moments
convey significant information for simple objects, they fail to do so for more
complicated ones (Pavlidis 1978). The computational time for deriving
moments of an image is also high.
Moments were originally calculated using the locations of the interna1
pixels of an image. Leu (1991) presented a meîhod for computing moments
from the boundary pixels of an image. He showed that this new method was
much more effiaent than the traditional methods for computing moments.
Khotanzad and Lu (1991) used moments for character recognition
2.4-2-2 Textural Features
Texture is a qualitative description of a surface in terms of some
properties such as fineness, coarseness, smoothness, and granulation.
Researchers have investigated many methods for evaluating the texture of an
image. Despite its importance, a formal approach to texture description does
not exist. Early image texture shidies employed autocorrelation f unctions,
-
power spectra, and the relative fiequenues of various gray levels (Haralidc et
al. 1973). The approaches to texture description are mostly on an ad-hoc basis
and generaliy utilize the gray level values of the iternal pixels of an image i n
some way (Haralick 1979).
Haralidc et al. (1973) suggested 28 texhval features which could be
extracteci from gray level images. Sayeed et ai. (1995) used the a subset of these
feahues for texture evaluation of a snack food. Khotanzad and Lu (1991)
analyzed gray-level images for texture classification of several different
commodi ties.
Gray-level histograms are the discrete frequency plot of pixels with a
particular gray-level. These plots may be viewed as a probability density
function so long as the elements of the image are randomly selected (Levine
1985). Then the continuous frequency gray-level plots of different dasses of
objects can be used as their discrimination functions. Das and Evans (1992)
used gray level histogram data for detecting fertility of hatching eggs. Han et al.
(1992) used gray level histograms obtained from X-ray images for detecting
split-pit peaches.
The performance of a dassification system depends chiefly on selecting
an appropriate set of features which best desaibe their associate classes.
Redundant and irrelevant features may degrade the classification performance
(Devijver and Kittler 1982). Feature selection speeds the processing time and
-
inaeases the reliability of a classifier by eluninating the redundant and
irrelevant information. A feature selecting procedure should extract the mos t
useful information from the representation vector and present it in the form
of a pattern vector of lower dimensionality whose elements represent only the
most significant aspects of the input data.
Mathematical feature selection tediniques are dassified into two major
categories: feature selection in the measurement space, and feature selection in
the transformed space. The methods in the k t category are referred to as
"feature selection" and the methods in the latter category are known as
"feature extraction" methods. Feature selection in the measurement space is
achieved by eliminating those measurements which are redundant or do not
contain enough relevant information. In this process the subset X of n
features:
is selected from the N-feature Y.
The subset X should have the best possible combination of Y features for
minimizing the ciassification error (Kittler 1975). To find the best possible
subset, one should try al1 possible combination of features. In most cases this is
not practical, because the required number of trials
is very large. For practical situations, some computationally feasible
procedures have been suggested to select a sub-optimal subset from the
-
original features. These methods are explained in detail by Kittler (1975) and
Devijver and Kittler (1 982).
Mucaardi and Gose (1971) compared severai different feature selection
techniques. They indicated that the error rates of the features selected by any of
the investigated techniques were lower than the error rate of randomly
selected features. They conduded ali of those feature selection techniques were
applicable to most pattern classification problems, but the choice of a particular
selection method depends on the ease of implementation, economical
considerations, and the particular application. Three different feature selection
methods, identified for their possible applicability in this research, are briefly
explained below.
2.5.1 Interclass/Intraclass Method
A dass of aiteria of feature selection in the measurement space which
are more heuristic in nature is based on the Euclidean distance between the
elements of the dass sets. These criteria originate from the intuitive argument
that "the greater distance between the elements of different classes the better
the dass separability". Based on this argument, a "good" feature should
provide a large distance between elements of different classes (the interclass
distance) while the distance between elements within a single dass (the
intradass distance) is as small as possible. This concept for a two-dass case
using two features (XI and X2) is presented in Fig. 2.4. In this figure dl is the
intraclass distance between a member of class 1 and the mean of this class M l .
The distance between the means of the two classes, d(M1, Mz), is the interclass
distance for the two classes.
-
Fig. 2-4 The concept of interdass and intradass distances.
Alternatively it can be stated that good features shouid have smali
scatter within their dass and large scatter between classes. One way to select
the best features is to use the Fisher criterion, defined by:
where KI, Kz are the dass covariance matrices, M I and Mz are the respective
means of the two dasses. The term KI+K2 in Eq. 2.4 is the within-class scatter
matrix and (M, - M ) ( M - M ) is the between-dass scatter matrix (Therrien
1989). Thus, good features should maximize the Fisher aiterion function. The
greater the ratio of interdass and intradass scatters, the greater the spatial
separation of the classes. Therefore features can be ranked based on this
critenon and those with a higher ratio can be considered as good features. A
detailed discussion of this criterion and its mathematical formulation is given
by Devijver and Kittler (1982).
-
2.5.2 Forward Selection
In the sequential forward feature selection method the features are
ranked based on their ability to separate the ciasses using an appropriate
classifier. To begin with, the feature with the highest rank is selected. If the
classification accuracy using this feature is not satisfactory, a feahire is selected
from the remaining set which gives the highest classification rate with the
current feature. The procedure continues until the selected set of features yield
an acceptable classification accuracy. There are two main drawbadcs to this
method: first, there is no mechanism for removing a feature that was already
selected; and second, correlation between features is not taken into account
(Devijver and Kittler 1982).
2.5.3 Backward Elhination
In the sequential backward elimination procedure the performance of a
classifier is first tested with the whole set of features. Then the feature that
gives the lowest classification rate with the rest of the features is eliminated.
The process of successive elimination continues until any further feature
elimina tion results in an unacceptable classification rate. This procedure has
the same drawbacks as the sequential forward method. Furthermore, both
sequential forward and backward selection methods are time consuming
(Mucàardi and Gose 1971). The STEPDISC procedure provided by SAS
(Statistical Analyses System, SAS Institute, Inc., Cary, NC, USA) c m be used to
perform the forward selection and backward elimination methods.
-
Feature selection performed in transformed space is known as "feature
extraction". Feature extraction methods eliminate the irrelevant information
and redundancy in pattern Y by mapping it into a lower dimensional pattern
E by a transform action T:
In general, the map T could be any vector function of Y which can maximize
an appropriate separability measure in the feature space (Kittler 1975).
The Karhunen-Loeve expansion is the most widely used method for
feature extraction. The idea behind this method is to reduce dimensionaiity of
the features by aeating new features which are h e a r combinations of the
original features (Mucuardi and Gose 1971). Other methods of feature
extraction such as those based on separability measures and those based o n
non-orthogonal mapping are explained by Kittler (1975).
Feature extraction methods were not considered for this research for the
following reason. The nonlinear transformation during feature extraction
procedures creates a new set of features from the originals. When the
transformed features are used as input to a dassifier, it is not easy to
investigate the behavior of a dassifier with changes in the original features.
Use of feature extractors as a preprocessing device in on-line classification is
compu tationally expensive.
-
Classifiers are algorithms implemented on digital cornputers for
purposes of classification. Classification dgorithms are developed and applied
in two stages. In the first stage, called the training stage, the required
classification parameters are estimated from a set of patterns cded the
"training set". During the second stage, cailed the test stage, the algorithm uses
the parameters to determine the dass of a new set of patterns called the "test
set". Once a classifier gives an acceptable accuracy for the test data, it can then
be used for the real world applications.
There are many different types of dassifiers and they are dassified and
explaïned in various pattern recognition books and research papers (e-g.,
Devijver and Kittler 1982; Nilsson 1990). Determining which classifier works
best for a particular application usually involves some degree of trial and
error. Most dassifiers, when applied to a particular problem, result in
comparable dassification accuracies. The real difference between them lies i n
their time complexity, storage requirements, and prease degree of accuracy
(Hush and Home 1993). A brief review of different classification methods and
their applications is given in the follovïing sections.
2 m 7 m 1 Bayesian Classifiers
Ln s tatis ticai pattem recognition, a classifier assigns the unknown represented
by the pattern X to the class o according to Bayes decision rule. For a two-class
case the Bayes decision ruie is given by:
-
where p(Xlw,) is the conditional dençity function and P ( o , ) is the a priori
probability of dass 1. In this case, I has a value equal to 1 or 2 Equation 2.6
states: Assign X to the dass a,, if P(w,)p(Xlw, jis larger than P(@, )p(Xlw2 ) - It i~ shown (Duda and Hart 1973) that this decision nile minirnizes the probability
of error, i.e-, the probabiiity of making an incorrect decision. Bayes decision
d e for multi-dass classification problems is presented in Fig. 2.5 . When dasses are separated by "discrimination functions", e-g., normal distribution
functions, then these huictions are used in Bayes deasion rule instead of the a
posterior probabilities.
Fig. 2.5 Bayes decision nile for mdti-class problems
4-
Gaussian classifier is one of the most frequently used dassifiers in
pattern classification problems. This dassifier, a speaal case of Bayes decision
rule, assumes that individual features have a Gaussian distribution (Therrien
1989). To implement this classifier one need only to estimate the mean vector
and the covariance matrix for each dass. These are the parameters of a
posterior probabilities that can be replaced in the Bayes decision mle. The
dassifier assigns the unknown to the class with the higher probability. W hen
Maximum, x Y Selector Class
-
dasses have different covariances the decision boundary may be in form of
ellipsoid, hyperboloid, parabdoid, or some combination of these (Therrien
1989). When the covariance matrices are equal, then the deasion boundaries
between dasses reduce to hyperplanes. More details on this classifier are given
in section 3.6.1.
Brogan and Edison (1974) used Bayesian deasion rule for classifying six
different types of grains. Segerlind and Weinberg (1973) used the Mahalanobis
distance (see section 3.6.1) for separating kemeis of corn, oats, wheat, barley,
rye, soybeans, and navy beans from each other. Singh et ai. (1993) applied the
technique of Bayes minimum risk classification for defect grading of
s tonefruit.
In agricultural research it is usually assumed that the variables are
normally distnbuted. Therefore many researchers have applied Gaussian
dassifiers to perform classification of different agricultural commodities. The
DISCRIM procedure in SAS has been widely used for these purposes.
Examples of the use of disuiminant analysis indude: separating early split
from normal split pistachio nuts by Pearson et al. (1993); evaluating snack
quality by Sayeed et ai. (1995), and discriminating canoia and mustard seeds by
Hehn and Sokhansanj (1990).
2.7.2 Classifiers Using Discrimination Functions
Discrimination functions have the property that they partition the
pattern space into mutuaiiy exclusive regions, where each region contains the
domain of a given dass. In classification applications, a posterior distributions
can be replaced by the discrimination functions (Duda and Hart 1973). Then a
-
dassifier is viewed as a machine that cornputes C discriminate functions,
g, .....gc (Fig. 26). A maximum selector is used to assign the pattern X to the
category associateci with the largest discriminant
Fig. 2.6 Machine classification of patterns using discrimination functions.
Linear discrimination functions are the simplest form of discrimination
functions. The decision boundary formed by a h e a r discrimination function
in two-dimensional feature space is a line, in three-dimensional feature space
it is a plane, and in multi-dimensional feature space it is a hyperplane. The
general fom of a linear discrimination function is:
where w's are the weights of the function and xis are the features constituting
the multi-dimensional feature space. The function is completely specified by
determining the values of the weights. A dassifier which implements the
linear discrimination functions is sometimes referred to as a linear classifier
or linear machine.
-
The selection of a suitable functional form for a discrimination function
is a problem of cruaai importance. The theory for construction of
discrimination functions has not reached an effective stage. Prior knowledge
about a posterior distribution of classes is always helpful in building proper
disaimination functions (Devijver and Kittler 1982). Sometimes reasonable
gueçses are made on the bais of qualitative knowledge about the patterns
(Nilsson 1990). An example of application of linear discrimination function is
classification of different industrial objects represented by autoregression
rnodels (Persoon and Fu 1986).
Piecewise linear discrimination function have been used for
dassification of objects whose classes are separated by non-linear decision
boundaries. In this method the classifier partitions the feature space into a
number of regions using a set of hyperplanes (Nilsson, 1990). Haralik et al.
(1973) used piecewise linear discrimination function for texture classification
of five kinds of sandstones.
2.7.3 Neares t Neighbor Classifiers
The Nearest Neighbor Classifier (NNC) makes use of the correspondence
between similarity and distance, i.e., the smaller the Eudidean distance
between classes the more similar they are (Batchelor 1974). The nearest
neighbor deasion rule assigns an unknown U to the dass of its nearest
neighbor X :
U E clars(i) if d(U,Xi) = min d(U,Xj), for k # j k, j = 1,Z ,.... C .
where d(U, X) is a distance measure between U and X and C is the number of
classes. Various distance rneasures (metrics) can be defined which can be used
-
in Eq. 2.8. Eudidean, aty blodc, and Mahalanobis are example of different
metrics. A complete discussion on different type of metrio is given by
Devijver and Kittler (1 982).
The basic idea behind nearest neighbor mles is that samples which fall
dose together in feahire space are likely to belong to the same dass. A NNC
stores a number of pattern for each dass. Then an unknown is compared to al1
of the stored patterns and assigned to the dass of the pattern which is most
similar with the unknown. The decision surface aeated by NNC is piece-wise
linear (Batchelor, 1974).
The k-nearest neighbor (k-NN) classifier is an extension of NNC. The
k-NN rule ciassifies X by assigning it the dass most frequently represented
among the k nearest samples. In other words a decision is made by examining
the labels on the k nearest neighbors and taking a vote. A complete description
of NNC and k-NN are given by Devijver and Kittler (1982).
2.7.4 Minimum Distance
Minimum distance classifier, like NNC, makes use of the
correspondence between similarity and distance. However in this method for
each class a prototype is considered and an unknown U is assigned to the dass,
i , of the prototype Mi which has the minimum distance d with it. That is:
where C is the number of classes. Minimum-distance classifiers would be
appropriate in situations where each class is represented by a single prototype
pattern around which al1 other patterns in that class tend to cluster.
-
Gupta and Srinath (1987) used minimum distance nile for classification
of 2D shapes using contour sequence moments. Persoon and Fu (1986) also
used minimum distance rule but they used Fourier descriptors for 2D shape
representation. String matching techniques whidi have been used for 2D
shape recognition (Bunk and Buhler 1993; Mase 1991) use minimum distance
classification method.
2.7.5 Decision Tree Classifiers
Classification trees constitute an important and popular form of
hierarchical ciassifiers. A deusion tree dassifier utilizes a series of simple
deusion functions, usually b i n q in nature, to determine the dass of an
unknown pattern (Levin 1981). The evaluation of these deasion functions is
initiated from the tree root node and branches out through the interna1 nodes
toward the terminal nodes. The evaluation of deusion function at each node
is in such a way to that the outcome of successive decision functions reduces
uncertainty about the unknown pattern. The most common choice for the
node deusion functions is a threshed cornparison on a component of the
feature vector. The thresholds are usually determined by examining a training
sample and their accuracies are validated using a test set.
The dassification capability of a tree classifier arises from its ability to
partition the feature space into complex regions by making a sequence of
simple decisions at each node. Some advantages of dassification trees are low
storage requirement, simple deusion at nodes and ease of understanding of
classification process. The disadvantages are abrupt decision at nodes,
cornparison of continuous features against a threshold to determine
-
brandiing, uncertainty about the thresholds, difficuity with missing features,
increase in complexity with the increased size of tree (Gelfand and Delp 1991).
There have been numerous researches to apply the p ~ c i p l e of
machine vision and pattern recognition in classification of agricultural
products. One of the pioneer works in this area is the classification of grain
kemels by Segerlind and Weinberg (1973). They investigated the feasibility of
identifymg grain kernels by analyzing their profile using Fourier series
expansion of the periphery radius. Tests were performed on sarnples of corn,
oat, wheat, barley, rye, soybeans, and navy beans. The value of the first ten
Fourier coefficients (harmonies) were obtained for a training set and a test set.
The authors mentioned that the application of the method to intraspecies
discrimination was only partially successful with the errors ranging from 11 to
25%. They also indicated the implementation of this technique was too
tedious for routine use. Considering the faa that this research was performed
more than two decades ago, at present time with the existence of super
cornputers and advances in software engineering the burden of this type of
research has greatly reduced.
Images of wheat kernels in plane-form view were acquired and
processed to extract kernel shape and size characteristics to disaiminate
different classes and varieties of wheat by Neuman et al. (1987). Feature
extraction algorithm based on object contours were developed and
implemented using the FORTRAN 77 cornputer language. In addition to
kernel spatial size and shape parameters as discriminating features, contour
-
curvature was further quantified in the frequency domain to obtain Fourier
descriptor as shape-specific features. Statistical pattern recognition methods
were used for discrimination analysis which resulted in an overall
performance of 87%. This research is important because determining the
variety of an agriculturai product is more difficult than separating different
kind of products.
Pattern recognition techniques were used for automatic classification of
six different grains by Brogan and Edison (1974). An algorithm based on
recursive learning technique and Bayesian deasion rule was developed and a
prototype device was built for rapid, accurate and automatic classification of
the grains. Of six grains, corn and soybeans were perfectly identified. W hea t,
oats, barely and ryes were much more similar and were more likely to be
misdassified. An overall accuracy of about 98% was obtained.
A method using machine vision for detecting the fertility of hatching
eggs during the thïrd and fourth day of incubation was developed by Das and
Evans (1992). In this method images of eggs were acquired using back-lighting
with a high intensity candling lamp. Parameters from the gray-level
histograms were estimated to desaïbe the shape of the histograms. A n
algorithm were developed using the estima ted parameter to dis tinguish the
fertile from infertile eggç. The algorithm gave prediction accuracies of 96%
using the day of fourth data and 88% using the day of third data.
Han et al. (1992) developed a method for detecting split-pit peaches
using a machine vision system. The X-ray films of peaches were placed under
a video camera and the gray level histograms of the images were analyzed.
Using the histogram data, a threshed equation was developed to separate the
-
split-pit peaches from the unsplit-pit peaches. An acc-uracy of 98% was
reported. The limitation of this method was in proper orientation of the
peaches under the X-ray camera.
Sarkar and Wolfe (1985) developed a prototype tomato sorting machine.
Tomatoes moving over a belt conveyer were deteaed by a photoceli which
signaleci a computer to grab two views, stem end and blossom end, and
andyze them. Based on the algorithm output, the computer activated
solenoid operated pneumatic cylinders. The cylinder action raised the
appropriate indined flap and dropped the tomatoes into their corresponding
boxes. The author indicated that the developed prototype was not suitable for
commercial purposes, because of speed limitations. There was no mention
whether the speed limitation was because of computer processing time or
hardware implementa tion.
Gunasekaran et al. (1987) applied image processing techniques for
detecting stress cracks in corn. The algorithm used for stress aadc evaluation
was similar to a high-pas filtering process. The pixels representing stress
cracks had significantly different gray level values than the pixels of the rest of
the kernel surface. Therefore, gray levels of the pixels representing the stress
cracks were extracted first by aeating an image suppressing the gray scale
levels of the stress crack part and subtracting this newly created image from
the original image. The success rate was determined by comparing the visual
evaluation of the kernels for the stress aack with corresponding evaluation of
the vision system of the same set of kernels. The algorithm performed
satisfactorily in detecting stress cracks in 90% of the examined kernels.
-
A vision system and aigorithm were developed to locate fruit on a tree
(Site and Delwiche 1988). Images were obtained using a solid state camera.
Several bandpass optical filters