pattern recognition and neural networks...pattern recognition and neural networks a thesis submitted...

MACHINE VISION CLASSIFICATION OF PISTACHIO NUTS

USING

PATTERN RECOGNITION AND NEURAL NETWORKS

A Thesis

Submitted to the CoiIege of Graduate Studies and Research

in Partial Fulfiilrnent of the Requiremen ts

for the Degree of

Doctor of Philosophy

in the

Department of Agricultural and Bioresource Engineering

University of Saskatchewan

Saskatoon, Saskatchewan, Canada

by

Ahmad Ghazanfari-Moghaddam

Fall, 1996

O Copy15ght Ghazanfari, Ahmad, 1996. AU rights reserved.

National Library 1*1 of Canada Bibliothèque nationale du Canada Acquisitions and Acquisitions et Bibliographie Services services bibliographiques

395 WelIington Street 395. rue Wellington Ottawa ON K I A ON4 Ottawa ON K I A ON4 Canada Canada

The author has granted a non- exclusive licence allowing the National Library of Canada to reproduce, loaq distribute or sell copies of this thesis in microform, paper or electronic formats.

The author retains ownership of the copyright in ths thesis. Neither the thesis nor substantial extracts fiom it may be printed or otherwise reproduced without the author's permission.

Your Ne v m Rilénmce

Our fi& Notre relérence

L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la fome de microfiche/film, de reproduction sur papier ou sur format électronique.

L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.


College of Graduate Studies and Research

SUMMARY OF DISSERTATION

Subrnitted in Partial Fulfillment

of the Requiremen~ for the

DEGREE OF DOCTOR OF PHILOSOPHY

by

Ahmad Ghazanfari-Moghaddam

Depamnent of Agticu1tura.i and Bioresource Engineering


Summer, 1996

Examining Committee:

Dr. L A , Kells

Dr. E. Barber

Dr. J. Iruday araj

Dr. D. Wulfsohn

Dr. R. Boiton

Dr. RI Ford

Dr. A. Kusaiik

Externai Examiner:

&ad&sia&s DeanlDeanls Designate, Chair College of Graduate Studies and Research

Chair of Advisory Cornmittee, Depamnent of Agriculnual and Bioresource Engineering

Co-supervisor, Department of Agicultural and Bioresource Engineering

Co-supervisor, Department of Agriculrural and Bioresource Engineering

Depamnent of Electrical Engineexing

Department of Agricultural and Bioresource Engineering

Deparmient of Cornputer Science

Dr. D.S. Jayas Department of Biosystems Engineering University of Manitoba Winnipeg, Manitoba, R3T 5V6

MACHINE VISION CLASSIFICATION OF PISTACHI0 NUTS

USING

PATTERN RECOGNITION AND N U R A L NETWORKS

Machine vision-based sorting of agicuituraI commodities is an aliemative to the

conventional mechanical and elecnwptical soriing methods. This method offers high

speed multi-category classification by processing multiple-feanrres obtained through image

processing algorithms. The purpose of this thesis was to select an appropriate set of

feames and to investigate different classification schemes for efficient machine vision-

based sorthg of pistachio nuts.

Kerman cultivar pistachio nuts obtained h m Califoniia were used in this study. A

sample of nuts were weighed and manuaIly soried into four classes: "Grade One" (G 1 ) ,

"Grade Two" (GZ), and "Grade Three" (G3), and "unsplit nuts" (UN). Each class

consistai of 260 nuts. Morphological features (area, length, width, perimeter, and

roundness ), Fourier descriptor (FD's) of the boundary, and gray level histograms were

exnacted h m images of the nuts using a Macintosh-based machine vision system and

commercial image processing software.

The discrimination power of the individual sets of features for separaring the four

classes were investigated using Gaussian classifiers. The morphological features and FD's

resulted in relatively low classification accuracîes. 'Ihe gray-level histograms yielded an

average classification acciiracy of 98.5%. Analysis of the classincation results hdicated

that morphological features had a high potentiai for separatïng G1, G2, and G3 frmn each

otha; and that the FDfs had hi& discrimination power for separating the split nuts fiom

unsplit.

Different f e a m selection methods including f o w ard selec tion, backw ard

elimination, Fisher criterion, and graphical analysis were applied to select a suitable subset

of features. The feature selection results uidicated that a combination of seven selected FD's

and the area (7FD's & A), or a combination of the fkquency of the gray level56 and the

area (GL-56 & A) were suitable for separating the four classes. The selectd features were

used as input to different classifiers such as Gaussians, decision mes, multi-layer neural

networks (MLNN), and mu1 ti-structure neural networks (MS NN) . A procedure for caicuia~g the computational complexity of the classifiers was developed. The classifiers

were compared in term of performance and computationai complexity.

A decision tree classifier using GL-56 & A resulted in 9 1.7% classification

accuracy. The sarne features using MLNN and MSNN resuited in 92.4% and 93.2%

accuracy, respectively. The GL-56 & A using a Gaussian classifier resulted in an overall

classification accuracy of 89.6%. Using 7FD's & A, the classification accuracies were

82.8%, 88.795, 94.1%, and 95.0% for Gaussian, decision me, MLNN, and MSNN

classifiers respectively.

The decision m e classifiers required the least amount of computational rime, but

relied heavily on the tbreshold vaiues supplied by the user. The neural network classifiers,

in sequential executions, required higher compuiationd tirne, but in terms of classification

accuracy, they were superior to the statistical classification methods. The MSNN classifiers

were the most suitable method for this multi-category classification problem. These

classifiers learned their input-output mapphg faster and were more robust when compared

to MLNN classifiers.

COPYRIGHT

The author has agreed that the Library, University of Saskatchewan, may

make this thesis freely available for inspection. Moreover, the author has agreed

that permission for extensive photocopying of this thesis for scholarly purposes

may be granted by professor or professors who supeMsed the thesis work

recorded herein or, in their absence, by the Head of the Department or the Dean

of the college in which the thesis work was done. It is understood that due

recognition will be given to the author of this thesis and to the University of

Saskatchewan in any use of the material in this thesis. Copying or publishing or

any use of the thesis for financial gain without approval by the University of

Saskatchewan and the author's written permission is prohibited.

Request for permission to copy or to make any other use of the material in

this thesis in whole or in part should be addressed to:

Head of the Department of Agriculturai and Bioresource Engineering University of Saskatchewan 57 Campus Drive Saskatoon, Saskatchewan S7N 5A9 Canada

Machine vision-based sorting of agricultural commodities is an alternative

to the conventional mechanical and electro-optical sorting methods. This

method offers high-speed, multi-category classification by processing multiple-

features obtained through image processing algorithms. The purpose of this

thesis was to determine an appropriate set of features and to investigate different

dassification schemes for efficient machine vision-based sorting of pis tachio

nuts.

Kerman cultivar pistachio nuts obtained from California were used in this

study. A sample of nuts were weighed and manually sorted into four classes:

"Grade One" (Gl), "Grade Two" (GZ), and "Grade Three" (G3), and "unsplit nuts"

(UN). Each class consisted of 260 nuts. Morphological features (area, length,

width, perimeter, and roundness), Fourier descriptor (FD's) of the boundary, and

gray level histograms were extracted from images of the nuts using a Macintosh-

based machine vision system and commercial image processing software.

The discrimination power of the individual sets of features for separating

the four dasses were investigated using Gaussian dassifiers. The morphological

features and FD's resulted in relatively low classification accuracies. The gray-

level histograms yielded an average classification accuracy of 98.5%. Analysis of

the classification results indicated that morphological features had a better

potential for separating G1, G2, and G3 from each other while the FD's had a

higher discrimination power for separating the split nuts from unsplit.

Different feature selection methods induding forward selection, badcward

elimination, Fisher criterion, and graphitai analysis were applied to select a

suitable subset of features. The feature selection results indicated that a

combination of seven selected FD's and the area (7FD1s & A), or a combination of

the frequency of the gray level 56 and the area (GL-56 & A) were suitable for

separating the four classes. The selected features were used as input to different

dassifiers such as Gauçsians, decision trees, multi-layer neural networks

(MLNN), and multi-structure neural networks (MSNN). A procedure for

calculating the computational complexity of the classifiers was developed. The

dassifiers were compared in term of performance and computational complexity.

A deusion tree classifier using GL-56 & A resulted in 91.7% classification

accuracy. The same features using MLNN and MSNN resulted in 92.4% and

93.2% accuracy, respectively. The GL-56 & A using a Gaussian dassifier resulted

in an overall classification accuracy of 89.6%. Using 7FD's & A, the classification

accuracies were 82.8%, 88.7%, 94.1 %, and 95.0% for Gaussian, decision tree,

MLNN, and MSNN dassifiers, respectively.

The decision tree classifiers required the least amount of computational

time, but relied heavily on the threshold values supplied by the user. The neural

network dassifiers, in sequential executions, required higher computational time,

but in terms of classification accuracy, were superior to the statistical

classification methods. The MSNN dassifiers were the most suitable method for

this mdti-category classification problem. These classifiers learned their input-

output mapping faster and were more robust compared to MLNN dassifiers-

ACKNOWLEDGMENT

Raise be to God, the Beneficent, the Mercifui.

I wouid like to express my appreciation and gratitude to the individuals who assisted, encouraged, guided, and supported me throughout my education. Speaal appreciation is expressed to my P1i.D. advisors, Professor J. Irudayaraj (Dept. of Bio. & k g . Engin., Utah State University) and Professor D. Wulfsohn (Dept. of Ag. & Bio. Engin.), for their excellent supervison and advice.

My deepest thanks and gratitude are extended to the members of my advisory and examining cornmittees: Professor E. Barber (Dept. of Ag. & Bio. Eng.), Professor R. Bolton (Dept. of Electncai Eng.), Professor R Ford (Dept. of Ag. & Bio. Eng.), Professor A. Kusalik (Dept. of Computer Science), and Professor D. S. Jayas (Dept of Biosystems Eng. Univ. of Manitoba), the extemal examiner, for their guidance and support.

Many thanks to Professor S. Sokhansanj (Dept. of Ag. & Bio. Eng.), for use of the Bioprocess fadities; to Mr. M. Romaniuk, former research engineer (Dept. of Ag. & Bio. Eng.) for his assistance in using the machine vision system; to Professor H.C. Wood (Dept. of Electrical Eng.), for use of A N . software; and to the personnel and graduate students at the Department of Agricultural and Bioresource Engineering, for their support.

Appreciation is further extended to the Ministry of Culture and Higher

Education, Islamic Republic of Iran, for granting a Ph.D. scholarship to cover my entire living and educational expenses.

Thanks are due to California Pistachio Commission and Paramount Farms for supplying the pistachio nuts required by this project, and different brochures on pistachio nuts.

Finally, I would like to express my appreciation to my beloved wife, Shahin and to my two little daughters, Nadia (8) and Farida (6) for their understanding, patience, and support, and for being so nice.

God bless all of you.

TABLE OF CONTENTS

. . COPYRIGHT .................. ... ............................................. ... .................................. u ABSTlRACT .................................................................................................................... iv ACKNOWLE DGMENT ........................................................ .-. .................................... vi

. . TABLE OF CONTENTS ................................................. .- i ......................................................................................................... LIST OF FIGURES xi

... ......................................................................................................... LIST OF TABLES ..mi LIST OF FREQUENTLY USED ABBREVTATIONS ................................................. xïv

INTRODUCTION AND OB JECTWES .................................................. i 1.1 Introduction ................................................................................................ 1 1.2 Objectives .................................................................................................... 5

..................................................... 1.3 An Overview of The Thesis Chapters 5

........................ BACKGROUND AND REVIEW OF LITERATURE 7 2.1 Pistachio Nuts ............................................................................................. 7

2.1.1 Cultural Practices ....................................................................... -8 ............................. 2.1.2 Pos t-Harvest Processing of Pistachio Nuts 9

......................... 2.1.3 USDA Standards for Grading PistaChio Nuts 13 2.2 Inspection by Machine Vision .................................................................. 14 2.3 Image Acquisition ..................................................................................... -16

.............................................................................. 2.4 Classification Feahues -17 Extemai Image Fea tues ............................................................ -18 2.4.1.1 Morp hoiogical Features ......................... ... ........... -18 2.4.1.2 Fourier Descrip tors ..................................................... -19 2.4.1.3 Boundary Sequences ................................................... 2 0 2.4.1.4 Boundary Chain Codes .................... .... ................... 21

Interna1 Image Fea tures .............................................................. 21 2.4.2.1 Moments ....................................................................... -22

......................................................... 2.4.2.2 Tex tural Fea tues -22 2.4.2.3 Gray-Level Histograms ............................................... 23

2.5 Feature Selection ..................................................................................... -23 2.5.1 Interclass/Intraclass Method .................................................... -25 2.5.2 Forward Selection ...................................................................... -27

vii

3.6 Primary Classifications Using Gaussian Classifier .............................. 76 ..................................................................... 3 .6.1 Gaussian Classifier -76

3.7 Feature Seleciion ....................................................................................... -77 ........................ 3.7.1 Feature Seiebion for Morphological Features 78

3.7.2 Feature S e l d o n for FD's ................ ... ..................................... 79 .......................... 3.7.3 Feature çelection for Gray-level His tograms 80

3.8 Classification Ushg Selected Features .................................................... 80 3.8.1 Decision Tree Classifiers ............................................................ 81 3.8.2 Neural Network Classifiers ........................ .. ........................ .... . 82

3.8.2. 1 Wecting a Network Topology .................................. 83 3.8.2.2 Selecting a Learning s tep ............................................ 84 3.8.2.3 Training a Neural Network ........................................ 85

3.8.3 MSNN Classifier ......... ... ........................................................ û6 3 . 9 Classification Performance Evaluation .................................................. 89 3.10 Computational Complexity Calculations ............................................ -91

3.10.1 Computational Complexity for Gaussian Classifier ............ 92 3.10.2 Computational Complexity for MLNN and MSNN ............ 93

3.11 Summary ................................................................................................... 94

................................................................... 4 RESULTS AND DISCUSSION 95 4.1 Introduction ................................................................................................ 95 4.2 Random Position Experimen ts ................................................................ -95

4.2.1 Classifications Using Morphologid Features ....................... 96 4.2.2 Classifications Using FD's ......................................................... -102 4.2.3 Condusions from Random Position experiments ................. -104

4.3 Controlled Position Experiments ............................................................ -105 4.3.1 Classification Using Morp hologicai Features ......................... 106 4.3.2 Classification Using Fourier Descriptors ................................. 106 4.3.3 Classification Using Selected FD's and Area .......................... 108 4.3.4 Classification Using Gray-level His togram Data .................. -110

4.4 Explonng the Training Behavior of MLM\I ........................................... 117 4.4.1 Learning S tep ............................................................................... 118 4.4.2 Data Arrangement ...................................................................... 120 4.4.3 Effects of Network Topology on Training ............................... 121

4.5 Classification of Pistachio Nuts by MLNN ........................................... 122 4.5.1 GL-56+A ....................................................................................... 123

4.5.2 7FD's & A ...................................................................................... 125 ..................................................................... 4.6 Classification Using MSNN -127

4.6.1 GL-56 &A ..................... ... ......................................................... 127 4.6.2 7FD's & A ....................... ........ .......................................... 129

...................................... 4.7 Cornparison of Classifiers' Performances 2 ........................................ 4.7.1 Performance of Gaussian classifiers -131

.................... 4.7.2 Performance of Decision Tree Classifiers .... .. 133 ................................... 4.7.3 Performance of The MLNN Classifiers -133

.......................................... 4.7.4 Performance of MSNN Classifiers. -135 4.8 S u m m a r y ..................................................................................................... 136

5 S W M A R Y AND CONCLUSIONS ........................................................ 138 ..................................................................................................... 5.1 Summary 138

................................................................................................ 5.2 Conclusions -140 ...................................................................................... 5.3 Recommenda tions 142

...................................................................................................... 6 REFERENCES 144

APPENDIX A ........................................................................................................... 154 USDA STANDARDS FOR GRADING PISTACHI0 NUTS ....................... 154

........................................................................................................... APPENDIX B 158 DELTA LEARNING RULE AND ERROR BACK-PROPAGATION FOR MULTI-LAY ER NEURAL NETWOIIKÇ .............................................. -158

........................................................................................................... APPENDIX C 166 CHI-SQUARE TEST FOR GOONESS OF FIT ................................................ 166

Figure

LIST OF FIGURES

Page . The main post-harvest processing of pistaduo nuts .................................... 10

........... An optical nut-sorter (ESM International Inc.. Houston. TX. USA) 12 A schematic represenation for a classification process by machine . . ........................................................................ visron ................ ........ .15

The concept of interdass and intraclass distances ....................................... 26 Bayes decision rule for multi-dass probiems ...................... .... .............. -30 Machine classification of patterns uçing discrimination ..... .. ................. -32

.............................................................. A mathematical mode1 for a neuron 41 Typical activation functions for neurons . a) sigmoid, b) hard .................... 42 Mathematical mode1 of a percep tron ............................................................ -44 A mu1 ti-category classifier using c discrete percep tron .............................. 45

............................................................. A typical three layer neural network 47 The general classification system .................................................................... 63 The machine vision system, 1: Ba& lighting device, 2: Camera, 3: Monitor, 4: Cornputer, 5: Object, 6: Image of the object .............................. 65 The polygonal representation of boundary of an image ............................. 73 A typical MSNN for four-class classification ................................................ 87 A typical disauninator of a MSNN ................................................................ 87 The performance of individual mcrphological features in recognizing the three classes of G1, G2 and G3 ............................................ 99 Area / mass correla tion of spli t pis tachio nu ts ............................................. -99 Area histograms: (a) Gl, (b) G2, (c) G3 ........................................................... 100 The mean of the first 15 hannonio of the four classes of pistachio

..................................................................................................................... nu ts -103 A typical gray-scale image of a pistaduo nut .............................................. -105 The mean of the first 15 humonics of split and unsplit nuts ..................... 107 The proposed classification scheme using 7FD's & A ................................. 109 The gray-level dis tnbution of the four dasses .............................................. 111 Increase in accuracy of dassification by increasing the number ................ 113 Tree classification scheme using GL-%&A data .......................................... 115

4.1 1 The effect of leaming step in training a M W ........................................... 119 ................................. 4.12 Effect of data arrangement on the training of M W 121

4.13 A typicai configuration of a data file for MLNN .......................................... 123 4.14 A typicai data file for the MSNN dassifiers .................................................. 127

...... 4.15 The location of the G1, G2 and G3 classes on a hypothetical nom al.. 132

xii

LIST OF TABLES

Table Page . I 1.1 USDA standards for size grading of the pistachio nu& (California

.......................................................................... Pistachio Commission, 1995) -1 4.2 Performance of the Gaussian dassifier using morphological

............................................................................................................... features -98 4.3 Gaussian classification of Gl, G2 and G3 using area ................................... 100 4.4 Performance of the Gawian classifier using FD's and Area ..................... 104 4.5 Performance of the decision tree classifier using 7FDts & A ...................... 110

............................. 4.6 Performance of the Gaussian classifier using 7FD's & A 110 4.7 Performance of the Gaussian classifier using gray- .................................... -113 4.8 Performance of the decision tree classifier using GL-56 & A ..................... 116 4.9 Performance of the Gausian classifier using GL-56&A ............................ -117 4.10 The performance of different network topologies under ............................ 122 4.11 MLNN Classification using 6-54 network and transfomed data of

GL.56&A ............................................................................................................. 124 4.12 Classification using 10-5-4 network wi th 7FDts & A ................................... 126 4.13 Classification results of MSNN with GL-56&A ............................................ 129 4.14 The dassification results of MSNN using 7FDts & k .................................. 130 4.15 MLNN performance using GL-56 & A using a threshold of 0.6 for

the neîwork output ........................................................................................... 135

xiii

LIST OF FREQUENTLY USED ABBREVIATIONS

MSNN

FD' s

Mdti-layer neural network

Multi-structure neural network

Unsplit pistadiio nuts

Pistachio nuts, Grade one

Pistachio nuts, Grade two

Pistalhio nuts, Grade three

Fourier descnptors

Seven selected FD's and area

Gray level56 and area

xiv

CHAPTER 1

INTRODUCTION AND OBJECTIVES

Harvested pistachio nuts contain a considerable number of empty,

undeveloped, and unsplit shells due to factors such as unfavorable climate,

incomplete poilination, la& of nutrition, and disease (Woodroof 1967).

However, consumer demand is for large, in-sheii, split pistachio nuts. The

United States Department of Agriculture (USDA) standards for pistachio nu ts

designate size grades of "Extra Large'', "Large", "Medium", and "Small" for

these nuts (Table 1.1). The standards also identify as degrading factors the - -

existence of foreign matenals, and damaged and unsplit nuts.

Table 1.1 USDA standards for size grading of the pistachio nuts (California Pistachio Commission, 1995)

Extra Large Large Medium Small

-

20 or less 21 - 25 26 - 30 31 or more

Currently, on-farm separation of split from unsplit pistachio nuts is

accomplished by flotation methods. Primary sorting of the nuts is usually

done using mechanical devices such as saeens, aspirators, and gravity

separators (Kader 1985). Electro-mechanical devices, opticai sorters, and

manual sorting are used in processing plants for further grading of the nuts.

To implement the grades indicated in Table 1.1, a USDA inspecter will take

random samples from a lot and grade the entire lot based on personal

experience. Industries also have their own standards for packaging and

marketing these nuts.

Inspection and sorting by human labor is a subjective and time

consuming process. It also becomes curnbersome after a prolonged period of

time. On the other hand, mechanicd sorting is not precïse, and because of

direct contact can cause darnage to the nuts. Optical sorting devices utilize

lîght with certain wavelengths reflected from a product to assess its quality.

However, such optical devices cannot be used for size or shape grading.

Inspection and grading of pistachio nuts by machine vision is an

attractive al ternative to conventional methods because it offers the po ten tial

for high speed, non-destructive classification of the nuts using a single

machine. In this process a charge-coupled device (CCD) provides analog

signals of an object to a cornputer where the signals are digitized and stored as

an image. Further processing can be done on the image to extract quantitative

information to be used as input to a classification algorithm.

The continual improvement of price/performance of digital computers

have made it practical to automate visual inspection in many areas. As vision

technology continues to develop and industry becomes increasingly aware of

its potential, computer vision will find many new applications. Much of the

on-going research in food and agricultural processing is focused on the

application of machine vision to quality control. Examples include maturity

detection of peanuts (Ghate et al. 1993), sorting of dried prunes (Delwiche et al.

1993), and potato inspection (Tao et al. 1990). These industries are extremely

cornpetitive, hence efficiency and quality are primary means to inaease

market share and profit. Automation is not a luxury in these industries, but

an essential requirement.

On the 0th- hand, agriculturai products present a challenge in

inspection technology because of the wide variabilities present in properties

used for assessing qualities and grades. Machine vision systems are suitable for

inspecting rigid or predefined objects such as machine tools and metal parts.

However, visual characteristics of agricultural produds such as color, shape,

size, and texture are difficult for machine vision system to discem. It is even

harder to assess quality based on visual processing of these features.

Artificial neural networks, resembling biological nervous s ystem, have

proved to be robust in deding with ambiguous data and the kind of problems

that require the interpolation of large amounts of data. Neural networks,

instead of sequentially performing a program of instructions, explore many

competing hypotheses simultaneously using massive parallelism (Lippmann

1987). In addition, neural networks have the potential for solving problems in

which some inputs and corresponding output values are known, but the

relationship between the inputs and outputs is not well understood. These

conditions are commonly found in agricultural inspection problems.

Pattern recognition has emerged as an important application of artificial

neural networks. One of the most important attributes of neural network

dassifiers is their capability to approximate the a posterior distribution of their

training samples through learning and adaptation. This abiiity makes them

unique among pattern dassifiers. The application of machine vision, coupled

with neural networks capabilities, seems to offer promise for inspecting

agricultural products.

Grading pistachio nuts using machine vision in conjunction with

pattern recognition techniques, including neural networks, offers many

advantages over the conventional optical or mechanical sorting devices.

Multiple sensors can be used to gather the necessary information from the

nuts and send suitable signals to a computer where they c m be decoded for

multi-category dassification. Image processing algorithms can be used to

extract higher-level information from the input signals for improved

dassification performance. The dassification parameters can be easily

modified to take into account annual variations in the product. When neural

networks are used as pattern dassifiers, the sorting device can be equipped

with a training option through which the machine can be trained for

recognizing new grades or for different products.

An extensive literature search and direct communication with

industrial sources have indicated that no pattern recognition machine or

neural network-based system has been used for sorting or grading of the

pistachio nuts. Bench-mark studies are thus needed to develop an efficient

and a practical machine vision-based method for grading pistachio nuts.

Research must be conducted to determine suitable image features and proper

classification methods for accurate grading of pistachio nuts.

The main objective of this research was to study the feasibility of

dassifymg pistadiio nuts into four dasses of "Grade One" (Gl), "Grade Two"

(G2), "Grade Three" (G3), and '2Tnsplit" (UN) using a machine vision system.

S p e d c goals were:

1. To investigate the potential of different image-exhacted

features for dassification of pistadüo nuts.

2. To investigate the feasibility of dassifymg pistachio nuts

into their appropriate dasses using the selected features by:

a. Designing or selecting appropriate statistical

pattern dassifiers.

b. Designing suitable multi-layer neural network

based classifiers,

3. To compare the performance of the applied dassifiers and

determine an effiaent classification technique.

1.3 AN OVERVIEW OF THE THESIS CHAPTERS

The material presented in this thesis is organized into five chapters.

The first chapter addresses the justification, importance, and the statement of

the objectives of the research. Chapter Two of this thesis begins with a brief

look at cultural and processing activities for pistachio nuts. It continues with

an explanation of the principles of machine vision and pattern recognition

methods. Exampleç of application of these principles in different areas, with

an emphasis on agricultural engineering applications are given throughout

the text. A complete background on multi-layer neural networks and their

relationship to staüstical pattern dassifiers is presented as well. Chapter Two

condudes with a presentation of some examples of application of neural

network dassifiers to selected problems in agricultural engineering.

Chapter Three begins with a brief explanation of the methods followed

to perform the classifications procedures. Then a detailed description of the

image acquisition system is presented. Feahues, feature extraction and

classification methods, ciassifiers and the methods for evaluating their

performance are also desaibed in detail.

Results and discussion of different classification methods are presented

in Chapter Four. The presentation of the results follows the flow of

experiments starting with classification of the nuts using bulk features and

evolving towards more suitable features. The discussion covers the source of

variation in different feature types and their possible limitations and merits.

Also, the performance of different dassifiers are compared and the reasons for

the variations in their performance are explained.

Chapter Five indudes a summary of the research, methods, findings

and the conclusions made from the experimental results. Finally, suggestions

for future research are proposed.

CHAPTER 2

BACKGROUND

AND

REVIEW OF LITERATURE

2.1 PISTACHIO NUTS

The pistachio tree belongs to the family Anacardiaceae, of which the

cashew, mango, sumac, and poison oak are also members. There are about a

dozen species of pistica, a few of which produce small nuts. Only Pistica nera,

however, fields the acceptable, larger edible nuts of commercial value. The

nuts of Pistica vera are known as "pesteh" in Persia, from which the name

"pistachio" is derived (Rosengarten 1984). The pistachio tree is deciduous, and

dioeaous. One male tree is usually adequate to pollinate eight to ten trees. The

male trees are planted in the orchard in a way that takes advantage of the

prevailing winds for pollination.

The pistachio tree attains a height of 2 to 8 m, with a spreading top. It is

native to western Asia (Woodroof 1967). The nut has a light-colored woody

shell. The kernel, which is light greenish-yellow, has a sweet and delicate

flavor. The kemels are eaten raw or are roasted with salt and different spices.

It is also used for flavoring in cookery and confectionery.

2.1.1 Cultural Practices

The pistachio is a dry-dimate tree and requires little irrigation. Because

of the deeper soil penetration of its root system, the pistachio tree is better able

to withstand la& of soil moisture in the upper soil area than others of the

commonly cultivated fruit or nut trees. The pistachio tree thrives best in areas

having cool enough winters for proper breaking of bud dormancy and long,

hot, dry summers for proper ripening of the nuts

Pistachio trees begin nut production at about six or seven years of age,

but full bearing is not attained until the fifteenth to twentieth year

(Rosengarten 1984). Under favorable conditions, pistachio trees live and

produce for centuries. In the future, the pistachio nut trees could play an

important role in the development of some arid regions where rainfall

predudes the successful growing of almost any other commercial clop.

Pistachio nuts are born in dusters. The seed color ranges from light

yellow to deep green throughout, and the leathery husk shows different

shades of yellow, red, and purple. It is peculiar to the pistaduo nut that, when

ripe, a large portion of the nuts split dong the shell suture. This splitting

property is desirable since pistachio nuts are usually marketed in-shell, to be

opened by the consumer (Woodroof 1967).

The time to harvest pistachio nuts is aiticai. Eariier harvesting results

in a higher percentage of undeveloped kernels, lower nutritional quality, and

loss in yield. Deiayed harvesting causes increased shell staining, losses in

kernel quality, and inaeased incidence of insect infestation and fungal attadc

(Woodroof 1967). The easiest indicator used to determine optimum harvest

time is when the huil slips from the shell when the fruit is pressed between

thumb and fingers. In the Central Vaiiey of California this normaliy occurs

during the first two weeks of September.

In California, the pistachio nuts mature in late August through late

September. At maturity the extemal hull of the fruit changes from light green

to a pale straw or whitish, opaque appearance, at the same time softening and

loosening itseif from the stony inner part of the ovary wall, whidi is the gray-

white inner shell. Thus, it is easily slipped off by pressing between the fingers

(Woodroof 1967).

The nuts are harvested by hand picking, knocking them off the tree

with poles, or by shaking the tree by use of mechanical shakers. W hen

harvested, the nuts are in hulls and usually contain 40 to 45% moisture

(Woodroof 1967). koper management of hamesting and post-harvesting

handling procedures of pistachio nuts is important to attain maximum yield

of good quality nuts which in turn determine marketability and profit.

2.1.2 Post-Harvest Processing of Pistachio Nuts

The pistachio nuts, like many other agricultural products, require

considerable amount of post-harvest processing. The main activities for post-

harvest handling of pistachio nuts are given in Fig. 2.1. Pistadiio nuts are

covered with a tight hull, when immature. At the time of harvest the hull

softens and loosens itself from the shell. After harvest the hull must be

removed as soon as possible to reduce the chance for fungal growth and to

avoid shell staining. Each hour after the nut is off the tree, the hull degrades.

If the hull is not removed within 24 hours of harvesting, the deteriorating

huli will stain the shell so badly that the nut will no longer be marketable in

the prime market. Abrasive peeling machines are used to remove the hulls

(Kader and Maranto 1985).

S hipping e Grading Roasting

-

Fig. 2.1 The main post-hanest processing of pistachio nuts.

The dehulled nuts always contain a considerable number of unsplit and

blank nuts. Floatation techniques are used to separate these nuts from the split

nuts, since immediately after harvesting and dehulling the unsplit nuts tend

to float. During the floatation process, nuts are immersed in a tank of water

and agitated for about 10 minutes. Once the agitation is ceased, the "sinkers"

are primarily the splits and the "floaters" are the unsplits and blanks. Since

the floatation process is not accurate, some amount of unsplit and Iow

quality nuts always remain. After floatation, the nuts are washed to dean the

surface from the residuals left during dehulling.

The moisture content of nuts increases as a result of the floatation

process. To prevent spoilage and increase storage life, the dehuiled nuts must

be dried to about 5-7% moisfmre content (Kader et al. 1980). The most common

methods for drying pistachio nuts are sun dryîng and heated air drying.

Further on-farm sorthg of the nuts is usually done by electro-mechanical

devices such as aspirators, air streams, gravity separators, shakers, and sieves.

The dried pistachios are usually stored on the farm for a period of 2 to 6

months before being shipped to a processing plant or before being exported.

The nuts are sold by mass and the price is set by the grade of the nuts and the

availability. The USDA has established standards for different grades of

pistachio nuts (California Pistachio Commission 1994). These standards are

briefly desaibed in the next section.

Many nut-processing plants have their own standard for marketing

pistachio nuts. Sorting, grading, roasting, dyeing, inspecting and packaging are

some of the unit processes which may take place in a given plant. Pistachio

companies are very reluctant in releasing any information on the methods

and equipment for processing these nuts. However, based on persona1

communication with the North American manufactures of the nut sorting

devices, some pistachio companies use optical sorting machines for detecting

the unsplit, contaminated and stained nuts. Bichromatic-infrared sensors are

commonly used for detecting the stained or the contaminated nuts and

monodiromatic sensors are used for detecting the unsplit nuts.

A diagram of an optical sorting machine used for sorting pistachio nuts

is given in Fig. 2-2 The unsorted nuts are poured into the hopper of the

machine and the feedhg mechanism gradualiy distributes the nuts between

several channels. Each channel is equipped with two (180° apart) or three (120°

apart) optical sensors. As the nuts slides down the channels, they pass through

an illuminated turutel where the optical sensing elements are mounted. The

sensors send signals to electronic arcuitry which activates a pneumatic ejector.

These machine are capable of processing approximately one metric tonne of

unsorted nuts per hour .

Hopper

Feeder

Fiow Cham

Illumination and Sensors heuma tic Ejector

Fig. 2.2 An optical nut-sorter (ESM International Inc., Houston, TX,

USA).

Pistachio production in the USA is a young and rapidly growing

industry. Within the past decade there has been a few published research

projects on the engineering aspects of pistadiio processing. Pearson et al. (1993)

measured the physical properties of early-split and normal-split pistachios to

determine a sorthg criterion. Hsu at al. (1991) determined some physical and

thermal properties of pistachio nuts. Farsaie et al. (1981) developed an

automatic electro-optical sorter for removing aflatoxin-contaminated pistachio

nuts. The device detected the defected nuts excited by an incident beam of long

wavelength ultra-violet light.

2.1.3 USDA Standards for Grading Pistachio Nuts

In 1994 the USDA approved a standard for grading pistachio nuts

(California Pis tachio Commission 1994). The standards define separate cri teria

for "in shell" and "shelled" nuts. A copy of these standards is reproduced and

presented in Appendix A. Since the nuts used in this research were all "in

shell", the grades of these nu& as approved by the USDA are briefly explained.

Pistachio nuts, regardless of their size, can be graded as "US Fancy", "US

No. l", "US No. 2", or US No. 3". The basic requirement for these grades is

that they must be free from foreign material, blanks, stainç, unsplit shells,

mold, insects, and nuts having diameters less than 10.0 mm as measured

using a round hole saeen. The difference between these grades is as a result of

allowable tolerance of the above mentioned factors.

As was shown in Table 1.1, based on size, pistachio nuts may be graded

as "Extra Large", "Large", "Medium", and "Small". The specification also

indicates that the average number of nuts in a grade should not exceed one-

half nut above or below the specificationç given for that grade. For example a

sample of Large grade pistachïo may have an average number of nuts from

20.5 to 25.5 per 28.5 grams (1 oz.). To ensure the uniformity within a grade and

to put a limit on the number of srnalier nuts in a grade, the speufication

indicates that the mass of 10% by count of the largeçt nuts in a sample should

not exceed 1.7 times the mass of 10% by count of the smallest nuts.

2.2 INSPECI'ION BY MACHINE VISION

Machine vision is a technology that has arisen from a union between

camera and computer. Figure 2.3 presents a blodc diagram of the hardware

and software components of a typical classification machine. A video camera

acts as an eye to a machine vision systern (Batchelor al. 1985). Analog signais

generated by the camera are digitized into a sequence of nurnbers and stored as

an image in the computer. Image processing algorithms are used to extract a

pattern from the image to represent the object The pattern is dassified by a

classification algorithm which in turn may generate a signal to activate an

actuator to direct the object into its proper route.

Machine vision systems have gained hemendous attention for

inspecting products in different industries and dernands for their new

applications are increasing. Here inspection refers to many industrial tasks

including defect detection, measuring, locating, detecting orientation, grading,

sorting, and counting. Machine vision offers many advantages over the

conventional grading systems. It is compatible with other automated on-line

processing. It will continue working round the dock and under conditions

which would be unpleasant or impossible for a human operator. It can take

dimensional measurements more accurately than a person can estimate by

eye, and c m give an objective measure of other variables such as color,

projected area and shape which an inspector could only assess subjectively

(Batchelor al. 1985). Since the inspection is done through a non-contacting

procedure, there is less damage to the products when they are king inspected.

Image S torage

1 1 Image Processing,Analy sis ] 1

l Feame Selec tion/Extrac tion I Classification JI

Sensor

Actuators

Fig. 2.3 A schematic representation for a classification process by machine vision

Machine vision sorting of agricultural products is more versatile and

have more maneuverability than existing optical sorters. This methods offers

multiple-feature processing. The features c m be the signals sent by different

sensors or obtained through an algorithm or a combinations of the two

methods. The pattern classification algorithms implemented on a machine

vision system provide multi-category classification of the products. The

classification parameters can be eady changed to improve the classification

performance or to use the same machine for a different product without any

change in its hardware.

23 m G E ACQUISITION

The first problem in any automated inspection task is to obtain a good

image of the object under investigation. Image acquisition involves collection

of images and their transmission to an associateci processing system. There is

no substitute for a high quality image. The first step in achîeving this is to

provide a proper illumination for the object. Improper illumination of ten

cause the key features to be obscured by glare, or it may reduces the intensity of

light reaching the detector. The best method of illumination is not always

obvious.

The image sensor is the basic element within a camera for capturing

images. Self-scanned solid-state arrays are the most widely used image sensor

in these systems. Photodiode arrays, charge couple device (CCD) and charge

injecting devices (CID) are examples of self-scanned array devices (Batchelor et

al. 1985). Many industrial inspection problems in which the product moves

dong a transport system require line-scan cameras. Using these cameras, an

image of an object is built up line by line as it moves past a camera. Linear

solid state arrays are extremely well suited for such applications. Video

cameras equipped with self-scanned solid-state sensors and high quality

photographic lens and filters are widely used for image acquisition (Batchelor

1985). The image obtained for processing does not have to be simple optical

one. It can be outside the range of visible light. Specid infrared cameras are

used for thermal imaging, and ultraviolet light has been used in crack

detection (Batchelore 1985).

An image acquisition board (frame-grabber) is a device which is

installed in a computer to digitize the analog signals received from the image

sensor and store thern as an image in computer memory (Data Translation

Inc. 1988). The analog signal of each sensor pixel is usually digitized by an 8 bit

buffer producing a 0-255 range for the signal. Therefore a white pixel of an

image has a value of 255 and a black pixel is stored as a value of O. The image

is stored as an array of numbers within the computer memory to be used for

further processing.

An image is stored in a computer memory as an arrays of nunibers that

may contain over 300,000 elements. Sequential process of this information is

time consuming and is not feasible for high speed on-line inspections.

Therefore, usually some image processing and/or image analysis algorithms

are applied to the gray scale images to extract some quantitative information

known as "features". The features are used as inputs to some classification

algorithm to determine the dass of the object.

When different features of an image are put together they form a vector

of numbers known as a "pattern". Thus, a pattern is a numerical description

of an object. Many different methods have been proposed to obtain numerical

features from a digitized image for purpose of classification. Pavlidis (1978,

1980) presented a complete survey of algorithms for shape analysis. He

distinguished two main categories of image extracted features, namely,

external and internal features. They are briefly reviewed in the foliowing

sections.

2-4-1 Extemal Image Features

Extemai image features are those type of feahues which encode the

boundary information (Pavlidis 1978). They usually require an image

segmentation to get the coordinates of the pixels on the external contour of the

image. The externai features do not requhe any knowledge of the gray level of

the internal pixels. After image segmentation or edge detection process,

mathematical procedures are performed on the coordinate of the boundaries

to extract usefd features. Examples of external image features are

morphological features, Fourier descriptors, boundary chah code, and

boundary sequences.

2.4.1.1 Momholoeical Features

Morphology refers to the area of image processing concerned with the

analysis of images based on their shapes and sizes (Dougherty and Giardina

1978). Morphological features of a shape are obtained by image segmentation

and anaiysis of the image. To get this type of feature one does not need to

know the coordinates or the gray-level intensity of the pixels within an image.

Therefore a silhouette image is suffiaent to extract these features. Examples of

these features are area, width, perimeter, and aspect ratio.

Morphological features have been widely used in automated grading,

sorting and detecting of grains. Neuman et al. (1987) used images of wheat

kernels and extracted shape and size characteristics to discriminate different

dasses and varieties of wheat. Brogan and Edison (1974) used morphological

features in conjunction with a recursive learning technique and a Bayesian

deusion mle for identification of six different type of grains. Hehn (1991) used

morphological data for separating canola and mustard seeds.

Morphological features sometimes are not sufficient for a high

performance inspection process. These features are combined with other

appropriate features to achieve higher classification rates. Neuman et al. (1987)

used a combination of morphological feahues and Fourier desaiptors to

separate different varieties of wheat A recent application of morphological

features in machine vision system examined quality of snacks by a neural

network (Sayeed et al. 1995). The features considered induded area, length,

width, roundness, and perimeter.

2.4.1.2 Fourier Descriptors

Fourier desaiptors (FD's) are shape recognition features based on the

Fourier series expansion of periodic functions. The theory behind these

features are discussed in detail in section 35.2. The general idea is to represent

the boundary as a periodic function with a period of 2x. The obtained periodic

function is then expanded in a Fourier series and its coefficients are calculated.

Ordinary Fourier coeffiaents are difficult to use as input to dassifiers, because

they contain factors dependent on size, rotation, and phase angle (Granlund

1 972).

Different methods have been developed to obtain scale and rotation

invariant FD's for 2-dimensional object recognition. Ehrlich and Weinberg

(1970) desaibed the contour of an object in terms of the lengths of equispaced

radii extending from the centroid to its boundary. They presented radii as a

periodic function of the central angle with a perïod of 2x. The fundion was

then expandeci as a Fourier series and the polar coefficients of the expansions

were used as shape descriptors. Granlund (1972) derived the Fourier

coeff~aents from the expansion of the boundary coordinates of the objects in

the complex plane. Zahn and Roskies (1972) represented a c u v e as a function

of arc length by the accumulated changes in direction of the curve from a

starting point on the curve. The function was normalized as a 2x periodic

function and the Fourier coefficient of the function were calculated.

Fourier descriptors have been extensively used as shape descriptors in

many pattern recognition applications. Segerlind and Weinberg (1973) used

FD's obtained using the method developed by Ehrlich and Weinberg (1970) for

grain kernel identification. Persoon and Fu (1986) used FD's obtained by

Granlund's (1972) method for character and machine part recognition.

Romaniuk (1994) used FD's obtained by Zahn and Roskies (1972) method as

input to a neural network for classification of barley seeds.

2.4.1.3 Boundarv Secpences

Boundary sequence is a general term for patterns approximating the

boundary of an object. Dubois and Glanz (1986) approximated the boundary of

an image as an ordered sequence of the lengths of N equiangular radial vectors

projected between the object centroid and the boundary. Ghazanfari and

Irudayaraj (1994a) used the same type of sequence for dassification of four

varieties of pistachio nuts. Gupta and Srinath (1987) also used this type of

sequence to derive moments for classification of 2D shapes.

Bunk and Buhler (1993) used the curvature of the boundary for

symbolic representation of the object. In their method a starting point is

selected on the boundary and the local change in the curvature of the

boundary segments is recorded at equidistant intervals. This method of

boundary representation is very accurate but results in long sequences which

increases processing tintes.

2.4.1.4 Boundary Chain Codes

The boundary chain code was first developed by Freeman (1970) to

represent a plane object boundary as a string of directional codes. A boundary

chah code is a sequence of directional codes representing the boundary of an

image. Application of boundary diain codes in machine vision inspection

have included fruits' stem detection (Wolfe and Sander 2985) and tomato

sorting (Sarkar and Wolfe 1985). Boundary chain codes have also been used in

some image processing algonthms to extract morphological features from an

image (Hehn and Sokhansanj 1990; Liu and Srinath 1990).

2.4.2 Interna1 Image Features

The internai image features are obtained by analysis of the pixels

within the boundary of an image. Both the location of a pixel and its gray level

may play an important role. There are many different types of features that

may be considered for different applications. For example, Gunasekaran et al.

(1987) used line detection within an image to indicate the existence of a aack

in corn, and the existence of pixels with gray level pixels above a certain

threshold was used for defect detection of dates by Wulfsohn et al. (1993).

Moments, textural features, and gray-level his tograms are examples of

intemal image features which wiii be reviewed briefly.

2.4.2.1 Moments

Moments have been among the most commonly used image extracted

features for shape discrimination. Several of the most essential image

attributes such as size, centroid, orientation, shape spreadness, and shape

elongation are directiy related to moments (Leu 1991; Hehn 1991). Moment

invariants were firçt proposed and used for pattern recognition by Hu (1962).

The major disadvantage of moments is that although the first few moments

convey significant information for simple objects, they fail to do so for more

complicated ones (Pavlidis 1978). The computational time for deriving

moments of an image is also high.

Moments were originally calculated using the locations of the interna1

pixels of an image. Leu (1991) presented a meîhod for computing moments

from the boundary pixels of an image. He showed that this new method was

much more effiaent than the traditional methods for computing moments.

Khotanzad and Lu (1991) used moments for character recognition

2.4-2-2 Textural Features

Texture is a qualitative description of a surface in terms of some

properties such as fineness, coarseness, smoothness, and granulation.

Researchers have investigated many methods for evaluating the texture of an

image. Despite its importance, a formal approach to texture description does

not exist. Early image texture shidies employed autocorrelation f unctions,

power spectra, and the relative fiequenues of various gray levels (Haralidc et

al. 1973). The approaches to texture description are mostly on an ad-hoc basis

and generaliy utilize the gray level values of the iternal pixels of an image i n

some way (Haralick 1979).

Haralidc et al. (1973) suggested 28 texhval features which could be

extracteci from gray level images. Sayeed et ai. (1995) used the a subset of these

feahues for texture evaluation of a snack food. Khotanzad and Lu (1991)

analyzed gray-level images for texture classification of several different

commodi ties.

Gray-level histograms are the discrete frequency plot of pixels with a

particular gray-level. These plots may be viewed as a probability density

function so long as the elements of the image are randomly selected (Levine

1985). Then the continuous frequency gray-level plots of different dasses of

objects can be used as their discrimination functions. Das and Evans (1992)

used gray level histogram data for detecting fertility of hatching eggs. Han et al.

(1992) used gray level histograms obtained from X-ray images for detecting

split-pit peaches.

The performance of a dassification system depends chiefly on selecting

an appropriate set of features which best desaibe their associate classes.

Redundant and irrelevant features may degrade the classification performance

(Devijver and Kittler 1982). Feature selection speeds the processing time and

inaeases the reliability of a classifier by eluninating the redundant and

irrelevant information. A feature selecting procedure should extract the mos t

useful information from the representation vector and present it in the form

of a pattern vector of lower dimensionality whose elements represent only the

most significant aspects of the input data.

Mathematical feature selection tediniques are dassified into two major

categories: feature selection in the measurement space, and feature selection in

the transformed space. The methods in the k t category are referred to as

"feature selection" and the methods in the latter category are known as

"feature extraction" methods. Feature selection in the measurement space is

achieved by eliminating those measurements which are redundant or do not

contain enough relevant information. In this process the subset X of n

features:

is selected from the N-feature Y.

The subset X should have the best possible combination of Y features for

minimizing the ciassification error (Kittler 1975). To find the best possible

subset, one should try al1 possible combination of features. In most cases this is

not practical, because the required number of trials

is very large. For practical situations, some computationally feasible

procedures have been suggested to select a sub-optimal subset from the

original features. These methods are explained in detail by Kittler (1975) and

Devijver and Kittler (1 982).

Mucaardi and Gose (1971) compared severai different feature selection

techniques. They indicated that the error rates of the features selected by any of

the investigated techniques were lower than the error rate of randomly

selected features. They conduded ali of those feature selection techniques were

applicable to most pattern classification problems, but the choice of a particular

selection method depends on the ease of implementation, economical

considerations, and the particular application. Three different feature selection

methods, identified for their possible applicability in this research, are briefly

explained below.

2.5.1 Interclass/Intraclass Method

A dass of aiteria of feature selection in the measurement space which

are more heuristic in nature is based on the Euclidean distance between the

elements of the dass sets. These criteria originate from the intuitive argument

that "the greater distance between the elements of different classes the better

the dass separability". Based on this argument, a "good" feature should

provide a large distance between elements of different classes (the interclass

distance) while the distance between elements within a single dass (the

intradass distance) is as small as possible. This concept for a two-dass case

using two features (XI and X2) is presented in Fig. 2.4. In this figure dl is the

intraclass distance between a member of class 1 and the mean of this class M l .

The distance between the means of the two classes, d(M1, Mz), is the interclass

distance for the two classes.

Fig. 2-4 The concept of interdass and intradass distances.

Alternatively it can be stated that good features shouid have smali

scatter within their dass and large scatter between classes. One way to select

the best features is to use the Fisher criterion, defined by:

where KI, Kz are the dass covariance matrices, M I and Mz are the respective

means of the two dasses. The term KI+K2 in Eq. 2.4 is the within-class scatter

matrix and (M, - M ) ( M - M ) is the between-dass scatter matrix (Therrien

1989). Thus, good features should maximize the Fisher aiterion function. The

greater the ratio of interdass and intradass scatters, the greater the spatial

separation of the classes. Therefore features can be ranked based on this

critenon and those with a higher ratio can be considered as good features. A

detailed discussion of this criterion and its mathematical formulation is given

by Devijver and Kittler (1982).

2.5.2 Forward Selection

In the sequential forward feature selection method the features are

ranked based on their ability to separate the ciasses using an appropriate

classifier. To begin with, the feature with the highest rank is selected. If the

classification accuracy using this feature is not satisfactory, a feahire is selected

from the remaining set which gives the highest classification rate with the

current feature. The procedure continues until the selected set of features yield

an acceptable classification accuracy. There are two main drawbadcs to this

method: first, there is no mechanism for removing a feature that was already

selected; and second, correlation between features is not taken into account

(Devijver and Kittler 1982).

2.5.3 Backward Elhination

In the sequential backward elimination procedure the performance of a

classifier is first tested with the whole set of features. Then the feature that

gives the lowest classification rate with the rest of the features is eliminated.

The process of successive elimination continues until any further feature

elimina tion results in an unacceptable classification rate. This procedure has

the same drawbacks as the sequential forward method. Furthermore, both

sequential forward and backward selection methods are time consuming

(Mucàardi and Gose 1971). The STEPDISC procedure provided by SAS

(Statistical Analyses System, SAS Institute, Inc., Cary, NC, USA) c m be used to

perform the forward selection and backward elimination methods.

Feature selection performed in transformed space is known as "feature

extraction". Feature extraction methods eliminate the irrelevant information

and redundancy in pattern Y by mapping it into a lower dimensional pattern

E by a transform action T:

In general, the map T could be any vector function of Y which can maximize

an appropriate separability measure in the feature space (Kittler 1975).

The Karhunen-Loeve expansion is the most widely used method for

feature extraction. The idea behind this method is to reduce dimensionaiity of

the features by aeating new features which are h e a r combinations of the

original features (Mucuardi and Gose 1971). Other methods of feature

extraction such as those based on separability measures and those based o n

non-orthogonal mapping are explained by Kittler (1975).

Feature extraction methods were not considered for this research for the

following reason. The nonlinear transformation during feature extraction

procedures creates a new set of features from the originals. When the

transformed features are used as input to a dassifier, it is not easy to

investigate the behavior of a dassifier with changes in the original features.

Use of feature extractors as a preprocessing device in on-line classification is

compu tationally expensive.

Classifiers are algorithms implemented on digital cornputers for

purposes of classification. Classification dgorithms are developed and applied

in two stages. In the first stage, called the training stage, the required

classification parameters are estimated from a set of patterns cded the

"training set". During the second stage, cailed the test stage, the algorithm uses

the parameters to determine the dass of a new set of patterns called the "test

set". Once a classifier gives an acceptable accuracy for the test data, it can then

be used for the real world applications.

There are many different types of dassifiers and they are dassified and

explaïned in various pattern recognition books and research papers (e-g.,

Devijver and Kittler 1982; Nilsson 1990). Determining which classifier works

best for a particular application usually involves some degree of trial and

error. Most dassifiers, when applied to a particular problem, result in

comparable dassification accuracies. The real difference between them lies i n

their time complexity, storage requirements, and prease degree of accuracy

(Hush and Home 1993). A brief review of different classification methods and

their applications is given in the follovïing sections.

2 m 7 m 1 Bayesian Classifiers

Ln s tatis ticai pattem recognition, a classifier assigns the unknown represented

by the pattern X to the class o according to Bayes decision rule. For a two-class

case the Bayes decision ruie is given by:

where p(Xlw,) is the conditional dençity function and P ( o , ) is the a priori

probability of dass 1. In this case, I has a value equal to 1 or 2 Equation 2.6

states: Assign X to the dass a,, if P(w,)p(Xlw, jis larger than P(@, )p(Xlw2 ) - It i~ shown (Duda and Hart 1973) that this decision nile minirnizes the probability

of error, i.e-, the probabiiity of making an incorrect decision. Bayes decision

d e for multi-dass classification problems is presented in Fig. 2.5 . When dasses are separated by "discrimination functions", e-g., normal distribution

functions, then these huictions are used in Bayes deasion rule instead of the a

posterior probabilities.

Fig. 2.5 Bayes decision nile for mdti-class problems

4-

Gaussian classifier is one of the most frequently used dassifiers in

pattern classification problems. This dassifier, a speaal case of Bayes decision

rule, assumes that individual features have a Gaussian distribution (Therrien

1989). To implement this classifier one need only to estimate the mean vector

and the covariance matrix for each dass. These are the parameters of a

posterior probabilities that can be replaced in the Bayes decision mle. The

dassifier assigns the unknown to the class with the higher probability. W hen

Maximum, x Y Selector Class

dasses have different covariances the decision boundary may be in form of

ellipsoid, hyperboloid, parabdoid, or some combination of these (Therrien

1989). When the covariance matrices are equal, then the deasion boundaries

between dasses reduce to hyperplanes. More details on this classifier are given

in section 3.6.1.

Brogan and Edison (1974) used Bayesian deasion rule for classifying six

different types of grains. Segerlind and Weinberg (1973) used the Mahalanobis

distance (see section 3.6.1) for separating kemeis of corn, oats, wheat, barley,

rye, soybeans, and navy beans from each other. Singh et ai. (1993) applied the

technique of Bayes minimum risk classification for defect grading of

s tonefruit.

In agricultural research it is usually assumed that the variables are

normally distnbuted. Therefore many researchers have applied Gaussian

dassifiers to perform classification of different agricultural commodities. The

DISCRIM procedure in SAS has been widely used for these purposes.

Examples of the use of disuiminant analysis indude: separating early split

from normal split pistachio nuts by Pearson et al. (1993); evaluating snack

quality by Sayeed et ai. (1995), and discriminating canoia and mustard seeds by

Hehn and Sokhansanj (1990).

2.7.2 Classifiers Using Discrimination Functions

Discrimination functions have the property that they partition the

pattern space into mutuaiiy exclusive regions, where each region contains the

domain of a given dass. In classification applications, a posterior distributions

can be replaced by the discrimination functions (Duda and Hart 1973). Then a

dassifier is viewed as a machine that cornputes C discriminate functions,

g, .....gc (Fig. 26). A maximum selector is used to assign the pattern X to the

category associateci with the largest discriminant

Fig. 2.6 Machine classification of patterns using discrimination functions.

Linear discrimination functions are the simplest form of discrimination

functions. The decision boundary formed by a h e a r discrimination function

in two-dimensional feature space is a line, in three-dimensional feature space

it is a plane, and in multi-dimensional feature space it is a hyperplane. The

general fom of a linear discrimination function is:

where w's are the weights of the function and xis are the features constituting

the multi-dimensional feature space. The function is completely specified by

determining the values of the weights. A dassifier which implements the

linear discrimination functions is sometimes referred to as a linear classifier

or linear machine.

The selection of a suitable functional form for a discrimination function

is a problem of cruaai importance. The theory for construction of

discrimination functions has not reached an effective stage. Prior knowledge

about a posterior distribution of classes is always helpful in building proper

disaimination functions (Devijver and Kittler 1982). Sometimes reasonable

gueçses are made on the bais of qualitative knowledge about the patterns

(Nilsson 1990). An example of application of linear discrimination function is

classification of different industrial objects represented by autoregression

rnodels (Persoon and Fu 1986).

Piecewise linear discrimination function have been used for

dassification of objects whose classes are separated by non-linear decision

boundaries. In this method the classifier partitions the feature space into a

number of regions using a set of hyperplanes (Nilsson, 1990). Haralik et al.

(1973) used piecewise linear discrimination function for texture classification

of five kinds of sandstones.

2.7.3 Neares t Neighbor Classifiers

The Nearest Neighbor Classifier (NNC) makes use of the correspondence

between similarity and distance, i.e., the smaller the Eudidean distance

between classes the more similar they are (Batchelor 1974). The nearest

neighbor deasion rule assigns an unknown U to the dass of its nearest

neighbor X :

U E clars(i) if d(U,Xi) = min d(U,Xj), for k # j k, j = 1,Z ,.... C .

where d(U, X) is a distance measure between U and X and C is the number of

classes. Various distance rneasures (metrics) can be defined which can be used

in Eq. 2.8. Eudidean, aty blodc, and Mahalanobis are example of different

metrics. A complete discussion on different type of metrio is given by

Devijver and Kittler (1 982).

The basic idea behind nearest neighbor mles is that samples which fall

dose together in feahire space are likely to belong to the same dass. A NNC

stores a number of pattern for each dass. Then an unknown is compared to al1

of the stored patterns and assigned to the dass of the pattern which is most

similar with the unknown. The decision surface aeated by NNC is piece-wise

linear (Batchelor, 1974).

The k-nearest neighbor (k-NN) classifier is an extension of NNC. The

k-NN rule ciassifies X by assigning it the dass most frequently represented

among the k nearest samples. In other words a decision is made by examining

the labels on the k nearest neighbors and taking a vote. A complete description

of NNC and k-NN are given by Devijver and Kittler (1982).

2.7.4 Minimum Distance

Minimum distance classifier, like NNC, makes use of the

correspondence between similarity and distance. However in this method for

each class a prototype is considered and an unknown U is assigned to the dass,

i , of the prototype Mi which has the minimum distance d with it. That is:

where C is the number of classes. Minimum-distance classifiers would be

appropriate in situations where each class is represented by a single prototype

pattern around which al1 other patterns in that class tend to cluster.

Gupta and Srinath (1987) used minimum distance nile for classification

of 2D shapes using contour sequence moments. Persoon and Fu (1986) also

used minimum distance rule but they used Fourier descriptors for 2D shape

representation. String matching techniques whidi have been used for 2D

shape recognition (Bunk and Buhler 1993; Mase 1991) use minimum distance

classification method.

2.7.5 Decision Tree Classifiers

Classification trees constitute an important and popular form of

hierarchical ciassifiers. A deusion tree dassifier utilizes a series of simple

deusion functions, usually b i n q in nature, to determine the dass of an

unknown pattern (Levin 1981). The evaluation of these deasion functions is

initiated from the tree root node and branches out through the interna1 nodes

toward the terminal nodes. The evaluation of deusion function at each node

is in such a way to that the outcome of successive decision functions reduces

uncertainty about the unknown pattern. The most common choice for the

node deusion functions is a threshed cornparison on a component of the

feature vector. The thresholds are usually determined by examining a training

sample and their accuracies are validated using a test set.

The dassification capability of a tree classifier arises from its ability to

partition the feature space into complex regions by making a sequence of

simple decisions at each node. Some advantages of dassification trees are low

storage requirement, simple deusion at nodes and ease of understanding of

classification process. The disadvantages are abrupt decision at nodes,

cornparison of continuous features against a threshold to determine

brandiing, uncertainty about the thresholds, difficuity with missing features,

increase in complexity with the increased size of tree (Gelfand and Delp 1991).

There have been numerous researches to apply the p ~ c i p l e of

machine vision and pattern recognition in classification of agricultural

products. One of the pioneer works in this area is the classification of grain

kemels by Segerlind and Weinberg (1973). They investigated the feasibility of

identifymg grain kernels by analyzing their profile using Fourier series

expansion of the periphery radius. Tests were performed on sarnples of corn,

oat, wheat, barley, rye, soybeans, and navy beans. The value of the first ten

Fourier coefficients (harmonies) were obtained for a training set and a test set.

The authors mentioned that the application of the method to intraspecies

discrimination was only partially successful with the errors ranging from 11 to

25%. They also indicated the implementation of this technique was too

tedious for routine use. Considering the faa that this research was performed

more than two decades ago, at present time with the existence of super

cornputers and advances in software engineering the burden of this type of

research has greatly reduced.

Images of wheat kernels in plane-form view were acquired and

processed to extract kernel shape and size characteristics to disaiminate

different classes and varieties of wheat by Neuman et al. (1987). Feature

extraction algorithm based on object contours were developed and

implemented using the FORTRAN 77 cornputer language. In addition to

kernel spatial size and shape parameters as discriminating features, contour

curvature was further quantified in the frequency domain to obtain Fourier

descriptor as shape-specific features. Statistical pattern recognition methods

were used for discrimination analysis which resulted in an overall

performance of 87%. This research is important because determining the

variety of an agriculturai product is more difficult than separating different

kind of products.

Pattern recognition techniques were used for automatic classification of

six different grains by Brogan and Edison (1974). An algorithm based on

recursive learning technique and Bayesian deasion rule was developed and a

prototype device was built for rapid, accurate and automatic classification of

the grains. Of six grains, corn and soybeans were perfectly identified. W hea t,

oats, barely and ryes were much more similar and were more likely to be

misdassified. An overall accuracy of about 98% was obtained.

A method using machine vision for detecting the fertility of hatching

eggs during the thïrd and fourth day of incubation was developed by Das and

Evans (1992). In this method images of eggs were acquired using back-lighting

with a high intensity candling lamp. Parameters from the gray-level

histograms were estimated to desaïbe the shape of the histograms. A n

algorithm were developed using the estima ted parameter to dis tinguish the

fertile from infertile eggç. The algorithm gave prediction accuracies of 96%

using the day of fourth data and 88% using the day of third data.

Han et al. (1992) developed a method for detecting split-pit peaches

using a machine vision system. The X-ray films of peaches were placed under

a video camera and the gray level histograms of the images were analyzed.

Using the histogram data, a threshed equation was developed to separate the

split-pit peaches from the unsplit-pit peaches. An acc-uracy of 98% was

reported. The limitation of this method was in proper orientation of the

peaches under the X-ray camera.

Sarkar and Wolfe (1985) developed a prototype tomato sorting machine.

Tomatoes moving over a belt conveyer were deteaed by a photoceli which

signaleci a computer to grab two views, stem end and blossom end, and

andyze them. Based on the algorithm output, the computer activated

solenoid operated pneumatic cylinders. The cylinder action raised the

appropriate indined flap and dropped the tomatoes into their corresponding

boxes. The author indicated that the developed prototype was not suitable for

commercial purposes, because of speed limitations. There was no mention

whether the speed limitation was because of computer processing time or

hardware implementa tion.

Gunasekaran et al. (1987) applied image processing techniques for

detecting stress cracks in corn. The algorithm used for stress aadc evaluation

was similar to a high-pas filtering process. The pixels representing stress

cracks had significantly different gray level values than the pixels of the rest of

the kernel surface. Therefore, gray levels of the pixels representing the stress

cracks were extracted first by aeating an image suppressing the gray scale

levels of the stress crack part and subtracting this newly created image from

the original image. The success rate was determined by comparing the visual

evaluation of the kernels for the stress aack with corresponding evaluation of

the vision system of the same set of kernels. The algorithm performed

satisfactorily in detecting stress cracks in 90% of the examined kernels.

A vision system and aigorithm were developed to locate fruit on a tree

(Site and Delwiche 1988). Images were obtained using a solid state camera.

Several bandpass optical filters

pattern recognition and neural networks...pattern recognition and neural networks a thesis submitted...

Documents