pattern recognition and neural networks...pattern recognition and neural networks a thesis submitted...

188
MACHINE VISION CLASSIFICATION OF PISTACHIO NUTS USING PATTERN RECOGNITION AND NEURAL NETWORKS A Thesis Submitted to the CoiIege of Graduate Studies and Research in Partial Fulfiilrnent of the Requiremen ts for the Degree of Doctor of Philosophy in the Department of Agricultural and Bioresource Engineering University of Saskatchewan Saskatoon, Saskatchewan, Canada by Ahmad Ghazanfari-Moghaddam Fall, 1996 O Copy15ght Ghazanfari, Ahmad, 1996. AU rights reserved.

Upload: others

Post on 18-Feb-2021

15 views

Category:

Documents


0 download

TRANSCRIPT

  • MACHINE VISION CLASSIFICATION OF PISTACHIO NUTS

    USING

    PATTERN RECOGNITION AND NEURAL NETWORKS

    A Thesis

    Submitted to the CoiIege of Graduate Studies and Research

    in Partial Fulfiilrnent of the Requiremen ts

    for the Degree of

    Doctor of Philosophy

    in the

    Department of Agricultural and Bioresource Engineering

    University of Saskatchewan

    Saskatoon, Saskatchewan, Canada

    by

    Ahmad Ghazanfari-Moghaddam

    Fall, 1996

    O Copy15ght Ghazanfari, Ahmad, 1996. AU rights reserved.

  • National Library 1*1 of Canada Bibliothèque nationale du Canada Acquisitions and Acquisitions et Bibliographie Services services bibliographiques

    395 WelIington Street 395. rue Wellington Ottawa ON K I A ON4 Ottawa ON K I A ON4 Canada Canada

    The author has granted a non- exclusive licence allowing the National Library of Canada to reproduce, loaq distribute or sell copies of this thesis in microform, paper or electronic formats.

    The author retains ownership of the copyright in ths thesis. Neither the thesis nor substantial extracts fiom it may be printed or otherwise reproduced without the author's permission.

    Your Ne v m Rilénmce

    Our fi& Notre relérence

    L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la fome de microfiche/film, de reproduction sur papier ou sur format électronique.

    L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.

  • University of Saskatchewan

    College of Graduate Studies and Research

    SUMMARY OF DISSERTATION

    Subrnitted in Partial Fulfillment

    of the Requiremen~ for the

    DEGREE OF DOCTOR OF PHILOSOPHY

    by

    Ahmad Ghazanfari-Moghaddam

    Depamnent of Agticu1tura.i and Bioresource Engineering

    University of Saskatchewan

    Summer, 1996

    Examining Committee:

    Dr. L A , Kells

    Dr. E. Barber

    Dr. J. Iruday araj

    Dr. D. Wulfsohn

    Dr. R. Boiton

    Dr. RI Ford

    Dr. A. Kusaiik

    Externai Examiner:

    &ad&sia&s DeanlDeanls Designate, Chair College of Graduate Studies and Research

    Chair of Advisory Cornmittee, Depamnent of Agriculnual and Bioresource Engineering

    Co-supervisor, Department of Agicultural and Bioresource Engineering

    Co-supervisor, Department of Agriculrural and Bioresource Engineering

    Depamnent of Electrical Engineexing

    Department of Agricultural and Bioresource Engineering

    Deparmient of Cornputer Science

    Dr. D.S. Jayas Department of Biosystems Engineering University of Manitoba Winnipeg, Manitoba, R3T 5V6

  • MACHINE VISION CLASSIFICATION OF PISTACHI0 NUTS

    USING

    PATTERN RECOGNITION AND N U R A L NETWORKS

    Machine vision-based sorting of agicuituraI commodities is an aliemative to the

    conventional mechanical and elecnwptical soriing methods. This method offers high

    speed multi-category classification by processing multiple-feanrres obtained through image

    processing algorithms. The purpose of this thesis was to select an appropriate set of

    feames and to investigate different classification schemes for efficient machine vision-

    based sorthg of pistachio nuts.

    Kerman cultivar pistachio nuts obtained h m Califoniia were used in this study. A

    sample of nuts were weighed and manuaIly soried into four classes: "Grade One" (G 1 ) ,

    "Grade Two" (GZ), and "Grade Three" (G3), and "unsplit nuts" (UN). Each class

    consistai of 260 nuts. Morphological features (area, length, width, perimeter, and

    roundness ), Fourier descriptor (FD's) of the boundary, and gray level histograms were

    exnacted h m images of the nuts using a Macintosh-based machine vision system and

    commercial image processing software.

    The discrimination power of the individual sets of features for separaring the four

    classes were investigated using Gaussian classifiers. The morphological features and FD's

    resulted in relatively low classification accuracîes. 'Ihe gray-level histograms yielded an

    average classification acciiracy of 98.5%. Analysis of the classincation results hdicated

    that morphological features had a high potentiai for separatïng G1, G2, and G3 frmn each

    otha; and that the FDfs had hi& discrimination power for separating the split nuts fiom

    unsplit.

  • Different f e a m selection methods including f o w ard selec tion, backw ard

    elimination, Fisher criterion, and graphical analysis were applied to select a suitable subset

    of features. The feature selection results uidicated that a combination of seven selected FD's

    and the area (7FD's & A), or a combination of the fkquency of the gray level56 and the

    area (GL-56 & A) were suitable for separating the four classes. The selectd features were

    used as input to different classifiers such as Gaussians, decision mes, multi-layer neural

    networks (MLNN), and mu1 ti-structure neural networks (MS NN) . A procedure for caicuia~g the computational complexity of the classifiers was developed. The classifiers

    were compared in term of performance and computationai complexity.

    A decision tree classifier using GL-56 & A resulted in 9 1.7% classification

    accuracy. The sarne features using MLNN and MSNN resuited in 92.4% and 93.2%

    accuracy, respectively. The GL-56 & A using a Gaussian classifier resulted in an overall

    classification accuracy of 89.6%. Using 7FD's & A, the classification accuracies were

    82.8%, 88.795, 94.1%, and 95.0% for Gaussian, decision me, MLNN, and MSNN

    classifiers respectively.

    The decision m e classifiers required the least amount of computational rime, but

    relied heavily on the tbreshold vaiues supplied by the user. The neural network classifiers,

    in sequential executions, required higher compuiationd tirne, but in terms of classification

    accuracy, they were superior to the statistical classification methods. The MSNN classifiers

    were the most suitable method for this multi-category classification problem. These

    classifiers learned their input-output mapphg faster and were more robust when compared

    to MLNN classifiers.

  • COPYRIGHT

    The author has agreed that the Library, University of Saskatchewan, may

    make this thesis freely available for inspection. Moreover, the author has agreed

    that permission for extensive photocopying of this thesis for scholarly purposes

    may be granted by professor or professors who supeMsed the thesis work

    recorded herein or, in their absence, by the Head of the Department or the Dean

    of the college in which the thesis work was done. It is understood that due

    recognition will be given to the author of this thesis and to the University of

    Saskatchewan in any use of the material in this thesis. Copying or publishing or

    any use of the thesis for financial gain without approval by the University of

    Saskatchewan and the author's written permission is prohibited.

    Request for permission to copy or to make any other use of the material in

    this thesis in whole or in part should be addressed to:

    Head of the Department of Agriculturai and Bioresource Engineering University of Saskatchewan 57 Campus Drive Saskatoon, Saskatchewan S7N 5A9 Canada

  • Machine vision-based sorting of agricultural commodities is an alternative

    to the conventional mechanical and electro-optical sorting methods. This

    method offers high-speed, multi-category classification by processing multiple-

    features obtained through image processing algorithms. The purpose of this

    thesis was to determine an appropriate set of features and to investigate different

    dassification schemes for efficient machine vision-based sorting of pis tachio

    nuts.

    Kerman cultivar pistachio nuts obtained from California were used in this

    study. A sample of nuts were weighed and manually sorted into four classes:

    "Grade One" (Gl), "Grade Two" (GZ), and "Grade Three" (G3), and "unsplit nuts"

    (UN). Each class consisted of 260 nuts. Morphological features (area, length,

    width, perimeter, and roundness), Fourier descriptor (FD's) of the boundary, and

    gray level histograms were extracted from images of the nuts using a Macintosh-

    based machine vision system and commercial image processing software.

    The discrimination power of the individual sets of features for separating

    the four dasses were investigated using Gaussian dassifiers. The morphological

    features and FD's resulted in relatively low classification accuracies. The gray-

    level histograms yielded an average classification accuracy of 98.5%. Analysis of

    the classification results indicated that morphological features had a better

    potential for separating G1, G2, and G3 from each other while the FD's had a

    higher discrimination power for separating the split nuts from unsplit.

  • Different feature selection methods induding forward selection, badcward

    elimination, Fisher criterion, and graphitai analysis were applied to select a

    suitable subset of features. The feature selection results indicated that a

    combination of seven selected FD's and the area (7FD1s & A), or a combination of

    the frequency of the gray level 56 and the area (GL-56 & A) were suitable for

    separating the four classes. The selected features were used as input to different

    dassifiers such as Gauçsians, decision trees, multi-layer neural networks

    (MLNN), and multi-structure neural networks (MSNN). A procedure for

    calculating the computational complexity of the classifiers was developed. The

    dassifiers were compared in term of performance and computational complexity.

    A deusion tree classifier using GL-56 & A resulted in 91.7% classification

    accuracy. The same features using MLNN and MSNN resulted in 92.4% and

    93.2% accuracy, respectively. The GL-56 & A using a Gaussian dassifier resulted

    in an overall classification accuracy of 89.6%. Using 7FD's & A, the classification

    accuracies were 82.8%, 88.7%, 94.1 %, and 95.0% for Gaussian, decision tree,

    MLNN, and MSNN dassifiers, respectively.

    The decision tree classifiers required the least amount of computational

    time, but relied heavily on the threshold values supplied by the user. The neural

    network dassifiers, in sequential executions, required higher computational time,

    but in terms of classification accuracy, were superior to the statistical

    classification methods. The MSNN dassifiers were the most suitable method for

    this mdti-category classification problem. These classifiers learned their input-

    output mapping faster and were more robust compared to MLNN dassifiers-

  • ACKNOWLEDGMENT

    Raise be to God, the Beneficent, the Mercifui.

    I wouid like to express my appreciation and gratitude to the individuals who assisted, encouraged, guided, and supported me throughout my education. Speaal appreciation is expressed to my P1i.D. advisors, Professor J. Irudayaraj (Dept. of Bio. & k g . Engin., Utah State University) and Professor D. Wulfsohn (Dept. of Ag. & Bio. Engin.), for their excellent supervison and advice.

    My deepest thanks and gratitude are extended to the members of my advisory and examining cornmittees: Professor E. Barber (Dept. of Ag. & Bio. Eng.), Professor R. Bolton (Dept. of Electncai Eng.), Professor R Ford (Dept. of Ag. & Bio. Eng.), Professor A. Kusalik (Dept. of Computer Science), and Professor D. S. Jayas (Dept of Biosystems Eng. Univ. of Manitoba), the extemal examiner, for their guidance and support.

    Many thanks to Professor S. Sokhansanj (Dept. of Ag. & Bio. Eng.), for use of the Bioprocess fadities; to Mr. M. Romaniuk, former research engineer (Dept. of Ag. & Bio. Eng.) for his assistance in using the machine vision system; to Professor H.C. Wood (Dept. of Electrical Eng.), for use of A N . software; and to the personnel and graduate students at the Department of Agricultural and Bioresource Engineering, for their support.

    Appreciation is further extended to the Ministry of Culture and Higher

    Education, Islamic Republic of Iran, for granting a Ph.D. scholarship to cover my entire living and educational expenses.

    Thanks are due to California Pistachio Commission and Paramount Farms for supplying the pistachio nuts required by this project, and different brochures on pistachio nuts.

    Finally, I would like to express my appreciation to my beloved wife, Shahin and to my two little daughters, Nadia (8) and Farida (6) for their understanding, patience, and support, and for being so nice.

    God bless all of you.

  • TABLE OF CONTENTS

    . . COPYRIGHT .................. ... ............................................. ... .................................. u ABSTlRACT .................................................................................................................... iv ACKNOWLE DGMENT ........................................................ .-. .................................... vi

    . . TABLE OF CONTENTS ................................................. .- i ......................................................................................................... LIST OF FIGURES xi

    ... ......................................................................................................... LIST OF TABLES ..mi LIST OF FREQUENTLY USED ABBREVTATIONS ................................................. xïv

    INTRODUCTION AND OB JECTWES .................................................. i 1.1 Introduction ................................................................................................ 1 1.2 Objectives .................................................................................................... 5

    ..................................................... 1.3 An Overview of The Thesis Chapters 5

    ........................ BACKGROUND AND REVIEW OF LITERATURE 7 2.1 Pistachio Nuts ............................................................................................. 7

    2.1.1 Cultural Practices ....................................................................... -8 ............................. 2.1.2 Pos t-Harvest Processing of Pistachio Nuts 9

    ......................... 2.1.3 USDA Standards for Grading PistaChio Nuts 13 2.2 Inspection by Machine Vision .................................................................. 14 2.3 Image Acquisition ..................................................................................... -16

    .............................................................................. 2.4 Classification Feahues -17 Extemai Image Fea tues ............................................................ -18 2.4.1.1 Morp hoiogical Features ......................... ... ........... -18 2.4.1.2 Fourier Descrip tors ..................................................... -19 2.4.1.3 Boundary Sequences ................................................... 2 0 2.4.1.4 Boundary Chain Codes .................... .... ................... 21

    Interna1 Image Fea tures .............................................................. 21 2.4.2.1 Moments ....................................................................... -22

    ......................................................... 2.4.2.2 Tex tural Fea tues -22 2.4.2.3 Gray-Level Histograms ............................................... 23

    2.5 Feature Selection ..................................................................................... -23 2.5.1 Interclass/Intraclass Method .................................................... -25 2.5.2 Forward Selection ...................................................................... -27

    vii

  • 3.6 Primary Classifications Using Gaussian Classifier .............................. 76 ..................................................................... 3 .6.1 Gaussian Classifier -76

    3.7 Feature Seleciion ....................................................................................... -77 ........................ 3.7.1 Feature Seiebion for Morphological Features 78

    3.7.2 Feature S e l d o n for FD's ................ ... ..................................... 79 .......................... 3.7.3 Feature çelection for Gray-level His tograms 80

    3.8 Classification Ushg Selected Features .................................................... 80 3.8.1 Decision Tree Classifiers ............................................................ 81 3.8.2 Neural Network Classifiers ........................ .. ........................ .... . 82

    3.8.2. 1 Wecting a Network Topology .................................. 83 3.8.2.2 Selecting a Learning s tep ............................................ 84 3.8.2.3 Training a Neural Network ........................................ 85

    3.8.3 MSNN Classifier ......... ... ........................................................ û6 3 . 9 Classification Performance Evaluation .................................................. 89 3.10 Computational Complexity Calculations ............................................ -91

    3.10.1 Computational Complexity for Gaussian Classifier ............ 92 3.10.2 Computational Complexity for MLNN and MSNN ............ 93

    3.11 Summary ................................................................................................... 94

    ................................................................... 4 RESULTS AND DISCUSSION 95 4.1 Introduction ................................................................................................ 95 4.2 Random Position Experimen ts ................................................................ -95

    4.2.1 Classifications Using Morphologid Features ....................... 96 4.2.2 Classifications Using FD's ......................................................... -102 4.2.3 Condusions from Random Position experiments ................. -104

    4.3 Controlled Position Experiments ............................................................ -105 4.3.1 Classification Using Morp hologicai Features ......................... 106 4.3.2 Classification Using Fourier Descriptors ................................. 106 4.3.3 Classification Using Selected FD's and Area .......................... 108 4.3.4 Classification Using Gray-level His togram Data .................. -110

    4.4 Explonng the Training Behavior of MLM\I ........................................... 117 4.4.1 Learning S tep ............................................................................... 118 4.4.2 Data Arrangement ...................................................................... 120 4.4.3 Effects of Network Topology on Training ............................... 121

    4.5 Classification of Pistachio Nuts by MLNN ........................................... 122 4.5.1 GL-56+A ....................................................................................... 123

  • 4.5.2 7FD's & A ...................................................................................... 125 ..................................................................... 4.6 Classification Using MSNN -127

    4.6.1 GL-56 &A ..................... ... ......................................................... 127 4.6.2 7FD's & A ....................... ........ .......................................... 129

    ...................................... 4.7 Cornparison of Classifiers' Performances 2 ........................................ 4.7.1 Performance of Gaussian classifiers -131

    .................... 4.7.2 Performance of Decision Tree Classifiers .... .. 133 ................................... 4.7.3 Performance of The MLNN Classifiers -133

    .......................................... 4.7.4 Performance of MSNN Classifiers. -135 4.8 S u m m a r y ..................................................................................................... 136

    5 S W M A R Y AND CONCLUSIONS ........................................................ 138 ..................................................................................................... 5.1 Summary 138

    ................................................................................................ 5.2 Conclusions -140 ...................................................................................... 5.3 Recommenda tions 142

    ...................................................................................................... 6 REFERENCES 144

    APPENDIX A ........................................................................................................... 154 USDA STANDARDS FOR GRADING PISTACHI0 NUTS ....................... 154

    ........................................................................................................... APPENDIX B 158 DELTA LEARNING RULE AND ERROR BACK-PROPAGATION FOR MULTI-LAY ER NEURAL NETWOIIKÇ .............................................. -158

    ........................................................................................................... APPENDIX C 166 CHI-SQUARE TEST FOR GOONESS OF FIT ................................................ 166

  • Figure

    LIST OF FIGURES

    Page . The main post-harvest processing of pistaduo nuts .................................... 10

    ........... An optical nut-sorter (ESM International Inc.. Houston. TX. USA) 12 A schematic represenation for a classification process by machine . . ........................................................................ visron ................ ........ .15

    The concept of interdass and intraclass distances ....................................... 26 Bayes decision rule for multi-dass probiems ...................... .... .............. -30 Machine classification of patterns uçing discrimination ..... .. ................. -32

    .............................................................. A mathematical mode1 for a neuron 41 Typical activation functions for neurons . a) sigmoid, b) hard .................... 42 Mathematical mode1 of a percep tron ............................................................ -44 A mu1 ti-category classifier using c discrete percep tron .............................. 45

    ............................................................. A typical three layer neural network 47 The general classification system .................................................................... 63 The machine vision system, 1: Ba& lighting device, 2: Camera, 3: Monitor, 4: Cornputer, 5: Object, 6: Image of the object .............................. 65 The polygonal representation of boundary of an image ............................. 73 A typical MSNN for four-class classification ................................................ 87 A typical disauninator of a MSNN ................................................................ 87 The performance of individual mcrphological features in recognizing the three classes of G1, G2 and G3 ............................................ 99 Area / mass correla tion of spli t pis tachio nu ts ............................................. -99 Area histograms: (a) Gl, (b) G2, (c) G3 ........................................................... 100 The mean of the first 15 hannonio of the four classes of pistachio

    ..................................................................................................................... nu ts -103 A typical gray-scale image of a pistaduo nut .............................................. -105 The mean of the first 15 humonics of split and unsplit nuts ..................... 107 The proposed classification scheme using 7FD's & A ................................. 109 The gray-level dis tnbution of the four dasses .............................................. 111 Increase in accuracy of dassification by increasing the number ................ 113 Tree classification scheme using GL-%&A data .......................................... 115

  • 4.1 1 The effect of leaming step in training a M W ........................................... 119 ................................. 4.12 Effect of data arrangement on the training of M W 121

    4.13 A typicai configuration of a data file for MLNN .......................................... 123 4.14 A typicai data file for the MSNN dassifiers .................................................. 127

    ...... 4.15 The location of the G1, G2 and G3 classes on a hypothetical nom al.. 132

    xii

  • LIST OF TABLES

    Table Page . I 1.1 USDA standards for size grading of the pistachio nu& (California

    .......................................................................... Pistachio Commission, 1995) -1 4.2 Performance of the Gaussian dassifier using morphological

    ............................................................................................................... features -98 4.3 Gaussian classification of Gl, G2 and G3 using area ................................... 100 4.4 Performance of the Gawian classifier using FD's and Area ..................... 104 4.5 Performance of the decision tree classifier using 7FDts & A ...................... 110

    ............................. 4.6 Performance of the Gaussian classifier using 7FD's & A 110 4.7 Performance of the Gaussian classifier using gray- .................................... -113 4.8 Performance of the decision tree classifier using GL-56 & A ..................... 116 4.9 Performance of the Gausian classifier using GL-56&A ............................ -117 4.10 The performance of different network topologies under ............................ 122 4.11 MLNN Classification using 6-54 network and transfomed data of

    GL.56&A ............................................................................................................. 124 4.12 Classification using 10-5-4 network wi th 7FDts & A ................................... 126 4.13 Classification results of MSNN with GL-56&A ............................................ 129 4.14 The dassification results of MSNN using 7FDts & k .................................. 130 4.15 MLNN performance using GL-56 & A using a threshold of 0.6 for

    the neîwork output ........................................................................................... 135

    xiii

  • LIST OF FREQUENTLY USED ABBREVIATIONS

    MSNN

    FD' s

    Mdti-layer neural network

    Multi-structure neural network

    Unsplit pistadiio nuts

    Pistachio nuts, Grade one

    Pistachio nuts, Grade two

    Pistalhio nuts, Grade three

    Fourier descnptors

    Seven selected FD's and area

    Gray level56 and area

    xiv

  • CHAPTER 1

    INTRODUCTION AND OBJECTIVES

    Harvested pistachio nuts contain a considerable number of empty,

    undeveloped, and unsplit shells due to factors such as unfavorable climate,

    incomplete poilination, la& of nutrition, and disease (Woodroof 1967).

    However, consumer demand is for large, in-sheii, split pistachio nuts. The

    United States Department of Agriculture (USDA) standards for pistachio nu ts

    designate size grades of "Extra Large'', "Large", "Medium", and "Small" for

    these nuts (Table 1.1). The standards also identify as degrading factors the - -

    existence of foreign matenals, and damaged and unsplit nuts.

    Table 1.1 USDA standards for size grading of the pistachio nuts (California Pistachio Commission, 1995)

    Extra Large Large Medium Small

    -

    20 or less 21 - 25 26 - 30 31 or more

  • Currently, on-farm separation of split from unsplit pistachio nuts is

    accomplished by flotation methods. Primary sorting of the nuts is usually

    done using mechanical devices such as saeens, aspirators, and gravity

    separators (Kader 1985). Electro-mechanical devices, opticai sorters, and

    manual sorting are used in processing plants for further grading of the nuts.

    To implement the grades indicated in Table 1.1, a USDA inspecter will take

    random samples from a lot and grade the entire lot based on personal

    experience. Industries also have their own standards for packaging and

    marketing these nuts.

    Inspection and sorting by human labor is a subjective and time

    consuming process. It also becomes curnbersome after a prolonged period of

    time. On the other hand, mechanicd sorting is not precïse, and because of

    direct contact can cause darnage to the nuts. Optical sorting devices utilize

    lîght with certain wavelengths reflected from a product to assess its quality.

    However, such optical devices cannot be used for size or shape grading.

    Inspection and grading of pistachio nuts by machine vision is an

    attractive al ternative to conventional methods because it offers the po ten tial

    for high speed, non-destructive classification of the nuts using a single

    machine. In this process a charge-coupled device (CCD) provides analog

    signals of an object to a cornputer where the signals are digitized and stored as

    an image. Further processing can be done on the image to extract quantitative

    information to be used as input to a classification algorithm.

    The continual improvement of price/performance of digital computers

    have made it practical to automate visual inspection in many areas. As vision

    technology continues to develop and industry becomes increasingly aware of

  • its potential, computer vision will find many new applications. Much of the

    on-going research in food and agricultural processing is focused on the

    application of machine vision to quality control. Examples include maturity

    detection of peanuts (Ghate et al. 1993), sorting of dried prunes (Delwiche et al.

    1993), and potato inspection (Tao et al. 1990). These industries are extremely

    cornpetitive, hence efficiency and quality are primary means to inaease

    market share and profit. Automation is not a luxury in these industries, but

    an essential requirement.

    On the 0th- hand, agriculturai products present a challenge in

    inspection technology because of the wide variabilities present in properties

    used for assessing qualities and grades. Machine vision systems are suitable for

    inspecting rigid or predefined objects such as machine tools and metal parts.

    However, visual characteristics of agricultural produds such as color, shape,

    size, and texture are difficult for machine vision system to discem. It is even

    harder to assess quality based on visual processing of these features.

    Artificial neural networks, resembling biological nervous s ystem, have

    proved to be robust in deding with ambiguous data and the kind of problems

    that require the interpolation of large amounts of data. Neural networks,

    instead of sequentially performing a program of instructions, explore many

    competing hypotheses simultaneously using massive parallelism (Lippmann

    1987). In addition, neural networks have the potential for solving problems in

    which some inputs and corresponding output values are known, but the

    relationship between the inputs and outputs is not well understood. These

    conditions are commonly found in agricultural inspection problems.

  • Pattern recognition has emerged as an important application of artificial

    neural networks. One of the most important attributes of neural network

    dassifiers is their capability to approximate the a posterior distribution of their

    training samples through learning and adaptation. This abiiity makes them

    unique among pattern dassifiers. The application of machine vision, coupled

    with neural networks capabilities, seems to offer promise for inspecting

    agricultural products.

    Grading pistachio nuts using machine vision in conjunction with

    pattern recognition techniques, including neural networks, offers many

    advantages over the conventional optical or mechanical sorting devices.

    Multiple sensors can be used to gather the necessary information from the

    nuts and send suitable signals to a computer where they c m be decoded for

    multi-category dassification. Image processing algorithms can be used to

    extract higher-level information from the input signals for improved

    dassification performance. The dassification parameters can be easily

    modified to take into account annual variations in the product. When neural

    networks are used as pattern dassifiers, the sorting device can be equipped

    with a training option through which the machine can be trained for

    recognizing new grades or for different products.

    An extensive literature search and direct communication with

    industrial sources have indicated that no pattern recognition machine or

    neural network-based system has been used for sorting or grading of the

    pistachio nuts. Bench-mark studies are thus needed to develop an efficient

    and a practical machine vision-based method for grading pistachio nuts.

    Research must be conducted to determine suitable image features and proper

    classification methods for accurate grading of pistachio nuts.

  • The main objective of this research was to study the feasibility of

    dassifymg pistadiio nuts into four dasses of "Grade One" (Gl), "Grade Two"

    (G2), "Grade Three" (G3), and '2Tnsplit" (UN) using a machine vision system.

    S p e d c goals were:

    1. To investigate the potential of different image-exhacted

    features for dassification of pistadüo nuts.

    2. To investigate the feasibility of dassifymg pistachio nuts

    into their appropriate dasses using the selected features by:

    a. Designing or selecting appropriate statistical

    pattern dassifiers.

    b. Designing suitable multi-layer neural network

    based classifiers,

    3. To compare the performance of the applied dassifiers and

    determine an effiaent classification technique.

    1.3 AN OVERVIEW OF THE THESIS CHAPTERS

    The material presented in this thesis is organized into five chapters.

    The first chapter addresses the justification, importance, and the statement of

    the objectives of the research. Chapter Two of this thesis begins with a brief

    look at cultural and processing activities for pistachio nuts. It continues with

    an explanation of the principles of machine vision and pattern recognition

    methods. Exampleç of application of these principles in different areas, with

  • an emphasis on agricultural engineering applications are given throughout

    the text. A complete background on multi-layer neural networks and their

    relationship to staüstical pattern dassifiers is presented as well. Chapter Two

    condudes with a presentation of some examples of application of neural

    network dassifiers to selected problems in agricultural engineering.

    Chapter Three begins with a brief explanation of the methods followed

    to perform the classifications procedures. Then a detailed description of the

    image acquisition system is presented. Feahues, feature extraction and

    classification methods, ciassifiers and the methods for evaluating their

    performance are also desaibed in detail.

    Results and discussion of different classification methods are presented

    in Chapter Four. The presentation of the results follows the flow of

    experiments starting with classification of the nuts using bulk features and

    evolving towards more suitable features. The discussion covers the source of

    variation in different feature types and their possible limitations and merits.

    Also, the performance of different dassifiers are compared and the reasons for

    the variations in their performance are explained.

    Chapter Five indudes a summary of the research, methods, findings

    and the conclusions made from the experimental results. Finally, suggestions

    for future research are proposed.

  • CHAPTER 2

    BACKGROUND

    AND

    REVIEW OF LITERATURE

    2.1 PISTACHIO NUTS

    The pistachio tree belongs to the family Anacardiaceae, of which the

    cashew, mango, sumac, and poison oak are also members. There are about a

    dozen species of pistica, a few of which produce small nuts. Only Pistica nera,

    however, fields the acceptable, larger edible nuts of commercial value. The

    nuts of Pistica vera are known as "pesteh" in Persia, from which the name

    "pistachio" is derived (Rosengarten 1984). The pistachio tree is deciduous, and

    dioeaous. One male tree is usually adequate to pollinate eight to ten trees. The

    male trees are planted in the orchard in a way that takes advantage of the

    prevailing winds for pollination.

    The pistachio tree attains a height of 2 to 8 m, with a spreading top. It is

    native to western Asia (Woodroof 1967). The nut has a light-colored woody

    shell. The kernel, which is light greenish-yellow, has a sweet and delicate

  • flavor. The kemels are eaten raw or are roasted with salt and different spices.

    It is also used for flavoring in cookery and confectionery.

    2.1.1 Cultural Practices

    The pistachio is a dry-dimate tree and requires little irrigation. Because

    of the deeper soil penetration of its root system, the pistachio tree is better able

    to withstand la& of soil moisture in the upper soil area than others of the

    commonly cultivated fruit or nut trees. The pistachio tree thrives best in areas

    having cool enough winters for proper breaking of bud dormancy and long,

    hot, dry summers for proper ripening of the nuts

    Pistachio trees begin nut production at about six or seven years of age,

    but full bearing is not attained until the fifteenth to twentieth year

    (Rosengarten 1984). Under favorable conditions, pistachio trees live and

    produce for centuries. In the future, the pistachio nut trees could play an

    important role in the development of some arid regions where rainfall

    predudes the successful growing of almost any other commercial clop.

    Pistachio nuts are born in dusters. The seed color ranges from light

    yellow to deep green throughout, and the leathery husk shows different

    shades of yellow, red, and purple. It is peculiar to the pistaduo nut that, when

    ripe, a large portion of the nuts split dong the shell suture. This splitting

    property is desirable since pistachio nuts are usually marketed in-shell, to be

    opened by the consumer (Woodroof 1967).

    The time to harvest pistachio nuts is aiticai. Eariier harvesting results

    in a higher percentage of undeveloped kernels, lower nutritional quality, and

    loss in yield. Deiayed harvesting causes increased shell staining, losses in

  • kernel quality, and inaeased incidence of insect infestation and fungal attadc

    (Woodroof 1967). The easiest indicator used to determine optimum harvest

    time is when the huil slips from the shell when the fruit is pressed between

    thumb and fingers. In the Central Vaiiey of California this normaliy occurs

    during the first two weeks of September.

    In California, the pistachio nuts mature in late August through late

    September. At maturity the extemal hull of the fruit changes from light green

    to a pale straw or whitish, opaque appearance, at the same time softening and

    loosening itseif from the stony inner part of the ovary wall, whidi is the gray-

    white inner shell. Thus, it is easily slipped off by pressing between the fingers

    (Woodroof 1967).

    The nuts are harvested by hand picking, knocking them off the tree

    with poles, or by shaking the tree by use of mechanical shakers. W hen

    harvested, the nuts are in hulls and usually contain 40 to 45% moisture

    (Woodroof 1967). koper management of hamesting and post-harvesting

    handling procedures of pistachio nuts is important to attain maximum yield

    of good quality nuts which in turn determine marketability and profit.

    2.1.2 Post-Harvest Processing of Pistachio Nuts

    The pistachio nuts, like many other agricultural products, require

    considerable amount of post-harvest processing. The main activities for post-

    harvest handling of pistachio nuts are given in Fig. 2.1. Pistadiio nuts are

    covered with a tight hull, when immature. At the time of harvest the hull

    softens and loosens itself from the shell. After harvest the hull must be

    removed as soon as possible to reduce the chance for fungal growth and to

  • avoid shell staining. Each hour after the nut is off the tree, the hull degrades.

    If the hull is not removed within 24 hours of harvesting, the deteriorating

    huli will stain the shell so badly that the nut will no longer be marketable in

    the prime market. Abrasive peeling machines are used to remove the hulls

    (Kader and Maranto 1985).

    S hipping e Grading Roasting

    -

    Fig. 2.1 The main post-hanest processing of pistachio nuts.

    The dehulled nuts always contain a considerable number of unsplit and

    blank nuts. Floatation techniques are used to separate these nuts from the split

    nuts, since immediately after harvesting and dehulling the unsplit nuts tend

    to float. During the floatation process, nuts are immersed in a tank of water

    and agitated for about 10 minutes. Once the agitation is ceased, the "sinkers"

    are primarily the splits and the "floaters" are the unsplits and blanks. Since

    the floatation process is not accurate, some amount of unsplit and Iow

  • quality nuts always remain. After floatation, the nuts are washed to dean the

    surface from the residuals left during dehulling.

    The moisture content of nuts increases as a result of the floatation

    process. To prevent spoilage and increase storage life, the dehuiled nuts must

    be dried to about 5-7% moisfmre content (Kader et al. 1980). The most common

    methods for drying pistachio nuts are sun dryîng and heated air drying.

    Further on-farm sorthg of the nuts is usually done by electro-mechanical

    devices such as aspirators, air streams, gravity separators, shakers, and sieves.

    The dried pistachios are usually stored on the farm for a period of 2 to 6

    months before being shipped to a processing plant or before being exported.

    The nuts are sold by mass and the price is set by the grade of the nuts and the

    availability. The USDA has established standards for different grades of

    pistachio nuts (California Pistachio Commission 1994). These standards are

    briefly desaibed in the next section.

    Many nut-processing plants have their own standard for marketing

    pistachio nuts. Sorting, grading, roasting, dyeing, inspecting and packaging are

    some of the unit processes which may take place in a given plant. Pistachio

    companies are very reluctant in releasing any information on the methods

    and equipment for processing these nuts. However, based on persona1

    communication with the North American manufactures of the nut sorting

    devices, some pistachio companies use optical sorting machines for detecting

    the unsplit, contaminated and stained nuts. Bichromatic-infrared sensors are

    commonly used for detecting the stained or the contaminated nuts and

    monodiromatic sensors are used for detecting the unsplit nuts.

  • A diagram of an optical sorting machine used for sorting pistachio nuts

    is given in Fig. 2-2 The unsorted nuts are poured into the hopper of the

    machine and the feedhg mechanism gradualiy distributes the nuts between

    several channels. Each channel is equipped with two (180° apart) or three (120°

    apart) optical sensors. As the nuts slides down the channels, they pass through

    an illuminated turutel where the optical sensing elements are mounted. The

    sensors send signals to electronic arcuitry which activates a pneumatic ejector.

    These machine are capable of processing approximately one metric tonne of

    unsorted nuts per hour .

    Hopper

    Feeder

    Fiow Cham

    Illumination and Sensors heuma tic Ejector

    Fig. 2.2 An optical nut-sorter (ESM International Inc., Houston, TX,

    USA).

  • Pistachio production in the USA is a young and rapidly growing

    industry. Within the past decade there has been a few published research

    projects on the engineering aspects of pistadiio processing. Pearson et al. (1993)

    measured the physical properties of early-split and normal-split pistachios to

    determine a sorthg criterion. Hsu at al. (1991) determined some physical and

    thermal properties of pistachio nuts. Farsaie et al. (1981) developed an

    automatic electro-optical sorter for removing aflatoxin-contaminated pistachio

    nuts. The device detected the defected nuts excited by an incident beam of long

    wavelength ultra-violet light.

    2.1.3 USDA Standards for Grading Pistachio Nuts

    In 1994 the USDA approved a standard for grading pistachio nuts

    (California Pis tachio Commission 1994). The standards define separate cri teria

    for "in shell" and "shelled" nuts. A copy of these standards is reproduced and

    presented in Appendix A. Since the nuts used in this research were all "in

    shell", the grades of these nu& as approved by the USDA are briefly explained.

    Pistachio nuts, regardless of their size, can be graded as "US Fancy", "US

    No. l", "US No. 2", or US No. 3". The basic requirement for these grades is

    that they must be free from foreign material, blanks, stainç, unsplit shells,

    mold, insects, and nuts having diameters less than 10.0 mm as measured

    using a round hole saeen. The difference between these grades is as a result of

    allowable tolerance of the above mentioned factors.

    As was shown in Table 1.1, based on size, pistachio nuts may be graded

    as "Extra Large", "Large", "Medium", and "Small". The specification also

    indicates that the average number of nuts in a grade should not exceed one-

  • half nut above or below the specificationç given for that grade. For example a

    sample of Large grade pistachïo may have an average number of nuts from

    20.5 to 25.5 per 28.5 grams (1 oz.). To ensure the uniformity within a grade and

    to put a limit on the number of srnalier nuts in a grade, the speufication

    indicates that the mass of 10% by count of the largeçt nuts in a sample should

    not exceed 1.7 times the mass of 10% by count of the smallest nuts.

    2.2 INSPECI'ION BY MACHINE VISION

    Machine vision is a technology that has arisen from a union between

    camera and computer. Figure 2.3 presents a blodc diagram of the hardware

    and software components of a typical classification machine. A video camera

    acts as an eye to a machine vision systern (Batchelor al. 1985). Analog signais

    generated by the camera are digitized into a sequence of nurnbers and stored as

    an image in the computer. Image processing algorithms are used to extract a

    pattern from the image to represent the object The pattern is dassified by a

    classification algorithm which in turn may generate a signal to activate an

    actuator to direct the object into its proper route.

    Machine vision systems have gained hemendous attention for

    inspecting products in different industries and dernands for their new

    applications are increasing. Here inspection refers to many industrial tasks

    including defect detection, measuring, locating, detecting orientation, grading,

    sorting, and counting. Machine vision offers many advantages over the

    conventional grading systems. It is compatible with other automated on-line

    processing. It will continue working round the dock and under conditions

    which would be unpleasant or impossible for a human operator. It can take

    dimensional measurements more accurately than a person can estimate by

  • eye, and c m give an objective measure of other variables such as color,

    projected area and shape which an inspector could only assess subjectively

    (Batchelor al. 1985). Since the inspection is done through a non-contacting

    procedure, there is less damage to the products when they are king inspected.

    Image S torage

    1 1 Image Processing,Analy sis ] 1

    l Feame Selec tion/Extrac tion I Classification JI

    Sensor

    Actuators

    Fig. 2.3 A schematic representation for a classification process by machine vision

    Machine vision sorting of agricultural products is more versatile and

    have more maneuverability than existing optical sorters. This methods offers

    multiple-feature processing. The features c m be the signals sent by different

    sensors or obtained through an algorithm or a combinations of the two

    methods. The pattern classification algorithms implemented on a machine

    vision system provide multi-category classification of the products. The

  • classification parameters can be eady changed to improve the classification

    performance or to use the same machine for a different product without any

    change in its hardware.

    23 m G E ACQUISITION

    The first problem in any automated inspection task is to obtain a good

    image of the object under investigation. Image acquisition involves collection

    of images and their transmission to an associateci processing system. There is

    no substitute for a high quality image. The first step in achîeving this is to

    provide a proper illumination for the object. Improper illumination of ten

    cause the key features to be obscured by glare, or it may reduces the intensity of

    light reaching the detector. The best method of illumination is not always

    obvious.

    The image sensor is the basic element within a camera for capturing

    images. Self-scanned solid-state arrays are the most widely used image sensor

    in these systems. Photodiode arrays, charge couple device (CCD) and charge

    injecting devices (CID) are examples of self-scanned array devices (Batchelor et

    al. 1985). Many industrial inspection problems in which the product moves

    dong a transport system require line-scan cameras. Using these cameras, an

    image of an object is built up line by line as it moves past a camera. Linear

    solid state arrays are extremely well suited for such applications. Video

    cameras equipped with self-scanned solid-state sensors and high quality

    photographic lens and filters are widely used for image acquisition (Batchelor

    1985). The image obtained for processing does not have to be simple optical

    one. It can be outside the range of visible light. Specid infrared cameras are

  • used for thermal imaging, and ultraviolet light has been used in crack

    detection (Batchelore 1985).

    An image acquisition board (frame-grabber) is a device which is

    installed in a computer to digitize the analog signals received from the image

    sensor and store thern as an image in computer memory (Data Translation

    Inc. 1988). The analog signal of each sensor pixel is usually digitized by an 8 bit

    buffer producing a 0-255 range for the signal. Therefore a white pixel of an

    image has a value of 255 and a black pixel is stored as a value of O. The image

    is stored as an array of numbers within the computer memory to be used for

    further processing.

    An image is stored in a computer memory as an arrays of nunibers that

    may contain over 300,000 elements. Sequential process of this information is

    time consuming and is not feasible for high speed on-line inspections.

    Therefore, usually some image processing and/or image analysis algorithms

    are applied to the gray scale images to extract some quantitative information

    known as "features". The features are used as inputs to some classification

    algorithm to determine the dass of the object.

    When different features of an image are put together they form a vector

    of numbers known as a "pattern". Thus, a pattern is a numerical description

    of an object. Many different methods have been proposed to obtain numerical

    features from a digitized image for purpose of classification. Pavlidis (1978,

    1980) presented a complete survey of algorithms for shape analysis. He

    distinguished two main categories of image extracted features, namely,

  • external and internal features. They are briefly reviewed in the foliowing

    sections.

    2-4-1 Extemal Image Features

    Extemai image features are those type of feahues which encode the

    boundary information (Pavlidis 1978). They usually require an image

    segmentation to get the coordinates of the pixels on the external contour of the

    image. The externai features do not requhe any knowledge of the gray level of

    the internal pixels. After image segmentation or edge detection process,

    mathematical procedures are performed on the coordinate of the boundaries

    to extract usefd features. Examples of external image features are

    morphological features, Fourier descriptors, boundary chah code, and

    boundary sequences.

    2.4.1.1 Momholoeical Features

    Morphology refers to the area of image processing concerned with the

    analysis of images based on their shapes and sizes (Dougherty and Giardina

    1978). Morphological features of a shape are obtained by image segmentation

    and anaiysis of the image. To get this type of feature one does not need to

    know the coordinates or the gray-level intensity of the pixels within an image.

    Therefore a silhouette image is suffiaent to extract these features. Examples of

    these features are area, width, perimeter, and aspect ratio.

    Morphological features have been widely used in automated grading,

    sorting and detecting of grains. Neuman et al. (1987) used images of wheat

    kernels and extracted shape and size characteristics to discriminate different

  • dasses and varieties of wheat. Brogan and Edison (1974) used morphological

    features in conjunction with a recursive learning technique and a Bayesian

    deusion mle for identification of six different type of grains. Hehn (1991) used

    morphological data for separating canola and mustard seeds.

    Morphological features sometimes are not sufficient for a high

    performance inspection process. These features are combined with other

    appropriate features to achieve higher classification rates. Neuman et al. (1987)

    used a combination of morphological feahues and Fourier desaiptors to

    separate different varieties of wheat A recent application of morphological

    features in machine vision system examined quality of snacks by a neural

    network (Sayeed et al. 1995). The features considered induded area, length,

    width, roundness, and perimeter.

    2.4.1.2 Fourier Descriptors

    Fourier desaiptors (FD's) are shape recognition features based on the

    Fourier series expansion of periodic functions. The theory behind these

    features are discussed in detail in section 35.2. The general idea is to represent

    the boundary as a periodic function with a period of 2x. The obtained periodic

    function is then expanded in a Fourier series and its coefficients are calculated.

    Ordinary Fourier coeffiaents are difficult to use as input to dassifiers, because

    they contain factors dependent on size, rotation, and phase angle (Granlund

    1 972).

    Different methods have been developed to obtain scale and rotation

    invariant FD's for 2-dimensional object recognition. Ehrlich and Weinberg

    (1970) desaibed the contour of an object in terms of the lengths of equispaced

  • radii extending from the centroid to its boundary. They presented radii as a

    periodic function of the central angle with a perïod of 2x. The fundion was

    then expandeci as a Fourier series and the polar coefficients of the expansions

    were used as shape descriptors. Granlund (1972) derived the Fourier

    coeff~aents from the expansion of the boundary coordinates of the objects in

    the complex plane. Zahn and Roskies (1972) represented a c u v e as a function

    of arc length by the accumulated changes in direction of the curve from a

    starting point on the curve. The function was normalized as a 2x periodic

    function and the Fourier coefficient of the function were calculated.

    Fourier descriptors have been extensively used as shape descriptors in

    many pattern recognition applications. Segerlind and Weinberg (1973) used

    FD's obtained using the method developed by Ehrlich and Weinberg (1970) for

    grain kernel identification. Persoon and Fu (1986) used FD's obtained by

    Granlund's (1972) method for character and machine part recognition.

    Romaniuk (1994) used FD's obtained by Zahn and Roskies (1972) method as

    input to a neural network for classification of barley seeds.

    2.4.1.3 Boundarv Secpences

    Boundary sequence is a general term for patterns approximating the

    boundary of an object. Dubois and Glanz (1986) approximated the boundary of

    an image as an ordered sequence of the lengths of N equiangular radial vectors

    projected between the object centroid and the boundary. Ghazanfari and

    Irudayaraj (1994a) used the same type of sequence for dassification of four

    varieties of pistachio nuts. Gupta and Srinath (1987) also used this type of

    sequence to derive moments for classification of 2D shapes.

  • Bunk and Buhler (1993) used the curvature of the boundary for

    symbolic representation of the object. In their method a starting point is

    selected on the boundary and the local change in the curvature of the

    boundary segments is recorded at equidistant intervals. This method of

    boundary representation is very accurate but results in long sequences which

    increases processing tintes.

    2.4.1.4 Boundary Chain Codes

    The boundary chain code was first developed by Freeman (1970) to

    represent a plane object boundary as a string of directional codes. A boundary

    chah code is a sequence of directional codes representing the boundary of an

    image. Application of boundary diain codes in machine vision inspection

    have included fruits' stem detection (Wolfe and Sander 2985) and tomato

    sorting (Sarkar and Wolfe 1985). Boundary chain codes have also been used in

    some image processing algonthms to extract morphological features from an

    image (Hehn and Sokhansanj 1990; Liu and Srinath 1990).

    2.4.2 Interna1 Image Features

    The internai image features are obtained by analysis of the pixels

    within the boundary of an image. Both the location of a pixel and its gray level

    may play an important role. There are many different types of features that

    may be considered for different applications. For example, Gunasekaran et al.

    (1987) used line detection within an image to indicate the existence of a aack

    in corn, and the existence of pixels with gray level pixels above a certain

    threshold was used for defect detection of dates by Wulfsohn et al. (1993).

  • Moments, textural features, and gray-level his tograms are examples of

    intemal image features which wiii be reviewed briefly.

    2.4.2.1 Moments

    Moments have been among the most commonly used image extracted

    features for shape discrimination. Several of the most essential image

    attributes such as size, centroid, orientation, shape spreadness, and shape

    elongation are directiy related to moments (Leu 1991; Hehn 1991). Moment

    invariants were firçt proposed and used for pattern recognition by Hu (1962).

    The major disadvantage of moments is that although the first few moments

    convey significant information for simple objects, they fail to do so for more

    complicated ones (Pavlidis 1978). The computational time for deriving

    moments of an image is also high.

    Moments were originally calculated using the locations of the interna1

    pixels of an image. Leu (1991) presented a meîhod for computing moments

    from the boundary pixels of an image. He showed that this new method was

    much more effiaent than the traditional methods for computing moments.

    Khotanzad and Lu (1991) used moments for character recognition

    2.4-2-2 Textural Features

    Texture is a qualitative description of a surface in terms of some

    properties such as fineness, coarseness, smoothness, and granulation.

    Researchers have investigated many methods for evaluating the texture of an

    image. Despite its importance, a formal approach to texture description does

    not exist. Early image texture shidies employed autocorrelation f unctions,

  • power spectra, and the relative fiequenues of various gray levels (Haralidc et

    al. 1973). The approaches to texture description are mostly on an ad-hoc basis

    and generaliy utilize the gray level values of the iternal pixels of an image i n

    some way (Haralick 1979).

    Haralidc et al. (1973) suggested 28 texhval features which could be

    extracteci from gray level images. Sayeed et ai. (1995) used the a subset of these

    feahues for texture evaluation of a snack food. Khotanzad and Lu (1991)

    analyzed gray-level images for texture classification of several different

    commodi ties.

    Gray-level histograms are the discrete frequency plot of pixels with a

    particular gray-level. These plots may be viewed as a probability density

    function so long as the elements of the image are randomly selected (Levine

    1985). Then the continuous frequency gray-level plots of different dasses of

    objects can be used as their discrimination functions. Das and Evans (1992)

    used gray level histogram data for detecting fertility of hatching eggs. Han et al.

    (1992) used gray level histograms obtained from X-ray images for detecting

    split-pit peaches.

    The performance of a dassification system depends chiefly on selecting

    an appropriate set of features which best desaibe their associate classes.

    Redundant and irrelevant features may degrade the classification performance

    (Devijver and Kittler 1982). Feature selection speeds the processing time and

  • inaeases the reliability of a classifier by eluninating the redundant and

    irrelevant information. A feature selecting procedure should extract the mos t

    useful information from the representation vector and present it in the form

    of a pattern vector of lower dimensionality whose elements represent only the

    most significant aspects of the input data.

    Mathematical feature selection tediniques are dassified into two major

    categories: feature selection in the measurement space, and feature selection in

    the transformed space. The methods in the k t category are referred to as

    "feature selection" and the methods in the latter category are known as

    "feature extraction" methods. Feature selection in the measurement space is

    achieved by eliminating those measurements which are redundant or do not

    contain enough relevant information. In this process the subset X of n

    features:

    is selected from the N-feature Y.

    The subset X should have the best possible combination of Y features for

    minimizing the ciassification error (Kittler 1975). To find the best possible

    subset, one should try al1 possible combination of features. In most cases this is

    not practical, because the required number of trials

    is very large. For practical situations, some computationally feasible

    procedures have been suggested to select a sub-optimal subset from the

  • original features. These methods are explained in detail by Kittler (1975) and

    Devijver and Kittler (1 982).

    Mucaardi and Gose (1971) compared severai different feature selection

    techniques. They indicated that the error rates of the features selected by any of

    the investigated techniques were lower than the error rate of randomly

    selected features. They conduded ali of those feature selection techniques were

    applicable to most pattern classification problems, but the choice of a particular

    selection method depends on the ease of implementation, economical

    considerations, and the particular application. Three different feature selection

    methods, identified for their possible applicability in this research, are briefly

    explained below.

    2.5.1 Interclass/Intraclass Method

    A dass of aiteria of feature selection in the measurement space which

    are more heuristic in nature is based on the Euclidean distance between the

    elements of the dass sets. These criteria originate from the intuitive argument

    that "the greater distance between the elements of different classes the better

    the dass separability". Based on this argument, a "good" feature should

    provide a large distance between elements of different classes (the interclass

    distance) while the distance between elements within a single dass (the

    intradass distance) is as small as possible. This concept for a two-dass case

    using two features (XI and X2) is presented in Fig. 2.4. In this figure dl is the

    intraclass distance between a member of class 1 and the mean of this class M l .

    The distance between the means of the two classes, d(M1, Mz), is the interclass

    distance for the two classes.

  • Fig. 2-4 The concept of interdass and intradass distances.

    Alternatively it can be stated that good features shouid have smali

    scatter within their dass and large scatter between classes. One way to select

    the best features is to use the Fisher criterion, defined by:

    where KI, Kz are the dass covariance matrices, M I and Mz are the respective

    means of the two dasses. The term KI+K2 in Eq. 2.4 is the within-class scatter

    matrix and (M, - M ) ( M - M ) is the between-dass scatter matrix (Therrien

    1989). Thus, good features should maximize the Fisher aiterion function. The

    greater the ratio of interdass and intradass scatters, the greater the spatial

    separation of the classes. Therefore features can be ranked based on this

    critenon and those with a higher ratio can be considered as good features. A

    detailed discussion of this criterion and its mathematical formulation is given

    by Devijver and Kittler (1982).

  • 2.5.2 Forward Selection

    In the sequential forward feature selection method the features are

    ranked based on their ability to separate the ciasses using an appropriate

    classifier. To begin with, the feature with the highest rank is selected. If the

    classification accuracy using this feature is not satisfactory, a feahire is selected

    from the remaining set which gives the highest classification rate with the

    current feature. The procedure continues until the selected set of features yield

    an acceptable classification accuracy. There are two main drawbadcs to this

    method: first, there is no mechanism for removing a feature that was already

    selected; and second, correlation between features is not taken into account

    (Devijver and Kittler 1982).

    2.5.3 Backward Elhination

    In the sequential backward elimination procedure the performance of a

    classifier is first tested with the whole set of features. Then the feature that

    gives the lowest classification rate with the rest of the features is eliminated.

    The process of successive elimination continues until any further feature

    elimina tion results in an unacceptable classification rate. This procedure has

    the same drawbacks as the sequential forward method. Furthermore, both

    sequential forward and backward selection methods are time consuming

    (Mucàardi and Gose 1971). The STEPDISC procedure provided by SAS

    (Statistical Analyses System, SAS Institute, Inc., Cary, NC, USA) c m be used to

    perform the forward selection and backward elimination methods.

  • Feature selection performed in transformed space is known as "feature

    extraction". Feature extraction methods eliminate the irrelevant information

    and redundancy in pattern Y by mapping it into a lower dimensional pattern

    E by a transform action T:

    In general, the map T could be any vector function of Y which can maximize

    an appropriate separability measure in the feature space (Kittler 1975).

    The Karhunen-Loeve expansion is the most widely used method for

    feature extraction. The idea behind this method is to reduce dimensionaiity of

    the features by aeating new features which are h e a r combinations of the

    original features (Mucuardi and Gose 1971). Other methods of feature

    extraction such as those based on separability measures and those based o n

    non-orthogonal mapping are explained by Kittler (1975).

    Feature extraction methods were not considered for this research for the

    following reason. The nonlinear transformation during feature extraction

    procedures creates a new set of features from the originals. When the

    transformed features are used as input to a dassifier, it is not easy to

    investigate the behavior of a dassifier with changes in the original features.

    Use of feature extractors as a preprocessing device in on-line classification is

    compu tationally expensive.

  • Classifiers are algorithms implemented on digital cornputers for

    purposes of classification. Classification dgorithms are developed and applied

    in two stages. In the first stage, called the training stage, the required

    classification parameters are estimated from a set of patterns cded the

    "training set". During the second stage, cailed the test stage, the algorithm uses

    the parameters to determine the dass of a new set of patterns called the "test

    set". Once a classifier gives an acceptable accuracy for the test data, it can then

    be used for the real world applications.

    There are many different types of dassifiers and they are dassified and

    explaïned in various pattern recognition books and research papers (e-g.,

    Devijver and Kittler 1982; Nilsson 1990). Determining which classifier works

    best for a particular application usually involves some degree of trial and

    error. Most dassifiers, when applied to a particular problem, result in

    comparable dassification accuracies. The real difference between them lies i n

    their time complexity, storage requirements, and prease degree of accuracy

    (Hush and Home 1993). A brief review of different classification methods and

    their applications is given in the follovïing sections.

    2 m 7 m 1 Bayesian Classifiers

    Ln s tatis ticai pattem recognition, a classifier assigns the unknown represented

    by the pattern X to the class o according to Bayes decision rule. For a two-class

    case the Bayes decision ruie is given by:

  • where p(Xlw,) is the conditional dençity function and P ( o , ) is the a priori

    probability of dass 1. In this case, I has a value equal to 1 or 2 Equation 2.6

    states: Assign X to the dass a,, if P(w,)p(Xlw, jis larger than P(@, )p(Xlw2 ) - It i~ shown (Duda and Hart 1973) that this decision nile minirnizes the probability

    of error, i.e-, the probabiiity of making an incorrect decision. Bayes decision

    d e for multi-dass classification problems is presented in Fig. 2.5 . When dasses are separated by "discrimination functions", e-g., normal distribution

    functions, then these huictions are used in Bayes deasion rule instead of the a

    posterior probabilities.

    Fig. 2.5 Bayes decision nile for mdti-class problems

    4-

    Gaussian classifier is one of the most frequently used dassifiers in

    pattern classification problems. This dassifier, a speaal case of Bayes decision

    rule, assumes that individual features have a Gaussian distribution (Therrien

    1989). To implement this classifier one need only to estimate the mean vector

    and the covariance matrix for each dass. These are the parameters of a

    posterior probabilities that can be replaced in the Bayes decision mle. The

    dassifier assigns the unknown to the class with the higher probability. W hen

    Maximum, x Y Selector Class

  • dasses have different covariances the decision boundary may be in form of

    ellipsoid, hyperboloid, parabdoid, or some combination of these (Therrien

    1989). When the covariance matrices are equal, then the deasion boundaries

    between dasses reduce to hyperplanes. More details on this classifier are given

    in section 3.6.1.

    Brogan and Edison (1974) used Bayesian deasion rule for classifying six

    different types of grains. Segerlind and Weinberg (1973) used the Mahalanobis

    distance (see section 3.6.1) for separating kemeis of corn, oats, wheat, barley,

    rye, soybeans, and navy beans from each other. Singh et ai. (1993) applied the

    technique of Bayes minimum risk classification for defect grading of

    s tonefruit.

    In agricultural research it is usually assumed that the variables are

    normally distnbuted. Therefore many researchers have applied Gaussian

    dassifiers to perform classification of different agricultural commodities. The

    DISCRIM procedure in SAS has been widely used for these purposes.

    Examples of the use of disuiminant analysis indude: separating early split

    from normal split pistachio nuts by Pearson et al. (1993); evaluating snack

    quality by Sayeed et ai. (1995), and discriminating canoia and mustard seeds by

    Hehn and Sokhansanj (1990).

    2.7.2 Classifiers Using Discrimination Functions

    Discrimination functions have the property that they partition the

    pattern space into mutuaiiy exclusive regions, where each region contains the

    domain of a given dass. In classification applications, a posterior distributions

    can be replaced by the discrimination functions (Duda and Hart 1973). Then a

  • dassifier is viewed as a machine that cornputes C discriminate functions,

    g, .....gc (Fig. 26). A maximum selector is used to assign the pattern X to the

    category associateci with the largest discriminant

    Fig. 2.6 Machine classification of patterns using discrimination functions.

    Linear discrimination functions are the simplest form of discrimination

    functions. The decision boundary formed by a h e a r discrimination function

    in two-dimensional feature space is a line, in three-dimensional feature space

    it is a plane, and in multi-dimensional feature space it is a hyperplane. The

    general fom of a linear discrimination function is:

    where w's are the weights of the function and xis are the features constituting

    the multi-dimensional feature space. The function is completely specified by

    determining the values of the weights. A dassifier which implements the

    linear discrimination functions is sometimes referred to as a linear classifier

    or linear machine.

  • The selection of a suitable functional form for a discrimination function

    is a problem of cruaai importance. The theory for construction of

    discrimination functions has not reached an effective stage. Prior knowledge

    about a posterior distribution of classes is always helpful in building proper

    disaimination functions (Devijver and Kittler 1982). Sometimes reasonable

    gueçses are made on the bais of qualitative knowledge about the patterns

    (Nilsson 1990). An example of application of linear discrimination function is

    classification of different industrial objects represented by autoregression

    rnodels (Persoon and Fu 1986).

    Piecewise linear discrimination function have been used for

    dassification of objects whose classes are separated by non-linear decision

    boundaries. In this method the classifier partitions the feature space into a

    number of regions using a set of hyperplanes (Nilsson, 1990). Haralik et al.

    (1973) used piecewise linear discrimination function for texture classification

    of five kinds of sandstones.

    2.7.3 Neares t Neighbor Classifiers

    The Nearest Neighbor Classifier (NNC) makes use of the correspondence

    between similarity and distance, i.e., the smaller the Eudidean distance

    between classes the more similar they are (Batchelor 1974). The nearest

    neighbor deasion rule assigns an unknown U to the dass of its nearest

    neighbor X :

    U E clars(i) if d(U,Xi) = min d(U,Xj), for k # j k, j = 1,Z ,.... C .

    where d(U, X) is a distance measure between U and X and C is the number of

    classes. Various distance rneasures (metrics) can be defined which can be used

  • in Eq. 2.8. Eudidean, aty blodc, and Mahalanobis are example of different

    metrics. A complete discussion on different type of metrio is given by

    Devijver and Kittler (1 982).

    The basic idea behind nearest neighbor mles is that samples which fall

    dose together in feahire space are likely to belong to the same dass. A NNC

    stores a number of pattern for each dass. Then an unknown is compared to al1

    of the stored patterns and assigned to the dass of the pattern which is most

    similar with the unknown. The decision surface aeated by NNC is piece-wise

    linear (Batchelor, 1974).

    The k-nearest neighbor (k-NN) classifier is an extension of NNC. The

    k-NN rule ciassifies X by assigning it the dass most frequently represented

    among the k nearest samples. In other words a decision is made by examining

    the labels on the k nearest neighbors and taking a vote. A complete description

    of NNC and k-NN are given by Devijver and Kittler (1982).

    2.7.4 Minimum Distance

    Minimum distance classifier, like NNC, makes use of the

    correspondence between similarity and distance. However in this method for

    each class a prototype is considered and an unknown U is assigned to the dass,

    i , of the prototype Mi which has the minimum distance d with it. That is:

    where C is the number of classes. Minimum-distance classifiers would be

    appropriate in situations where each class is represented by a single prototype

    pattern around which al1 other patterns in that class tend to cluster.

  • Gupta and Srinath (1987) used minimum distance nile for classification

    of 2D shapes using contour sequence moments. Persoon and Fu (1986) also

    used minimum distance rule but they used Fourier descriptors for 2D shape

    representation. String matching techniques whidi have been used for 2D

    shape recognition (Bunk and Buhler 1993; Mase 1991) use minimum distance

    classification method.

    2.7.5 Decision Tree Classifiers

    Classification trees constitute an important and popular form of

    hierarchical ciassifiers. A deusion tree dassifier utilizes a series of simple

    deusion functions, usually b i n q in nature, to determine the dass of an

    unknown pattern (Levin 1981). The evaluation of these deasion functions is

    initiated from the tree root node and branches out through the interna1 nodes

    toward the terminal nodes. The evaluation of deusion function at each node

    is in such a way to that the outcome of successive decision functions reduces

    uncertainty about the unknown pattern. The most common choice for the

    node deusion functions is a threshed cornparison on a component of the

    feature vector. The thresholds are usually determined by examining a training

    sample and their accuracies are validated using a test set.

    The dassification capability of a tree classifier arises from its ability to

    partition the feature space into complex regions by making a sequence of

    simple decisions at each node. Some advantages of dassification trees are low

    storage requirement, simple deusion at nodes and ease of understanding of

    classification process. The disadvantages are abrupt decision at nodes,

    cornparison of continuous features against a threshold to determine

  • brandiing, uncertainty about the thresholds, difficuity with missing features,

    increase in complexity with the increased size of tree (Gelfand and Delp 1991).

    There have been numerous researches to apply the p ~ c i p l e of

    machine vision and pattern recognition in classification of agricultural

    products. One of the pioneer works in this area is the classification of grain

    kemels by Segerlind and Weinberg (1973). They investigated the feasibility of

    identifymg grain kernels by analyzing their profile using Fourier series

    expansion of the periphery radius. Tests were performed on sarnples of corn,

    oat, wheat, barley, rye, soybeans, and navy beans. The value of the first ten

    Fourier coefficients (harmonies) were obtained for a training set and a test set.

    The authors mentioned that the application of the method to intraspecies

    discrimination was only partially successful with the errors ranging from 11 to

    25%. They also indicated the implementation of this technique was too

    tedious for routine use. Considering the faa that this research was performed

    more than two decades ago, at present time with the existence of super

    cornputers and advances in software engineering the burden of this type of

    research has greatly reduced.

    Images of wheat kernels in plane-form view were acquired and

    processed to extract kernel shape and size characteristics to disaiminate

    different classes and varieties of wheat by Neuman et al. (1987). Feature

    extraction algorithm based on object contours were developed and

    implemented using the FORTRAN 77 cornputer language. In addition to

    kernel spatial size and shape parameters as discriminating features, contour

  • curvature was further quantified in the frequency domain to obtain Fourier

    descriptor as shape-specific features. Statistical pattern recognition methods

    were used for discrimination analysis which resulted in an overall

    performance of 87%. This research is important because determining the

    variety of an agriculturai product is more difficult than separating different

    kind of products.

    Pattern recognition techniques were used for automatic classification of

    six different grains by Brogan and Edison (1974). An algorithm based on

    recursive learning technique and Bayesian deasion rule was developed and a

    prototype device was built for rapid, accurate and automatic classification of

    the grains. Of six grains, corn and soybeans were perfectly identified. W hea t,

    oats, barely and ryes were much more similar and were more likely to be

    misdassified. An overall accuracy of about 98% was obtained.

    A method using machine vision for detecting the fertility of hatching

    eggs during the thïrd and fourth day of incubation was developed by Das and

    Evans (1992). In this method images of eggs were acquired using back-lighting

    with a high intensity candling lamp. Parameters from the gray-level

    histograms were estimated to desaïbe the shape of the histograms. A n

    algorithm were developed using the estima ted parameter to dis tinguish the

    fertile from infertile eggç. The algorithm gave prediction accuracies of 96%

    using the day of fourth data and 88% using the day of third data.

    Han et al. (1992) developed a method for detecting split-pit peaches

    using a machine vision system. The X-ray films of peaches were placed under

    a video camera and the gray level histograms of the images were analyzed.

    Using the histogram data, a threshed equation was developed to separate the

  • split-pit peaches from the unsplit-pit peaches. An acc-uracy of 98% was

    reported. The limitation of this method was in proper orientation of the

    peaches under the X-ray camera.

    Sarkar and Wolfe (1985) developed a prototype tomato sorting machine.

    Tomatoes moving over a belt conveyer were deteaed by a photoceli which

    signaleci a computer to grab two views, stem end and blossom end, and

    andyze them. Based on the algorithm output, the computer activated

    solenoid operated pneumatic cylinders. The cylinder action raised the

    appropriate indined flap and dropped the tomatoes into their corresponding

    boxes. The author indicated that the developed prototype was not suitable for

    commercial purposes, because of speed limitations. There was no mention

    whether the speed limitation was because of computer processing time or

    hardware implementa tion.

    Gunasekaran et al. (1987) applied image processing techniques for

    detecting stress cracks in corn. The algorithm used for stress aadc evaluation

    was similar to a high-pas filtering process. The pixels representing stress

    cracks had significantly different gray level values than the pixels of the rest of

    the kernel surface. Therefore, gray levels of the pixels representing the stress

    cracks were extracted first by aeating an image suppressing the gray scale

    levels of the stress crack part and subtracting this newly created image from

    the original image. The success rate was determined by comparing the visual

    evaluation of the kernels for the stress aack with corresponding evaluation of

    the vision system of the same set of kernels. The algorithm performed

    satisfactorily in detecting stress cracks in 90% of the examined kernels.

  • A vision system and aigorithm were developed to locate fruit on a tree

    (Site and Delwiche 1988). Images were obtained using a solid state camera.

    Several bandpass optical filters