student mini-camp project report
DESCRIPTION
Student Mini-Camp Project Report. Pattern Recognition. Problem Statement. Pattern Recognition - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/1.jpg)
Student Mini-Camp Project Report
Pattern Recognition
Participant Students AffiliationsPatrick Choi Claremont Graduate UniversityJoseph McGrath Univ. of Massachusetts, LowellPeizhe Shi University of WashingtonHem Wadhar UC Los AngelesQin Wu West Virginia UniversityFlora Xu Claremont Graduate UniversityAdvisorJen-Mei Chang CSU Long Beach
![Page 2: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/2.jpg)
Problem Statement
Pattern RecognitionThe subject of pattern recognition in data is broadly known as a sub-category of machine learning which is a scientific discipline that is concerned with the design of algorithms that allow artificial intelligence to learn, based on the information given.
We worked on a given set of data which contains distinct images of cats and dogs. The first 160 images are labeled as dogs or cats. And the left 38 images are unlabelled. Our object is to build up pattern recognition architecture on the known data (labeled dogs and cats). Then we use our pattern recognition routine to classify those unknown images (unlabeled dogs and cats) as either dogs or cats correctly.
![Page 3: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/3.jpg)
Pattern Recognition
Can we produce an algorithm/technique/method to train and distinguish between cats and dogs?
![Page 4: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/4.jpg)
Image Pre-Preprocessing
Raw Canny Filtered 2-D Wavelet Transform
PCA LDA
Identification
Model
![Page 5: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/5.jpg)
Image Pre-ProcessingRaw
When using the raw data of a 64x64 pixel image, it was not manipulated. It went directly to either the PCA or LDA method in the next step of the program.
Using the “imread” command in MATLAB, each original TIF image of 80 cats and 79dogs are written to 80 or 79 by 4096 matrix.
Canny Filter Edge Detection Method
Using the matrices created during the raw image pre-processing step, these images were then analyzed using a canny filter edge detection method.
MATLAB automatically calculates the high and low thresholds, and the gaussian filter uses a sigma value of 1.
![Page 6: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/6.jpg)
Image Pre-Processing
Canny Edge Detecting
cat=importdata('cat.mat'); %opening up the matrix “cat”, which contains all of the %raw cat data in one 80x4096 matrix
cat=cat';[m,n]=size(cat);all_cats_edge=zeros(n,m);for j=1:n cat_j=reshape(cat(:,j),64, 64); cat_edge=edge(cat_j,'canny'); file_name=strcat('cat',num2str(j),'.mat'); save(file_name,'cat_edge'); all_cats_edge(j,:)=reshape(cat_edge,1,m);endsave('all_cats_edge.mat','all_cats_edge');
![Page 7: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/7.jpg)
Image Pre-ProcessingRaw Canny
![Page 8: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/8.jpg)
The wavelet decomposition of an 2 D image can be obtained by performing the filtering consecutively along horizontal and vertical directions (separable filter bank). This is depicted schematically in the following figure.
The wavelet decomposition of an 2 D image
![Page 9: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/9.jpg)
LL: low-freq. components
LH: high freq. components in vertical direction
HL: high freq. components in horizontal direction
HH: high freq. components in diagonal direction
Wavelet decompostion
The wavelet decomposition of an 2 D image
![Page 10: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/10.jpg)
HL
LH
HH
Edge Detection by Wavelet Method
![Page 11: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/11.jpg)
LL: low-freq. components
LH: high freq. components in vertical direction
HL: high freq. components in horizontal direction
HH: high freq. components in diagonal direction
Wavelet decompostion
The wavelet decomposition of an 2 D image
![Page 12: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/12.jpg)
HL
LH
HH
Edge Detection by Wavelet Method
![Page 13: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/13.jpg)
Edge Detection by Wavelet Method
![Page 14: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/14.jpg)
PCA Method
• PCA transforms many potentially correlated variables to few uncorrelated ones.– This reduces the dimension of the problem so that
we may more easily compare input images to our training sets.
– The lower dimension representation uses the ‘highest energy’ singular vectors as a basis for representation.
![Page 15: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/15.jpg)
PCA Method
The first nine singular vectors for raw image data for dogs & cats.
![Page 16: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/16.jpg)
PCA Method
The first nine singular vectors for the canny filter edge data for dogs.
![Page 17: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/17.jpg)
PCA Method
The first nine singular vectors for the Vertical + Horizontal wavelet data for dogs.
![Page 18: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/18.jpg)
PCA Method
Results using the raw data for training into the PCA methodology
![Page 19: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/19.jpg)
PCA Method
Results using the canny edge filter data for training into the PCA methodology
![Page 20: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/20.jpg)
PCA MethodResults using the Wavelet coefficient horizontal + vertical data, for training into the PCA methodology
![Page 21: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/21.jpg)
PCA Method
Results using the Wavelet coefficient horizontal + vertical + diagonal data, for training into the PCA methodology
![Page 22: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/22.jpg)
PCA Method
• Results:– 17 out of 38 unknown test images identified correctly; test
images were converted to V + H Wavelets
• Take Away:– Potential coding or algorithm flaws
![Page 23: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/23.jpg)
Linear Discriminant Analysis (LDA)
• Idea: project the high dimensional image data linearly into a one dimensional space, where the data is classified using an optimal threshold.
• Main procedures– Feature extraction: Preprocessing the data from training set. – Selecting the optimal direction of projection w.– Determine the optimal threshold c.– Identify the unknown data.
![Page 24: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/24.jpg)
LDA – feature extraction
• Advantage– Lower dimension, faster computation– Discarding redundant information, more efficient classification
• Singular Value Decomposition (SVD)– X: preprocessed images for training
X = USVT
• Feature selection– Features: the first nf columns of U as the principle components.
– New data: the first nf rows of SVT as the extracted information of the images.
– The dimension of the space of data decrease to nf
![Page 25: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/25.jpg)
LDA – Optimal direction of projection
• Goal– maximize the inter-class distance in the projected space– minimize the intra-class distance in the projected space
![Page 26: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/26.jpg)
LDA – Optimal direction of projection
![Page 27: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/27.jpg)
LDA – Optimal threshold
• After projecting all training data to the optimal direction w, pick a threshold such that– Total number of error is minimized– Numbers of error of cats and dogs are equal
• Identification– Feature extraction: project the image on the principle components
xe = Uf x– Compute the projection of the extracted data on optimal direction w
v = wT xe
– Compare the projection with the threshold c to identify the class of the unknown image
![Page 28: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/28.jpg)
LDA - Testing
• Various size of training set• Use the rest for testing• 30 features• 10 trials, shuffled images• Classification rate around 90%
• Training: all 80 dogs and 80 cats• 40 features• Threshold: 43.6• Error: dogs 2, cats 2.
![Page 29: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/29.jpg)
LDA - Comparison
![Page 30: Student Mini-Camp Project Report](https://reader035.vdocuments.us/reader035/viewer/2022062301/56815d43550346895dcb48dc/html5/thumbnails/30.jpg)
LDA – on the secret data
• Missed 3 out of 38, 2 dogs, 1 cat• Rate of success: 92%