Download - Monitoring%Illegal%Fishing%through%Image%Classiﬁcaon%cs229.stanford.edu/proj2016/poster/GeislerFrostMahajan-Monitoring... · Nature Conservancy, an NGO, monitors illegal ﬁshing

Modeling

SIFT

Fisher Encoding (FE) (GMM Model on SIFT to construct

a visual dic:onary)

Linear SVM

Bag of Visual Words (K-‐Means on SIFT)

Linear SVM

CIFA

R-‐10

CNN

VGG-‐16 Random Forest (15 trees)

SVM (RBF, linear)

Monitoring Illegal Fishing through Image Classifica:on Jason Frost

[email protected] Taylor Geisler

[email protected] Antariksh Mahajan [email protected]

ProblemNature Conservancy, an NGO, monitors illegal fishing by installing cameras on fishing boats. Image classification using computers can help them to speed up and automate the process. We built a multiclass classification system for 8 different species.

Feature Extraction

1.  Scale-Invariant Feature Transform (SIFT)•  Identifies keypoints (shapes with

directional orientations)•  Invariant to rotations and translations•  Dense SIFT (dSIFT) runs SIFT on a grid

2.  VGG-16 Convolutional Neural Network•  16-layer CNN, has won image recognition

competitions by ImageNet•  Pre-trained weights, outputs 1000 features

3.  CIFAR-10 Convolutional Neural Network•  10 convolutional layers•  Trained our own weights

Data•  Original dataset: 3777 images, 8 labels•  Data preprocessing and augmentation: •  Feature-wise centering set input mean to 0•  Random rotations, shifts, flips increased

dataset size to >20,000 images•  Rescaled to 224x224 for easier processing

•  Challenge: fish are tiny part of most images

Label: ALB BET

Baseline models•  Naïve Bayes (by occurrences of colors): 8%•  1-nearest neighbor: 97%!•  But this is because similar boats catch

similar fish. Need more complex model for greater generalizability.

Training image with SIFT keypoints

Image 2(conv-‐64), maxpool, ReLU

2(conv-‐128), maxpool, ReLU



2(FC-‐4096), 1(FC-‐1000)

ResultsExtrac'on Method Model Training Acc. Test Acc.

VGG-‐16 RF 99.5% 83.4%

SVM-‐RBF 44.3% 45.6% SVM-‐Linear 54.1% 53.3%

-‐ CIFAR-‐10 95.6% 77.3%

Conclusions•  VGG-16 to RF performed best•  However, difficult to extract most important

features from VGG-16•  Better and more generalizable results can

be obtained by extracting fish from images, possibly using another CNN

Download - Monitoring%Illegal%Fishing%through%Image%Classiﬁcaon%cs229.stanford.edu/proj2016/poster/GeislerFrostMahajan-Monitoring... · Nature Conservancy, an NGO, monitors illegal ﬁshing

Top Related