deep learning meetup 7 - building a deep learning-powered search engine

Post on 21-Jan-2018

408 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Building a Deep Learning-powered Search Engine

Koby Karp

Deep Learning Paris Meetup #7

I’m Koby - Data Scientist @ Equancy

★ Robotics Engineer (2007-2011)

★ Computer Visioner (2011-2012)

★ Data Scientist, Data Engineer, Data Miner, Data Analyst, ... (2011-2016)

★ Deep Learner (2016-)

★ ?

E-Commerce ♥ Images

★ Catalogue

★ Social Network

★ Marketplace

Three use cases for FASHION:

★ Visual Search Engine

★ Fashion Object Detection

★ Data Quality

Three use cases for FASHION:

★ Visual Search Engine

➹ Take pictures with your phone

➹ Search through catalogue using your images

➹ Return most similar or exact products

Big City Life = High Exposure to Fashion Daily

Visual Search Engine at a glance

Visual Search Engine at a glance

★ Batch Phase: Build

➢ Describe - Encode image into a numeric description (vector)

➢ Index - Apply transformation to all images and store in a DB

★ Online Phase: Deploy

➢ Measure Distance - Apply a distance metric between DB and a new (unseen) image

➢ Ranking - Sort by distance and return first N results

Visual Search Engine at a glance

Describe

Numerical Representation

0.672

0.510

0.741

...

0.919

Catalogue Image

★ Batch Phase: Build

➢ Describe

➢ Index

★ Online Phase: Deploy

➢ Measure Distance

➢ Ranking

Encode image into a numeric description (vector)

★ Batch Phase: Build

➢ Describe

➢ Index

★ Online Phase: Deploy

➢ Measure Distance

➢ Ranking

Visual Search Engine at a glanceApply transformation to all images and store in a DB

Index

0.672 0.435 0.482 ... 0.141

0.510 0.525 0.810 .... 0.241

0.741 0.526 0.210 ... 0.571

... ... ... ... 0.816

0.919 0.552 0.161 0.622 0.412

Catalogue Images

0.672 0.435 0.482 ... 0.141

0.510 0.525 0.810 .... 0.241

0.741 0.526 0.210 ... 0.571

... ... ... ... 0.816

0.919 0.552 0.161 0.622 0.412

Visual Search Engine at a glanceApply a distance metric between DB and a new (unseen) image

Measure Distance

★ Batch Phase: Build

➢ Describe

➢ Index

★ Online Phase: Deploy

➢ Measure Distance

➢ Ranking

0.672

0.510

0.741

...

0.919User’s Image

Visual Search Engine at a glanceSort by distance and return first N results

Top 5

★ Batch Phase: Build

➢ Describe

➢ Index

★ Online Phase: Deploy

➢ Measure Distance

➢ RankingUser’s Image

Focus on the Describe step

Three attributes that we need to describe

Shape Color Texture

Three attributes that we need to describe

Shape Color Texture

How is it done with “classic” Computer Vision?

Edge DetectorsImage Moment

HOG / HOF / SIFTFourier / Wavelet

Color Histograms

Three attributes that we need to describe

Problems with this approach:

1. Too many parameters (difficult to tune)

2. Multiple methods (how to weigh?)

3. Slow (many transformations)

4. Ungeneralizable

Solution: Pre-Trained Convolutional Neural Network (CNN)

Entering: Convolutional Neural Network (CNN)

AlexNet (2012)

1. “The Beatles of the CNNs” -Me

2. Trained on the ImageNet dataset (15 million images)

3. Used for classification of 1000 categories (Animals, Plants, Urban - No Fashion)

4. Invariant to translations and horizontal reflections

5. Tried other models such as VGG16.

Entering: Convolutional Neural Network (CNN)

AlexNet (simplified visualization)

Convolutional Neural Network (CNN)

AlexNet (simplified visualization)

❖ We remove the last Fully connected layer (Soft-Max)

❖ We feed our images and generate CNN codes of size 4096

❖ The weights of the Trained CNN contain the Feature Engineering mapping that was necessary

to discriminate between the 1000 classes

❖ We use the network as a general-purpose descriptor.

Test Time ...

Dataset

M. Manfredi; C. Grana; S. Calderara; R. Cucchiara "A complete system for garment segmentation and color classification" MACHINE VISION AND APPLICATIONS, vol. 25, pp. 955 -969 , 2014

Mix of various clothing and accessory:

❖ 60000 items

❖ Medium Quality

❖ Grey background

❖ Used as a benchmark for garment classification

Image Clustering

❖ Using t-SNE for compression to 2D

❖ Selected random 10% for visualization

Image Clustering Jewelry & Accessories

Image Clustering T-Shirts

Image Clustering Shoes

Image Clustering

Shorts

Image ClusteringJeans, Khakis & Chinos

Image ClusteringTrousers

Image ClusteringBags

Image ClusteringJackets

Image ClusteringFunky Tops

Search Results ...

We propose our customers to

collaborate, using their data,

for building a first prototype

Built with our customers

Selected topics look for an

innovative way of using existing

data

Leveraging smart data

Topics must lead to real,

operational applications, with

added value for the business

For industrial applications

Equancy selects several topics we consider worth

investigating for our yearly program

Cutting-Edge Topics

Depending how speculative we judge

each topic, Equancy will support

significant time costs of consultants

Co-investment

EQUANCYR&D Program

Equancy R&D Initiative

Thanks!You were great :)

Equancy is recruiting:

❖ Data Scientist Intern❖ Data Engineer

kkarp@equancy.com

top related