deepfont: large-scale real-world font recognition from images zhangyang (atlas) wang 1 joint work...

28
DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman, Aseem Argawala, and Thomas Huang

Upload: anabel-evans

Post on 30-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

DeepFont: Large-Scale Real-World Font Recognition from Images

Zhangyang (Atlas) Wang

1

Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman, Aseem Argawala, and Thomas Huang

Page 2: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Problem Definition

Seen a font in use and want to identify what it is?

2

Page 3: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Problem Definition

Font recognition: recognize font style (typeface, slop, weight, etc) automatically from real-world photos

Why it matters? Highly desirable feature for designers Design library collection Design inspiration Text editing

3

Page 4: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Challenges

An extremely large-scale recognition problem Over 100,000 fonts claimed on myfonts.com in their collection

Beyond object recognition: recognizing subtle design styles.

Extremely difficult to collect real-world training data Has to rely on synthetic training data BIG mismatch between synthetic training and real-world testing

4

Page 5: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Solution

Deep convolutional neural network? Effective at large-scale recognition Effective at fine-grained recognition Data-driven

Problem: huge mismatch between synthetic training and real-world testing Data augmentation Decomposition-based deep CNN for domain adaptation

5

Page 6: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

The AdobeVFR Dataset

Synthetic training set 2383 fonts from Adobe Type Library

(extended to 4052 classes later) 1000 synthetic English word images per font ~2.4M training images

Real-world testing set 4383 real-world labeled images Covering 671 fonts out of 2383

6

……………………………………………

• The first large-scale benchmark set for the task of visual font recognition

• Consisting of both synthetic and real-world text images

• Also good for fine-grain classification, domain adaption, understand design styles

Page 7: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Deep Convolutional Neural Network

7

Following the benchmark structure?

Page 8: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Domain Mismatch

Direct training on synthetic data and testing on real-world data (Top-5 accuracy)

Need domain adaptation to minimize the gap between synthetic training and real-world testing!

8

Synthetic Real-World

Training 99.16% NA

Testing 98.97% 49.24%

Page 9: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Data Augmentation

Common degradations Noise, blur, warping, shading, compression artifacts, etc

9

Page 10: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Data Augmentation

Common degradations Noise, blur, warping, shading, compression artifacts, etc

Special degradations Aspect ratio squeezing: squeeze the image using a random ratio in [1.5, 3.5] in the

horizontal direction.

10

Page 11: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Data Augmentation

Common degradations Noise, blur, warping, shading, compression artifacts, etc

Special degradations Aspect ratio squeezing: squeeze the image using a random ratio in [1.5, 3.5] in the

horizontal direction. Random character spacing: render training text images with random character

spacing

11

Page 12: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Data Augmentation

Common degradations Noise, blur, warping, shading, compression artifacts, etc

Special degradations Aspect ratio squeezing: squeeze the image using a random ratio in [1.5, 3.5] in the

horizontal direction. Random character spacing: render training text images with random character

spacing

Inputs to the network: random 105x105 crops

12

Page 13: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Effects of Data Augmentation

13

• Synthetic 1-4: common degradations• Synthetic 5-6: special degradations• Synthetic 1-6: all degradations

• On the right: MMD between synthetic and real-world data responses

Page 14: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Beyond Data Augmentation

Problems Cannot enumerate all possible degradations, e.g., background and font decorations. May introduce degradation bias in training

Design the learning algorithm to be robust to domain mismatch? Mismatch already happens in the low-level features Tons of unlabeled real-world data

14

Page 15: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Network Decomposition for Domain Adaptation

15

Unsupervised cross-domain sub-network Cu (N layers)

Supervised domain-specific sub-network Cs (7-N layers)

Page 16: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Network Decomposition for Domain Adaptation

Train sub-network Cu in a unsupervised training using stacked convolutional auto encoders, with both synthetic data and unlabeled real-world data.

Fix sub-network Cu, and train sub-network Cs in a supervised way, using the labeled synthetic data.

16

Page 17: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Quantitative Evaluation

17

Model Augmentation? Decomposition? Real-World Test (Accuracy)

Top 1 Top 5

LFE Y Na 42.56% 60.31%

DeepFont N N 42.49% 49.24%

DeepFont Y N 66.70% 79.22%

DeepFont Y Y 71.42% 81.79%

4383 real-world test images collected from font forums.

Varying the layer number K of unsupervised network Cu

K 0 1 2 3 4 5

Training 91.54% 90.12% 88.77% 87.46% 84.79% 82.12%

Testing 79.28% 79.69% 81.79% 81.04% 77.48% 74.03%

Page 18: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Successful Examples

18

Page 19: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Failure Examples

19

Page 20: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Model Compression

For a typical CNN, about 90% of the storage is taken up by the dense connected layers

Matrix factorization methods are considered for compressing parameters in linear models, by capturing nearly low-rank property of parameter matrices.

20

The plots of eigenvalues for the fc6 layer weight matrix in DeepFont. This densely connected layer takes up 85% of the total model size.

Page 21: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Model Compression

During training, we add a low rank constraint on the fc 6 (rank < k) layer

In practice, we adopt very aggressive compression on all fc layers, and obtained a mini-model with ~40 MB in storage, with a compression ratio >18, and (top-5) performance loss ~3%.

21

Take-Home Points:

1) FC layers can be highly redundant. Compressing them aggressively MIGHT work well.

2) Joint Training-Compression performs notably better than two-stage.

Page 22: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

In Adobe Product: Recognize Fonts from Images

22

Page 23: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

In Adobe Product: Photoshop Prototype

23

Page 24: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Text Editing Inside Photoshop

24

Page 25: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

Text Editing Inside Photoshop

25

Page 26: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

In Adobe Product: Discover Similarity between Fonts

Font inspiration, browsing, and organization

26

Page 27: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

In Adobe Product: Discover Similarity between Fonts

Font inspiration, browsing, and organization

27

Page 28: DeepFont: Large-Scale Real-World Font Recognition from Images Zhangyang (Atlas) Wang 1 Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman,

For more information

Full paper will be made available quite soon

AdobeVFR Dataset will be available soon

Thank you!