reading street signs€¦ · google mylocation 22. future works • perspective effect •...

24
Reading Street Signs Using a Generic Structured Object Detection and Signature Recognition Approach 1 Recognition Approach Sobhan Naderi Parizi (Speaker) and Alireza Tavakoli Targhi, Omid Aghazadeh and Jan-Olof Eklundh Computer Vision and Active Perception Laboratory Royal Institute of Technology (KTH) Stockholm, Sweden {sobhannp, att, omida, joe}@kth.se

Upload: others

Post on 18-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Reading Street Signs Using a Generic Structured Object Detection and Signature

Recognition Approach

1

Recognition Approach

Sobhan Naderi Parizi (Speaker)

and Alireza Tavakoli Targhi, Omid Aghazadeh and Jan-Olof Eklundh

Computer Vision and Active Perception Laboratory

Royal Institute of Technology (KTH)

Stockholm, Sweden

{sobhannp, att, omida, joe}@kth.se

Page 2: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Outline

• Overview of our Framework• Detection• Text Segmentation• Recognition

2

• Conclusion and Future Work• Offline Demo

Page 3: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Overview: Mobvis:

3

Page 4: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Overview: Street Plate Detection4

Page 5: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Our Contributions:

• Design an efficient system to combine • Structured region detection• Text segmentation• Text recognition

5

• Text recognition

• Providing a database and evaluation over the database

Page 6: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

6

Introduction to: KTH-Plate Database

1.5 m1.5 m

Scale

Camera View Point (degree)Camera View Point (degree) 0 +45 0 +45 --4545

3m3m

4.5 m4.5 m

Page 7: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Introduction to: KTH-Plate Annotated Database

• 86 low resolution using Mobile phone Camera (1280�960)• 120 high resolution (2448�3264)

7

• 120 high resolution (2448�3264)• 3 scales � 3 orientations• Manual plate segmentation• Locations:

• 1) Graz, Austria

• 2) Stockholm, Sweden

• 3) Theran, Iran

Page 8: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Annotation

• Four corners of the plate• Clockwise starting from up-left

• Ground-truth is generated

8

• Ground-truth is generated

• Ground-truth plates are added to positive samples• Random regions are added to negative set

Page 9: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Step 1: Detection

• Boosting method (simple Haar features)• Affine normalization

9

• Generate training samples• Add false positives to negative set• Add the detected plates with over 80% overlap

• Boosting is retrained on the enriched database

Page 10: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Step 1: DetectionFiltering• Still many false positives

• Hard samples

10

• Texture-Transform (A. Tavakoli et al.)

• Again boosting

Page 11: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Step 1: Detection Final Decision• Weighted merging

11

windowdetected truthground

windowdetected truthground

S S

S S

U

I

Page 12: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Step 1: Detection Sample Results

12

Page 13: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Step2: Text Segmentation

• Input: detected region

• Output: interested text region

13

• Hard aspects:• All text is not interesting

• Limited detection/object overlap

Page 14: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Step2: Text Segmentation

• Pixelwise texture classification• Two class (text/background)• Single Histogram (F. Schroff et al.)• Distance map

14

• Distance map

• Connected component labeling• Multi threshold• Bounding box around the largest component is returned

Page 15: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Step3: RecognitionOverview• OCR

• Needs clearly segmented characters

• Template Matching methods• SIFT

15

• SIFT• Dynamic Time Warping (DTW) of a signature

• Real time• No training is needed!• Target models are generated simply

Page 16: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Step3: Recognition (DTW scheme)

16

Page 17: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Step3: Recognition (DTW scheme)

17

• O(d � )• Sequential data (e.g. time varying)• Projected features

2N

Page 18: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Step3: RecognitionProjected features

• Projected along the text

• Different types of features• Sum of intensity values

18

Feature value

• Sum of intensity values• Upper contour• Lower contour

• More sophisticated features• Edge orientation (HOG)

Page 19: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Step3: RecognitionOnce again filtering

19

Regions detected as plate

Segmented text regions

Some samples recognized successfully

Page 20: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Step3: Recognition

• Automatic model construction• Bag of characters

g a s sM u r eRequest for the model of

20

• List of street names could be used• Remarkably invariant to font

g a s sM u r emodel of ”Murgasse” st.

Page 21: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

• Recognition Acc: 99.53% for 100 models

• Practically we do not

RecognitionEvaluation

21

need to explore more models

Page 22: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Google MyLocation22

Page 23: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Future Works

• Perspective effect

• Generalized for any sign detection

23

• Generalized for any sign detection• Recognition is good enough. Our bottleneck is on detection

• Needs to be faster• Integrate to mobile application

Page 24: Reading Street Signs€¦ · Google MyLocation 22. Future Works • Perspective effect • Generalized for any sign detection 23 • Recognition is good enough. Our bottleneck is

Reading Street Signs Using a Generic Structured Object Detection and Signature

Recognition Approach

24

Recognition Approach

Sobhan Naderi Parizi (Speaker)

and Alireza Tavakoli Targhi, Omid Aghazadeh and Jan-Olof Eklundh

Computer Vision and Active Perception Laboratory

Royal Institute of Technology (KTH)

Stockholm, Sweden

{sobhannp, att, omida, joe}@kth.se