fast and accurate skew estimation based on distance transform

30
An article by: Itay Bar-Yosef, Nate Hagbi, Klara Kedem, Itshak Dinstein Computer Science Department Ben-Gurion University Beer-Sheva, Israel Presented by: Doron Ben-Zion and Michael Wasserstrum Fast and Accurate Skew Estimation Based on Distance Transform

Upload: landis

Post on 09-Feb-2016

33 views

Category:

Documents


0 download

DESCRIPTION

Fast and Accurate Skew Estimation Based on Distance Transform. An article by: Itay Bar-Yosef, Nate Hagbi, Klara Kedem , Itshak Dinstein Computer Science Department Ben-Gurion University Beer-Sheva, Israel Presented by: Doron Ben-Zion and Michael Wasserstrum. Distance Transform. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Fast and Accurate Skew Estimation Based on Distance Transform

An article by: Itay Bar-Yosef, Nate Hagbi, Klara Kedem, Itshak DinsteinComputer Science DepartmentBen-Gurion UniversityBeer-Sheva, Israel

Presented by:Doron Ben-Zion and Michael Wasserstrum

Fast and Accurate Skew Estimation Based on Distance

Transform

Page 2: Fast and Accurate Skew Estimation Based on Distance Transform

A distance transform, also known as distance map or distance field, is a derived representation of a digital image.

Derived – extracting information from the original image.

We label each pixel of the image with the Euclidian Distance to the nearest ”obstacle pixel”, which in our case is the foreground pixel.

For each pixel x : DT(x) = distance (x,y) where y stands for the

nearest pixel in the foreground.

Distance Transform

Page 3: Fast and Accurate Skew Estimation Based on Distance Transform

Examples of Distance Transform

Page 4: Fast and Accurate Skew Estimation Based on Distance Transform

The distance transform is sometimes very sensitive to small changes in the object. If, for example, we take this rectangle which contains a small black region in the center of the white rectangle, then the Distance Transform becomes:

Page 5: Fast and Accurate Skew Estimation Based on Distance Transform

 The Distance Transform is also very sensitive to noise

Page 6: Fast and Accurate Skew Estimation Based on Distance Transform

An example of applying the Distance Transform to a real world image is illustrated with:

To obtain a binary input image, we threshold the image at a value of 100, as shown in:

The Distance Transform is:

Page 7: Fast and Accurate Skew Estimation Based on Distance Transform

Simple Example of Using Distance Transform (Manhattan Norm)

4 4 3 2 1 0 0 13 3 2 1 0 1 1 02 3 2 1 0 1 1 01 2 3 2 1 0 0 10 1 2 3 2 1 1 21 0 1 2 3 2 2 32 1 0 1 2 3 3 43 2 1 0 1 2 3 4

1 1 1 1 1 0 0 11 1 1 1 0 1 1 01 1 1 1 0 1 1 01 1 1 1 1 0 0 10 1 1 1 1 1 1 11 0 1 1 1 1 1 11 1 0 1 1 1 1 11 1 1 0 1 1 1 1

Page 8: Fast and Accurate Skew Estimation Based on Distance Transform

Sqrt(2) 1 Sqrt(2)

1 0 1

Sqrt(2) 1 Sqrt(2)

Linear Algorithm of Distance Tranform

0 1 0

1 0 1

0 1 0

1 1 1

1 0 1

1 1 1

• To preform an estimation of DT on a given binary matrix (which represents an image) we will use one kind of the 3 given “masks” – where each one of them represent a different metric.

L1 – Matrix (Diamond)

L2 – Matrix (euclidean)

L∞ - Matrix

142 100 142

100 0 100

142 100 142

Computers “prefer” working with integers so to simplify the process we multiply the numbers by 100 while preserving the ratios

-1 0 1

-1

0

1

Page 9: Fast and Accurate Skew Estimation Based on Distance Transform

For a Given image matrix A (m*n) and a given mask M:Initialize every background pixel to ∞ and every

foreground pixel to zero.For k = 1 until m (Top down)

For s = 1 until n (Left to right)A[k,s] = min{A[k+i,s+j] + M[i,j]}

-1 ≤ i ≤ 1-1 ≤ j ≤ 1

For k = m down to 1 (Botton up)For s = n down to 1 (Right to left)

A[k,s] = min{A[k+i,s+j] + M[i,j]}-1 ≤ i ≤ 1-1 ≤ j ≤ 1

• In our case we will use L2 (euclidiean mask).

Algorithm

Loop 1

Loop 2

Page 10: Fast and Accurate Skew Estimation Based on Distance Transform

Note that we do change ‘A’ and don’t create a new Matrix (image).

Means that if we have changed a value of a pixel, its new value will be taken in consideration in the upcoming iterations.

Also note that in the first loop we do not consider values to the right of the pixel that can be later changed!

That is why we need the second loop!

Explanation

Page 11: Fast and Accurate Skew Estimation Based on Distance Transform

We can use only part of the mask in each loop!

Distance Transform – even faster!

Distance Transform – more accurate!If we would like to increase the accuracy of

the Distance Transform we can use a larger mask.

We’ll pay more in running time. mask gives “Good Enough” results.

Page 12: Fast and Accurate Skew Estimation Based on Distance Transform

Example of Distance Transform

Page 13: Fast and Accurate Skew Estimation Based on Distance Transform

Document skew estimation is an important step in the process of document analysis.

It affect the performance of subsequent stages of document capturing process such as:line extraction.page segmentation.OCR - Optical character recognition.

We will use Distance Transform to estimate the skew of a document.

Skew Estimantion:

Page 14: Fast and Accurate Skew Estimation Based on Distance Transform

1. Use Thresholding to obtain a Binarized Document.

2. Use Distance Transform.3. Use Gaussian Blur for smoothing the Image.4. Calculate the gradient for each backround

pixel.5. Calculate the average orientation for a

specific window.6. Produce an histogram.

Calculate a gaussien on top of the histogram.Return The Gaussien central value!

Process Steps

Page 15: Fast and Accurate Skew Estimation Based on Distance Transform

Binarized DocumentsBinarized Document is a document

represented only by 2 values of pixels: 0 & 1. usually we use 0 for black and 1 for white.

To obtain a Binarized Document from a gray scale document we simply use a Threshold.

In our case “Otsu’s global thresholding approach” was used.

1. Thresholding

Page 16: Fast and Accurate Skew Estimation Based on Distance Transform

Example of Otsu’s Thresholding

Original Image

Binarized Image

Page 17: Fast and Accurate Skew Estimation Based on Distance Transform

We don’t need color information for estimating the document orientation.

It’s crucial for Distance Transform - an important step in our skew estimation process.

Why using Thresholding?

Page 18: Fast and Accurate Skew Estimation Based on Distance Transform

We are using the DT as explained before.We are using DT because of the observation

that the dominant orientation of its gradients accurately reflects the skew of the document.

2. Distance Transform

(a) A portion of a text document image (b) The DT of the document image

Page 19: Fast and Accurate Skew Estimation Based on Distance Transform

A Gaussian blur (also known as Gaussian smoothing) is the result of blurring an image by a Gaussian function.

3. Gaussian Blur

Page 20: Fast and Accurate Skew Estimation Based on Distance Transform

Space between characters creates local maxima which is irrelevant to the document orientation and interrupts the statistics process that is being done later.

We would like to avoid those local maximas.Blur will help us to eliminate local maxima

between characters.

Why should we blur the image??

Page 21: Fast and Accurate Skew Estimation Based on Distance Transform

The blurring affect

(c) Gradient orientation field of the DT (d) Gradient orientation field of the smoothed DT

Page 22: Fast and Accurate Skew Estimation Based on Distance Transform

ds = smoothed DT image.We now have to calculate the garident for

each background pixel:

The gradient direction of ds can be approximated by:

4. Calculate the gardients

Page 23: Fast and Accurate Skew Estimation Based on Distance Transform

Why:We would like to robustly estimate the

orientation.How:

For that matter, most methods divide the image into equal-sized windows and average the orientation in each window.

Problem : since the dominant orientation gradient vectors between text lines converge to the center of the gap from two opposite directions, they are expected to cancel each other.

For instance:

5. Calculate Orientation

Page 24: Fast and Accurate Skew Estimation Based on Distance Transform

Before averaging we will double the angels of all pixels’ gradients.

i.e.:

Notice that the gradients are now pointing to the same direction.

Now we can obtain the orientation of block using:

Where:

Solution

Page 25: Fast and Accurate Skew Estimation Based on Distance Transform

We remember that is perpendicular to the text lines and thus:

As mentioned earlier, our method is based on the observation that the dominant orientation of the DT gradient vectors is perpendicular to text lines.

Now we will produce an histogram…

Page 26: Fast and Accurate Skew Estimation Based on Distance Transform

In order to estimate the dominant orientation, we thus calculate a histogram, , for the orientations obtained.

contains 18,000 bins to represent provides resolution of up to for a

The angle that corresponds to the peak of is the estimated skew angle :)

6. Histogram

Page 27: Fast and Accurate Skew Estimation Based on Distance Transform

(a) A document image rotated at 20◦. (b) Corresponding histogram hθ.

(c) A document imagerotated at −30◦.(d) Corresponding histogram hθ.

Page 28: Fast and Accurate Skew Estimation Based on Distance Transform

6. Calculate a Gaussian on top of The Histogram

Page 29: Fast and Accurate Skew Estimation Based on Distance Transform

The Gaussian allow us to return an accurate value of the skew.

We return the Gaussian central value!

And Finally…

Page 30: Fast and Accurate Skew Estimation Based on Distance Transform

Questions?