solving jigsaw puzzles using computer vision · 2020. 9. 6. · solving jigsaw puzzles using...

SOLVING JIGSAW PUZZLES USING COMPUTER VISION

AUTHOR : AREEJ MAHDI SUPERVISOR : REIN VAN DEN BOOMGAARD DATE : JUNE 22, 2005 SIGNED BY :

B a c h e l o r G r a d u a t i o n P r o j e c t

Solving Jigsaw Puzzles Using Computer Vision Areej Mahdi

University of Amsterdam [email protected]

Abstract This paper presents a method to solve jigsaw puzzles by using shape alone. The method uses scans of a real jigsaw puzzle. The pieces of the puzzle are extracted and segmented from the image by using computer vision techniques. The pieces are then each split into four curves, each curve representing a side of the puzzle. The four corners of the pieces are detected using the curvature scale space technique. This technique has not been used in previous work on solving jigsaws by computer. For the local matching the Procrustes Analysis is used. This method is tested on a 24-piece puzzle. 1. Introduction Solving a jigsaw puzzle is a very interesting problem both theoretically and practically. By using a computer to solve jigsaw puzzles we come across various known problems in the computer vision field and in matching 2D subcurves. The solution for this problem can be used in solving various practical problems like reconstructing archaeological artefacts or fitting a protein with a known amino acid sequence to a 3D electron density map [1,2]. Yet simply solving a jigsaw puzzle is by its nature a very challenging problem. The very first paper published on solving jigsaw puzzles by shape alone goes back to the year 1964 [3]. Since then many articles have been published on this subject using only the shape of the pieces. Although there are quite a few algorithms introduced to solve jigsaw puzzles, still there is no algorithm known that can solve large jigsaw puzzles in a reliable and efficient way. The purpose of my research is to build a system that solves apictorial jigsaw puzzles from an image containing all the pieces of the puzzle. Computer vision is necessary for solving jigsaw puzzles by computer especially since it mainly depends on the shape of the pieces. Thus an accurate extraction of the information from the image is essential for a good representation of the

puzzle pieces and eventually a correct solution of the puzzle. In paragraph 2 related work done on jigsaw puzzle solving will be described as well as the chosen approach. In the paragraphs 3 and 4 we will describe in more detail the setup of the jigsaw puzzle solver and the implementation. The results and experimentations will be presented in paragraph 5. Paragraph 6 contains the conclusions and finally paragraph 7 presents recommendations for future work. 2. Related work In this chapter related work done previously will be described as well as the chosen approach. There are a number of approaches used in the different papers to solve jigsaw puzzles. Generally they consist of the same stages: pre-processing the images of the pieces, matching the sides of the pieces (local matching) and finally matching the jigsaw pieces (global matching). To represent the pieces, most of the published work use polygon approximations of the borders of the pieces. However the way they deal with the puzzle pieces differ, as does their implementation. To represent the boundary of the pieces the following two methods are widely used: the polygon approximation and the Freeman chain code [3]. The advantage of using chain code is that it is more compact to represent. The representation of each point on the boundary costs about 2 to 3 bits per point. Chain code is also invariant to shifting and translation; it is also possible to scale and rotate. Besides the length of the contour is easy to calculate using chain code. As well as calculating the area of a closed loop and checking the contour for multiple loops and closure. In some papers the piece is treated as a sequence of strings. The local shape analysis is based on an approximate string matching [4]. Others transform a piece into a skeleton and apply most of the processing on the skeleton [5]. The rest of the methods use the curves of the pieces themselves to solve the puzzle. A method that worked very well was the algorithm given by Wolfson. It was able to

solve a 104-piece puzzle [1]. A later work that was based on Wolfson approach was able to solve a 204-piece puzzle. This method differs in how the interior pieces are placed. However the algorithm was not widely tested. It also required many hand-tuned parameters. In this paper the same overall approach as that of Wolfson will be followed, but some new techniques will be used. The assumptions that have been made in this paper are: the jigsaw puzzle must have a rectangular outside border, each jigsaw puzzle must have well defined neighbours and finally, the pieces must interlock with their neighbours by tabs. The solution of the jigsaw puzzle is done in a number of phases. First the pre-processing step in which the pieces are segmented one by one from the image. Then the border of each piece is extracted and represented. After splitting each piece into four curves each curve is then matched with every other curve. This is done using a local matching algorithm. This will result in a score of how well the curves match each other. A (global) algorithm can then be used to assemble the jigsaw puzzle. This is the general set-up of the research. Each of the different phases will each be described in detail. This method will be tested on a 24-piece puzzle. 3. Pre-processing Obtaining the pieces The pieces of the puzzle are obtained by scanning the pieces of the jigsaw puzzle. This produces a black and white image containing all the jigsaw pieces (see figure 1). All subsequent processing will be done on a binary image. The pieces are placed in random orientation and position on the scanner. The image contains noise which must first be removed so that all the pieces in the image can be recognized and segmented properly. To remove the noise from the image, morphological operations are applied. The morphological operations ‘opening-by-reconstruction’ followed by a ‘closing-by-reconstruction’ are used to remove the extra noise and fill up the holes in the pieces. The next step is to segment each of the pieces from the image. The pieces of the puzzle in the image lie rather close to each other and in some positions even slightly touch each other. By applying the Watershed Transformation, we are able to segment the image into the areas. Each area contains one of the pieces. The pieces are then stored separately.

The pieces have been numbered manually to be able to check if the solution is correct.

figure 1: The image obtained from scaning

the pieces of the puzzle. Extracting the borders The next step will be to extract and represent the border of the pieces. To extract the border of the puzzle pieces first the inner pixels of the pieces in the image are removed so that only the border is left. Then a contour algorithm is applied to obtain the x,y coordinates of the boundary of each piece. The coordinates are needed to represent the border of the pieces. Other methods to extract the border are available but this approach is both simple and effective. For correctly matching puzzle pieces it is necessary to use a good boundary description method. We will therefore represent the border using the 8-connectivity version of the Freeman chain code. Because this also includes diagonal connections. Detecting the corner points The next step is to split each piece into four curves so that they can be matched later on. It is therefore necessary to extract the four corners from each piece. For the detection of the corners we use the curvature scale space. The curvature scale space is a multi scale representation of planar curves. It describes curves at varying levels of details, which makes it possible to detect relevant information at an appropriate scale. The curve is defined in terms of the first and second derivatives of the boundary coordinates x(u), y(u), where u is the length of the path along the

curvature. To express the different levels of details, the boundary coordinate functions are convolved with the Gaussian function g(u,σ) producing Gaussian smoothed coordinates X(u,σ) and Y(u,σ) [6].

where

The curvature is then calculated according to the following formula:

where

Corners of the pieces can be detected due to the high curvature at the corner points. However other interesting points in a piece can also be detected using the curvature scale space. For example points that correspond to the inflection points of the tabs in a piece because these points also contain high curvature. Unfortunately the pieces of the puzzle contain very ridged boundaries and curves created due to the noisy image. This results in lots of high curvature points which are neither corner points nor points of interest in the piece. It is therefore important to choose an appropriate scale which results in the least number of found points with high curvature yet still containing all the actual corner points of a piece. If the scale of the curvature scale space is higher, then it produces fewer points. By smoothing the curve more noisy points will be

filtered out. But since some pieces contain rather large corner angles, these angles are lost at high scales. Therefore a method must be used to select the real corner points. The method used in this paper is the first to combine possible corner points found at two different scales. The chosen scales both have to contain all four true corner points. By comparing the points obtained at the two scales, some false corner points can be eliminated. Corner points that appear at one scale and disappear at the other are obviously not true corner points, so they can be deleted. Possible corner points that almost have the same position on the curve on both scales are considered to be the same point. It is not easy to know which point is a better candidate and one does not want to accidentally delete the best corner point. So the two points are replaced with a point that lies between them. The real corner points do not lie close to each other, so possible corner points that lie close to one another are examined. The point with the lowest curvature is deleted. Although this eliminates many false corners, the number of possible corners is still greater than four. To choose the corner points we take the three points with the highest curvature. These points are used to construct a parallelogram, so that the fourth corner point lies at nearly the same distance regarding the other three points. 4. Solving the jigsaw puzzle The detected corner points are used to split the chain code for each piece into four sub-chain codes. The sub-chain codes will be matched to each other in the local matching algorithm. Local Matching Algorithm To match the pieces we have used the Procrustes Analysis. This is a method used to compare the shape of objects. The shape of the objects must be described by a finite set of points. The Procrustes Analysis determines the linear transformation: translation, reflection, orthogonal rotation and scaling of the points in matrix X of an object to best fit the points in matrix Y of another object so that [8]:

Y = s*X*R + 1’t

Where s is the scale factor, R is the rotation and reflection matrix, t the translation vector and 1 is a matrix of ones.

The matrices X and Y must have the same number of points. Therefore the curves that we are trying to match must be resampled so that their lengths correspond to each other. If the curves have different curve lengths then the curve with a shorter length is repeatedly matched with the longer curve; each time the smallest curve is matched with a part of the long curve. The goodness-of-fit criterion is the sum of squared errors. By applying the Procrustes Analysis a matrix is obtained that minimizes the value of the dissimilarity measure. It is a 4Nx4N matrix where N is the number of pieces. If the dissimilarity value is smaller than a specified threshold then the match is good. To prevent performing matches with straight sides, the straight sides are recognized before they are matched with the rest of the other curves. Global matching Based on the scores of the local matching a global algorithm can be implemented to assemble the pieces. It is therefore essential that the local matching algorithm provides good matches. For the assembly of the pieces, we can divide the pieces into two types: frame pieces and interior pieces. A frame piece is a piece that contains one or more straight sides. The frame pieces are assembled first so that they can be used to fit the interior pieces into the frame. The assembly of the frame pieces is done by matching the frame pieces with each other so that they form a cycle. The cycle should contain each piece only once and has the smallest dissimilarity matches between the pieces. This is the well-known Travelling Salesman problem. Once a cycle is found, we can match the remaining pieces in this frame. One can use the same approach as Wolfson [6].

5. Experiments and results All the code was implemented in Matlab. The algorithm was tested on a 24-piece jigsaw puzzle and it meets the requirements set out by the assumptions made in paragraph 2. The image of the jigsaw puzzle was obtained by laying the pieces upside down on their backside on a flatbed-surface scanner at a resolution of 300 dpi. The pieces were placed at random positions and orientations with no overlap. After the segmentation and representation of each piece, the corners of each piece are determined. The results for the experiments are show in table 1. This table contains the number

of real corner points that are detected at each step of the experiment. The pieces are all given an id number according to their location in the puzzle. The puzzle pieces are numbered from left to right, top to bottom. This makes it easy to know with which piece we are dealing with. To detect the corners the curvature scale space is used at scale 18 and 30. Scale 18 is the highest scale that can be chosen where all four real corner points can be found for all the pieces. At scale 30 however some real corner points are lost for some of the pieces. This is the case for pieces number 4 and 5. These pieces contain corner points with a rather large angle, which are hard to recognize because of the low curvature. By using a high scale of 30 the detection becomes even more difficult. Nevertheless this scale is chosen because it provides a sufficient level of detail so that the corner points can be easily found and it works for nearly all the pieces. The results of scale 18 and 30 are combined to obtain one set of points. This set will be narrowed down throughout the experiments. After applying the heuristics, described in paragraph 3, the four corners of 19 of the pieces are still among the points found with high curvature (see table 1). However when the ten points with the highest curvature for each piece are selected, all the corners of only fourteen of the pieces are among these points. For four of the pieces the number of real corner points in the top ten is two and for the remaining pieces only three real corners. This is because some of these pieces contain corners with rather big angles, so the real corner points do not have the highest curvature when compared to other points in the same piece. When the four points with the highest curvature are taken as the possible four corners of a piece, then it only successfully identifies one piece. While by taking the top 3 highest curvature points, the corners of eight of the pieces are correctly detected. This produces better results than the top 4 because the method used to select the fourth corner chooses the closest point that forms a parallelogram together with the other three found corners. The expected position of the fourth point is calculated and displayed as a circle in the figures. The distance between the remaining candidate points and the expected position is calculated. The point with the least distance is selected as the fourth. For one of the eight pieces this happened to be a point other than the real corner point, see figure 2. As can be seen in the figure the true corner point lies further from the expected position than the point chosen as

the fourth corner. For pieces that have less than three true corner points as the points with the highest curvature, this results as expected in a wrong estimation of the fourth corner point, see figure 3. The detection of the corners works correctly for seven pieces, see figure 4. These pieces are emphasized in the table. From these seven pieces, six are connected to each other in the puzzle at five different places.

figure 2: contains a jigsaw piece that has three succesfully detected corner. The selection of the fourth corner however is incorrect. The red crosses are the points found in the top 3. The circle is the estimated location of the fourth corner and the green cross marks the selected point.

figure 3: contains a jigsaw piece that has only two succesfully detected corner points. The red crosses are the points found in the top 3. The circle marks the location of the estimated fourth corner the green cross marks the selected point.

figure 4: presents piece with succesfully detected corners The red crosses are the points found in the top 3. The circle marks the estimated location of the fourth corner and the green cross marsk the selected point.

The reason that the corner points of some pieces get eliminated so easily at the first rounds is because in these pieces the starting point of the curve is very close to the corner. A wrap around of 5 points is used. But it is difficult to say how long the wrap must be. If the wrap is too large, then incorrect calculations will be made, for example the calculations of the chain code and the curvature of the piece. For points of which the starting point is far from the corner points the real corners can be detected, this is the case for the eight pieces which were successfully found in the top 3 experiment. The starting point used is always on the topside of the piece. Because the pieces were randomly placed on the scan platter, this starting point sometimes just happens to lie rather close to a corner point. So it is not always possible to know what the best wrap around value is because each piece has a random orientation. From experimenting with the data, it became clear that if another starting point is chosen, the four corners are successfully detected for pieces other than the previous eight.

Piece ID

18+30 Top 10 Top 4

Top 3 4th point estimation

1 4 4 3 2 2 2 4 4 3 2 3 3 4 4 2 2 2 4 4 3 2 1 1 5 3 2 2 2 2 6 3 3 3 2 2 7 4 4 3 2 2 8 4 4 2 2 2 9 4 4 3 3 4

10 4 4 3 3 4 11 4 4 2 2 2

12 4 4 2 2 2 13 2 2 1 1 1 14 4 3 3 3 4 15 4 3 3 3 4 16 4 4 3 3 4 17 4 4 4 3 3 18 4 2 2 1 2 19 4 4 3 3 4 20 2 2 2 1 1 21 4 3 3 2 2 22 4 4 3 3 4 23 4 4 2 2 2 24 3 3 2 1 1

Table 1: contains the results of all the experiments performed to determine the corner points of each piece, the numbers in the table refer to number of real corner points that are detected at each experiment. Cornerpoints that are succesfully detected for a piece are emphasized.

The matching algorithm produces a dissimilarity matrix. A match is considered to be successful if the dissimilarity value is above a certain threshold value. The optimal threshold value 0.03 is used and has been determined by trial and error. Table 2 contains the results of the matching algorithm. The columns contain the piece id and the side of the piece, which is successfully matched with the side of the piece in the second column. The correct matches are emphasized in the table. The local matching algorithm produces nine matches of which three matches are correct, see table 2. As is obvious the results of the local matching contained a number of false positives matches. Three of these matches are impossible matches, between two indents or two outdents. Specifying whether a curve is an indent or outdent can probably lead to reducing the false positive matches.

Matched Pieces 16.1 22.2 9.2 10.4 14.1 10.1 14.2 15.4 14.3 10.1 19.1 22.2 19.2 15.3 19.3 15.1 22.3 10.1

Table 2: contains the results of the local matching algorithm. The piece and curve numbere in the first column are considered as a successful match with the piece and curve in the second column. The correct matches are emphasized.

To prevent the algorithm from performing impossible matches such as matching a straight side of a piece with other sides, the straight sides in the puzzle must be recognized. A straight side of one of the frame pieces is matched with every side of the other pieces. If the match is below a certain threshold value then the side is considered a straight side and is not matched with sides of other pieces. The threshold value is set on 0.03. This method successfully recognized all frame sides of the pieces. This can be later used in the global matching algorithm so that only frame pieces are matched at first. The matches produced by the local matching are in fact good, considering the fact that the images are at low resolution and the curves are very noisy. Unfortunately due to time limitations, the global matching has not been implemented. 6. Conclusion A method to solve jigsaw puzzles has been presented which used some new techniques not used by previous work on solving jigsaw puzzles such as the curvature scale space. The curvature scale space is applied to detect the corners of each piece. It succeeded to detect all the corners for some of the pieces. However some improvements must be made to fully make use of potentials of these methods. Better results are probably possible if more scales are combined together to detect the corner points. Also applying more heuristics may help. Furthermore the pieces on which the curvature scale space was used contained very noisy curves. The implemented local matching algorithm was successful in finding three out of the five matches correct matches. However the algorithm also returned a number of false positive matches. A number of these matches are impossible ones, so a strong suggestion is to prevent such matches from happening. This will surely enhance the performance of the matching algorithm. 7. Future work It is possible to obtain better results by applying some changes. Finding the correct corner points of the jigsaw pieces can be improved. In the paper we used the curvature scale space at two different scales and the results were combined. By using

more than two scales it is possible to obtain more accurate findings, which lead to a better detection of the corner points. It became clear from the experiments that choosing another starting point, which is not very close to a corner, will result in better corner findings. Furthermore the method in which the final corner points are chosen can be enhanced. This method first chooses three of the possible corner points with the highest value as three of the final corners. Applying more heuristics before this method is used, may lead to better results. So that the remaining points with a highest curvature are for more pieces indeed the corner points. The way the fourth corner point is determined can also be improved. The matching of the curves can produce a better score if we specify whether a curve that we are matching is an indent or an outdent. This will eliminate many impossible matches such as matching two indents together. This will lead to a more accurate matching algorithm and a better score. Because the curves are so noisy, smoothing them a bit before applying the matching algorithm may also help improve the matching results. The images used in this paper are of relatively low resolution. This resulted in lots of noise in the image and curves. Another possible reason for the noisy curves is that the jigsaw used in the development of the method described in this paper is hand-made from wood. Therefore the boundaries of the pieces are not entirely smooth. References [1] D. Goldberg, C. Malon and M. Bern. A Global

Approach to Automatic Solution of Jigsaw Puzzles. SoCG ’02, June 5-7, 2002.

[2] W. Kong and B.B. Kimia. On solving 2D and 3D puzzles using curve matching. Proc. IEEE Computer Vision and Pattern Recognition, 2001.

[3] H. Freeman and L. Gardner. Apictorial jigsaw puzzles: The computer solution of a problem in pattern recognition. IEEE trans. on Electronic Computers 13 (1964) 118-127.

[4] H. Bunke and G. Kaufmann. Jigsaw puzzle solving using approximate string matching and best-first search. In Computer Analysis of Images and Patterns, Chetverikov and Kropatchs. ed., Springer (1993) 299-308.

[5] R.W. Webster, P.S. LaFolette and R. L. Stafford. Isthmus Critical Points for Solving

Jigsaw Puzzles in Computer Vision. IEEE Transactions on Systems, Man, and Cybernetics, vol. 21, no. 5, 1991. 1271-1278.

[6] H. Wolfson, E. Schonberg, A. Kalvin and Y. Lamdan. Solving jigsaw puzzles by computer. Annals of Operations Research 12 (1988), 51-64.

[7] F. Mokhtarian, K. Mackworth. A Theory of Multiscale, Curvature-Based Shape Representation for Planar Curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, no. 8, 1992. 789-805.

[8] I. Borg and P. Groenen. Modern Multidimensional Scaling: Theory and Applications. Springer Series in Statistics (1997) 339-355.

Solving Jigsaw Puzzles using Computer Vision

Name: Areej MahdiSupervisor: Rein van den BoomgaardDate: July 1, 2005

Overview

IntroductionRelated workPreprocessingLocal MatchingGlobal MatchingDiscussionConclusionsQuestions

Introduction

The goal of this project:Building a system that solves jigsaw puzzles by shape alone from an image containing all the pieces of the puzzle

Interesting problem..

Reconstruction of archeological artifacts

Art restoration

Fitting a protein with known amino acid sequence to a 3D density map

Related Work (1)

First article written on this subject goes back to 1964

Different approaches generally consists of the same phases

Wolfson’s approach

Related Work (2)

Succeeded in solving 104-piece puzzle

Divided border into four sub curves corresponding to the four sides

Local Schwartz-Sharir matching algorithm

Global matching

Overall approach of Wolfson but applying new techniques

Jigsaw Puzzle

Assumptions:

Rectangular outside border

Four well defined neighbors

Pieces interlock with their neighbors by tabs

Preprocessing: Obtaining the pieces

Scanning the pieces

Removing the noise

Segmenting the image

Preprocessing: Extracting the pieces

Scanning the pieces

Removing the noise


Preprocessing: Extracting the border

Scanning the pieces

Removing the noise


Preprocessing: Representing the border

Borders are represented by Chain Code

Chain code describes relative step from previous pixel

8-connectivity Chain Code: 2,1,0,7,7,0,1,1

Preprocessing: Dividing the border

Finding the corners of the pieces: divide the border into 4 subcurves

Use Curvature Scale Space

Curvature Scale Space (1)

Mokhtarian

Describes curves at varying levels of details

Detection of relevant information at a appropriate scale

It is invariant under rotation, uniformed scaling and translation

Very suitable for noisy curves

Curvature Scale Space (2)Curve defined in terms of derivative of the boundary coordinates x(u), y(u) , where u is the length of the path along the curve

To express the different levels of details the boundary coordinate functions are convolved with the Gaussian function g(u,σ) producing Gaussian smoothed coordinates X(u,σ) and Y(u,σ)

Curvature Scale Space (3)

The curvature of a curve is calculated as follows:

where

Piece nr. 11

Piece nr. 11Scale: 18

Piece nr. 19

Detecting the corners (1)More than four points are detected with high curvature even at high scales

0 50 100 150 200 25020

40

60

80

100

120

140

160

180

200

Detecting the corners (2)Combining different scales to eliminate false corner points

0 50 100 150 200 25020

40

60

80

100

120

140

160

180

200

0 50 100 150 200 25020

40

60

80

100

120

140

160

180

200

Detecting the corners (3)Choosing the top 3 highest candidate corners and constructing a parallelogram from these points

Detecting the corners (3)Choosing the top 3 highest candidate corners and constructing a parallelogram from these points

Piece ID 18+30 Top 4 Top 3 4th point estimation

1 4 3 2 2

2 4 3 2 3

3 4 2 2 2

4 4 2 1 1

5 3 2 2 2

6 3 3 2 2

7 4 3 2 2

8 4 2 2 2

9 4 3 3 4

10 4 3 3 4

11 4 2 2 2

12 4 2 2 2

13 2 1 1 1

14 4 3 3 4

15 4 3 3 4

16 4 3 3 4

17 4 4 3 3

18 4 2 1 2

19 4 3 3 4

20 2 2 1 1

21 4 3 2 2

22 4 3 3 4

23 4 2 2 2

24 3 2 1 1

Detecting the corners (5)

Local Matching

The corner of seven pieces are successfully detected

Local matching is applied only to these pieces

Local Matching (1)

Procrustes Analysis:Compares shapes of objectsDetermines linear transformation of matrix X to best fit into Y:

Y = sXR + 1T twhere s is scale factor, R the rotation and reflection matrix, t the translation vector

Local Matching (2)Matched curves must have same lengths

The minimized value of the dissimilarity measure between two curves is calculated

Each curve of a piece is matched with all curves of the remaining pieces

This produces a 4N x 4N matrix

Straight sides of pieces are detected to prevent them from being matched with other pieces

16

9

14

19

22

15

10

16 9 14 19 22 15 10

Global Matching

Same approach as Wolfson:

Frame assemblyTraveling Salesman Problem

Interior assembly

DiscussionDetection of the corner points:

Higher resolution of jigsaw image

Combining more Curvature scales

Applying better heuristics to select the four corner points

Local Matching: Eliminating false positive matches: indents and outdents

Conclusion

New combination of techniques works, but lots of room for improvement

Better results possible after applying suggestions to current approach

Questions ??

Local Matching (3)

Result table of found matchesMatched Pieces

16.2 22.29.2 10.414.1 10.114.2 15.414.3 10.119.1 22.219.2 15.319.3 15.122.3 10.1

Detecting the corners (1)

More than four points are detected with high curvature even at high scales

<image of curvature and X & Y in a 3D plot>

1

solving jigsaw puzzles using computer vision · 2020. 9. 6. · solving jigsaw puzzles using...

Documents