slides on photosynth.net, from my msc at imperial
TRANSCRIPT
3D browsing of a photos datasetUncovering Photosynth.net
Markou Nikolas, Romain Dossin, Kevin Keraudren
November 29, 2010
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Introduction
Flickr search ”Rome Coliseum”
34,169 results
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 2 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Introduction
Flickr search ”Rome Coliseum”
34,169 results
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 2 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Introduction
Huge amount of data on the Web: Flickr > 5 billion photos
How can we browse such amount ?
What can we learn from it ?
What if we could turn 2D into 3D ?
→ Photosynth.net(University of Washington + Microsoft Research)
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 3 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
1 The Bundler PipelineExtract the focal length from the EXIF tags (extract focal.pl)Find feature points in each image using SIFTMatch keypoint descriptors between each pair of imagesStructure from motion : recover a set of camera parameters and a3D location for each track
2 Photo Explorer RenderingRender the sceneTransitions
View Interpolation
3 Running the code
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 4 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Extract the focal length from the EXIF tags (extract focal.pl)
Extract the focal length from the EXIF tags(extract focal.pl)
Jhead
ImageMagick: identify -format %[exif:*] image.jpg
focalpixels = X resolution ∗ (focalmm/CCD widthmm).→ used later to initialize the bundle adjustment
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 5 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Find feature points in each image using SIFT
SIFT - Scale Invariant Feature TransformFrom scale space to feature space
SIFT transforms an image into a large collection of local feature vectorseach of which is invariant to :
image translation
scaling
rotation
and partially invariant to :
illumination changes
affine projections
3d projections
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 6 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Find feature points in each image using SIFT
SIFT (continued)It is based on the highly successful Gaussian pyramid and the simple toimplement Difference of Gaussians (DoG) technique.
Source: http://fourier.eng.hmc.edu
For each different level the maxima and minima points are kept androtation histogram is created from the pixels around those for extrarobustness.These features can then be matched on other images. Objects can alsobe described as a set of features.
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 7 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Find feature points in each image using SIFT
Find feature points in each image using SIFT
Output format of ./sift
<number o f keypoints> <d e s c r i p t o r length><subp ixe l row> <subp ixe l column> <scale> <o r i e n t a t i o n>< i n v a r i a n t d e s c r i p t o r vector><subp ixe l row> <subp ixe l column> <scale> <o r i e n t a t i o n>< i n v a r i a n t d e s c r i p t o r vector>. . .
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 8 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Match keypoint descriptors between each pair of images
Approximate nearest neighbors to matchkeypoints between each pair of images2 images I and J, SIFT keypoints in J → kd-treefor each keypoint in I, look for nearest neighboor in J
Source: Wikipedia
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 9 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Match keypoint descriptors between each pair of images
Approximate nearest neighbors to matchkeypoints between each pair of images2 images I and J, SIFT keypoints in J → kd-treefor each keypoint in I, look for nearest neighboor in J
Source: Wikipedia
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 9 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Match keypoint descriptors between each pair of images
Approximate nearest neighbors to matchkeypoints between each pair of images2 images I and J, SIFT keypoints in J → kd-treefor each keypoint in I, look for nearest neighboor in J
Source: Wikipedia
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 9 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Match keypoint descriptors between each pair of images
Approximate nearest neighbors to matchkeypoints between each pair of images2 images I and J, SIFT keypoints in J → kd-treefor each keypoint in I, look for nearest neighboor in J
Source: Wikipedia
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 9 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Match keypoint descriptors between each pair of images
Approximate nearest neighbors to matchkeypoints between each pair of images2 images I and J, SIFT keypoints in J → kd-treefor each keypoint in I, look for nearest neighboor in J
Source: Wikipedia
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 9 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Match keypoint descriptors between each pair of images
Approximate nearest neighbors to matchkeypoints between each pair of images2 images I and J, SIFT keypoints in J → kd-treefor each keypoint in I, look for nearest neighboor in J
Source: Wikipedia
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 9 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Match keypoint descriptors between each pair of images
The fundamental matrix
Corresponding points within stereo-pair images are connected by thefundamental matrix.Set of corresponding points xi ↔ x′i in two imagesF is the fundamental matrix⇐⇒ ∀i , x′iFxi = 0linear equation in the unknown entries of F:If x = (x ,y ,1) , x′ = (x ′,y ′,1) then :x ′xf11 + x ′yf12 + x ′f13 + y ′xf21 + y ′yf22 + y ′f23 + xf31 + f33 = 0
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 9 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Match keypoint descriptors between each pair of images
Fundamental matrix estimation using the 8 pointalgorithm and RANSAC
With n point matches, this can be rewritten :
Af =
x ′1x1 x ′1y1 x ′1 y ′1x1 y ′1y1 y ′1x1 y1 1...
x ′nxn x ′nyn x ′n y ′nxn y ′nyn y ′nxn yn 1
f = 0
where f the 9-vector made up of F in row-major order∃ solutions⇐⇒ rank(A)≥ 8, unicity in the case of equality(f determined up to scale)If A > 8 (ex. noise): least-squares solution or run RANSAC and keepthe best fitting model
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 10 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Match keypoint descriptors between each pair of images
RANSAC - Random Sample ConsensusDuring any matching procedure we are stuck with erroneous matches.These mismatched points are called outliers and are usuallycatastrophic when trying to fit a model to the data.
Source: Wikipedia
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 11 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Match keypoint descriptors between each pair of images
RANSAC - Random Sample Consensus
Method:
Randomly choose a number points
Try to fit a model to them
Check how many other points are in consensus with the model
It is repeated and the best fit is left as a solution.All points not fitting this solution (outliers) are usually removed from thedata set. This process filters most of the large errors.
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 12 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Match keypoint descriptors between each pair of images
Organize the matches into tracks
Source: ”Modeling the World from Internet Photo Collections”
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 13 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Structure from motion : recover a set of camera parameters and a 3D location for each track
Structure from motion : recover a set of cameraparameters and a 3D location for each track
Start with the two cameras (images) that best matchEstimate their parameters (focal length from EXIF tags, 5 pointsalgorithm)Recover the 3D position of the points they both observe through abundle adjustment
Then take the camera that observes the most of the same pointsEstimate its parameters using Direct Linear TransformationRun a bundle adjustment adding only the already known points :only the new camera parameters can changeRun another bundle adjustment adding the points observed byanother camera
Iterate with a new cameraMarkou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 14 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Structure from motion : recover a set of camera parameters and a 3D location for each track
Structure from motion : recover a set of cameraparameters and a 3D location for each trackBundle adjustment :
n cameras parametrized by Θij
m tracks parametrized by the 3D points Xj
qij the observed projection of the j-th track in the i-th cameraP(Θ,X) : mapping between a 3D point X and its 2D projection in acamera with parameters Θ
wij : 1 if camera i observes point j, 0 otherwise
Minimize (the unknowns are the 3D points Xj ):
n
∑i=1
m
∑j=1
wij ‖qij −P(Θi ,Xj)‖
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 15 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Render the scene
Render the scene
As frustra
Images
Points and lines
3D rendering
Sources: Wikipedia & ”Modeling the World fromInternet Photo Collections”
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 16 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Transitions
Transitions
Representation accuracy of the real scene
Camera motionI Linear interpolationI TimingI Twinkle
View interpolationI Triangulated MorphsI Planar Morphs
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 17 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Transitions
Triangulated Morphs
Method
I Projection of the points onto each imageI 2D Delaunay triangulation, with edges constraintsI Projection of the triangulation onto an average planeI Creation of a 3D meshI Display depending on camera location
Rendering
I Good geometryI Artifacts
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 18 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Transitions
Planar Morphs
Method
I Projection onto a common planeI Display depending on camera location
Rendering
I Lower quality of the geometryI Less Artifacts
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 19 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Running the codeOne full run: Bundler→ Poisson reconstruction82 photos from Flickr, big images, 44h, 115 196 points recovered at theBundler stage...
Figure: Cloud points obtained from Bundler, and Poisson surfacereconstruction done after PMVS
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 20 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
ConclusionPhotosynth is only a beginning...The University of North Carolina aims to reconstruct famous sites,in a day from a ”normal machine”
Figure: Reconstruction from 2D photosMarkou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 21 / 22
Introduction The Bundler Pipeline Photo Explorer Rendering Running the code Conclusion
Quizz
1 SIFT is used on images to find key features. In whattransformations is SIFT invariant to and when it doesn’t perform sowell ? How does this affect the clusters generated afterwards ?
2 If you were given the parameters of a camera (rotation matrix, focallength, position of the center), the associated image and the 3Dcloud point it observes, where would you place the 2D image in 3Dspace ?
3 As Photosynth is a web application that must be able to displayhigh-resolution photos to a lot of people simultaneously, how doyou think Microsoft optimized this system in order to avoid largedata transfers ?
Markou Nikolas, Romain Dossin, Kevin Keraudren () 3D browsing of a photos dataset November 29, 2010 22 / 22