intech-high performance imaging through occlusion via energy minimization based optimal camera...

8/13/2019 InTech-High Performance Imaging Through Occlusion via Energy Minimization Based Optimal Camera Selection

1/9

High Performance Imaging throughOcclusion via EnergyMinimization-based OptimalCamera Selection

immediate

Abstract Seeing an object in a cluttered scene with severeocclusion is a signicantly challenging task for manycomputer vision applications. Although camera arraysynthetic aperture imaging has proven to be an effectiveway for occluded object imaging, its imaging qualityis often signicantly decreased by the shadows of theforeground occluder. To overcome this problem, somerecent research has been presented to label the foregroundoccluder via object segmentation or 3D reconstruction.However, these methods usually fail in the case of complicated occluder or severe occlusion. In this paper,we present a novel optimal camera selection algorithmto handle the problem above. Firstly, in contrast to thetraditional synthetic aperture photography methods, weformulate the occluded object imaging as a problem of visible light ray selection from the optimal camera view.To the best of our knowledge, this is the rst time to"mosaic" a high quality occluded object image via selectingmulti-view optimal visible light rays from a cameraarray or a single moving camera. Secondly, a greedyoptimization framework is presented to propagate thevisibility information among various depth focus planes.Thirdly, a multiple label energy minimization formulationis designed in each plane to select the optimal camera

view. The energy is estimated in the 3D synthetic apertureimage volume and integrates the multiple view intensityconsistency, previous visibility property and cameraview smoothness, which is minimized via graph cuts.Finally, we compare this approach with the traditionalsynthetic aperture imaging algorithms on UCSD lighteld datasets and our own datasets captured in indoorand outdoor environment, and extensive experimentalresults demonstrate the effectiveness and superiority of our approach.

Keywords Occluded Object Imaging, ComputationalPhotography, Synthetic Aperture Imaging, EnergyMinimization

1. Introduction

Occluded object imaging is a signicantly challenging taskin many computer vision application elds such as video

surveillance and monitoring, hidden object detection andrecognition, tracking through occlusion, etc. However, because traditional photography simply captures the 2D

Tao Yang, Yanning Zhang, Xiaomin Tong, Wenguang Ma and Rui Yu: High Performance ImagingThrough Occlusion via Energy Minimization-Based Optimal Camera Selection

1www.intechopen.com

International Journal of Advanced Robotic Systems

ARTICLE

www.intechopen.com Int. j. adv. robot. syst., 2013, Vol. 10, 393:2013

1 School of Computer Science, ShaanXi Provincial Key Laboratory of Speechand Information Processing, Northwestern Polytechnical University,Xian,China

2 Department of Computer Science, University College London, UK* Corresponding author E-mail: [email protected]

Received 08 Aug 2013; Accepted 27 Sep 2013

DOI: 10.5772/57175

2013 Yang et al.; licensee InTech. This is an open access article distributed under the terms of the CreativeCommons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use,distribution, and reproduction in any medium, provided the original work is properly cited.

Tao Yang 1, Yanning Zhang 1,* , Xiaomin Tong 1, Wenguang Ma 1 and Rui Yu 2

High Performance Imaging ThroughOcclusion via Energy Minimization-BasedOptimal Camera SelectionRegular Paper


2/9

(a) (b) (c)

Figure 1. Explanation of the principle of multiple camera synthetic aperture imaging via geometric optics.

projection of the 3D world, essentially it cannot handleocclusion.

Recently, computational photography is changing thetraditional way of imaging, which captures additionalvisual information by using generalized optics. Syntheticaperture imaging (SAI) [17] is one of the key aspectsof computational photograph. Figure 1 visualizes theprinciple of multiple camera synthetic aperture imaging.In a convex lens, rays from the red point on the planeof focus after refraction converge to a single point on thesensor plane, forming a sharp image (Figure 1a). Raysfrom the blue point, which is not in the plane of focus,form a circle of confusion on the sensor plane, resulting ina blurred image (Figure 1b). A camera array is analogousto a "synthetic" lens aperture - each camera being a sample

point on a virtual lens (Figure 1c). We syntheticallyfocus the camera array by choosing a plane of focus, andadding up all the rays corresponding to each point onthe chosen plane to get a pixel in a "synthetic aperture"image. By warping and integrating the multiple viewimages, synthetic aperture imaging can simulate a virtualcamera with a large convex lens, and focus on differentfrontal-parallel or oblique planes with a narrow depth of eld. As a result, occluded objects focused on the virtualfocal plane are visible, while those that are not are blurry.

Synthetic aperture imaging photography [1, 2] providesa new concept in resolving the occluded object imagingproblem, however it still suffers from the following

limitations:(1) The clarity of the occluded object image is oftensignicantly decreased by the shadows of the foregroundoccluder. Although some methods have been presentedto label the foreground via object segmentation or 3Dreconstruction, these methods would fail in the case of acomplicated occluder or severe occlusion.

(2) Because the state-of-the-art SAI methods use theintensity average of multiple cameras, various colourresponses of cameras often signicantly reduce the coloursmoothness and consistency of synthetic aperture image.

(3) The occluded objects contour and contrast are sensitive

to calibration error, which is serious especially forunstructured light eld synthetic aperture imaging with amoving camera.

In this paper, we address the above issues by proposinga new algorithm which for the rst time formulates

the occluded object imaging as an optimal cameraselection problem. A multiple label energy minimizationformulation is designed in each plane to select the optimalcamera. The energy is estimated in the 3D syntheticaperture image volume, which integrates the multipleview intensity consistency clustering, visibility probabilitypropagation and cameraview smoothness. When focusingon a hidden object, instead of naively averaging all cameraviews in the synthetic aperture image, our method willactively select the rays from only one optimal cameravia multiple label graph cuts-based energy minimization[811].

The organization of this paper is as follows: Section

2 introduces several related works. Our algorithm ispresented in Section 3. Following that, Section 4 presentsthe data set, implementation details and the experimentalresults. In addition, the performance analysis anddiscussions are proposed. Finally, we conclude the paperand point to future work in Section 5.

2. Related Work

In 1999, the rst famous camera array setup was devisedfor the lm The Matrix , in which a 1D camera array wasused to create the impression of orbiting around a scenethat has been frozen in time. Pioneering work on syntheticaperture imaging was proposed by Levoy[1]. They set upa two dimensional Stanford light eld camera array whichconsisted of 128 Firewire cameras, and for the rst timealigned multiple cameras to approximate a camera with avery large aperture.

The MIT computer graphics group [12] used 64 USBwebcams for synthesizing dynamic depth of eld effects.Lei e t a l. [13] developed a cluster-based system forcamera array application, which consisted of eightnodes and 16 cameras. Ding et al. [14] constructed a3x3 camera array to reconstruct the surface of uid.Venkataraman et al. [15] presented an ultra-thin highperformance monolithic camera array named PiCam(Pelican Imaging Camera-Array), that captured lightelds and synthesized high resolution images. Maitre etal. [16] used a planar camera array to perform surfacereconstruction. Schuchert et al. [17] estimated 3D object

Int. j. adv. robot. syst., 2013, Vol. 10, 393:20132 www.intechopen.com


3/9

Our imaging result

Synthetic Aperture Photography Energy Minimization basedOptimal Camera Selection

Warping Images and EnergyMinimization Results on Focus Depth i

i i+1

Traditional Synthetic Aperture Imaging Result

Captured Scene

Figure 2. Algorithm framework of our approach.

structure, motion and rotation based on 4D afne opticalow using a multi-camera array. Yang et al. [18] built anew hybrid synthetic aperture imaging system, whichadopted a linear camera array and four top viewcameras to continuously track and see people throughocclusion. Later on, Yang et al. [19] presented a cameraarray autofocus algorithm by minimizing the temporaland spatial correspondences error subject to global loopconstraint. To reduce the complexity and expensive costof the large scale camera array, Davis et al. [2] presenteda system for interactively acquiring and rendering lightelds using a hand-held commodity camera, this systemprovided users with real-time feedback and guidedthem toward under-sampled parts of the light eld. Theconstructed synthetic aperture image of the above lighteld photography systems have a shallow depth of eld,so that objects off the focus plane disappear due tosignicant blur. This unique characteristic makes syntheticaperture imaging a powerful tool for occluded objectimaging.

When focusing on the occluded object, outlier rays thatactually hit occluders will blur the focused occludedobject, and decrease the clarity and contrast of thesynthesized image quality. Several methods have beenpresented to overcome this problem. Vaish et al. [20]studied four cost functions, including colour medians,entropy, focus and stereo for reconstructing an occludedsurface using synthetic apertures. Their method achievedgood results under slight occlusion; however the costfunctions may fail under severe occlusion. Pei etal. [21] proposed a background subtraction methodfor segmenting and removing a foreground occluder before synthetic aperture imaging. Their results wereencouraging for a simple static background, howeverthe performance was very sensitive to the movingforeground segmentation result and may fail in crowded

scenes with a dynamic cluttered background. Later on,Pei et al. [22] improved the imaging quality via graphcut-based foreground segmentation. Because of the colour

variance-based segmentation, the performance of theirapproach was very sensitive to the illumination and colourresponse of different views in the camera array. Althoughtheir results were encouraging in a well-controlledlaboratory experiment, their method may fail in complexoutdoor environments. In addition, this method [22]cannot handle occluded object imaging through multipleoccluders.

3. Our Approach

In this section we will introduce our optimal cameraselection-based occluded object imaging method. Insteadof averaging all rays from the entire camera array [1] orpartial visible cameras[22], our approach selects only oneoptimal camera for each pixel from the camera array, andcombines the selected rays together to create a syntheticimage. The word optimal refers to three facts. (1) Fromthe pixel perspective, the pixel should be visible in thecorresponding optimal camera. (2) From the local imageperspective, the adjacent pixels should be more likely toselect the same or adjacent camera from the camera array.(3) From the global perspective, we hope to select as few aspossible cameras to create the image of the occluded object.

3.1. Algorithm framework In this subsection, we give an overview of the optimalcamera selection and imaging approach by describing theow of information with the Stanford light eld datasetshown. The overall method mainly includes two parts:(1) Synthetic Aperture Imaging, and (2) Optimal CameraSelection, which are shown in Figure 2.

An imaging cycle begins when capturing multi-viewimages by multiple camera or a moving camera. TheSynthetic Aperture Imaging Module (Figure 2, left part)takes the multiple view images as input, through cameracalibration and synthetic aperture imaging, this module

generates a set of multi-view warping images on the givenfocus plane, in which all the images are precisely alignedwithout parallax.


3www.intechopen.com


4/9

The image warping results are then fed as input tothe Optimal Camera Selection Module (Figure 2, middlepart). Then, for the entire image plane of focus at depthi, this module initializes a multi-label graph grid. Thenode in this graph denotes the pixel on the plane of

focus. The edges between different nodes represent thespatial relations between connected pixels, and the labelof each node denotes the optimal camera ID for thispixel. Through the above denition, we can formulatethe optimal camera selection problem as a multi-labelproblem, and assign each pixel to a unique optimal cameravia graph cuts based energy minimization. To takeadvantage of the information of multiple depth planes,we adopt a greedy searching strategy which applies theenergy from the close to the distant focus plane, andpropagates the previous labelling results to the distantlayers. This greedy searching strategy only needs to scanthe observed scene once, which is quite efcient.

Finally, the camera selection results are combined togetherto generate a high quality occluded object image result(Figure 2, right top image), which is much better thanthe traditional synthetic aperture imaging result (Figure 2,right bottom image).

3.2. Optimal camera selection via multi-labelling graph cut

Let C denotes the camera view number. For a given depthi, our goal is to nd a labelling function f : C where refers to all the pixels in all images and C represents theset of possible labels of each pixel. For a pixel x, if f (x) =c, c C , then x is considered to be visible in view c.

Consider the labellingredundancy of thecamera array (thelabels in different cameras are highly related), we labelall the pixels in the reference camera view instead of inall camera views. Thus, we only seek a more succinctlabelling, g : I re f C , where I re f represents the wholeimage area in the reference camera.

The objective of choosing the visible view can beformulated as a following energy minimization problem:

E( g, i) = Ed( g) + Es( g) (1)

where the data term Ed is the sum of the data cost of each pixel and the smooth term Es is a regularizer that

encourages neighbouring pixels to share the same label.Data term : If a scene point p is located in the focusingdepth i, and pixel x refers to the corresponding pixel inreference image, then the colourvalue I 1(x) , I 2(x),..., I C(x)of the corresponding pixels in all camera views shouldapproximate to a Gaussian distribution N ( , 2) with asthe true colour value of the point p and 2 as the varianceof colour values from different cameras. Those colourvalues close to usually refer to the colour value of point p from different cameras, while those distant from areusually from the occluding object with low probability.Thus, we dene the data cost of labelling each pixel x inthe reference image as follows:

Ed( g) = xI re f

(1 N ( , 2, I g(x) (x))) (2)

where N ( , 2, I g(x) (x)) refers to the Gaussian probabilityof pixel labelling to camera g(x), g(x) C . To get theGaussian distribution, we rst apply K Means clustering incolour space, and then choose the cluster with the greatestnumber of samples. Let M and Var denote the sample

mean and variance of this cluster. Finally, we can obtainthe Gaussian distribution N ( , 2) with = M and 2 =Var .

Smooth term : The smoothness term Es( g) at depth i is aprior regularizer that encourages overall labelling to besmooth. The prior regularizer is based on the fact thattwo neighbouring pixels often have a higher probability of choosing the same camera view as their visibility is similarin the same camera view.

Because of the different colour responses among multiplecamera views andcalibration errors of thecamera position,even for the same visible point, the colour value of thecorresponding visible pixels in multiple camera views arealways different. Thus this prior is very important and itwill determine the colour smoothness of the nal syntheticimage. Surprisingly the traditional synthetic apertureimaging methods seldom consider the problem.

Here, we adopt the standard four-connectedneighbourhood system and penalize if the labels of two neighbouring pixels are different:

Es( g)= pI re f qN p

S p,q( g( p), g(q)) (3)

S p,q( g( p), g(q)) =1 g( p) = g(q)0 otherwise (4)

The intuition behind the design of the above energy is that:

(1) If the point is a fully visible focus point, then theappearance can be modelled as a unimodal distribution.In this case, the cost of choosing any visible camera issmall and therefore the imaging quality will not be greatlyinuenced by the choice of labelling.

(2) If the point is a partially occluded focus point, we arelikely to get a unimodal distribution and the case is similarto 1).

(3) If the point is a free point, the distribution will tend to be a uniform distribution. In this case all the cameras haverelatively large cost and as a result the smoothness termwill play the decisive role.

In the experiment, we adopt Boykovs graph cuts methods[8, 9] to solve the above energy minimization problem.

4. Experiments

4.1. Imaging system and datasets

In order to evaluate the performance of the proposedmethod under various circumstances, we have designedandset up several movingcamera-based light eld capturesystems.

Figure 3(a) displays our moving linear camera array lighteld capture system. The vertical linear array contains



5/9

(a) Moving linear camera array system

(b) Single moving camera based indoor virtual camera array system

(c) Single moving camera based outdoor virtual camera array imaging system

Figure 3. Our moving camera synthetic aperture imaging systems. (a) and (b) are designed for light eld capture in an indoorenvironment. (c) is designed for capturing an unstructured light eld in an outdoor environment.

eight Pointgrey Flea3 cameras, and it can move smoothly

on a sliding track. The entire system has the ability tosimulate a virtual camera with a three metre convex lens.In this experiment, we adopt the above moving lineararray to capture multiple occluded toys in an indoorenvironment (as shown in Figure 5).

In order to capture a dense 3D light eld of the scene,as well as to avoid the various colour responses amongdifferent cameras, we have also set up a single movingcamera-based imaging system (as shown in Figure 3(b)).Through moving the camera in different directions, thesystem can simulate a virtual camera array with 900camera views and a three metre convex lens (as shown inFigure 3(b), right image).

To evaluate our approach in outdoor scenes, we have alsoset up an outdoor unstructured light eld imaging system based on a moving Canon 5D Marker III camera. Figure

3(c) displays the system and the examples of an outdoor

building occluded by the front trees. We use Zhangsautomatic camera tracking method [24] to estimate themoving cameras pose and position for synthetic apertureimaging. The imaging results of this system are shownin Figures 6 and 7. Besides our system and datasets, inthis experiment section, we also adopt the public availableUCSD light eld datasets[12]in the experiment to evaluateand compare the imaging performance of our method.

4.2. UCSD Santa dataset

The UCSD Santa light eld dataset was acquired usingan eight-camera array and a linear translating gantry[23].This dataset contains 120 views on a 120x1 grid with imageresolution as 640x512. Figure 4(a) displays ve examplesof theSanta dataset. We adopt theview #57 as the referencecamera view (as shown in Figure 4(b1)), and Figure 4(b2)


5www.intechopen.com


6/9

View # 35 View # 46 View # 57 View # 68 View # 79

(b1) Original reference camera view (b2) Result of Vaish et al. [20]

(b3) Result of our method (b4) Our camera selection result

(a) Original multiple camera views

(b) Comparison of imaging result of the chairs through the Santa doll

Figure 4. Imaging results of USCD Santa light eld datasets [23]. (a) shows the examples of original camera views. (b1) to (b4) displaythe comparison results of the chair through occlusion.

shows the synthetic aperture imaging result using theVaishs method [20]. Please note that when we focuson the distant chairs and windows, the shadows fromthe foreground Santa signicantly blurred the image. Bycontrast, through multi-labelling graph cuts based optimalcamera selection (as shown in Figure 4(b4)), our approachsuccessfully removes the false shadows and produces ahigh quality occluded object image with far greater clarity(as shown in Figure 4(b3)).

4.3. Our multiple occluded toys dataset

To further test our method on severe occlusion cases, wehave conducted another experiment with multiple objects.We use our moving linear camera array (as shown inFigure 3) to capture this set of images. As shown in Figure5(a), the ower pot, teddy bear and the penguin are linedup in a column. It can be seen that the penguin is occluded by the teddy bear, which is further occluded by the frontower pot. In particular, we can see nothing but the feetof the teddy bear from the reference camera view (Figure

5(b1)). The standard synthetic aperture imaging results of the teddy bear and penguin are shown in Figure 5(b2)and Figure 5(c2) respectively. Due to the severe occlusion,Vaishs method [20] can only obtain a blurred image of the occluded object. By contrast, our method successfullyselects the optimal camera views via energy minimization(as shown in Figures 5(b4) and 5(c4)), and provides a clearand complete object image of the teddy bear (Figure 5(b3))and penguin (Figure 5(c3)) through severe occlusion.

4.4. Our window and building dataset

Figure 6 shows the imaging results of the outdoor scenethrough a large window. We capture these test datasetswith a hand-held single moving camera in our laboratory.Because of occlusion by the black window frame, wecannot get a complete image of thedistant building (Figure6(a)). Figure 6(b2) gives the synthetic aperture imagingresults using the Vaishs method [20] , which is blurry dueto the foreground window frame and depth variation of the distant building. By contrast, our approach virtually



7/9

View # 10 View # 14 View # 18 View # 22 View # 26 View # 30

Original reference camera view

(b2) Result of Vaish et al.[20] (b3) Result of our method (b4) Our camera selection result

(c2) Result of Vaish et al.[20] (c3) Result of our method (c4) Our camera selection result

(c) Comparison of imaging result through the flower pot and teddy bear

(b) Comparison of imaging result through the front flower pot

(a) Original multiple camera views captured by our single moving camera system

Figure 5. Imaging results through multiple occluders. Please note that in this challenging scene, the penguin is occluded by the teddy bear, which is further occluded by the front ower pot. Our approach successfully sees through multiple objects (b3) and (c3).

"removes" the foreground occluder, and creates a completeand clear image with lots of details via optimal cameraselection.

4.5. Our street dataset

To evaluate our method on the challenging street view,we have conducted another experiment with a complexoutdoor scene. As shown in Figure 7, the distant buildingsare occluded by the nearby trees (Figure 7(a)). Our aim

is to see the building to the rear of the scene through thetrees in the foreground. Comparison results using Vaishs

method [20] and our method are shown in Figures7(b2) and 7(b3). As Vaishs method [20] simply averagesthe intensity value of the multiple view images, it cannotselect different camera views for different regions, and theresulting image is signicantly blurred. In addition, sinceVaishs method only focuses on a given depth plane, theimage clarity can be reduced due to the depth variationof the buildings surface (Figure 7(b2)). Please note thatthrough selecting the optimal camera views, our methodaccurately gives the desired imaging result even underdepth changes and occlusion (Figure 7(b3)).


7www.intechopen.com


8/9

View # 10 View #12 View # 14 View # 16 View # 18




(b) Comparison of imaging result of the distant building through the window

Figure 6. Imaging results of the outdoor scene through the window. Please note that standard synthetic aperture imaging approach isvery blurred (b2). By contrast, our method successfully removes the window and generates a clear image of the occluded building (b3).

View # 4 View # 6 View # 8 View # 10 View # 12


(b) Comparison of imaging result of the distant building through the front tree in the street



Figure 7. Imaging results of the distant buildings through the trees. Please note that standard synthetic aperture imaging approach isvery blurred (b2). By contrast, our method successfully removes the complex foreground trees and generates a clear image of the occluded building (b3).



9/9

5. Conclusion

A novel occluded object imaging approach has beenpresented. Experimental results with qualitative andquantitative analysis demonstrate that the proposedmethod can reliably select the optimal camera andgenerate a clear image even through severe occlusion.Moreover, the satised imaging results with a movingcamera indicate that this approach has great potentialfor many applications. Our future work will focus onextending the method by developing new applications of the occluded object imaging techniques on smartphones.

6. Acknowledgements

The research in this paper is supported by the Project of National Natural Science Foundation of China under grantnumbers 61272288 and 61231016, the Foundation of ChinaScholarship Council under grant number 201303070083,the NPU New People and New Directions Foundationunder grant number 13GH014604, the NPU Foundationfor Fundamental Research under grant numbers JC201120and JC201148, and the Soaring Star of NPU under grantnumber 12GH0311.

7. References

[1] Levoy M (2006) Light elds and computationalimaging. IEEE Computer Maganize,39:46-55.

[2] Davis A, Levoy M, Durand F(2012) Unstructuredlight elds. Eurographics,31(2):305-314.

[3] Vaish V, Garg G, Talvala E, Antunez E, Wilburn B,Horowitz M, Levoy M (2005) Synthetic aperturefocusing using a shear-warp factorization of theviewing transform. IEEE Workshop on A3DISS,Computer Vision and Pattern Recognition,129-135.

[4] Joshi N, Avidan S, Matusik M, Kriegman D J(2007) Synthetic aperture tracking: tracking throughocclusions. International Conference on ComputerVision,1-8.

[5] Wilburn B, Joshi N, Vaish V, Talvala V E, AntunezE,Barth A, Adam A, Horowitz M, Levoy M (2005)High performance imaging using large cameraarrays. ACM T GRAPHIC,24(3):765-776.

[6] Basha T, Avidan S, Hornung S, Matusik W(2012)Structure and Motion from Scene Registration.Computer Vision and Pattern Recognition,1426-1433.

[7] Vaish V, Wilburn B, Joshi N, Levoy M (2004) Usingplane + parallax for calibrating dense camera arrays.Computer Vision and Pattern Recognition,1:2-9.

[8] Boykov Y, Veksler O, Zabih R (2001) Efcientapproximate energy minimization via graph cuts.IEEE Transactions on Pattern Analysis and MachineIntelligence (PAMI), 20(12):1222-1239.

[9] Boykov Y, Kolmogorov V (2004) An experimentalcomparison of min-cut/max-ow algorithms forenergy minimization in vision. IEEE Transactions on

Pattern Analysis and Machine Intelligence (PAMI),26(9):1124-1137.

[10] Boykov Y, Kolmogorov V(2010) Basic graph cutsalgorithms. Advances in Markov Random Fields.

[11] Kolmogorov V, Zabih R(2004) What energy functions

can be minimized via graph cuts? IEEETransactions on Pattern Analysis and MachineIntelligence (PAMI),26:147-159.

[12] Yang J C, Everett M, Buehler C, McMillan L (2002) Areal-time distributed light eld camera. EurographicsSymposium on Rendering.

[13] Lei C, Chen X, Yang Y H (2009) A new multi-viewspacetime-consistent depth recovery framework forfree viewpoint video rendering. InternationalConference on Computer Vision,1570-1577.

[14] Ding Y Y, Li F, Ji Y, Yu J Y (2011) Dynamic uid surfaceacquisition using a camera array. InternationalConference on Computer Vision,2478-2485.

[15] Venkataraman K,Lelescu D,Duparr J, McMahon A,

Molina G, Chatterjee P, Mullis R(2013) PiCam:An Ultra-Thin High Performance Monolithic CameraArray. ACM Transactions on Graphics,32(5):1-13.

[16] Maitre M, Shinagawa Y, Do M N (2008) Symmetricmulti-view stereo reconstruction from planar cameraarrays. Computer Vision and Pattern Recognition,1-8.

[17] Schuchert T, Scharr H (2010) Estimation of 3Dobject structure, motion and rotation based on 4Dafne optical ow using a multi-camera array. 11thEuropean Conference on Computer Vision,5963-609.

[18] Yang T, Zhang Y N,TongX M, Zhang X Q, Yu R (2013)A new hybrid synthetic aperture imaging modelfor tracking and seeing people through occlusion.IEEE Transactions on Circuits and Systems for VideoTechnology,23(9):1461-1475.

[19] Yang T, Zhang Y N, Yu R, Chen T (2013) Exploitingloops in the camera array for automatic focusingdepth estimation. International Journal of AdvancedRobotic Systems,10:232,DOI:10.5772/56321.

[20] Vaish V, Szeliski R, Zitnick C L, Kang S B,Levoy M (2006) Reconstructing occluded surfacesusing synthetic apertures: stereo, focus and robustmethods. Computer Vision and Pattern Recognition,2331-2338.

[21] Pei Z, Zhang Y N, Yang T, Zhang X W, Yang YH (2011) A novel multi-object detection methodin complex scene using synthetic aperture imaging.

Pattern Recognition,45(4):1637-1658.[22] Pei Z, Zhang Y N, Chen X, Yang Y H (2013) Syntheticaperture imaging using pixel labelling via energyminimization. Pattern Recognition,46(1):174-187.

[23] UCSD/MERL light eld dataset (2007)http://vision.ucsd.edu/datasets/lfarchive/lfs.shtml

[24] Zhang G F, Jia J Y, Wong T T, Bao H J (2009)Consistent depth maps recovery from a videosequence. IEEE Transactions on Pattern Analysis andMachine Intelligence (PAMI),31:974-988.


9www.intechopen.com

intech-high performance imaging through occlusion via energy minimization based optimal camera...

Documents