example-based sup er-resolutionpeople.csail.mit.edu/billf/papers/tr2001-30.pdfexample-based sup...

12

Upload: others

Post on 26-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Example-Based Sup er-Resolutionpeople.csail.mit.edu/billf/papers/TR2001-30.pdfExample-Based Sup er-Resolution William T. F reeman, Thouis R. Jones, and Egon C. P asztor MERL, Mitsubishi

MERL { A MITSUBISHI ELECTRIC RESEARCH LABORATORY

http://www.merl.com

Example-Based Super-Resolution

William T. Freeman, Thouis R. Jones, and Egon C. Pasztor�

MERL, Mitsubishi Electric Research Labs.

201 Broadway

Cambridge, MA 02139

TR-2001-30 August 2001

Abstract

Image-based models for computer graphics lack resolution independence: they

cannot be zoomed much beyond the pixel resolution they were sampled at with-

out a degradation of quality. Interpolating images usually results in a blurring

of edges and image details. We describe image interpolation algorithms which

use a database of training images to create plausible high-frequency details in

zoomed images. Image pre-processing steps allow the use of image detail from

regions of the training images which may look quite di�erent from the image

to be processed. These methods preserve �ne details, such as edges, gener-

ate believable textures, and can give good results even after zooming multiple

octaves.

This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to

copy in whole or in part without payment of fee is granted for nonpro�t educational and research purposes

provided that all such whole or partial copies include the following: a notice that such copying is by per-

mission of Mitsubishi Electric Information Technology Center America; an acknowledgment of the authors

and individual contributions to the work; and all applicable portions of the copyright notice. Copying,

reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi

Electric Information Technology Center America. All rights reserved.

Copyright c Mitsubishi Electric Information Technology Center America, 2001

201 Broadway, Cambridge, Massachusetts 02139

Page 2: Example-Based Sup er-Resolutionpeople.csail.mit.edu/billf/papers/TR2001-30.pdfExample-Based Sup er-Resolution William T. F reeman, Thouis R. Jones, and Egon C. P asztor MERL, Mitsubishi

1. First printing, TR2001-30, August, 2001.

� Egon Pasztor's present address:

MIT Media Lab

20 Ames St.

Cambridge, MA 02139

Page 3: Example-Based Sup er-Resolutionpeople.csail.mit.edu/billf/papers/TR2001-30.pdfExample-Based Sup er-Resolution William T. F reeman, Thouis R. Jones, and Egon C. P asztor MERL, Mitsubishi

Example-based Super-resolution

William T. Freeman, Thouis R. Jones, Egon C. Pasztor

MitsubishiElectricResearchLaboratories(MERL)201Broadway

Cambridge,MA 02139

(a) (b)Figure 1: An object modelledby traditionalpolygon techniquesmay lack someof the richnessof real-world objects,but behavesproperlyunderzooming,(b).

Abstract

Image-basedmodelsfor computergraphicslack resolutioninde-pendence:they cannotbe zoomedmuchbeyond the pixel resolu-tion they weresampledat without a degradationof quality. Inter-polating imagesusually resultsin a blurring of edgesand imagedetails.We describeimageinterpolationalgorithmswhich useadatabaseof training imagesto createplausiblehigh-frequency de-tails in zoomedimages.Imagepre-processingstepsallow the useof imagedetailfrom regionsof thetrainingimageswhichmaylookquitedifferentfrom theimageto beprocessed.Thesemethodspre-serve fine details,suchasedges,generatebelievabletextures,andcangive goodresultsevenafterzoomingmultipleoctaves.

1 Introduction

As shown in Fig. 1, polygon-basedrepresentationsof 3-dimensionalobjects offer resolution independenceover a widerangeof scales.Objectboundariesremainsharpasonezoomsinon theobjectuntil very closerange,wherefacetingappearsduetofinite polygonsize.

Constructingpolygonmodelsfor complex, real-world objectscanbe difficult. Image-basedrendering(IBR) is a complementaryap-proachfor representingandrenderingobjects,usingcamerasto ob-tain rich modelsdirectly from real-world data.Unfortunately, theserepresentationsno longerhave resolutionindependence.Whenwezoom into a bitmappedimage,we get a blurred image.Figure2shows the problem for an IBR “version” of teapot image, richwith real-world detail.Weknow theteapot’sfeaturesshouldremainsharpaswezoomin onthem,yetstandardpixel interpolationmeth-ods,suchaspixel replication(b, c) andcubic splineinterpolation(d, e), introduceartifactsor blurring of edges.For imageszoomed3 octaves,suchasthese,sharpeningtheinterpolatedresulthaslittleusefuleffect (f, g).

A methodto achieve higherresolutionviews of pixel-basedimagerepresentations,which we will call super-resolution,would havesomeof the bestof both worlds, complex modelsand resolutionindependence.In addition,many otherapplicationsin graphicsorimageprocessingcouldbenefitfrom suchpixel resolutionindepen-dence,suchastexturemapping,enlarging consumerphotographs,

and converting NTSC video contentto HDTV. We don’t expectperfectresolutionindependence—even thepolygonrepresentationdoesn’t have that—but increasingthe resolutionindependenceofpixel-basedrepresentationsis an importanttask for image-basedrendering.Our example-basedsuper-resolutionalgorithm yieldsFig. 2 (h, i).

2 Related approaches

Figure3 shows severalcomplementarywaysto increasetheappar-ent resolutionof an image:(a) sharpening,(b) aggregation frommultiple frames,and (c) single-framesuper-resolution.We feeleachshouldbe usedwherever possible.Sharpeningamplifiesde-tails that arepresentin the image.Integratingresolutioninforma-tion overmultiple framesis sometimescalledsuper-resolution.Forthepurposesof thispaper, wewill alwaysmeansingle-framesuper-resolution.

Super-resolutionrelatesto imageinterpolation—how shouldoneinterpolate betweenthe digital samplesof a photograph?Re-searchershave long studiedthis problem,althoughonly recentlyusingmachinelearningor samplingapproaches,whichoffer muchpower.

Cubic spline interpolation[9] is a very commonimageinterpola-tion function,but suffersfrom blurring of edgesandimagedetails.Recentattemptsto improveoncubicsplineinterpolation[12, 16, 3]havemetwith limitedsuccess.Schreiberandcollaborators[12] pro-poseda sharpenedGaussianinterpolatorfunction to minimize in-formationspilloverbetweenpixelsandoptimizeflatnessin smoothareas.SchultzandStevenson[13] haveusedaBayesianmethodforsuper-resolution,but hypothesizedtheprior probability.

Theseanalyticapproachesoftensuffer from percievedlossof detailin texturedregions.A proprietary, undisclosedalgorithm,AltamiraGenuineFractals2.0 [1] (an Adobe Photoshopplug-in), doesaswell asany of thenon-training-basedmethods,but still suffersfromblur in regionsof textureandatfine lines.

2.1 Example-based approaches

Onewould expectthat therichnessof real-world imageswould bedifficult to captureanalytically. Thismotivatesa learning-basedap-proach:in atrainingset,learnthefinedetailsthatcorrespondto dif-ferentimageregionsseenatalow-resolution;thenusethoselearnedrelationshipsto predictfinedetailsin otherimages.For thepastsev-eralyears[5, 6], wehavebeenexploringthisapproachfor enlargingimages.

To motivatewhy this approachshouldwork at all, notethata col-lection of imagepixels arespecialsignalswhich have much lessvariability thanwould a correspondingsetof completelyrandomvariables.Researchershave studiedtheseregularities to accountfor the early processingstagesof the mammalianvisual systems[4, 15]. Weexploit theseregularitiesin our algorithms,aswell: weusesmall piecesof oneimage,modifiedfor generalizationby the

Page 4: Example-Based Sup er-Resolutionpeople.csail.mit.edu/billf/papers/TR2001-30.pdfExample-Based Sup er-Resolution William T. F reeman, Thouis R. Jones, and Egon C. P asztor MERL, Mitsubishi

(a)

(b) (c)

(d) (e)

(f) (g)

(h) (i)Figure 2: (a) An image(100x100)of a real-world teapotshowsa richnessof texture, but yields a blocky or blurred imagewhenzoomedin by afactorof 8 in eachdimensionby (b,c) pixel replica-tionor (d,e)cubicsplineinterpolation.(Images(b) through(i) were32x32pixel originalsub-images,zoomedby 8 to 256x256images).Sharpeningthecubicsplineinterpolationmaynot helpto increasetheperceptualsharpness(f, g, using“sharpenmore” in AdobePho-toshop).(h, i) show theresultsof our one-passsuper-resolutional-gorithm,maintainingedgeandline sharpness,andplausibletexturedetails.

appropriatepre-processing,to createplausibleimageinformationinasecondimage.Withoutveryspecifictrainingdata,it is notreason-ableto expectto generatethecorrect high-resolutioninformation.Weaimfor themoreattainablegoalof generatingvisually plausibleimagedetails,suchassharpedges,andplausiblelooking texture.

(a)

(b)

(c)Figure 3: Differentcomplementaryapproachesto increasetheper-ceptualresolutionof animage.(a)Showsschematicallythechangein thespatialfrequency amplitudespectrumof animageassociatedwith image sharpening. Existinghigh frequenciesin theimageareamplified.This is oftenusefulto do,providednoiseisn’t amplified,but not thesubjectof super-resolution.(b) Extractingasinglehigh-resolutionframefrom a sequenceof low-resolutionvideo imagesis useful,andis alsosometimesreferredto assuper-resolution.(c)Thesuper-resolutiongoalof thispaperis to estimatemissinghigh-resolutiondetailthatis notpresentin theoriginal image,andwhichcannotbemadevisibleby simplesharpening.

3 Training set generation

To generateourtrainingset,westartfrom acollectionof highreso-lution images,anddegradeeachof themin amannercorrespondingto thedegradationwe planto undoin the imageswe laterprocess.Typically, we blur andsubsamplethemto createa low-resolutionimageof 1

4 thenumberof originalpixels.

We apply an initial analyticinterpolation,suchascubic spline,tothelow-resolutionimage.Weonly needto storethedifferencesbe-tweena cubicsplineinterpolationof the image,andthe true,highresolutionimage.Figure4 (a)and(c) show low andhighresolutionversionsof animage;(b) is theinitial up-interpolation(bilinearwasusedfor thisexample).

We want to storethehigh-resolutionpatchcorrespondingto everypossiblelow-resolutionimagepatch;thesepatchesaretypically5x5and7x7pixels,respectively. Evenrestrictingourselvesto plausibleimageinformation,this is ahugeamountof informationto store,soweneedto pre-processtheimagesto remove variablitity andmakethetrainingsetsasgenerallyapplicableaspossible.

We believe that the highest resolutioncomponentsof the low-resolutionimage(b) aremostimportantin predictingtheextra de-tails presentin (c). We filter out the lowest-frequency componentsof (b), so thatwe don’t have to storeexamplepatchesfor all pos-sible lowestfrequency componentvalues.We alsobelieve that therelationshipbetweenhigh andlow-resolutionimagepatchesis es-sentiallyindependentof local imagecontrast,andwedon’t wanttohave to storeexamplesof thatrelationshipfor all possiblevaluesofthe local imagecontrast.The resultingbandpassfiltered andcon-

Page 5: Example-Based Sup er-Resolutionpeople.csail.mit.edu/billf/papers/TR2001-30.pdfExample-Based Sup er-Resolution William T. F reeman, Thouis R. Jones, and Egon C. P asztor MERL, Mitsubishi

Figure 4: Image pre-processingstepsfor training images.Westartfrom a low-resolutionimage,(a), andits corresopndinghigh-resolutionsource,(c). We form an initial interpolation,(b), of thelow-resolutionimageto thehigherpixel samplingresolution.In thetraining set,we storecorrespondingpairsof patchesfrom (f) and(e), which arethe bandpassor highpassfiltered andcontrastnor-malizedversionsof (b) and(c), respectively. Thisprocessingallowsthesamepatchpair examplesto applyin differentimagecontrastsandlow-frequency offsets.

trastnormalizedimagepairsusedfor trainingareshown in Fig. 4(d) and(e). We undothe contrastnormalizationstepuponrecon-structionof thehigh-resolutionimage.

3.1 Markov network algorithm

If local imageinformationaloneweresufficient to predictthemiss-ing high resolutiondetails,we shouldbe able to usethe trainingsetpatchesby themselves for super-resolution.For a given inputimageto enlarge,we would apply the pre-processingsteps,breakthe imageinto patches,and look-up the missinghigh resolutiondetail.Unfortunately, thatapproachdoesn’t work, asillustratedinFig. 5 (a); thehigh resolutiondetail imagelookslike oatmeal.Thelocalpatchaloneis notsufficient to estimateplausiblelookinghighresolutiondetail.

Fig.5 (b) illustrateswhy. Foragivenlow-resolutioninputpatch,wesearchedatypicaltrainingdatabaseof about100,000patchestofindthe16closestexamplesto theinputpatch,shown in thesecondlineof Fig. 5 (b). Eachof theselooks fairly similar to the input patch.Thebottomrow shows thehigh resolutiondetailcorrespondingtoeachof thesetrainingexamples;eachof thoselooksquitedistinctfrom the other. This illustratesthat local patchinformationaloneis not sufficient for super-resolution;spatialneighborhoodeffectsmustbetakeninto account.

We have modelledthespatialrelationshipsbetweenpatchesusingaMarkov network, for whichwell-known usesin imageprocessinginclude[7]. In Fig. 6, thecirclesrepresentnetwork nodes,andthelines indicatestatisticaldependenciesbetweennodes.We let thelow-resolutionimagepatchesbeobservationnodes,y. Weselectthe16 or soclosestexamplesto eachinputpatchasthedifferentstatesof thehiddennodes,x, thatwe seekto estimate.For this network,the probabilityof any given high-resolutionpatchchoicefor eachnodeis theproductof all setsof compatibility matrices

�relating

the possiblestatesof eachpair of neighboringhiddennodes,and

(a)Nearestneighbor

(b)Figure 5: (a)Estimatedhighfrequenciesfor tiger image(Fig. 4 (e)arethe truehigh frequencies)formedby substitutingthe high fre-quenciesof the closesttraining patch to Fig. 4 (d). The lack ofa recognizableimageindicatesthat a nearestneighboralgorithmis not sufficient; spatialcontext must also be used.(b) An inputpatch,andsimilar low-resolution(middle rows) andpairedhigh-resolution(bottomrows) patches.For many of thesesimilar low-resolutionpatches,the high-resolutionpatchesarequite different,reinforcingthelessonfrom (a) above.

matrices� relatingeachobservationto theunderlyinghiddenstates.

To specifythe�

functionsof theMarkov network, we usea sim-ple trick. We samplethe nodesof the input image so that thehigh-resolutionpatchesoverlap with eachother by one or morepixels. In the region of overlap, the pixel valuesof compatibleneighboringpatchesshould agree.We measuredab

ij , the sum ofsquareddifferencesbetweenpatchcandidatesi and j at nodesaand b. The compatibility matrix betweennodesa and b is then

� ab(i, j) = exp( � (dabij )2

2� ), where � is a noise parameter. We usea similar quadraticpenaltyon differencesbetweenthe observedlow-resolutionimagepatch,andthecandidatelow-resolutionpatchfoundfrom thetrainingset,to specifytheMarkov network compat-ibility function,

�.

Theoptimalhigh-resolutionpatchesat eachnodeis thatcollectionwhich maximizesthe probabiltyof the Markov network. Findingtheexact solutioncanbe computationallyintractible,but we havefoundgoodresultsusingtheapproximatesolutionobtainedby run-ninga fast,iterativealgorithmcalledBelief Propagation.Typically,3 or 4 iterationsof thealgorithmaresufficient,seeFig. 7.

3.2 Single-pass algorithm

The fact that belief propagationconverged to a solution of theMarkov network soquickly led us to believe theproblemwasnota difficult one.We founda simpleone-passalgorithmwhich givesresultsthatarenearlyasgoodastheiterativesolutionto theMarkovnetwork.

Page 6: Example-Based Sup er-Resolutionpeople.csail.mit.edu/billf/papers/TR2001-30.pdfExample-Based Sup er-Resolution William T. F reeman, Thouis R. Jones, and Egon C. P asztor MERL, Mitsubishi

Figure 6: Markov network modelfor super-resolutionproblem.

In our algorithm,we only computehigh resolutionpatchcompati-bilities for neighboringpatchesthatarealreadyfixed,typically thepatchesabove andto theleft, in raster-scanorderprocessing.If wepre-structurethe training dataproperly(Figure11), matchingthelocal low-resolutionimagedataaswell asselectingthecompatiblehigh resolutionpatchcandidatecanall be donein a singleoper-ation: finding the nearestneighborto a given input vector in thetrainingset.Thesimplificationavoidsstepsof finding thecompat-ibility matricesandtheiterative belief propagationalgorithm,withnegligible reductionin imagequality.

Figure8 showsanimagezoomedwith super-resolution,alongwiththe sameimagezoomedwith cubic splineand the true high res-olution image.At the bottomarethe imagesfrom the trainingsetusedin thesuper-resolutionzoom.Thecentersectionshowsthede-tailsof a few patchesin thezoomedimageandtheir correspondingbestmatchesin thetrainingset.Thetop andbottomrowsshow theimagecontentof thepatchesin thesuper-resolutionimageandthetrainingset.Thesecondandfifth rowsshow thelow resolution,con-trastnormalizedpatches.The third row shows the high resolutioncontentof thetruehigh resolutionimage,andthefourthshows thehighresolutionpatchchosenby thesuper-resolutionalgorithm.Al-thoughnotperfect,thematchesbetweenthetrueandestimatedhighresolutionpatchesarereasonablygood.Note that thealgorithmisable to make useof training patchexamplesfrom sourceimageregionsthat look very differentthantheregionswherethey arein-sertedinto the zoomedimage.For example,the orangeborderedpatchcorrespondsto a shadow boundaryon wood in the trainingimage(of 3 girls),but is appliedto zoomupagreenplantocclusionboundary. Thebandpassfiltering andcontrastnormalizationallowsfor this re-use,whichmakesthetrainingsetmorepowerful.

4 Single-pass algorithm details

In thesimplestterms,one-passsuper-resolutiongeneratesthemiss-ing high frequency contentof a zoomedimageas a sequenceofpredictionsfrom local imageinformation.Theinput imageis sub-divided into low-frequency patcheswhich are traversedin rasterscanorder. At eachstep,a high-frequency patchis selectedfromthetrainingsetbasedon thelocal low-frequency detailsaswell asadjacent,previously determinedhigh-frequency patches.

In thealgorithmsdescribedbelow, (nonpredictive)scalingupof im-agesis performedvia cubicsplineinterpolation,andscalingdownby convolving with a [0.25 0.5 0.25] blurring filter and subsam-pling on theeven indices.([6] usedlinear interpolationfor theup-sampling,which putsmoreinterpolationburdenon the restof thealgorithm).

4.1 Prediction

Given the highest frequenciesin an input image, the super-resolutionalgorithmpredictsthenext octaveup,i.e.thefrequenciesmissingfrom an imagezoomedwith cubic interpolation.Theout-put of thealgorithmis thesumof its input andthehigh frequencypredictions(Figure9).

Thehigh frequenciesarepredictedfor NxN pixel patchesatatime,in raster-scanorder. Eachpredictionis basedon two competingre-quirements.First, the high frequency patchshouldcomefrom alocationin thetrainingimagethathasa similar low-frequency ap-pearance.Second,thehigh-frequency predictionshouldagreeattheedgesof thepatchwith its neighbors,to preventdiscontinuities.

Thefirst requirementcanbefulfilled by extractingalow-frequencypatch(MxM, not necessarilythe samesizeasthe high frequencypatch)from the imagewe arezoomingandsearchingfor a matchin the training set madeup of pairs of low- and high-frequencypatches.To meetthesecondcriterion,weoverlappredictedpatchesat their borders(Figure10). Whensearchingthe training set, thehigh-frequency datapreviously predictedis alsousedin selectingthe bestpair. A user-controlledweighting factor � is usedto ad-just the relative importanceof the low frequency patchversustheoverlapwith high frequency patches.

Thesuper-resolutionalgorithmoperatesundertheassumptionthatthepredictive relationshipbetweenlow andhighfrequency patchesis independentof contrast,andwe thereforenormalizepatchpairsby theaverageabsolutevalueof thelow frequency patch,acrossthecolorchannels(plussomesmall � to avoid overflow).

Thepixels in low-frequency patchandthehigh-frequency overlapareconcatenatedto form a searchvector. The training set is alsostoredasa setof vectors,sosearchingfor a matchcanbeaccom-plishedby finding thenearestneighborin thetrainingset.Whenamatchis found,thecontrastnormalizationis reversedon thehigh-frequency patch,andit is addedto theoutputimage(Figure11).

4.1.1 Search algorithm

Wesearchfor matchesusinganL2 norm.Dueto thehighdimensionof thesearchspace,finding theabsolutebestmatchwouldbecom-putationallyprohibitive. Instead,we usea tree-based,approximatenearestneighborsearch.The tree is built by recursively splittingthetrainingsetin thedirectionof highervariation.At eachstepwedivide thesetof tiles in half, to maintainabalancedtree.

We use“best-first” searchof the tree to find a goodmatch.Thisallows for a speed-qualitytradeoff: by searchingmorebranchesofthetreewecanfindabettermatch.Sincebest-firstsearchisunlikelyto give the true bestmatchwithout searchingmost or all of thetree,we improve thebest-firstmatchby agreedywalk in thegraphconnectingapproximatenearestneighborsin thetrainingset.Thisimproves the matchwith negligible cost. In all examplesin thispaper, we connecteachpatchpair to its 32 approximatenearestneighbors,computedby amethodsimilar to [10].

4.2 Training

Trainingsetsfor thesuper-resolutionalgorithmarebuilt from band-passandhighpasspairstaken from a setof training images.Spa-tially correspondingMxM low-frequency andNxN high-frequencypatchesaretaken from imagepairsat a setof samplinglocations(usually, M=7, N=5, andsamplesaretakenatevery pixel).

Patchpairsarecontrastnormalizedasdescribedabove.Thesearchvectorfor a patchpair is createdby the concatenationof the low-frequency patchandtheregion thatwill beoverlappedin thehigh-

Page 7: Example-Based Sup er-Resolutionpeople.csail.mit.edu/billf/papers/TR2001-30.pdfExample-Based Sup er-Resolution William T. F reeman, Thouis R. Jones, and Egon C. P asztor MERL, Mitsubishi

frequency patchduring the predictionphase,adjustedby the con-sistenc� y weightingfactor � (Figure11).

Weusedthesamesetof trainingimagesfor all thesuper-resolutionexamplesin thispaper, shown in Figure12.They weretakenwith aNikon Coolpix 950digital cameraat 640x480resolutionandhigh-estqualitycompressionsettings.

4.3 Parameter settings

Attentionto parametersettingscanimproveimagequality. For bothlevelsof zooming,weused5x5pixel high-resolutionpatches(N=5)with 7x7pixel low-resolutionpatches(M=7). Theoverlapbetweenadjacenthigh-resolutionpatcheswas1. Thesesettingsdo well tocapturesmalldetails.Largerpatchsizescanbeusedfor lessaccen-tuationof smalldetails.

For amoreconservativeestimateof thehigherresolutiondetail,thealgorithmcanbeperformed4 timesat staggeredoffsetsrelative tothepatchsamplinggrid. This gives4 independentestimatesof thehigh frequencies,which canthenbeaveragedtogether, smoothingsomeimagedetails.

The parameter� controlsthe tradeoff betweenmatchingthe low-resolutionpatchdataand finding a high-resolutionpatch that iscompatiblewith its neighbors.Thevalue � = 0.1 M2

2N � 1 gave goodquality resultsin ourexperiments.Thefractionadjustsfor therela-tiveareasof low-frequency patchesandoverlappedhigh-frequencypixels.

5 Results

Figures13and14showsouralgorithmappliedto abrick wall andaman’sface.Thetrainingsetwastakenfromtheimagesin Figure12.Theresultingzoomsaresignificantlysharperthanthosefrom cubicsplineinterpolation,preservingsharpedgesandimagedetails.

Figure15shows anexamplewhereour low-level trainingsetaloneis not enoughto distinguishJPEGcompressionnoisefrom correctimagedata;thealgorithminterpretstheartifactsasimagedataandenhancesthem.Extensionsof specializedhigh-level modelssuchas[2] couldbeneeded.

In a simpleexplorationof therelationof thezoomedimageto thetrainingimages(see[6] for others),weenlargedtheimageof Fig.8usingapathologicaltrainingsetof imagesof text. Nonetheless,thealgorithmdoesitsbesttoexplaintheobservedlow-resolutionimagein its vocabulary of text examples,resultingin a zoomwith highresolutiondetailformedoutof concatenatedcharacters,Fig. 16.

5.1 Discussion

Recentwork by Hertzmannet al [8] hasalsouseda training-basedmethodto performsuper-resolution,in thecontext of analogiesbe-tweenimages.Our methoddiffersin thatit operateson tiles ratherthanper-pixel, providing a performancebenefit.It alsonormalizesthetrainingsetaccordingto contrast,andassumesthat thehighestfrequency detailsin animagecanbepredictedusingonly thenextlower octave. Thesetwo generalizationsallow us to zooma widerclassof imagesusingasingle,generictrainingset,ratherthanbeingrestrictedto operatingonimagesthatareverysimilarto thetrainingimage.

A training-basedapproachwasusedby [11], but no attemptwasmadeto enforcethe spatialconsistency constraintsnecessaryforgoodimagequality.

If well-known objectsaresparselysampledin theimage,animageextrapolationbasedonlocal imageevidencealonewill notproduce

the new details that the viewer expects.Very small face imagesaresuseptableto this problem.To addresstheseproperly, higher-level reasoningwould have to be addedto the algorithm.BakerandKanade[2] have recentlyexploredsuper-resolutionalgorithmstunedto aparticularclassof images,suchasfacesor text.

In thezoomed-upimages,low-contrastdetailsnext to highcontrastedgesmaybe lost, dueto thecontrastnormalizationfixing on thelevel of thehigh contrastedge.Independentcontrastnormalizationfor differentimageorientations,eachzoomedseparately, mightad-dressthis problem.However, it is not clearthata one-passimple-mentationwouldsuffice for thatmodification.

Finally, the algorithm works best when the resolutionor noisedegradationsof the datamatchthoseof the imagesto which it isapplied.

Numerically, the root-mean-squarederror from the true high fre-quenciestendto beapproximatelythesameasfor theoriginal cu-bic splineinterpolation.Unfortunately, thismetrichasonly a loosecorrelationwith percieved imagequality [12]. Typical processingtimes for the single-passalgorithm are two secondsto enlarge a100x100imageup to 200x200pixels.

Wehave focussedon thecaseof enlargingsingleimages.Thecaseof enlargingmoving imagesis differentin two respects:(1) thereismoreinput data;multiple observationsof thesamepixel couldbeusedfor super-resolution.(2) Caremustbe taken to ensurecoher-enceacrosssubsequentframesso that the made-upimagedetailsdonotscintillatein themoving image.

6 Conclusions

Thereis a surprisingregularity acrossimages,suchthata trainingsetmadefrom theimagesof Fig. 12 canbeusedto inventmissingdetailsin many others(all thosein this paper).While a trainingsettunedto theimagesto beprocessedof courseworksbest,a trainingsetof genericimagescanhandleavery broadclassof inputs.

We have built on the training-basedsuper-resolutionalgorithmof[6], andintroduceda faster, simpler, and,we believe, betteralgo-rithm for one-passsuper-resolution.Thealgorithmrequiresonly anearest-neighborsearchin thetrainingsetfor avectorderivedfromeachpatchof localimagedata.Thisone-passsuper-resolutionalgo-rithm is asteptowardachieving resolutionindependencein image-basedrepresentations.

Thesealgorithmsare an instanceof a generaltraining-basedap-proachthatmaybeusefulfor imageprocessingor graphicsappli-cations(see[6, 14]). Trainingsetscanbebuilt to helpenlarge im-ages,remove noise,estimate3-d surfaceshapes,andattackotherimagingapplications.

7 Acknowledgements

We thank Ted Adelson,OwenCarmichael,andJohnHaddonforhelpfuldiscussions.

References

[1] AltamiraGenuineFractals2.0,2000.www.altamira.com.

[2] S.Baker andT. Kanade.Limits onsuper-resolutionandhow to breakthem.In IEEE Conf. Computer Vision and Pattern Recognition, 2000.

[3] F. Fekri, R. M. Mersereau,andR. W. Schafer. A generalizedinter-polative VQ methodfor jointly optimalquantizationandinterpolationof images.In Proc. ICASSP, volume5, pages2657–2660,1998.

[4] D. J.Field. Whatis thegoalof sensorycoding.Neural Computation,6:559–601,1994.

Page 8: Example-Based Sup er-Resolutionpeople.csail.mit.edu/billf/papers/TR2001-30.pdfExample-Based Sup er-Resolution William T. F reeman, Thouis R. Jones, and Egon C. P asztor MERL, Mitsubishi

[5] W. T. FreemanandE. C. Pasztor. Learningto estimatescenesfromimages.

In M. S. Kearns,S. A. Solla,andD. A. Cohn,editors,Adv.

Neural Information Processing Systems, volume11,Cambridge,MA,1999.MIT Press.Seealsohttp://www.merl.com/reports/TR99-05/.

[6] W. T. Freeman,E. C. Pasztor, andO. T. Carmichael.Learninglow-level vision. Intl. J. Computer Vision, 40(1):25–47,2000.

[7] S. GemanandD. Geman. Stochasticrelaxation,Gibbsdistribution,andthe Bayesianrestorationof images. IEEE Pattern Analysis andMachine Intelligence, 6:721–741,1984.

[8] A. Hertzmann,C. E. Jacobs,N. Oliver, B. Curless,andD. H. Salesin.Imageanalogies.In ACM SIGGRAPH, 2001. In Computer GraphicsProceedings,AnnualConferenceSeries.

[9] R. Keys. Bicubic interpolation. IEEE Trans. Acoust. Speech, SignalProcessing, 29:1153–1160,1981.

[10] S. A. NeneandS. K. Nayar. A simplealgorithmfor nearestneigh-bor searchin high dimensions.IEEE Pattern Analysis and MachineIntelligence, 19(9):989–1003,September1997.

[11] A. PentlandandB. Horowitz. A practicalapproachto fractal-basedimagecompression.In A. B. Watson,editor, Digital images and hu-man vision. MIT Press,1993.

[12] W. F. Schreiber. Fundamentals of electronic imaging systems.Springer-Verlag,1986.

[13] R. R. SchultzandR. L. Stevenson. A Bayesianapproachto imageexpansionfor improved definition. IEEE Trans. Image Processing,3(3):233–242,1994.

[14] H. SidenbladhandM. J.Black. Learningimagestatisticsfor Bayesiantracking.In Intl. Conf. on Computer Vision, 2001.

[15] E. P. Simoncelli.Statisticalmodelsfor images:Compression,restora-tion andsynthesis.In 31st Asilomar Conf. on Sig., Sys. and Comput-ers, PacificGrove,CA, 1997.

[16] S.ThurnhoferandS.Mitra. Edge-enhancedimagezooming.OpticalEngineering, 35(7):1862–1870,July1996.

(a)beliefprop.,iter. 0

(b) beliefprop.,iter. 1

(c) beliefprop.,iter. 3

(d) beliefpropagationFigure 7: Beliefpropagationsolutionto Markov network for super-resolution.(a), (b), and(c) aretheestimatedhigh frequenciesafter0, 1, and3 iterationsof beliefpropagation.(d) is theestimatedfull-resolutionimage.(The inverseof the contrastnormalizationusedin Fig. 4 (d) was appliedto Fig. 7 (c). The result was addedtoFig. 4 (b) to obtainFig. 7 (d)). Thetrainingsetfor this imagewastwo categoriesof theCoreldatabase,includingothertigers,but notthis image[6].

Page 9: Example-Based Sup er-Resolutionpeople.csail.mit.edu/billf/papers/TR2001-30.pdfExample-Based Sup er-Resolution William T. F reeman, Thouis R. Jones, and Egon C. P asztor MERL, Mitsubishi

Figure 8: Exampleshowing how patchesin thetrainingimageareusedto createdetailin thetestimage;seetext.

Figure 9: High level view of run time processingin our super-resolutionalgorithm.

Figure 10: Patch centers, low-resolution and high-resolutionpatches,and high-resolutionpatch overlap. The shadedhigh-resolutionpatcheshavealreadybeenprocessed,andtheshadedareaoverlappedby thecurrent(unshaded)patchareusedto enforcespa-tial consistency in thehigh-resolutiondetails.

Figure 11: Block diagramshowing raster-orderper-patchprocess-ing. At eachstep,local low- andhigh-frequency details(shown ingreenandred,respectively) areusedto searchthe training setfora new high-frequency patch,which is addedto thehigh-frequencyimage.

Page 10: Example-Based Sup er-Resolutionpeople.csail.mit.edu/billf/papers/TR2001-30.pdfExample-Based Sup er-Resolution William T. F reeman, Thouis R. Jones, and Egon C. P asztor MERL, Mitsubishi

(a) (b) (c)

(d) (e) (f)Figure 12: Training imagesusedfor the examplesof this paper,unlessotherwisestated.Patcheswere sampledat 1 pixel offsetsover eachof theseimagesand over their syntheticallygeneratedlow-resolutioncounterparts(afterpre-processingsteps).Thesesix200x200imagesyieldedatrainingsetof slightly over200,000highandlow resolutionimagepatchpairs.

(a)(b)

(c)

(d)

(e)

(f)Figure 13: Super-resolutioncanbeusedfor zoominginto texture-mappedsurfaces.(a) original texturein 52x52pixel bitmap.super-resolutionzoomedby 2 (b), by 4 (c) and by 8 (d) times in eachdimension.Samelevel of zooming,using(e) cubicsplineinterpo-lation,and(f) AltamiraGenuineFractalsproprietarysoftware.Oursuper-resolutionalgorithm inventssharp,plausibleimagedetails,usingexamplespreviously observedin thetrainingdatabase.

Page 11: Example-Based Sup er-Resolutionpeople.csail.mit.edu/billf/papers/TR2001-30.pdfExample-Based Sup er-Resolution William T. F reeman, Thouis R. Jones, and Egon C. P asztor MERL, Mitsubishi

(a)

(b)

(c)Figure 14: (a) Original image. (b) cubic spline interplation (c)Super-resolutioninterpolation

(a)

(b) (c)

(d) (e)Figure 15: Failureexample.(a) original image.(b) and(d): cubicsplineinterpolationby factorof 4 in eachdimension.Note JPEGcompressionartifacts madevisible. (c) and (e): one passsuper-resolutioninterpolation.Without high-level information,the algo-rithm treatstheJPEGnoiseassignal,andamplifiesit.

Page 12: Example-Based Sup er-Resolutionpeople.csail.mit.edu/billf/papers/TR2001-30.pdfExample-Based Sup er-Resolution William T. F reeman, Thouis R. Jones, and Egon C. P asztor MERL, Mitsubishi

(a)

(b)

(c)Figure 16: Super-resolutionexampleusing pathologicaltrainingsetcomposedentirelyof text in onefont; (a) is anexampleimagefrom thetrainingset.(b) zoomedimage,and(c) close-up.Notethatthealgorithmdoesasbestasit canto inventplausibledetailfor thisimage,formingcontoursby concatenatedletters.