chapter 3: extending film-like digital photographyraskar/adm/book06feb08/chap3... · web viewas a...

45
CHAPTER 3: Extending Film-Like Digital Photography As a thought experiment, suppose we accept our existing film-like concepts of photography, just as they have stood for well over a century. For the space of this chapter, let’s continue to think of any and all photographs, whether captured digitally or on film, as a fixed and static record of a viewed scene, a straightforward copy of the 2-D image formed on a plane behind a lens. How might we improve the results from these traditional cameras and the photographs they produce if we could apply unlimited computing, storage, and communication to any part of the process existing film- like cameras ? The past few years have yi e lded a wealth of new opportunities, as miniaturization already allows lightweight battery-powered devices such as mobile phones to rival the computing power of the desktop machines of only a few years ago, and as manufacturers can produce millions of low-cost, low-power and compact digital image sensors, high- precision motorized lens systems, bright, full-color displays, and even palm-sized projectors, integrated into virtually any form as low-priced products. H ow can these computing opportunities computing improve conventional forms of photography? Currently, adjustments and tradeoffs dominate film-like photography, and most decisions are locked in once we press the camera’s shutter release. Excellent photos are often the result of meticulous and artful adjustments, and the sheer number of adjustments has grown along with our photographic sophistication as digital camera electronics replaced film chemistry, and now include ASA settings, tone scales , flash control, complex multi-zone light metering, color balance, and color saturation . Yet we make all these adjustments before we take the picture, and even our hastiest decisions are usually permanent irreversible . Poor choices lead to poor photos, and an excellent photo may be possible only for an exquisitely narrow combination of settings taken combined with a shutter-click made at just the right moment. How might we elude these heretofore inescapable tradeoffs? Might

Upload: others

Post on 04-Dec-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

CHAPTER 3: Extending Film-Like Digital Photography As a thought experiment, suppose we accept our existing film-like concepts of photography, just as they have stood for well over a century. For the space of this chapter, let’s continue to think of any and all photographs, whether captured digitally or on film, as a fixed and static record of a viewed scene, a straightforward copy of the 2-D image formed on a plane behind a lens. How might we improve the results from these traditional cameras and the photographs they produce if we could apply unlimited computing, storage, and communication to any part of the processexisting film-like cameras? The past few years have yielded a wealth of new opportunities, as miniaturization already allows lightweight battery-powered devices such as mobile phones to rival the computing power of the desktop machines of only a few years ago, and as manufacturers can produce millions of low-cost, low-power and compact digital image sensors, high-precision motorized lens systems, bright, full-color displays, and even palm-sized projectors, integrated into virtually any form as low-priced products. How can these computing opportunities computing improve conventional forms of photography?

Currently, adjustments and tradeoffs dominate film-like photography, and most decisions are locked in once we press the camera’s shutter release. Excellent photos are often the result of meticulous and artful adjustments, and the sheer number of adjustments has grown along with our photographic sophistication as digital camera electronics replaced film chemistry, and now include ASA settings, tone scales, flash control, complex multi-zone light metering, color balance, and color saturation. Yet we make all these adjustments before we take the picture, and even our hastiest decisions are usually permanentirreversible. Poor choices lead to poor photos, and an excellent photo may be possible only for an exquisitely narrow combination of settings takencombined with a shutter-click made at just the right moment. How might we elude these heretofore inescapable tradeoffs? Might it be possible Can weto defer choosing the camera’s settings somehow making the, adjustments, or to change our minds and re-adjust them later? Can we compute new images that expand the range of settings, such as a month-long exposure time? What new flexibilities might allow us to take a better picture now, and also keep our choices open to or create an even better one later?

3.1 Understanding Limitations This is a single-strategy chapter. Since existing digital cameras are already extremely capable and inexpensive, in this Chapter we will explore different ways to construct combined results fromcombine multiple cameras and/or multiple images. By digitally combining the

Page 2: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

information from more than onethese images, we can compute a picture superior to what any single camera could capture, and may also can create interactive display applications that let users adjust and explore settings that were fixed in film-like photography. (FOOTNOTE-1)

(FOOTNOTE-1)For example, HDRShop from Paul Debevec’s research group at USC-ICT[http://projects.ict.usc.edu/graphics/HDRShop] helps users construct high-dynamic-range images from bracketed-exposure image sets, then lets users interactively adjust exposure settings to reveal details in brilliant highlights or the darkest shadows; Autostitch from David Lowe’s group at UBC [http://www.cs.ubc.ca/~mbrown/autostitch/autostitch.html] and AutoPano-SIFT[http://user.cs.tu-berlin.de/~nowozin/autopano-sift/] let users construct cylindrical or spherical panoramas from overlapped images; and HD View from Microsoft Research[http://research.microsoft.com/ivm/hdview.htm] allows users an extreme form of zoom to explore high-resolution panoramas, varying smoothly from spherical projections for very wide-angle views (e.g. >180o ) to planar projections for very narrow, telescopic views (<1o ). in which the viewer is even permitted to adjust its parameters interactively [ADD footnote on interactive display].

This strategy is a generalization of analogous to ”bracketing” [cite: http://en.wikipedia.org/wiki/Bracketing] already familiar to most photographers. Bracketing lets photographers avoid uncertainty about critical camera settings such as focus or exposure; instead of taking just one photo we think has the correct setting, we make additional exposures at severalboth higher and lower settings that ”bracket” the chosen one. If our first, best-guess setting was not the the correct choice setting, the bracketed set of photos almost always contains a better one. The methods in this chapter are analogous, but often use a larger the numberset of photos is larger as multiple settings may be changed, and we may digitally merge desirable image features from multiple exposures images in the set rather than simply selecting just one singlea single best photo.

We need to broaden our thinking about photography to avoid missing some opportunities. So many of the limitations and trade-offs of traditional photography have been with us for so long that we tend to assume they are inescapable, a direct consequence of the laws of physics, image formation and light transport. For example, Chapter [chapCameraFundamentals] reviewed how the depth-of-focus of an image formed behind a lens is a direct consequence of the thin-lens law [fig:depthOfFocus—SHOULD BE A SIDE_BAR ILLUSTRATION (SEE

Page 3: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

JackDepthOfFocusSlides.ppt; make figure with explanatory caption]. While true for a single image, merged multiple images let us construct an ‘all focus’ image (as in Fig XX), or vary focus and depth-of-focus arbitrarily throughout the image.

In film-like photography, we cannot adjust a single knob For more flexible and easily-adjustableto change the depth-of-focus: instead we, we must choose make several interdependent settings choices that each impose different trade-offsmposes its own trade-offs. We could use a lens with a shorter focal length, but this will make the field-of-view wider; we could compensate for the wider field-of-view by moving the camera closer to the subject, but then we will is changes foreshortening in the scene, like this:

Focal Length vs. Viewpoint vs. FocusFocal Length Focal Length vs. vs. Viewpoint Viewpoint vs.vs. FocusFocus

Wide angle isnWide angle isn’’t flattering; do you know why?t flattering; do you know why?

Wide angle Standard Telephoto

Large/Deep Large/Deep Depth of FocusDepth of Focus Small/shallowSmall/shallow

Figure XX{chap3foreshorten}: Foreshortening Effects (NEED PERMISSION FOR THESE IMAGES!) We can keep the same image size for a photographed subject if we move the camera closer and “zoom out” to a wider-angle lens, or if we move the camera further away and “zoom in” to a narrow-angle telephoto lens, but the appearance of that subject may change dramatically due to foreshortening. Foreshortening is the subjective name for a mathematically simple rule for planar projection: the image size of an object is proportional to its depth, its distance to the camera. In a telephoto image of a face, the distance to the camera is much greater than the small depth differences between forehead, nose, and chin, and all appear properly proportioned in the image. In a wide-angle close-up photo, the nose

Tom Artin, 07/15/08,
This illustration already appears in the Introduction. Do you want it to appear twice in the book?
Jack Tumblin, 07/24/08,
No, but I would like some sort of illustration of the visual tradeoff between distance and field of view. In the intro, I believe we used this image to illustrate depth-of-field
Page 4: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

distance might be half the chin and forehead distance, exaggerating its image size in a very unflattering way.

We could compensate by cropping the sensed image using a smaller portion of the image sensor, but this reduces resolution and may make some lens flaws more noticeable, such as focus imperfections, chromatic aberration and coma, and other lens flaws less apparent, such as radial distortion. . We could leave the lens unchanged, while reducing the size of its limiting aperture, but this decreases the light falling on the image sensor and may increase visible noise. Compensating for the decreased intensity by increasing exposure time would increase the chance of motion-blur or camera-blur. Increasing the sensor’s light sensitivity might further increase image noise.1

What strategy should we choose to extend film-like photography in the most useful ways? Usually, no one answer is best; instead, we confront a host of interrelated tradeoffs that depend on scene, equipment, the photographer’s intentions, and the ultimate display of the photograph.

What are our assumptions as photographers? Do they remain valid for bracketing, for merged sets of photographs? How might we transcend them by combining, controlling and processing results from multiple cameras, lights, and photographs using computing methods?

Assumptions block progress as we explore computed extensions to film, and we can’t be sure we’ve identified them all. Surely every photo-making process has to employ a high-quality optical system for high quality results. Surely any good camera must require focusing, adjusting zoom level, choosing the field of view and the best framing of the subject scene. To achieve the results we aspire to, surely we must choose our exposure settings carefully, seek out themaking optimal tradeoffs among sensitivity, noise and the length of exposure needed to capture a good image. Surely we must keep the camera stable as we aim it at our subject. Surely we must match the color-balance of our film (or digital sensor) to the color spectrum of our light sources, and later match it to the color spectrum of our display device. Surely 1 See section on image noise outlining 3 principal causes: photon arrival, Schott noise in electronics, thermal noise, miscelaneous electronic noise (quantum or photon-arrival noise, fixed-pattern noise due to sensitivity variations, Schott noise due to thermal excitation in transistors, and interference or “crosstalk” between nearby signal-carrying conductors., A/D and D/A quantization, lossy encoding/decoding.

Page 5: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

we must choose appropriate lighting, adjust the lights well, choose a good viewpoint, pose and adjust the subject for its most flattering appearance (and “Say cheese!”). Only then are we ready to click the shutter. Right?

Well, no, n. Not necessarily, n. Not any longer. We can break each of these conventions with computationl methods. The technical constraints change radically for each of these conventions if we’re allowed to combine results from multiple photographs and/or multiple cameras. This chapter points out some of those assumptions, describes a few current alternatives, and encourages you to look for more.

A few inescapable limits, though, do remain: we cannot measure infinitesimal amounts of light, such as the

strength of a single ray, but instead must measure a bundle of rays; a group that impinges on a non-zero area and whose directions span a non-zero solid angle; and

we cannot completely eliminate noise from any real-world sensor that measures a continuum of values (such as the intensity of light on a surface); and

we cannot create information about the scene not recorded by at least one camera.

Beyond these basic irreducible limits, we can combine multiple photographs to substantially expand nearly all the capabilities of film-like photography.

3.2 Strategies: Fusion of Multiple Images

Tradeoffs in film-like photography usually improve one measurable aspect of a photograph at the expense of another. While we can capture a series of photographs with different settings for each, we can also vary setting within the digital sensors themselvespermit other choices as well:

3.2.1 ”Sort first” versus ”Sort last” CaptureWith the 'sort first' method, we capture a sequence of photographs with one or more cameras. Each photo forms one complete image, taken with just, but for only one complement of camera settings. Each image is ready to use as output, and we need no further “sorting” of the image contents to construct a viewable output image (though we may still merge several photos to make the output even better). B. racketing of any kind is aA good example of “sort first” photography is bracketing of any sort -- if we photograph at high, moderate, and low

Page 6: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

exposure times, we “sort” the results byse by selecting the best whole-photo result; we don't need any further untangling of measured data to create the best photograph to untangle that photo from any other measurements. For example, in 1909-1912, and 1915, commissioned and equipped by Tsar Nicholas II, Sergei Mikhailovich Prokudin-Gorskii (1863-1944) surveyed the Russian Empire in a set of beautiful color photographs gathered by his own “sort first” method for color photography [cite: http://www.loc.gov/exhibits/empire/gorskii.html]. His single-lens customized view-camera took a rapid sequence of three separate photographs, each through a different color filter in front of the lens. In 2003, the U.S. Library of Congress digitized a large set of these negatives and merged them to construct conventional color photographs.

T Using the “sort last” method , we mixes togetherake several several different settingssimultaneous measurements withininside each photographic image we take. After photography we must “sort” the contents of the photos, rearrange and recombine them somehow to construct a suitable output image. Such multiple simultaneous measurements in each image make sort-lastthe methods less susceptible to scene variations over time, reducing the chance that a transientchanging scene value will escape successful measurement. For example, suppose we photograph a scene as (e.g. the sun goes behind the clouds cover or reveal the sun duringwhile doing “sort-first” exposure bracketing; our first high-exposure photo was , takenover-exposed before sun went behind clouds appears overly bright, but our, while subsequent mid and low exposure photos captures are darker than they should be due to falling light levelswere both under-exposed after the sun-behind-clouds event, resulting in, yielding no usable photos no usable measurements at all. Another example of “sort-first” difficulties appeared in merging Prokudin-Gorskii’s color plates; unpredictable motions from wind-blown trees and swirling river water during photography caused mismatches among the 3 negatives, resulting in color fringing and rainbow-like artifacts, as in “Pinkhus Karlinskii. Eighty-four years [old].” Viewable online at http://www.loc.gov/exhibits/empire/images/p87-5006.jpg. )

One commonly encountered example ofT “sort last” is the “Bayer” color mosaic pattern found on nearly all single-chip digital cameras is perhaps the most widely used form of “sort-last” measurement, .w While 3-chip digital cameras follow theare “sort first” method. Three-chip cameras (more common for high-quality video applications than still photos, e.g. ) --they use a dichroic prism [http://en.wikipedia.org/wiki/Dichroic_prism] assemblyarrangement

Page 7: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

behind the lens to split the image from the lens into 3 separate wavelength bands for 3 separate image sensors.In the patented Bayer mosaic method, i; most digital cameras use a single-chip ”sort-last” sensor. Individual, pixel-sized color filters cover adjacent pixels on this sensor, forming a red, ,green, and blue filter pattern as shown. Even though the sensor loses spatial resolution because of this multiplexing, we can measure all 3 colors at once, and interpolate sensible values for every pixel location (de-mosaicking) to give the impression of a full-resolution image with all colors measured for every pixel..

IMAGEhttp://en.wikipedia.org/wiki/Bayer_filter [SITE HAS IMAGE TO USE]FIG X: Bayer Mosaic in many modern digital cameras employspermits “”sort-last” sensing for color. ‘De-mosaicking’ methods employ edge-following, estimation, and interpolation methods to approximate a full-resolution color image from these measurements. Alternatively, three-chip video cameras follow the “sort-first” method, and sense three complete independent color images simultaneously.

3.2.2 Time- and Space-multiplexed Capture

In addition to “sort-first” and “sort-last,” we can also classify multi-image gathering methods into time-multiplexed and space-multiplexed forms which are more consistent with the 4D ray-space descriptions we encourage in this book. Time-multiplexed methods use one or more cameras to gather photos whose settings vary in a time-sequence: camera settings may change, the photographed scene may change, or both. Space-multiplexed methods are their complement, gathering a

Page 8: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

series of photos at the same time, but with camera settings that differ among cameras or within cameras (e.g. “sort first,” “sort last”).

Like “sort first” methods, time-multiplexed capture can introduce inconsistencies from changing scenes. For example, suppose we wish to capture photographs for assembly into a panoramic image showing a 360-degree view from a single viewpoint. For a time-multiplexed sequence, we could mount a single camera on a tripod, use a lens with a field of view of D degrees, and take a time-multiplexed sequence by rotating the camera D degrees or less between each exposure. With aAn unchanging scene and a camera with little or no radial distortion we can gather a will produce a perfectly matched set of photographs that match each other perfectly in their overlapped regions, and anyfor conventional panorama-making software will produce good results., However, abut any movement or lighting changes within the scene during this process will introduce inconsistencies that are much more difficult (though not impossible) to resolve; clouds in the first photograph might not align at all with clouds in the last one, but alignment is not impossible; tools in PhotoShop CS3 are suitable for manually resolving modest mismatches, and “Video panoramas” have proven capable of resolving more challenging scene changes that include flowing water, trees waving in the wind, and lighting changes [Agarwala2005 http://grail.cs.washington.edu/projects/panovidtex/panovidtex.pdf]).

AFor a space-multiplexed sequence neatly avoids these time-dependent mismatches. To capture a changing panorama, we can either , we would construct a ring of cameras with aligned or slightly overlapped fields-of-view to capture all views simultaneously (e.g. Kodak’s ‘Circle-Vision360’ panoramic motion-picture attraction [1967 onwards] at Disney theme parks), or resort to specialized catadioptric (lenses-and-mirrors) optics to map the entire panorama onto a single image sensor[NayarCata97, and Benosman2001].

3.2.3Hybrid Space-Time Multiplexed Systems

Hybrid systems of video or still cameras enable capture of each step of a complicated event over time in order to understand it better, whether captured as a rapid sequence of photos from one camera (a motion picture), a cascade of single photos taken by a sequence set of cameras, or something inbetween. Even before the first motion pictures, in 1877-9 Edweard Muybridge (http://www.kingston.gov.uk/browse/leisure/museum/museum_exhibitions/muybridge.htm) devised just such a hybrid by constructingan elaborate multi-camera system of wet-plate (collodion) cameras to take single short-exposure-time photos in rapid-fire sequences.

Page 9: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

Muybridge devised a clever electromagnetic shutter-release mechanism triggered by trip-threads to capture action photos of galloping horses. He also refined the system with electromagnetic shutter releases triggered by pressure switches or elapsed time, and, triggered by time elapse, to record walking human figures, dancers, and acrobatic performances etc. (see http://www.kingston.gov.uk/browse/leisure/museum/museum_exhibitions/muybridge.htm) . His sequences ofThese short-exposure ‘freeze-frame’ images allowed the first careful examination of the subtleties of motion that are too fleeting or complex for our eyes to absorb as they are happening--a fore-runner of slow-motion movies or video. Instead of selecting just one perfect instant for one photograph, these event-triggered image sequences contain valuable visual information that stretches across time and across a sequence of camera positions,, and is suitable for several different kinds of computational merging.

Perhaps the simplest form of ‘computational merging’ of time-multiplexed images is within the camera itself. In his seminal work on fast, high-powered electronic (Xenon) strobe lights, Harold Edgerton showed that a rapid multiple-exposure sequence can be as revealing as a high-speed motion-picture sequence, as illustrated here: (IMAGE FOUND HERE: http://libraries.mit.edu/archives/exhibits/bush/img/edgerton-baseball.jpg NEED PERMISSION FOR THIS! Policy found here: http://libraries.mit.edu/archives/research/policy-publication.html )Figure XXIn addition to its visual interest, photos lit by a precisely timed strobe sequence like this permit easy frame-to-frame measurements; for example, this one confirms that baseballs follow elastic collision dynamics.[FOOTNOTE 2]

[FOOTNOTE 2]:Similarly, you can try your own version of Edgerton’s well-known milk-drop photo sequences (e.g. http://www.vam.ac.uk/vastatic/microsites/photography/photographer.php?photographerid=ph019&row=4 ) with a digital flash camera, an eye dropper, and a bowl of milk; http://www.math.toronto.edu/~drorbn/Gallery/Misc/MilkDrops/index.html

In some of Muybridge’s pioneering effortscelebrated experiments, he triggered two or more cameras were triggered at once to capture multiple views simultaneously views. ModernRecent work by Bregler and others (http://movement.nyu.edu/ ) on motion-capture from video merged these early multi-view image sequences computationally to infer the 3D shapes and movements that cause them. By finding image regions undergoing movements consistent with rigid jointed 3D

Page 10: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

shapes in each image set, Bregler et al. [1998Bregler: http://www.debevec.org/IBMR99/75paper.pdf ] was able to compute detailed estimates of the 3D position of each body segment in each frame, and re-render the image sets as short movies at any frame rate viewed from any desired viewpoint.

In another ambitious experiment at Stanford University In an ambitious experiment at Stanford University, mmore than one hundred years after Muybridge’s work,, Marc Levoy and colleagues Marc Levoy and colleagues constructed an adaptable array of 128 individual film-like digital video cameras [Wilburn 05] that perform both time-multiplexed and space-multiplexed image capture simultaneously. The reconfigurable array enabled a wide range of computational photography experiments (see Chapters XX, YY pages AA and BB). Building on lessons from earlier arrays (e.g. [Kanade97], [Yang02],[Matusik04],[Zhang04)), the system’s interchangeable lenses, custom control hardware, and refined mounting system permitted adjustment of camera optics, positioning, aiming, arrangement and spacing between cameras. One configuration kept when the cameras packed together, just are placed 1 inch apart, and staggered the their triggering times for each camera are staggered within the normal 1/30 second video frame interval. TheEach video cameras all viewed the same scene from almost the same viewpoint,, but each viewed the scene during different overlapped time periods. By assembling the differences between overlapped video frames from different cameras, the team was able to compute the output of a virtual high-speed camera running at multiples of the individual camera frame rates and as high as 3,000 frames per second.

However, at high frame rates these differences were quite small, causing noisy results that would not be acceptable for us as afor conventional high-speed video camera, such as [VisionResearch08]. . Instead, the team simultaneously computed three low-noise video streams with different tradeoffs using synthetic-aperture techniques[Levoy04]. They made aA spatially-sharp but temporally blurry video Is made by averaging together multiple staggered video streams, which providedgave high-quality results for stationary items but excessive motion blur for moving objects. For aA temporally-sharp but spatially (low spatial resolution) video It, they made by averaged togethering spatial neighborhoods within each video frame to eliminated motion blur, but this also but induced excessive blur in stationary objects. They also computed a A temporally- and spatially-blurred video stream Iw c, toombined holds the joint low-frequency terms, so that the combined streams Is + It – Iw exhbited reduced noise, sharp stationary features and modest motion blur, as shown in Figure {chap3levoyHiSpeed}X2:

Page 11: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

FIGURE {chap3levoyHiSpeed} X2: Staggered video frame times permit construction of a virtual high-speed video signal with a much higher frame-rate via ‘hybrid synthetic aperture photography’[Wilburn2005, Fig 12, reproduced by permission NEED TO GET PERMISSION]. Hybrid synthetic aperture photography for combining high depth of field and low motion blur. (a-c) Images captured of a scene simultaneously through three different apertures: a single camera with a long exposure time (a), a large synthetic aperture with short exposure time (b), and a large synthetic aperture with a long exposure time. Computing (a+b-c) yields image (d), which has aliasing artifacts becausethe synthetic apertures are sampled sparsely from slightly different locations. Masking pixels not in focus in the synthetic aperture images before computing the difference (a + b - c) removes the aliasing (e). For comparison, image (f) shows the image taken with an aperture that is narrow in both space and time. The entire scene is in focus and the fan motion is frozen, but the image is much noisier.

3.3 Improving Dynamic Range

Like any sensor, digital cameras have a limited input range: too much light dazzles the sensor, ruining the image with a featureless white glare, and too little light makes image features indistinguishable from perfect darkness. How can we improve that range, let our cameras see details in the darkest shadows and brightest highlights?

Page 12: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

Film-like cameras provide several different mechanisms that let us adjust the camera’s overall sensitivity to light to the amount of light in a viewed scene, and digital cameras can adjust most of them automatically. These include adjusting the aperture size to limit the light admitted through the lens (but this may change the depth-of-field), adjusting the exposure time (but this may allow motion blur), placing dark transparent ‘neutral density’ filters in the light-path through the lens but this might accidentally displace the camera), or we can adjust the sensitivity of the sensor itself; we could change to a film with a different ‘ASA’ rating (might change film-grain size), or change the gain-equivalent settings on digital cameras (might change the amount of noise in the picture). Despite their tradeoffs, these mechanisms combine to give modern camera sensitivity an astoundingly wide sensitivity range, one that can rival or exceed that of the human eye, which adapts to sense light over 16 decades of intensity from the absolute threshold of vision at about 10-6 cd/m2 up to the threshold of light-induced eye damage near 10+8 cd/m2 .Like any sensor, digital cameras have limited sensitivity and dynamic range: too much light dazzles the sensor, ruining the image with a featureless white glare, and too little light makes image features indistinguishable from perfect darkness. Several different mechanisms in modern cameras let us adjust its sensitivity to light in the scene, including adjusting the aperture size to limit the light through the lens, placing dark transparent ‘neutral density’ filters in the light-path through the lens, adjusting the exposure time, or by adjusting the sensitivity of the sensor itself (film sensitivity ‘ASA’ ratings; equivalent gain settings on digital cameras). These mechanisms combine to give modern camera sensitivity an astoundingly wide adjustment range, one that can rival or exceed that of the human eye, which adapts to sense light over 16 decades of intensity from the absolute threshold of vision at about 10-6cd/m2 up to the threshold of light-induced eye damage near 10+8 cd/m2.

Page 13: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

TUMBLIN OWNS MADE/OWNS THESE IMAGES IMAGES SHOWN ABOVE--OK TO USE.

OR USE THE MUCH PRETTIER< HIGHER-Res IMAGES FROM HERE: http://en.wikipedia.org/wiki/Image:HDRI-Example.jpg (NEED TO GET PERMISSION) AND FROM HERE (GET PERMISSION!)http://en.wikipedia.org/wiki/Image:Old_saint_pauls_1.jpg Figure {fig:hdrExample}: Tone-mapped HDR (high dynamic range) image from [Choud03]. Many back-lit scenes such as this one can easily exceed the dynamic range of most cameras. Bottom row shows the original scene intensities scaled by progressive factors of ten; note that scene intensities in the back-lit cloud regions at left are approximately 10,000X times higher than shadowed forest details, well beyond the 1000:1 dynamic range typical of conventional CMOS or CCD camera sensors.

However, sensitivity adjustment alone isn’t enough to enable cameras to match our eye’s ability to sense the intensity variations in every possible scene. Many scenes with plenty of light are still quite difficult to photograph well because their contrasts are too high; the intensity

Page 14: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

ratio between their brightest and darkest regions overwhelms the dynamic range of the camera, so that it cannot capture a detailed record of the darkest blacks and the brightest whites simultaneously. These troublesome high-contrast scenes often may include large visible light sources aimed directly at the camera,, stronglow-angle back-lighting and deep shadows, reflections, and specular highlights such as Figure {fig:hdrExample} above, and film-like photography offers us very little recourse except to add light to the shadowy regions with flash or fill-lighting; rather than adjust the camera to fit the scene, we adjust the scene to fit the camera!).

Unlike its sensitivity, the camera’s maximum contrast ratio, known as its , its “‘dynamic range”’ is not adjustable for any ordinary camera. Formally, it is the ratio between the brightest and darkest light intensities a camera can capture from a scene within a single image without saturation at maximum white or minimum blacklosing its detail-sensing abilities;, the maximum intensity ratio between scene intensities that cause the darkest detailed shadows and brightest textured brilliance, as shown in Figure {fig:hdrExample}.

No one single sensitivity setting (or ‘exposure’ value) will suffice to capture a ‘high dynamic range’ (HDR) scene that exceeds the camera’s contrast-sensing ability.

While bracketing and merging can construct an extended dynamicBoth the lens and the sensor together limit the camera’s dynamic range. In a high contrast scene, glare effects and unwanted light scattering within complex lens structures causes glare and flare effects that depend on the image itself; small light from bright parts of the scene to ‘leak’ into dark image areas, washing out shadow details and limiting the maximum contrast that the lens can form on the image on the sensor, typically between 100,000:1 to 10 million to 1 [cite: McCann07], [Levoy07glare] The sensor’s dynamic range (typically <1000:1) then imposes further limits. Device electronics (e.g. charge transfer rates) typically set the upper bound on the amount of sensed light, and the least amount of light distinguishable from darkness is set by both the sensitivity and the sensor’s ‘noise floor’, the combined effect of all the camera’s noise sources (quantization, fixed-pattern, thermal, EMI/RFI, and photon arrival noise).

Page 15: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

???? ????0 255 0 255

Domain of Human Vision:Domain of Human Vision:from ~10-6 to ~10+8 cd/m2

FilmFilm--Like Cameras:Like Cameras: ~100:1 to ~1000:1 typ.~100:1 to ~1000:1 typ.FilmFilm--Like Displays:Like Displays: ~1 to ~100 cd/m2 typ.

starlightstarlight moonlightmoonlight office lightoffice light daylightdaylight flashbulbflashbulb

1010--66 1010--22 11 1010 100100 1010+4+4 1010+8+8

{chap3mismatch}{chap3mismatch}

Figure {chap3mismatchHDR}: Visual Dynamic Range Mismatch. The range of visible intensities dwarfs the contrast abilities of cameras and displays. When plotted on logarithmic scale (where distance depicts ratios; each tic marks a factor-of-10 change), the range of human vision spans about 16 decades, but typical film-like cameras and displays span no more than 2-3 decades. For the daylight-to-dusk (photopic) intensities (upper 2/3rds of scale), humans can detect some contrasts as small as 1-2% (1.02:1, which divides a decade into 116 levels (1/log10 1.02)). Accordingly, 8-bit image quantization is barely adequate cameras and displays whose dynamic range may exceed 2 decades (100:1); many use 10, 12, or 14-bit internal representations to avoid visible contouring artifacts.

3.3.1 Capturing High Dynamic Range

Film-like photography is frustrating for high contrast scenes because even the most careful bracketing of camera sensitivity settings will not allow us to capture the whole scene’s visible contents in a single picture. Sensitivity set high enough to reveal the shadow details will

Page 16: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

cause severe over-exposure for dark parts of the scene; sensitivity set low enough to capture visible details in the brightest scene portions are far too low to capture any visible features in the dark parts of the image. However, there are several practical methods available that let us capture all the scene contents in a usable way.

The resulting image covers a much wider dynamic range (see Figure {chap3mismatch} than conventional image file formats can express; storing only for 8- or 10-bit per color per pixel is inadequate to depict the much wider range of intensities in these high dynamic range (HDR) images. Many early file formats used simple grids of floating-point pixel values but these used extravagant amounts of memory. One popular solution used 8-8-8-8 bit pixels that featured a ‘shared exponent’ E and 8-bit mantissas in a compact, easy-to-read “RGBE” devised by Greg Ward[Ward95rgbe], and popularized by use in his photometrically accurate 3D renderer RADIANCE [Ward1998]. Later, a psychophysically well-motivated extension to the TIFF 6.0 image standard [logLUV98]

HDR by Multiple ExposuresOne approach for faithfully recordingThe ‘sort-first’ approach is very suitable for capturing HDR images. To capture the finely-varied intensities in a high dynamic range scenes, we can is to capture multiple images using different exposure settings and a motionless camera that takes perfectly aligned images, and then to merge these images. In principle, the merging is simple; we divide the pixel value of each pixel by the light sensitivity of the camera as it took that picture, and combine the best estimates of scene radiance at that pixel for all pictures we took, discouting badly over-and under-exposed images.. The basic idea is that when longer exposures are used, dark regions are well exposed but bright regions are saturated. On the other hand, when short exposures are used, dark regions are too dark but bright regions are well imaged. If exposure varies and multiple pictures are taken of the same scene, value of a pixel can be taken from those images where it's neither too dark nor saturated. This type of approach is often referredsimple form of merging is quick to compute and found widespread early use as to as ‘exposure bracketing’ , and has been widely adopted [Morimura 1993, Burt and Kolczynski 1993,Madden 1993,Tsai 1994], but many methods assumed the linear camera response curves found on instrumentation cameras. . Most digital cameras intended for photography introduce intentional nonlinearities in their light response, often mimicking the s-shaped response curves of film when plotted on log-log axes (‘H-D’ or Hurter-Driffield curves).

Page 17: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

These curves enable cameras to capture a wider usable range of intensities and provide a visually pleasing response to HDR scenes, retaining weak ability to capture intensity changes even at its extremes of over- and under-exposure, and vary among different cameras. Imaging devices usually contain nonlinearities, where pixel values are nonlinearly related to the brightness values in the scene. Some authors have proposed to use images acquired under different exposures to estimate the radiometric response function of an imaging device, and use the estimated response function to process the images before merging them [Mann and Picard 1995, Debevec and Malik 1997, Mitsunaga and Nayar 1999.] This approach has proven far more robust, and is now widely available in both commercial software tools (Adobe Photoshop CS2 and later, CinePaint) open-source projects (HDRShop, PFStools, and more: see http://theplaceofdeadroads.blogspot.com/2006/02/high-dynamic-range-photography-with.html).

HDR by Exotic Image SensorsWhile quite easy and popular for static scenes, exposure-time bracketing methods not the only option available for capturing HDR scenes, and is not suitable for scenes that vary rapidly over time.

====================================================================================================================

STOP HERE - JACK

====================================================================================================================

Some alternative sensor designs can provide signals proportional to the logarithm of intensity, but like low-contrast films, they gain higher dynamic range by reducing sensitivity to low-contrast details, and often provide poor noise performance.

Logarithmic sensorsAssorted Pixels (nayar)Multiple Sensors:

-Beamsplitter-Camera Arrays (only good for distant views)

Page 18: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

Gradient CameraNayar whirling-filter camera (from the Comp Photog Overview article)SingBingKang HDR video paper

Figure 1 High Dynamic Range imaging using Exposure Bracketing

3.3.2 Tone MappingFrom http://en.wikipedia.org/wiki/Tone_mapping:Tone mapping is a technique used in image processing and computer graphics to map a set of colours to another; often to approximate the appearance of high dynamic range images in media with a more limited dynamic range. MPrint-outs,ost conventional digital image output devices such as plain-paper laser printers, CRT displays,or LCD monitors, and DMD projectors are unable to directly recreate the full range of contrasts present in most natural scenes, even those that might not be considered ‘HDR’ images. For example, in a well-lit office environment, Tumblin has measured contrasts between displayed white and displayed black on a variety of high-quality CRT displays, and found values ranging from 50:1 to 15:1, yet all had approximately the same visual appearance. To avoid losing important visual information, we must devise a ‘tone mapping’ operator to reduce the contrasts of the image to match the display’s abilities, yet somehow maintain the appearance of the original scene as closely as possible, keeping both its high contrasts and its visible details in the displayed image without introducing visually objectionable artifacts. This process of detail-and appearance-preserving contrast reduction is known as tone reproductiion.all have a limited dynamic range which is inadequate to reproduce the full range of light intensities present in natural scenes. Essentially, tone mapping addresses the problem of strong contrast reduction from the scene values (radiance) to the displayable range while preserving the image details and color appearance important to appreciate the original scene content.Accurate tone mapping for film-like photography is inherently difficult and ambiguous for several reasons. First, it presumes that one fixed set of displayed intensities will be capable of recreating a good approximation of the visual sensations of the original scene. Unfortunately, this mapping is neither simple nor directly measurable, because human eyes don’t capture what we see all at once in a single blink or a shutter release; instead, our eyes wander through the scene, guided by a curious combination of conscious

Page 19: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

and unconscious eye movements, and continually adjusts its sensitivity to help examine the detailed contents of local image contents and help comprehend them.

3.3.3 Image Compression and Display[Why GAMMA isn’t enough]

[Perceptual basis for tone mapping]

[Display advances: Dolby/Brightside, others with modulated LED backlights]

[Rafal Mantiuk, Karol Myzkowki’s MPEG extensions for HDR;YZ’s filter-banks-based tone-map and compress/decompress]

[Why GAMMA isn’t enough] In film sensitometry, ‘gamma’ describes the contrast sensitivity; the ratio between scene contrasts and the contrast of the developed film. So-called ‘high contrast’ films (large gamma) exaggerate scene contrasts, driving displayed images to their black or white extremes; their available dynamic range is low. Conversely, ‘low-contrast’ films (gamma much less than one) reduce scene contrasts – they can photograph very high-dynamic range scenes, but with a tradeoff; the wide contrast-sensing range causes a ‘washed out’ appearance, and reduces the film’s ability to record subtle low-contrast details such skin textures, wood grain, and hints of translucency and self-shadowing that help us discern surface materials. Currently, most CCD and CMOS sensor pixel designs mimic the nice-looking compromise offered by photographic film; high contrast sensitivity for mid-range pixel values (e.g. 8-bit values between about 60 and 190) describe a fairly narrow range of contrasts (typically < 80:1). Outside this range, with each successively brighter or darker pixel value the contrast sensitivity falls, permitting the camera to capture a much larger contrast range (typically up to 1000:1) at its most extreme pixel values.

(Ramesh’s gradient domain mergingRafal Mantiuk’s HDR video compression, YZ’s filter-banks paper……)

3.4 Extended Depth of Field(depth of field vs depth of focus?)From Wikipedia: In optics, particularly as it relates to film and photography, the depth of field (DOF) is the portion of a scene that appears sharp in the image. Although a lens can precisely focus at only one distance, the decrease in sharpness is gradual on either side of the

Page 20: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

focused distance, so that within the DOF, the unsharpness is imperceptible under normal viewing conditions.EXCELLENT SOURCE: http://en.wikipedia.org/wiki/Depth_of_field

3.5 Beyond Tri-Color Sensing[modified from Ankit’s dissertation] At first glance in increase in the spectral resolution of camera, lights and projectors might not seem to offer any significant advantages in photography. Existing photographic methods quite sensibly rely on the well-established trichromatic response of human vision, and use three or more fixed color primaries such as red, green, and blue (RGB) to represent any color in the color gamut of the device.

Figure 1.2. Comparison of the spectral response of a typical color film and digitalcamera sensor. (a) Spectral response of FujiFilm Velvia 50 color slide film (from:[Fuji2008] FUJIFILM. FUJICHROME Velvia for Professionals [RVP]. Data Sheet AF3-960E) (b) Spectral response of the Nikon D70 sensor [Moh2003]

Fixed-spectrum photography limits our ability to detect or depict several kinds of visually useful spectral differences. In the common phenomena of metamerism, the spectrum of available lighting used to view or photograph objects can cause materials with notably different reflectance spectra to have the same apparent color because they evoke equal responses from the broad, fixed color primaries in our

Page 21: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

eyes or the camera. Metamers are commonly observed in fabric dyes where two pieces of fabric might appear to have the same color under one light source, and a very different color under another.

Another problem with fixed color primaries is that they impose a hard limit on the gamut of colors that the device can capture or reproduce accurately. As demonstrated in the CIE 1931 color space chromaticity diagram, each set of fixed color primaries defines a convex hull of perceived colors within the space of all humanly perceptible colors. The device can reliably and accurately reproduce only the colors inside the convex hull defined by its color primaries. In most digital cameras, the Bayer grid of fixed, passive R,G,B filters overlaid on pixel detectors set the color primaries. Current DMD projectors use broad-band light sources passed through a spinning wheel that holds similar passive R,G,B filters. These filters compromise between narrow spectra that provide a large color gamut, and broad spectra that provide greatest onscreen brightness.

Metamers and contrast enhancement: Photographers often use yellow, orange, red, and green filters for various effects in black & white photography. For example, white clouds and blue sky are often rendered as roughly the same intensity in a black & white photograph. An orange or red filter placed in front of the lens makes the sky darker than the clouds, thus rendering them as a different shade on the black & white film. A red filter essentially attenuates the wavelength corresponding to blue and green colors in the scene, thus enabling the viewer to distinguish between the clouds and the sky in the resulting photograph. This is a classic case of effectively modifying the illumination to distinguish between metamers. In color photography, use of warming filters to enhance the contrast in a photograph is quite common.

Unfortunately, photographers can carry only a limited number of filters with them. Even these filters are often rather broad-band and useful for only very standard applications. I argue that a camera that allows arbitrary and instantaneous attenuation of specific wavelength ranges in a scene would give increased flexibility to the photographer. The camera could iteratively and quickly work out the best effective filter to achieve a metamer-free high contrast photograph for a given scene. Similarly, with an ‘agile’ light source guided by our camera, we might change the illumination spectra enough to disrupt the metameric match. Similarly, we might interactively adjust and adapt the illuminant spectrum to maximize contrasts of a scene, both for human viewing and for capture by a camera.

Page 22: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

3.6 Wider Field of ViewA key wonder of the human vision is its seemingly endless richness and detail; the more we look, the more we see. Most of us with normal- or corrected-to-normal vision are almost never conscious angular extent or the spatial resolution limits of our eyes, nor are we overly concerned with where we stand as we look at something interesting, such as an ancient artifact behind glass in a display case.

Our visual impressions of our surroundings appear seamless, enveloping and filled with unlimited detail apparent to us with just the faintest bit of attention. Even at night, when rod-dominated scotopic vision limits spatial resolution and the world looks dim and soft, we do not confuse a tree trunk with the distant grassy field beyond it. Like any optical system, our eye’s lens imperfections and photoreceptor array offers little or resolving ability beyond 60-100 cycles per degree, yet we infer that the edge of a knife blade is discontinuous, it is disjoint from its background and is not optically mixed with it on even the most miniscule scale. Of course we cannot see behind our heads, but we rarely have any sense of our limited field of view [FOOTNOTE 3], which stops abruptly approximately outside a cone spanning about +/-80 degrees away from our direction of gaze.[CITE: HPHP handbook of human performance]. This visual richness, and its tenuous connection to viewpoints, geometry and sensed amounts of light can make convincing hand-drawn depictions of 3D scenes more difficult to achieve, as 2D marks on a page can seem ambiguous and contradictory (for an intriguing survey, see[Durand02depict]).

By comparison, camera placement, resolution and framing are key governing attributes in many great film-like photographs. How might we achieve a more ‘free-form’ visual record computationally? How might we construct a photograph to better achieve the impression of unlimited, unbounded field of view, limitless visual richness revealed with little more effort than an intent gaze?

[FOOTNOTE 3]Try this to map out the limits of your own peripheral vision; (some people have quite a bit more or less than others): gaze straight ahead at a fixed point in front of you, stretch out your arms back behind you, wiggle your fingers continually, and without bending your elbows, slowly bring your hands forward until you sense movement in your peripheral vision. Map it out from all directions; is it mostly circular? Is it different for each eye? Is it shaped by your facial features (nose, eyebrows, cheekbones, eye shape)? Your glasses or contact lenses? Do you include these fixed features in your conscious assessment of your surroundings?

Page 23: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

3.6.1 Panoramas vias Image Stitching

objective measurements that confirm assess silhouettes as having infinite sharpness, as true visual discontinuity, even in the as we seldom find our impressions of our surroundings lacking in subtlety and richness; we seek out mountaintops, ocean vistas, and spectacular ‘big-sky’ sunsets and dramatic weather effects in part because the more we look around, the more we see in these visually rich scenes. With close attention, we almost never exhaust our eye’s abilities to discover interesting visual details, from the fine vein structure of a leaf to the slow boiling formation of a thunderstorm to the clouds in coffee to the magnificently complex composition of the luxurious fur on a hare (see Figure{chap3hare})

Page 24: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

Figure {chap3hare} “A young Hare”, Albrecht Durer, watercolor, 1502. The suggestion of extremely fine whiskers and wisps of fur by suggests unbounded resolution

[ankit’s dissertation] Panoramas: A panorama is a composite picture created by stitching multiple photos of overlapping but distinct parts of the scene. Capturing a panorama requires the user to manually point the camera at interesting parts of the scene while ensuring there is adequate overlap in the various captured photos. The stitching process works best for scenes far away from the camera, thus making this very useful for capturing landscapes etc. Panoramas are popular because (a) ultra wide-angle lenses are expensive and usually not very good, and (b) the composite obtained by stitching has a much higher resolution than that of the digital sensor. Additionally, the

Page 25: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

photographer can select exactly the parts of the scene that are photographed and what parts are skipped. The resulting panorama might not have a regular shape, but contains all the required “information”. The main disadvantages of panoramas are that they require capture of multiple photos, and stitching photos may not give perfect results and might produce visible seams.

3.6.2 Extreme Zoom As both resolution and field of view increase together, images are not only very large, but also become awkward to display and explore visually. (MS 2007 paper and other systems; browse by continuously varied blend between spherical/planar.)

2007 SIGG: Capturing and Viewing Gigapixel Images Johannes Kopf (Universität Konstanz), Matt Uyttendaele (Microsoft Research), Oliver Deussen (Universität Konstanz), Michael Cohen (Microsoft Research)

3.7 Higher Frame Rate

SIGGRAPH2007:Factored Time-Lapse Video

Kalyan Sunkavalli, Wojciech Matusik, Hanspeter Pfister (Mitsubishi Electric Research Laboratories (MERL)), Szymon Rusinkiewicz (Princeton University)

Computational Time-Lapse Video Eric P. Bennett, Leonard McMillan (University of North Carolina at Chapel Hill)

3.8 Improving ResolutionLarger Lens, larger sensor: early digital camerasWolfgang Heidrich camera using scanner (we borrowed ideaGigaPixl Project

SIGGRAPH2007Image Deblurring with Blurred/Noisy Image Pairs

Lu Yuan (The Hong Kong University of Science and Technology), Jian Sun (Microsoft Research Asia), Long Quan (The Hong Kong University of Science and Technology), Heung-Yeung Shum (Microsoft Research Asia)

Page 26: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

3.8.1 Mosaics from a Moving Camera

‘Pushbroom’ camera;199? Rademacher

2006 SIGGRAPH:Photo Tourismasdf: http://phototour.cs.washington.edu/

AutoCollage Carsten Rother, Lucas Bordeaux, Youssef Hamadi, Andrew Blake (Microsoft Research Cambridge)

Photographing Long Scenes With Multi-Viewpoint Panoramas Aseem Agarwala (University of Washington), Maneesh Agrawala (University of California, Berkeley), Michael F. Cohen, Richard Szeliski (Microsoft Research), David Salesin (University of Washington and Adobe Systems Incorporated)

3.8.2 Super-resolution: Interleaved samples

Super-resolution refers to techniques that improve image resolution by combining multiple low-resolution images [134]. In sensor-shift based super-resolution the camera is simply placed on a tripod and it captures multiple photos while the image sensor moves by sub-pixel-sized distances. These photos are then fused to give a single high resolution photo. Unfortunately only a few high-end cameras use this technique due to the precision required in the image sensor displacement. A simpler super-resolution approach requires moving the entire camera by some random jittered amounts and capturing multiple photos. First the relative motion between the camera and the scene is estimated and all images are registered to a reference frame. The images are then fused to obtain a single high resolution image. While this is simpler to achieve, the results suffer from inaccuracies in shift estimation and irregular camera position sampling. One problem with both panoramas and super-resolution is that they require capture of multipleb photos. This assumes the scene to be static or changing very slowly.

Page 27: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

3.8.3 Apposition Eyes

Shifting from the film-like photography goal of copying an image formed by a lens to the broader topic of ‘capturing visual experience’ suggests that we should look for broader sets of solutions for gathering visual information, look for solutions included in that latter that were missing from the former. Just as biological vision systems include many different designs for different purposes, my co-author and I believe that computational photography devices might achieve similarly broad diversity and novelty in design.

In particular, advances in low-cost high-precision fabrication of lenticular screens, extremely fine-grain Fresnel lenses, fused fiber-optic light guides, lens and sensor arrays, and miniature integrated systems that include optics, actuators, and sensors together are beginning to enable new kinds of experimental imaging systems inspired by compound eyes found in insects and crustaceans.

For example, (TOMBO) project for very thin, surface-like sensors; the mirror-lens project (RAMESH has the citation—from his SIGGRAPH slides),

For example, Duparre et al[duparre05] explored a very thin multi-layer lenslet array system that achieves good resolution with a very wide field of view, by combining the Gabor Superlens idea with ideas from apposition eyes of insects.

Fabrication techniques have also improved enough to allow limited explorations of materials with features smaller than the wavelength of light, For example, SCIENCE NEWS ARTICLE ON BUTTERFLY WINGS, fabrication of optical elements smaller than the wavelength of light (“optical crystals”, “super-lattice materials” etc.) suggest

more broadly. WHY? Because multiple ‘telescopes’ give redundancy in case of damage; lenses with modest overlap hint at motion capture,we’ll use it to introduce ‘tombo’ project, and Levoy project described above. http://www.technologyreview.com/Infotech/18772/?a=f

ALSO: Gabor Superlens. Denis Gabor postulated pairs of lens arrays with slightly different-sized lenses could be combined to emulate the light transport of one large lens .

Page 28: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

(curious source: book preview found on google:http://books.google.com/books?id=i6eQphVAoMsC&pg=PA34&lpg=PA34&dq=gabor+superlens&source=web&ots=Fu7M-MWpLn&sig=aWT94f3IdEbF3LFca2mdZkl8S3g&hl=en&sa=X&oi=book_result&resnum=8&ct=result#PPA7,M1)

------------------

Biological organs for visual sensing appeared in the fossil record extremely early, just as the Cambrian era began about 540 million years ago [Land02], but even then showed tremendous variety found in animals today. Single-lens “simple” eyes are the predominant design for vertebrates, as they offer the higher spatial resolution for the same volume and are usually steerable to aid in predation and predator avoidance.

The image perceived by the arthropod is a combination of inputs from the numerous ommatidia, which are oriented to point in slightly different directions. Compared with single-aperture eyes, compound eyes have poor image resolution; however, they possess a very large view angle and the ability to detect fast movement and, in some cases, the polarization of light [Volkel2003].

The design differences and wide variety of arthropod eyes suggest that several eye designs (including the huge simple eye of squids, where the lens is formed by specialized skin c) evolved independently multiple times [Land02]

“simple” eyes with a single large lens to “compound” eyes with arrays of individual lenses

While most vertebrate eyes follow roughly the same “simple eye” design, a steerable ball containing a single lens to focus an image on neural tissues on its inside back wall, non-vertebrates appear to have evolved them independently as well, such as the basketball-sized eyeball of the giant squid used to see in the near-total darkness of extreme sea depths.

Neither lenses or imaging are required for useful visual sensing; the eye spots of larval trematode worms consist of invidivual receptor cell embedded in a pigment cell that block light from random directions for

Page 29: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

each receptor; their aggregate signal can then indicate light intensity and direction with to allow the worm to navigate and avoid predators; some mollusks exhibit what appear to be the common ancestor of simple.

and avoid threats visually with receptor-lined pigmented cups that pepper the outer edge of their retractable mantles as they peek out of nearly-closed shells,honeybees use vast arrays of ommatidia of clams to sense light direction and changing shadows, a wide variety of compound eyes of honeybees and dragonflies with spectral sensitivity that extends into UV and IR wavelengths,

While the film-like camera in some ways resembles the ‘simple’ eye of

Ranging from neuron-lined hollows in the mantle of clams, and direction-sensing

A honeybee’s giant hemispheres of ommatidia (tiny light-sensing units, each with their own lenses) with spectral response that stretches into UV and IR wavelengths, the 12-inch eyeballs of the giant squid that gather images in near-total darkness, UV and IR sensing in honeybees and light-sensing cells)

A different visual organ found in many arthropods (phylum for animals with exoskeletons and segmented bodies such as insects, spiders (arachnids) and crustaceans).Many tiny lenses instead of one huge ‘main’ lens

Chapter 3 Bibliography Entries

[Benosman2001] “Panoramic Vision: Sensors, Theory, and Applications”, Series: Monographs in Computer Science , by Benosman, Ryad; Kang, Sing B. (Eds.) 2001, XXIV, 449 p. 300 illus., Hardcover ISBN: 978-0-387-95111-9

[Bregler98] Tracking people with twists and exponential maps (1998) [248 citations] by Christoph Bregler, Jitendra Malik In Proc. IEEE Conf. on Computer Vision and Pattern Recognition http://www.cs.berkeley.edu/~bregler/bregler_malik_cvpr98.ps.gz

[Brown07-Autostitch] Matthew Brown, David Lowe: Automatic Panoramic Image Stitching using Invariant Features. In: International Journal of Computer Vision, Volume 74, Issue 1, August 2007.

Page 30: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

[Choud03] “The trilateral filter for high contrast images and meshes”ACM International Conference Proceeding Series; Vol. 44 archiveProceedings of the 14th Eurographics workshop on Rendering, Leuven, Belgium SESSION: Cloth and filtering table of contents Pages: 186 - 196 Year of Publication: 2003 ISBN ~ ISSN:1727-3463 , 3-905673-03-7

[Durand02] Durand, F. An invitation to discuss computer depiction. In Proc. 2nd Intern. Symposium on Non-Photorealistic Animation and Rendering (June 2002).

“Microoptical telescope compound eye”Jacques Duparré, Peter Schreiber, André Matthes, Ekaterina Pshenay–Severin, Andreas Bräuer, Andreas Tünnermann, Reinhard Völkel, Martin Eisner, and Toralf Scharf. In Optics Express, (Optical Society of America) February 7, 2005, Vol. 13, Issue 3, pp. 889-903http://www.opticsinfobase.org/abstract.cfm?URI=oe-13-3-889

[Kanade97] Takeo Kanade, Peter Rander, P. J. Narayanan, “Virtualized Reality: Constructing Virtual Worlds from Real Scenes” IEEE MultiMedia, Volume 4 , Issue 1 (January 1997) Pages: 34 - 47 ISSN:1070-986X Publisher IEEE Computer Society Press Los Alamitos, CA, USA(Here’s the bibtex for it)@article{Kanade97, author = {Takeo Kanade and Peter Rander and P. J. Narayanan}, title = {Virtualized Reality: Constructing Virtual Worlds from Real Scenes}, journal = {IEEE MultiMedia}, volume = {4}, number = {1}, year = {1997}, issn = {1070-986X}, pages = {34--47}, doi = {http://dx.doi.org/10.1109/93.580394}, publisher = {IEEE Computer Society Press}, address = {Los Alamitos, CA, USA}, }

[Land02] “Animal Eyes”, Michael F. Land, Dan-Eric Nilsson, 240pp. Oxford University Press, USA ISBN-13: 9780198509684[ look here for more info:http://www.amazon.com/Animal-Eyes-Oxford-Biology/dp/0198509685]

Page 31: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

[Levoy04] Marc Levoy, Billy Chen, Vaibhav Vaish, Mark Horowitz, Ian McDowall, Mark T. Bolas: Synthetic aperture confocal imaging. ACM Trans. Graph. 23(3): (SIGGRAPH 2004) pp 825-834 (2004)

[Levoy07glare] bibtex for this is:@inproceedings{1276424, author = {Eino-Ville Talvala and Andrew Adams and Mark Horowitz and Marc Levoy}, title = {Veiling glare in high dynamic range imaging}, booktitle = {SIGGRAPH '07: ACM SIGGRAPH 2007 papers}, year = {2007}, pages = {37}, location = {San Diego, California}, doi = {http://doi.acm.org/10.1145/1275808.1276424}, publisher = {ACM}, address = {New York, NY, USA}, }

[logLUV98] Here’s the bibtex for it:@article{ larson98logluv, author = "Gregory Ward Larson", title = "{LogLuv} Encoding for Full-Gamut, High-Dynamic Range Images", journal = "Journal of Graphics Tools: JGT", volume = "3", number = "1", pages = "15--31", year = "1998", url = "citeseer.ist.psu.edu/larson98logluv.html" } [Matusik04] Wojciech Matusik, Hanspeter Pfister. 3D TV: A Scalable System for Real-Time Acquisition, Transmission, and Autostereoscopic Display of Dynamic Scenes. ACM Transactions on Graphics. 24(3) 2004.

[McCann07] J. J. McCann and A. Rizzi, (2007) “Camera and visual veiling glare in HDR images”, J. Soc. Information Display, vol. 15(9), 721-730, 2007

[Moh2003] Joan Moh, Hon Mun Low, and Greg Wientjes. “Characterization of the Nikon D-70 Digital Camera”http://scien.stanford.edu/class/psych221/projects/05/joanmoh, 2003.

@Article{NayarStereoBowlLens2006, author = {S. Kuthirummal and S. K. Nayar},title = {{M}ultiview {R}adial {C}atadioptric {I}maging for {S}cene {C}apture},

Page 32: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

journal = {ACM Trans. on Graphics (also Proc. of ACM SIGGRAPH)},month = {Jul},year = {2006}}

[NayarCata97]"Catadioptric Omnidirectional Camera," S.K. Nayar, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.482-488, Jun, 1997.(here’s the bibtex for it:)@InProceedings{CAVE_0104, author = {S.K. Nayar},title = {{C}atadioptric {O}mnidirectional {C}amera},booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},pages = {482--488},month = {Jun},year = {1997}}

@InProceedings{assortedPixels2002, author = {S.K. Nayar and S.G. Narasimhan },title = {{A}ssorted {P}ixels: {M}ulti-{S}ampled {I}maging {W}ith {S}tructural {M}odels},booktitle = {European Conference on Computer Vision (ECCV)},volume = {IV},pages = {636--652},month = {May},year = {2002}}

[Sand2004] 'Video Matching' Peter Sand and Seth Teller Video Matching. ACM Transactions on Graphics (TOG) (SIGGRAPH 2003?4?) Vol. 22? 23? no. (3) 592-599,

[VisionResearch08] Phantom 12 Camera, see http://www.visionresearch.com/

[Volkel2003]Völkel, R., Eisner, M., and Weible, K. J. 2003. "Miniaturized imaging systems." Microelectron. Eng. 67-68, 1 (Jun. 2003), 461-472. DOI= http://dx.doi.org/10.1016/S0167-9317(03)00102-3

[Yang02]

Page 33: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

http://groups.csail.mit.edu/graphics/pubs/egrw2002_rtdlfc.pdfACM International Conference Proceeding Series; Vol. 28 archiveProceedings of the 13th Eurographics workshop on Rendering table of contents Pisa, Italy, Pages: 77 – 86, 2002 ISBN:1-58113-534-3(Here’s the bibtex for it)@inproceedings{Yang02, author = {Jason C. Yang and Matthew Everett and Chris Buehler and Leonard McMillan}, title = {A real-time distributed light field camera}, booktitle = {EGRW '02: Proceedings of the 13th Eurographics workshop on Rendering}, year = {2002}, isbn = {1-58113-534-3}, pages = {77--86}, location = {Pisa, Italy}, publisher = {Eurographics Association}, address = {Aire-la-Ville, Switzerland, Switzerland},

[Ward94rgbe] Ward, Greg, Real Pixels, Graphics Gems II, p. 80-83. edited by James Arvo, Morgan-Kaufmann 1994

[Ward98radiance] Greg Ward Larson and Rob Shakespeare, Rendering with Radiance, Morgan Kaufmann, 1998. ISBN 1-55860-499-5

[Wilburn05] High Performance Imaging Using Large Camera Arrays Bennett Wilburn, Neel Joshi, Vaibhav Vaish, Eino-Ville (Eddy) Talvala, Emilio Antunez, Adam Barth, Andrew Adams, Marc Levoy, Mark Horowitz Proc. SIGGRAPH 2005. Project page : http://graphics.stanford.edu/papers/CameraArray/

[Ward99] “Rendering with Radiance: The Art and Science of Lighting Visualization” By Greg Ward Larson and Rob A. Shakespeare,Copyright © 1998 by Morgan Kaufmann Publishers

[Zhang04] ZHANG, C., AND CHEN, T. 2004. A self-reconfigurable cameraarray. In Proceedings, Eurographics Symposium on Rendering, 243-254, 2004

CHAPTER 3 Beautiful Gobbidge

Page 34: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

3.1.1.1 Measuring Rays of Light:

BLUE TEXT: IGNORE—THESE =ARE JACK/S WRITING NOTES

The previous chapter uses 4-D sets of rays to describe the geometry of image formation, and diagrams such as Figure XX seem to suggest that pixels measure the amount of light arriving along each ray. While intuitive, this notion is not quite correct, because a single ‘ray’ of light passing through the lens arrives at JACK FIX THIS LATER….carries an infinitesimal amount of light, described in doubly-differential units—it has no area, no angular extent, and thus no measurable energy from the scene to the camera. Similarly

3.1.2 Noise in Images:

TOM’s REWRITE:Electronic “noise” at some level is unavoidable; the transmission or reception of a continuous or analog signal entails it. A radio signal is the oscillating wave carrying audio data broadcast from a transmitter and received by an antenna. In a digital camera, the electronic signal results from the rays of light admitted by the lensand focused on the sensor, which in turn conveys it to the analog-to-digital converter. Though it cannot be zero, noise can be reduced to a level minimal relative to signal. This relation is expressed as “signal to noise ratio,” (SNR). In a system with a high SNR, noise can be scarcely detectable. The lower the ratio, the more apparent the noise.

There are typically three kinds of noise associated with digital cameras: random noise, fixed pattern noise, and banding noise. The amount of noise varies with the size of the sensor, the ISO setting of the exposure, and of couse the make and model of the camera. There is some variation of noise within the various portions of a single image: shadow areas are noisier than highlight areas.

3.1.2 Noise in Images:HIGHLIGHTED TEXT SNARFED FROM http://www.cambridgeincolour.com/tutorials/noise.htm

Some degree of noise is always present in any electronic device that transmits or receives a "signal" in continuous or analog form. For televisions this signal is the broadcast data transmitted over cable or received at the antenna; for digital cameras, the signal is at first the light which hits the camera sensor, then the charges and currents

Page 35: CHAPTER 3: Extending Film-Like Digital Photographyraskar/Adm/Book06Feb08/Chap3... · Web viewAs a thought experiment, suppose we accept our existing film-like concepts of photography,

caused by that light, and then the voltages read out from the sensor chip on their way to the analog-to-digital converter. Even though noise is unavoidable, it can become so small relative to the signal that it appears to be nonexistent. The signal to noise ratio (SNR) is a useful and universal way of comparing the relative amounts of signal and noise for any electronic system; high ratios will have very little visible noise whereas the opposite is true for low ratios. The sequence of images below show a camera producing a very noisy picture of the word "signal" against a smooth background. The resulting image is shown along with an enlarged 3-D representation depicting the signal above the background noise.

Digital cameras produce at least three common types of noise: random noise due to photon arrivals, thermal noise due to nonzero temperature of the sensor, "fixed pattern" noise, and banding noise. The three qualitative examples below show pronounced and isolating cases for each type of noise against an ordinarily smooth grey background.

Noise not only changes depending on exposure setting and camera model, but it can also vary within an individual image. For digital cameras, darker regions will contain more noise than the brighter regions; with film the inverse is true.

[SEE FIGURES IN WEBSITE ]Note how noise becomes less pronounced as the tones become brighter. Brighter regions have a stronger signal due to more light, resulting in a higher overall SNR. This means that images which are underexposed will have more visible noise-- even if you brighten them up to a more natural level afterwards. On the other hand, overexposed images will have less noise and can actually be advantageous, assuming that you can darken them later and that no region has become solid white where there should be texture (see "Understanding Histograms, Part 1").

Resolution (sensor manufacturing limits,sensor size vs. noise) http://www.cambridgeincolour.com/tutorials/camera-lenses.htm

Lenses (resolution and aberrations vs. cost; size vs. aberrations, radial distortion)

http://photo.net/making-photographs/lens