Project 5: High Dynamic Range

Due Date: Friday, April 8th, 11:59pm

Brief

Background

Modern cameras are unable to capture the full dynamic range of commonly encountered real-world scenes. In such scenes, even the best possible photograph will be partially under or over-exposed. Researchers and photographers commonly get around this limitation by combining information from multiple exposures of the same scene. There are few consumer friendly HDR pipelines, though. Some cameras can be configured to exposure bracket a scene, but cameras aren't smart enough to automatically combine the resulting exposures. Luckily, you have a computational camera which can quickly and intelligently capture and combine multiple exposures. In this assignment, you will use FCam to have near total control of the exposure settings.

Requirements

There are two pipelines for this assignment. Each group is required to implement both. You should read the paper that inspired each pipeline. Even though we discussed them in class, there are additional relevant details in the papers.

1) In the spirit of Debevec and Malik 1997, you are required to combine multiple exposures into a high dynamic range radiance map and then use a global tone mapping operator to create a low dynamic range visualization of your HDR radiance map. Luckily, our Nokia N900 and FCam allow you to capture "raw" images in which pixel values are (nearly) linearly proportional to exposure. This makes the computation of g, the inverse of the function mapping exposure to pixel value, unnecessary. This makes the computation of the HDR radiance map easier but not entirely trivial. You will still want to consult equations 5 and 6 in the Debevec paper to know which pixels to trust in each exposure. Once you have the radiance at each pixel, you need to tone map this to an appropriate range for display. Consider using a global tone-mapping operator, such as Reinhart's, or implement a local one for extra credit.

2) In the spirit of Exposure Fusion, Mertens et al., you will fuse multiple exposures into a single, detailed composite without ever explicitly computing an HDR radiance map. This approach doesn't care about the exposure times. It doesn't even care if the flash fired. It's just trying to composite the high contrast, well exposed pieces of the various exposures. The goal is to decide which pixels in each exposure are trustworthy using some simple heuristics and create a weighted composite of the exposures according to that trustworthiness. It's one of those things that seems to work poorly in theory, but well in practice. But the method is sensitive to the choice of particular input photos (whereas pipeline 1 is not sensitive, assuming all regions appear well-exposed at least once).

Details

Additional Phone Setup

Before doing anything else you will need to add libraries to your phone. To do this open your phones xterm and do the following: sudo gainroot apt-get install libcv4 libcvaux4 libhighgui4

First Pipeline

HDR radiance map Tone mapped result

We want to build an HDR radiance map from several LDR exposures. The observed pixel value Zij for pixel i and exposure j is a function of unknown scene radiance and known exposure duration: Zij = f(Ei Δ tj ). Note that Ei is the scene radiance at pixel i, and scene radiance integrated over some time Ei Δ tj is the exposure at a given pixel. In general, f might be a somewhat complicated pixel response curve. Luckily, we can capture raw images and start by assuming that f is an identify function and leave it out.

Rearranging this equation and taking the natural log of each side, we get ln(Ei) = ln(Zij)-ln(Δ tj). This is a simplified version of Equation 5 in Debevec.

Each exposure only gives us trustworthy information about certain pixels (i.e. the well exposed pixels for that image). For dark pixels the relative contribution of noise is high and for bright pixels the sensor may have been saturated. To make our estimates of Ei more accurate we need to weight the contribution of each pixel according to Equation 6 in Debevec. An example of a weighting function w is a triangle function that peaks at Z=127.5, and is zero at Z=0 and Z=255.

Getting the radiance map is only half the battle. You want to be able to show off your image clearly. There are a few gobal tone-mapping operators to play with, such as log(L), sqrt(L), and L / (1+L). Regardless of which transform you use, you'll want to stretch the intensity values in the resulting image to fill the [0 255] range for maximum contrast.

Second Pipeline

In the Exposure Fusion pipeline, there are two primary algorithmic challenges: 1) constructing weight maps which indicate the relative contribution of pixels in each exposure to the final composite and 2) blending the individual exposures in the gradient domain according to these weight maps.

The Exposure Fusion paper suggests measuring three properties when building weight maps. Weights are generated using three measures of pixel quality in their corresponding image. Each of the three measures can be thought of as producing a weight map.

The first property, contrast, is measured by filtering the intensities of a given image with the Laplacian filter (where will this assign the highest weights?). The next pixel quality measure captures the saturation of pixels in an image by computing the standard deviation of the color channels for each pixel. The final measure tries to capture the degree to which an input image is well exposed. This measure can be implemented by taking the product of the Gaussian falloff from 0.5 of the intensities for each channel, e.g. exp{-(pi-0.5) / (2s2)} where s=0.2

The final weights are constructed by taking the product of the weight maps specified by the quality measures. Within the product the weights from a given quality measure are raised to the power of a constant to adjust their relative importance. In the example above the all three constants were one. The combined weight maps for all exposures should be normalized such that they sum to one at any pixel location.

At this point, one could produce a final composite by summing the input exposures multiplied by their weight maps. However, this approach works poorly (see below). What kinds of problems are we seeing and why? We can create a better composite by fusing the exposures in the gradient domain, as in project 2, except the suggested machinery in this case is Laplacian pyramid blending rather than Poisson blending because we have real valued weights instead of binary weights. See section 3.1 in the Exposure Fusion paper for more details.

Implementation Tips

Extra Credit

For all extra credit, be sure to demonstrate on your web page cases where your extra credit has improved image quality.

Graduate Credit

Groups contain mixtures of graduate and undergraduate students, so there are no distinct graduate requirements.

Write up

Each group will produce a single hand in. Make sure you say who is in your group. Feel free to name your group, as well. Only one group member needs to run the handin script.

Describe your algorithm and any decisions you made to write your algorithm a particular way. Show and compare the results of your two algorithms. Also discuss any extra credit you did. Feel free to add any other information you feel is relevant.

Because you are building a (hopefully) interactive system, it would be compelling to show videos of actual usage in different scenes.

Handing in

This is very important as you will lose points if you do not follow instructions. Every time after the first that you do not follow instructions, you will lose 5 points. The folder you hand in must contain the following:

Then run: cs129_handin proj5
If it is not in your path, you can run it directly: /course/cs129/bin/cs129_handin proj5

Rubric

Credits

Project partially based on Noah Snavely's Computer Vision course at Cornell University. Handout written by David Dufresne, Travis Webb, and James Hays.