Lab: Fourier decomposition

Learning Objectives

  1. Practical experience of real-time webcam processing in MATLAB.
  2. To become familiar with the practical aspects of Fourier decomposition and editing in the Fourier domain in MATLAB.
  3. To understand and experience the ill-posed problem of deconvolution.

Prerequisites

  1. Basic knowledge of programming in MATLAB.
  2. Understanding of convolution.

Webcams and MATLAB

For this lab, we're going to try some real-time image processing, similar to the demos that James has shown in lecture. To begin, we need to install an additional component in MATLAB: the MATLAB Support Package for USB Webcams. Please follow the instructions here to install it.

Fig 1. Installing webcam support for MATLAB

With that installed, we can now query the connected webcams using webcamlist, and capture image frames from any connected cameras:

% Initialize webcam name in position 1 of the cell array returned by 'webcamlist'
cam = webcam(1);

% Set webcam parameters
% Feel free to query the 'cam' object, e.g., cam.AvailableResolutions
cam.Resolution = '320x240';

% Capture a frame
im = snapshot(cam);

% Show it!
imshow(im);

clear cam;

We can also see a live stream using preview(cam). Try it! Now, to process the images coming from the camera, we need to use snapshot within a loop. Be prepared to CTRL-C close the window from the Command Window, as we have no keyboard interrupt programmed to quit the loop. Using a while loop:

clear cam;
cam = webcam(1);
cam.Resolution = '320x240';

while true
  % Capture a frame
  im = snapshot(cam);

  % Show it!
  imshow(im);
end
Fig 2. Our processing loop.

Now, imshow() is a little slow because it sets up the window properties and image range rescaling every time it is called. Instead, we're going to just update the data within a figure from an initial imshow() call, using the figure handle h:

clear cam;
cam = webcam(1);
cam.Resolution = '320x240';

% Create figure handle
h = imshow( snapshot(cam) );

while true
  % Capture a frame
  im = snapshot(cam);

  % Update the figure data
  set( h, 'CData', im );
end

Great! Now we're ready to perform some real-time image processing. For fun, spend a few minutes devising an way to fake a long exposure...

Fig 3. James' 'long exposure'. I wrote 100 frames into a constantly-updating cyclic buffer, then took the mean over time.

Convolution

Before we begin to look at Fourier decompositions, let's use a convolution through the function conv2 and measure the performance of our system as the kernel size increases.

Tasks:

  1. Let's just work in the intensity domain today. Use rgb2gray() in your imshow() and snapsnot() functions to make our camera image grayscale.
  2. Using tic and toc, estimate the frame rate of the system—the number of frames processed per second. Write this value out to the Command Window using disp() or equivalent. Remember that et = toc; will write the elapsed time in seconds to a variable.
  3. Add a convolution with a small 5x5 kernel to your live loop. Use conv2(). Pick five kernel sizes, and estimate the performance hit as the kernel size increases. Go big! Modern PCs are very fast, and we needed kernel sizes in the hundreds to see an impact. Keep the numbers handy. For more precise measurement in the real-time processing loop, take a running average of the framerate over a set of frames.

Fourier Transforms: What are they and how are they useful?

Fourier transforms are an important part of image processing which can be used to to decompose an image into a basis of sine and cosine components. In this lab, we will see some of the powers that Fourier transforms provide us when dealing with images.

Using Fourier Transforms in MATLAB

In MATLAB, it is easy to compute Fourier transforms—we use the fft2() function. To compute the inverse of fourier transform, we can call ifft2(). The '2' here stands for '2D':

im = im2double( imread('rosso.png') );
im_f = fft2(im);
im_r = ifft2(im_f);
imshow( real(im_r) );
Fig 4. 'rosso' test image (here), and its deconstruction and reconstruction through the Fourier domain.

Let's investigate im_f. It is a complex double type, which means that it has both real and imaginary components (extractable with the real() and imag() functions, respectively). We saw in class what this means and how this is an elegant representation to the sine and cosine frequency basis via Euler's formula:

Fig 5. Euler's formula and its relation to the parameterization of a circle, showing the sine and cosine coefficients of the frequency basis.

We also defined how these terms can be used to extract the frequency amplitude and phase components of the Fourier decomposition: $$A = \pm \sqrt{Re(\psi)^2+Im(\psi)^2}$$ $$\phi = \tan^{-1}( \frac{Im(\psi)}{Re(\psi)})$$

Tasks:

Let's look at the decomposition live. Adapt your live loop to also perform Fourier decoposition, and to visualize the input, frequency amplitude, phase, and reconstruction images.

  1. Perform Fourier decomposition on your webcam video (use rgb2gray again for a grayscale image).
  2. Extract the frequency amplitude and phase components. We expect to have an amplitude and phase component for every pixel.
  3. Visualize the frequency amplitude and phase components. The goal is to achive something as shown below.
  4. Reconstruct the image from the amplitude and phase components, and show this as the fourth component. Use \(Re = Acos(\phi)\), \(Im = Asin(\phi)\), and im_f2 = re + 1j*im in MATLAB to return to the im_f complex double structure.
  5. Take your laptop out on the town (though probably just around CIT) and go and find some interesting frequency content! Add your favourite to your writeup by taking a screenshot.

Some tips:

  1. For convenience of display, it helped us to make our input image a fixed square size. We cropped and resized our webcam image to [256 256].
  2. To place all of the four images into a grid, we used a big matrix that was 4x4 times the input image size, then used cat() liberally.
  3. For both the amplitude and phase, we used the fftshift function to center the offset component (instead of it being at the edge of the image).
  4. The amplitude coefficient values themselves have a large range, and so we took the log and then also divided by 10! Take care to only affect the values of the data shown on the screen, and not the data used in the reconstruction.
  5. Cast your reconstruction back to a real after performing ifft2(), so that we can display it as an image.
Fig 6. Target Fourier decomposition visualization view. Top left: input image. Bottom left: frequency amplitude (rescaled). Bottom right: phase. Top right: Fourier reconstruction of input image.

Editing the frequency amplitude and phase images

Now that we have our decomposition in an easily manipulated format, let's edit it! We know that, in the frequency amplitude image, frequencies increase as we move out radially from the center, and horizontal and vertical frequencies lie on the y and x dimensions respectively. Please refer to the lecture slides for example editing operations.

Tasks:

  1. Implement a 'low pass' filter by editing the frequency amplitude image. A high pass filter only lets through low frequencies.
  2. Implement a 'high pass' filter by editing the frequency amplitude image. A high pass filter only lets through high frequencies.
  3. Edit only some frequencies in a particular directional.
  4. Experiment with editing the phase image. What effects can we accomplish?
  5. Implement Gaussian filtering—recall that the Fourier decomposition of a Gaussian (its dual) is another Gaussian (with 1/sigma variance).

Fourier decomposition for convolution efficiency

By the convolution theorem of Fourier analysis, we know that convolution in the spatial domain is equivalent to element-wise multiplication of the frequency amplitude image in the frequency domain. This provides an opportunity: given that the computational cost of a Fourier decomposition is fixed, but that the computational cost of large kernel sizes increases (remember the first lab exercise), then we can use convolution in frequency space to speed up the execution of large kernels. MATLAB already does this for you if you use the imfilter() function, but not if you use the conv2().

Tasks

  1. Implement convolution for arbitrary kernels in the frequency domain. Take the FFT of the kernel and the image, and element-wise multiply their frequency amplitude images. Then, reconstruct with iFFT. Hint: While it will work just fine, we don't actually need to separate the complex double into separate amplitude and phase variables for this operation.
  2. How fast is your code versus conv2? At what kernel size does it become faster to convolve in the frequency domain than in the spatial domain?

Deconvolution

In class, we discussed that, by the convolution theorem, division in the Fourier domain is, in principle, a deblur or deconvolution. If we know the filter kernel, this deconvolution problem is possible (though the solution is unstable to even small amounts of noise). In most cases, we do not know the filter kernel and must estimate it. The Hubble Space Telescope's WFPC camera was famously blurry for the first three years of its operation due to an incorrectly-shaped mirror, but by imaging distant stars NASA could correct for and deconvolve Hubble's images.

Fig 7. Hubble's optical flaw corrected by deconvolution image processing. [NASA; Wikipedia]

Task: Known filter kernels

Let's experiment with deconvolving given a known filter kernel.

  1. Pick an image which is relatively blur free. Here's one from lab 3 if you need one.
  2. Pick a filter kernel, e.g., a Gaussian, and then convolve the image with the kernel.
  3. Using the Fourier decompositions of the filtered image and the kernel, perform 'inverse filtering' in the Fourier domain by element-wise division of the frequency amplitude components.
  4. How does your result vary with the sigma of the Gaussian kernel? How does your result vary with adding minor noise to the input image after it has been convolved? In terms of cameras, think that noise is a sensor artifact and blur is an optical artifact, and the noise is 'added' (so to speak) after the light has been blurred optically?

Is deconvolution a well posed problem? How many possible scene and kernel combinations can make up a given image?

Important: When the values in the frequency amplitude channel of our Fourier decomposition are small (close to zero), this division causes problems as slight imprecision due to sampling noise causes large differences in output. Unfortunately, most point spread functions in imaging and optics have frequency decompositions with small amplitude values, which requires us to use strong regularization when deconvolving. This is beyond the scope of this lab, but interested parties can refer to Gordon Wetzstein's introduction here.

Potential exercise (not required) or inspiration for final project...

In real imaging applications, we usually don't know the kernel. How might we measure the optical blur (filter kernel) for a camera, to characterize depth of field and potentially deblur? One option is to image light from a single point in the world—a point light (or our best physical approximation of it)—and see how the light is spread across the sensor. This is called the point spread function (for obvious reasons). One way to image the point spread function is to photograph a single LED in a very dark room at a known distance away from the camera, and for known camera parameters (focal length, aperture). In principle, we can image the point spread function for all distances and all camera parameters, and calibrate or characterize the camera's depth of field blur. Then, if we knew the depth of a particular scene point in a new unseen photograph, we could deconvolve with the calibrated point spread function as a kernel.

Fig 8. Point spread function captured via a point light source at increasing distances (defocuses). From Durand et al., 2006.
Fig 9. Hubble's captured kernel or point spread function [NASA; Wikipedia]. Thankfully, stars are pretty good point light approximations!

Phew! Seems hard (and laborious), but imagine performing this for just one depth and one set of camera parameters as an experiment, and then debluring an image of an object with planar geometry at that fixed depth. In principle, we could image any target that would allow us to measure the point spread function. It could be a point light, or we might try a tiny black dot on a smooth piece of paper or card (without any grain), which we could then invert.

Fig 9. Capturing a point spread function at a fixed depth, and a surface blurred under that point spread function.
Fig 10. Western order: dot, blurred dot, image at ~same depth as dot, blurred image at ~same depth as dot.

Would this approach work? What are the limitations of the approach? Is there a better workflow and computation?


Submission

Please upload your MATLAB code, input/result images, and any notes of interest as a PDF to Gradescope. Please use writeup.tex for your submission.


Acknowledgements

This lab was developed by Alan Yu and the 1290 course staff.