Lab III: Bilateral Filter Lab

Learning Objectives

Implement a bilateral filter in Python.
Given a noisy image, students will be able to adjust the parameters of a bilateral filter to achieve maximum noise reduction,
Students will combine flash and no-flash photos using the cross-bilateral filter to generate high quality images in low-light conditions.

Prerequisites

Basic familiarity with the concept of filtering, and how it can be implemented using a convolution.
Basic knowledge of programming in Python.

Introduction

As our noble digital photographer ventures out into the wilderness armed with their trusted cameras, they are repeatedly hounded in their quest for truth by the vile and iniquitous Noise Monster. In this lab we will cover one of the tools the lionhearted photographer can use to combat this fiendish interloper: the bilateral filter.

If you have past experience with filtering, you may be aware of how it can be used to remove unwanted frequencies (a fancy term for noise) from an input signal. What, then, makes the bilateral filter so special? To appreciate the advantages offered by a bilateral filter, it would be useful to first understand why an old-fashioned Gaussian filter proves insufficient when it comes to processing images.

Let's load up a noisy image in Python and convolve it with a Gaussian filter:


    from PIL import Image
    import numpy as np
    from scipy import ndimage

    img = np.array(Image.open('path/to/rubiks_cube.png'), dtype=np.float32)
    img_filtered = ndimage.gaussian_filter(img, sigma=[3, 3, 0])

    Image.fromarray(img_filtered.astype(np.uint8)).show()

Listing 1. Trying to remove noise using Scipy's built-in Gaussian filter (scipy_gaussian.py)

While the amount of noise has certainly been reduced, the astute viewer will notice that so has most of the detail around object edges. Everything appears blurry now. This is why a Gaussian—or any other convolutional filter, for that matter—is not very useful as a noise supression tool for images: it filters indiscriminately, leading to loss of resolution. What we want, instead, is a filter that works on the `internal' pixels of an object, but leaves the edges alone.

Q1: The following two figures show a crude decomposition of our image into edge (left) and non-edge regions (right). Do you notice anything different about the color composition of the regions in each case? How many different colors can you count in each region?

**Fig 3. The edge, and non-edge regions of the image**

In class, we saw how the median filter can handle edge-aware filtering. Let us now see how a bilateral filter handles the same image.

The result, in this case, is much closer to what we want.

Q2: Based on the color composition of the edge and non-edge regions shown above, what do you think the bilateral filter is doing?

Gaussian Filter Background

The bilateral filter is a Gaussian that acts strongly on regions of uniform color, and lightly on regions with high color variance. Since we expect edges to have high color variance, the bilateral filter acts as an edge-preserving or edge-aware filter.

Let us dive into the details of how the bilateral filter works. Once again, the Gaussian distribution provides a good launching pad. First, let's remind ourselves of its 2D form—here unnormalized. Let's assume the Gaussian of variance $\sigma$ is centered at a point $p$ and we're sampling the PDF for a point at $q$: $q$ is $d$ away from $p$.

$$\mathcal{N}(d) = e^{- \left( \frac{(q_x - p_x)^2 + (q_y - p_y)^2}{2\sigma^2} \right) }$$

Equation 0. Unnormalized Gaussian PDF

The Gaussian filter applied at a pixel index $p$ in image $I$ can be written as:

$$G(p) = \frac{1}{w} \sum\limits_{q\in S} \mathcal{N} (|p-q|)I_q$$ $$w=\sum\limits_{q\in S}\mathcal{N}(|p-q|)$$

Equation 1. Gaussian filtering with normalization via $w$

where $I_q$ is the value at pixel index $q$, and $S$ denotes a small neighborhood of pixels around $p$. The $\mathcal{N}$ denotes a Gaussian (which is also called a $\mathcal{N}$ormal distribution), and $w$ is the normalization factor—this preserves image brightness during filtering. A simple Python implementation of this equation is provided in Listing 2. While this isn't the most efficient implementation of Gaussian filtering—we would typically use a numpy implementation—it is helpful for understanding the Gaussian filter and the changes we need to make to a Gaussian filter to obtain a bilateral filter.

Note: If you have trouble mentally mapping this code to your conceptual understanding of filtering, feel free to review the course slides on filtering.

# Gaussian convolution applied at each pixel in an image
  # Parameters:
  #   im: Image to filter, provided as a numpy array
  #   sigma:  The variance of the Gaussian (e.g., 1)
  #
  import numpy as np
  
  def gaussian(im, sigma):
      height, width, _ = im.shape
      img_filtered = np.zeros([height, width, 3])
  
      # Define filter size.
      # A Gaussian has infinite support, but most of it's mass lies within
      # three standard deviations of the mean. The standard deviation is
      # the square of the variance, sigma.
      n = np.int(np.sqrt(sigma) * 3)
  
      # Iterate over pixel locations p
      for p_y in range(height):
          for p_x in range(width):
              gp = 0
              w = 0
  
              # Iterate over kernel locations to define pixel q
              for i in range(-n, n):
                  for j in range(-n, n):
                      # Make sure no index goes out of bounds of the image
                      q_y = np.max([0, np.min([height - 1, p_y + i])])
                      q_x = np.max([0, np.min([width - 1, p_x + j])])
                      # Computer Gaussian filter weight at this filter pixel
                      g = np.exp( -((q_x - p_x)**2 + (q_y - p_y)**2) / (2 * sigma**2) )

                      # Accumulate filtered output
                      gp += g * im[q_y, q_x, :]
                      # Accumulate filter weight for later normalization, to maintain image brightness
                      w += g
              img_filtered[p_y, p_x, :] = gp / (w + np.finfo(np.float32).eps)

      return img_filtered

Listing 2. Applying Equation 1 to each pixel in an image with Python (filt.py).

Let's run this code on the Rubik's Cube image and see if we get the same result as Scipy's built-in Gaussian filter. Be warned, though. The code will take a while to run because it is in standard interpreted Python. We will address this problem shortly.

# Gaussian convolution applied at a pixel in an image
import numpy as np
from PIL import Image
import filt

img = np.array(Image.open('../img/rubiks_cube.png'), dtype=np.float32)
img_filtered = np.asarray(filt.gaussian(img, 3))
Image.fromarray(img_filtered.astype(np.uint8)).show()

Listing 3. Testing our implementation of Gaussian filtering (our_gaussian.py).

Interlude: Speeding up Python with Cython

Let's try to speed the code up a bit. We will do this using a python library called Cython. The main source of latency in our Python code of Listing 2 is the nested for loops. Cython allows us to run these as compiled C code, which is very fast. You can install Cython using Pip:

$> source ./../path/to/your/venv/bin/activate

$> pip install cython

The changes that we need to make to the python code from Listing 2 are shown in Listing 4:

We import C versions of specific functions exp and sqrt, and
We add the keyword cdef to our variable definitions, and we strongly type our variables with int and double.

This lets Cython know the operations and memory to execute faster.


# Gaussian convolution applied at each pixel in an image (Cython version)
# Parameters:
#   im: Image to filter, provided as a numpy array
#   sigma:  The variance of the Gaussian (e.g., 1)
#

# Tell Cython to use the faster C libraries.
# If we add any additional math functions, we could also add their C versions here.
from libc.math cimport exp   
from libc.math cimport sqrt

import numpy as np

def gaussian(float[:, :, :] im, double sigma):
    cdef int height = im.shape[0]  # cdef int tells Cython that this variable should be converted to a C int
    cdef int width = im.shape[1]   # 

    # cdef double[:, :, :] to store this as a 3D array of doubles
    cdef double[:, :, :] img_filtered = np.zeros([height, width, 3])

    # A Gaussian has infinite support, but most of it's mass lies within
    # three standard deviations of the mean. The standard deviation is
    # the square of the variance, sigma.
    cdef int n = np.int(sqrt(sigma) * 3)

    cdef int p_y, p_x, i, j, q_y, q_x

    cdef double g, gpr, gpg, gpb
    cdef double w = 0

    # The rest of the code is similar, only now we have to explicitly assign the r, g, and b channels
    for p_y in range(height):
        for p_x in range(width):
            gpr = 0
            gpg = 0
            gpb = 0
            w = 0
            for i in range(-n, n):
                for j in range(-n, n):
                    q_y = max([0, min([height - 1, p_y + i])])
                    q_x = max([0, min([width - 1, p_x + j])])
                    g = exp( -((q_x - p_x)**2 + (q_y - p_y)**2) / (2 * sigma**2) )
                                            
                    gpr += g * im[q_y, q_x, 0]
                    gpg += g * im[q_y, q_x, 1]
                    gpb += g * im[q_y, q_x, 2]
                    w += g
                    
            img_filtered[p_y, p_x, 0] = gpr / (w + 1e-5)
            img_filtered[p_y, p_x, 1] = gpg / (w + 1e-5)
            img_filtered[p_y, p_x, 2] = gpb / (w + 1e-5)

    return img_filtered

Listing 4. Modifying our filtering code slightly to make it run faster (filt_cython.pyx).

Every time we make changes to our code, and before we run our code, we must ask Cython to compile the changes we made. We've set this up for you: download setup.py and execute the following command:


  $> python setup.py build_ext --inplace

Try running $python our_gaussian.py now. You should see a significant improvement in speed. You can do this for any Python code. Sometimes it's impossible to vectorize code into numpy functions. Cython gives us another option to make Python code faster.

Our BF Forever: the Bilateral Filter

$$G(p) = \frac{1}{w} \sum\limits_{q\in S} \mathcal{N} (|p-q|)I_q$$

Equation 1. Gaussian filtering

In Equation 1, the contribution of each I_q to the filtered value of p, G(p), is determined by the distance between pixel p and q, that is, |p -q|. To preserve detail around edges, we would like to add an additional term to Equation 1 that accounts for the color variance around p as well.

It turns out that one controllable way to modulate filtering in edge regions is to weight the contribution of each I_q to G(p) in Equation 1 by the color difference between p and q.

$$G(p) = \frac{1}{w} \sum\limits_{q\in S} \mathcal{N} (|p-q|)(|I_p-I_q|)I_q$$

Equation 2. An additional term in the Gaussian filter to weight contribution by color.

Task Ia

Modify the code in Listing 4 to include an additional weight based on the color difference between pixels, as shown in Equation 2.
Modify the normalization factor $w$ in your code such that the filter weights still sum to 1. How can we achieve this across the three color channels?

Task Ib

The weight in Equation 2 decreases linearly with respect to the difference in color. Modify your code so that weight is exponential with respect to the difference in color. Does this make the results better or worse?
Make sure to correct your normalization factor $w$.

**Fig 5. An illustration of how the spatial and color (intensity) weights combine to preserve edges.**
Image from *Fast Bilateral Filtering for the Display of High-Dynamic-Range Images*, Durand and Dorsey.

Task Ic

Now, let's go a step farther, and weight the color difference based on a Gaussian.

Modify the code so that the weight is Gaussian with respect to the color difference.
Introduce a new sigma to control this second Gaussian on the color information. You may need to increase sigma depending on your kernel size!

$$G(p) = \frac{1}{w} \sum\limits_{q\in S} \mathcal{N}(|p-q|)\mathcal{N}(|I_p-I_q|)I_q$$ $$ w = \sum\limits_{q\in S} \mathcal{N}(|p-q|)\mathcal{N}(|I_p-I_q|)$$

Equation 3. A bilateral filter.

Congratulations! You have successfully implemented a bilateral filter:

Since the standard definition uses a Gaussian as the weight decay function, bilateral filters are commonly defined by the variance values of the two Gaussians that determine the weights: $\textrm{BF}(\sigma_1, \sigma_2)$. This is what makes them powerful controllable edge-aware filters through varying these sigma values. We will see this in the next section.

Freedom! Now that you understand what role each term in Equation 3 plays, you need not be constrained by convention. You may choose to use any combination of filters—box, tent, or even something novel you've created yourself—depending on the task at hand, and how the filter affects the result. The important thing to keep in mind is that the weight is now a function of both color, and space. In fact, you could even come up with a trilateral filter that used a third domain, say, the image gradients, in addition to the color and spatial distance. The last task of this lab encourages you to think about designing such techniques.

The Spatial and Color (or Range) Sigmas

In the last section we saw how a bilateral filter is controlled by two parameters: the variance of the Gaussian filter in the spatial domain, and the variance of the Gaussian in the color domain. We will refer to these parameters as the spatial and range sigmas respectively. Moreover, as the two Gaussians that make up the bilateral filter are being multiplied together, a value close to zero for either one of the sigmas makes the whole filter ineffective. On the other hand, If the two values are too large, the bilateral filter no longer preserves edges well.

Fig 6a. Effect of varying the spatial and range sigmas.

Q3: Based on the image content, we may choose to assign a relatively higher value to the spatial sigma, or to the range sigma. Given our understanding of the bilateral filter, what kind of images do you think would require a higher spatial sigma, and what kind a higher range sigma?

To put our intuition about the effect of each parameter into practice, we will denoise the images in Dataset I. In each case, our goal will be to remove as much noise from the image as possible, while maintaining the original image edges. Our measure of success will be the L2 distance from a ground truth image (also included in the dataset).

**Fig 6. Dataset I.**
Noisy | Ground truth

Task II

Write a script that varies the spatial and range sigma values for an image in Dataset 1, and outputs a 5x5 2D grid of filtered images showing the variation. An example is here: parameter_grid.png (GB = simple Gaussian blur). The pyplot.subplot function in the matplotlib library will help lay out the figure. You might need to experiment with your parameter values. What do you notice about the result?
Apply the bilateral filter you wrote for Task I to each image in Dataset I. Determine the approximate range of the spatial and range sigma values that minimize the L2 distance from the ground truth.
List a feature of each image that determines the unique spatial/range sigmas for each image.

Stretch Goal: Our Angry BF: the Cross Bilateral Filter

Note: This is an optional component of the lab that demonstrates a useful application of the bilateral filter beyond denoising.

Task II demonstrated how image content determines the effectiveness of bilateral filtering. In some instances, however, a variant called the cross bilateral filter can be used to double-cross the aforementioned rule. The idea behind cross bilateral filtering is that the range and spatial Gaussian filters can act on two different images! This proves especially useful when we have multiple photographs of a single scene captured with different camera settings.

The cross bilateral filter is not restricted to the task of denoising; it can be used to improve the overall quality of a photo. In fact, your next task explores this very feature.

Task III

Images captured in low-light situations tend to be noisy, and lack sharp edge resolution. A common solution to this problem is to use a flash. However, while the flash image does have sharp edge definition, it suffers from an unpleasing direct-lighting effect.

Capture a pair of images of the same scene, in low-light conditions, with and without a flash. You may want to use a tripod to make sure your camera does not move between captures.
Modify your bilateral filter code so that the range and spatial Gaussians are applied to two different input images.
Use your implementation of the cross bilateral filteral to combine the flash/no-flash image pairs into a high quality photo. An example is shown in Fig 7.
What parameters did you use, and why does this approach work?

**Fig 7. Using the cross bilateral filter to combine a flash/no-flash image pair.**
Figure from *Flash Photography Enhancement via Intrinsic Relighting*, Eisemann et al. 2004

Conclusion

We started off with the problem of filtering an image to remove noise while maintaining sharp detail in the edges. We saw how the bilateral filter solves this problem by combining two different sources of information: the weights of the filter were determined by color proximity, in addition to spatial proximity. The cross bilateral filter took the idea of drawing on multiple sources of information a step further. The RGB values for the filter came from two altogether different photos of the same scene.

In fact, this idea recurs throughout the field of computational photography. An imaging problem is solved by drawing on information from different representations, and even modalities. Together, the combined sources make up for their individual shortcomings, and provide a more comprehensive (or perhaps simply a more beautiful) picture of the world.

Submission

Please upload your Python code, input/result images, and any notes of interest as a PDF to Gradescope. Please use writeup.tex for your submission.

Acknowledgements

This lab was developed by Numair Khan and the 1290 course staff. Numair created the logos and drew Sir Cameralot, with inspiration from Hergé and Tintin. Thanks to Sylvain Paris for the parameter grid example.