# CSCI1290 : Final Project

## Real Time Depth of Field using the Microsoft Kinect

### rdfong

For this project, I decided to implement real time depth of field using the Microsoft Kinect and its depth detection capabilities. In order to achieve a real time effect I used Open GL shader language (GLSL). The program also allows you to adjust the focal length and aperture on the fly.

### The steps of the algorithm are the following:

1. Get the color and depth information from the Kinect and combine it into a single texture using the alpha channel as the depth value. Make sure that everything ranges from 0 to 1 (otherwise Open GL will clamp things)
2. After some artifact fixing (to be discussed later), we attach a framebuffer and render the texture onto a quad using an orthonormal camera projection.
3. In the shader, we calculate the blur radius for the current pixel using the following formula: max(0.01/640, min(max_blur,max_blur*(abs(focal_length-cur_depth))/aperture)); In this formula, focal_length is a depth value between 0 and 1 that we want the camera to focus on, cur_depth is the depth at the current pixel, max_blur is the largest radius blur we will allow.

Aperture is defined using a near and far plane that are equidistant from the focal length. The blur radius should only span 1 pixel when the pixel's depth is at the focal length and then increase linearly towards the maximum blur radius as the depth approaches either plane. If the depth falls outside of the planes it just gets assigned the maximum blur radius. Note: The first argument to the max (x,y) call is there just to make sure that the blur radius can never reach 0.

4. Next, I sample a 5 by 5 grid area of texture coordinates (spacing and size dependent on the blur radius) around the texture coordinate of the current pixel. For each sample I store its weighted color contribution in an array of 25 colors. If the distance between the sample point and the current pixel is greater than the blur radius or if the difference between the depths is above some threshold (prevents color leaking at edges), its weight is 0, otherwise it is that distance divided by the blur radius. In addition, instead of just sampling straight from the texture I do a weighted sampling of multiple levels of detail using Open GL's mipmapping functionality (essentially a weighted average of different levels of an image pyramid).

Side note: Originally I was using stochastic sampling to blur. However, in order to do so actually randomly I had to pass in a large texture of randomized directions and magnitudes and then sample from it nonsequentially. While it did work, it also decreased my frame rate to about 4 fps, which I deemed unacceptable. Rather than trying to find some hackish way around it I ended up ditching the idea entirely and using this one (much faster, about 50 fps) instead.

5. We then normalize the elements of the array and sum them to get the final color. Note that we will be running multiple passes of this shader so we need to restore the depth value in the alpha channel. See shader code below:
6. We run this shader multiple times swapping between two framebuffers(I decided the results looked best with 4 additional passes), and then render the final result onto the main buffer on a textured quad.

7. ### Problems with the Kinect

Unfortunately the Kinect isn't perfect. Especially up close, the depth detection doesn't quite work right at the edges if prominent objects. These artifacts manifested themselves in the depth map as regions where the distance was equal to the maximum range of the Kinect. While these regions usually didn't distort the object itself too horribly, it got rid of all information about anything in the object's background that was covered by them (left). I also noticed that the artifact only occurred on one side of objects in the scene (not really sure about that one). To help relieve the issue, I simply started from the side of the image the artifacts were on, and marched across the image pixel by pixel, setting the depth at any pixel in such a region to the last seen valid depth value in that row (right). While this isn't particularly accurate way of solving the problem, it definitely helped. In the common case where the background was something flat like a wall, it certainly did the trick.
Side Note: Because the Kinect uses infrared it also fails at depth detection on transparent or really shiny objects (see white blob next to left arm in image below). You may also notice I have my blinds down in most of my results...

You'll also notice that this made the depth map near edges look blocky and jagged. This issue was partially solved with some amount of blurring. However, if the size of the blur radius got too large these artifacts became magnified. This makes sense as with a larger blur radius, more pixels would become affected by the problem pixels. This was the main reason I decided to run multiple blur passes. Instead of using a large range of blur radius sizes I decreased the range and ran more passes to achieve the same amount of blurring while keeping artifacts to a minimum. This helped for objects that were farther away but I was unable to solve the issue as effectively for closer ones.

Next I ran into problems with the alignment of the depth image which unfortunately did not line up with the color image perfectly. This was especially noticeable near edges where it would, for example, perceive part of the wall behind and object to be at the same depth as that object. (See exaggerated example figure below). It was not fixable via a simple translation as the discrepancy seemed to differ depending on the location in the scene. This discrepancy was especially large closer to the camera. I tried a few different things to either fix or hide the problem but I never really managed to find a good solution. You will notice in my results that some objects have areas near their edges that are either too blurred or too focused.