![]() |
![]() |
![]() |
My project is divided into two separate parts. The first part takes a video stored as an image sequence as input, and it outputs another image sequence which represents samples of the light field passing through a plane at evenly spaced intervals. Some previous work in this area has used more expensive hardware setups such as the creation of a microlens array or a multi-camera array. The second part of my project takes these light field samples, and using camera based head tracking, it displays the scene as it would appear from the given head position.
In order to generate samples of the light field at evenly spaced intervals we need to make several assumptions about the input video. First, we must assume that the scene is static. This ensures that the light field itself is not changing as we sample it. Second we assume that the video camera is dollying from left to right and does not move up and down. This assumption allows us to arrange the samples in order and to completely ignore any vertical movement of the camera. This is not an unrealistic assumption especially given that our goal is to display these scenes in a head tracked environment. In such an environment, there is a great deal more lateral movement than vertical. An example of an input video is given below.
This video is then input into a matlab program. The program prompts the user to select a point in the first frame, which will be tracked in all subsequent frames. This depth of this point in the scene will determine the depth of the plane through which the light field will be sampled. In general, this point should be as close to the camera as possible and still be visible in all of the frames. A small window is created around the selected point. For a given frame, we use that window as a template to compute the normalized cross correlation. This gives us the new location of the desired point. If the point has moved a significant amount from one frame to the next, we also update the template to reflect the change in perspective.
Once the point has been located in each frame, the frames are arranged based on thier horizontal translation. The translations are clamped to integer pixel values and duplicate frames are then removed. The user is then shown the middle frame in the sequence, and asked to select a rectangle, which corresponds to the virtual window through which we will measure the light field. The final output of the program is an image sequence which represents samples of the light field through that virtual window from many different horizontal translations. An example is shown below.
The second part of my project is focused on recreating the above scenes as interactive visualizations. To this I wrote a program in C++ using OpenGL and OpenCV. The inputs to this program is a sampling of the light field passing through a plane from different horizontal translations, as well as the head position of the viewer. To determine the head position, I used four OptiTrack cameras positioned on the corners of a large 5x8 foot display.
![]() |
![]() |
![]() |
Each camera emits infrared light, which reflects off of infrared markers mounted in "constellations." The cameras report the positions of each markers to a server, which determines the position and orientation of each constellation. This information is then broadcast to my program using vrpn. Given the horizontal head translation, we then display the correct image on the screen as it would be seen from that particular head position. Because each scene is different the incomming head positions must be offset and scaled by variable amounts which can be tweaked by the user until the proper effect is achieved.
Below are some videos created by attaching the head tracking markers to the camera itself, and then moving it horizontally in front of the screen. It's achieves an effect very similar to looking through a window into the actual scene.