May 18, 2011
The idea of this project was to automatically create panoramas of any length. Using two different collections of images, I was able to show reasonably hallucinated panoramas composed of images not captured in the traditional panoramic set-up, i.e. from a static location, where the camera is rotated on a static point. In the spirit of the Kaneva et. al work in , I used this hallucinated panorama pipeline to create an automatic mural of my vacation photos. The result was a unique pictorial summarization of my vacation.
The fist dataset I explored was the SUn dataset . This huge set of scene images was well-suited for this project because the images are close-up images of people or things. Because panoramas, automatic or not, are inherently images of a scene, using a database of scene images made sense.
The second set of images I used was from a vacation I took to Istanbul, Turkey. I selected 35 photos from my set of vacation pictures. These images were selected because they were image of scenes, besides being visually appealing and often contained people I was visiting.
In order to create a successful panorama, images that somehow resembled each other had to be selected. The panorama was constructed in a loop, and the first image was chosen at random from the dataset. The following images had to be selected because they somehow lined-up well with the previous image. Two visual features were used to select the best-match image for the forthcoming step in the panorama.
The Gist feature is a global image feature , which characterizes several important statistic about a scene . The Gist feature can encode the amount or strength of vertical or horizontal lines in an image, which for example can help to match scenes with similar horizon lines, textures, or buildings in them.
The Gist feature is computed by convolving an oriented filter with the image at several different orientations and scales. This way the high- and low-frequency repetitive gradient directions of an image can be measured. The scores for the filter convolution at each orientation and scale are stored in an array, which is the Gist feature for that image.
In this project, the Gist features for all of the SUn dataset were pre-computed. I used this pre-computed dataset when working with the SUn images. Because the panorama was constructed by matching the next image to the image directly before it, a new Gist descriptor did not need to be computed for the panorama at an intermediate state of construction.
The vacation images did not have Gist descriptors. I used the same method used to compute the SUn dataset Gist descriptors. The code for this computation was available at: Jianxiong Xiao's website . I used version 2 of the code. The matlab functions I wrote to calculate the new Gist descriptors are in computNovelGists.m and gistCalculator.m.
Gist descriptors were calculated using a Gabor filter at 8 orientations at 4 scales. Figure 1 shows the Gabor filters used to calculate the Gist descriptor. The Gist descriptors were calculated for the gray scale of the subject image. Figure 2 shows an example image in gray scale, it's pre-filtered version used to improve the quality of the Gist calculation, and the resultant Gist descriptor for that image. In the next section I will discuss how color was used as a discriminating feature.Figure 1: Gabor Filter used to calculate Gist Descriptor Figure 2: Example Gist Descriptor for one of my images
The Gist descriptor was appropriate for selecting matching images that have a similar spatial composition. In order to improve the likelihood that the match image really did match the search image, I used a color feature - Tiny Images. Tiny Images uses 32x32 size versions of the original images as a comparison feature . Because I had already used Gist features to ensure an approximate spatial match, I only wanted to make sure that the color transition between images would be smooth. Tiny Images seemed like a light weight way to access global color information. The method I wrote for tiny image comparison is in pickBestColorMatch.m.
First, a start image is selected at randomFigure 3: Image selected at random to begin panorama (SUN Image | My Image)
Then the top 10 closest Gist matches are selected using the function findGistMatches.m.Figure 4: 10 Best Gist matches (SUN Images | My Images)
From the best Gist matches, the best Tiny Image match is selected (pickBestColorMatch.m). This best match will be stitched together with the previous image using Graph Cut and Poisson Blending, discussed in the following sections.Figure 5: Best color match (SUN Image | My Image)
To create a more convincing panorama, the transitions between images needed to be streamlined. In order to accomplish this transition, graph cut and Poisson blending were used.
To find the optimal seam at which to merge two image for the panorama, I used a graph cut implementation. I augmented the graph cut code from Project 3. This implementation of graph cut was taken from Kwatra et. al in . By only using graph cut, the transition between images was improved, and the run-time of panorama building loop was on the order of 1 minute, depending on the length of the panorama. Below are some examples of panoramas only using graph cut. In panorama.m there is a commented line that can be turned on or off for construction of the panorama using graph cut only.Figure 5: Examples of Panoramas stitched using Graph Cut
To make the transitions between images smoother, Poisson Blending was used. The Poisson blending function I used was adapted for this task from the code I wrote for Project 2 . Using the Poisson blending greatly increased run-time to the order of 10s of minutes.Figure 6: Example of Transition with Poisson Blending
The final panoramas from the SUn dataset images are not perfectly convincing, but they are more visually appealing than previous results I have seen. For short panoramas of about 3 images, the results often include images from very similar scenes, and the results are especially good for outdoor scenes.
For the vacation images, short panoramas often selected the images taken in the same geographical location. This resulted in a visually appealing and much closer to reality panorama. Take for example the two sets of sequential images of the Ferry Station (it has a red crescent with blue wings sign on top) and the European Side of Istanbul (the buildings have sepia roofs and there is a large crowd of people). These sets of two images make mini-panoramas that are beautiful and almost convincing. I thought the whole panorama of the vacation was a lovely way to visualize my time in Istanbul, even if it wasn't a realistic panorama.
To improve this project, I think that more features could be used to select the best match images. The SUn Database paper  had much high accuracy at selecting images of the same scene category when more features were used in combination.
To make this project into a user-friendly application the Poisson blending stage would have to be greatly optimized.
 "SUN Database: Large Scale Scene Recognition from Abbey to Zoo," Jianxiong Xiao, James Hays, Krista Ehinger, Aude Oliva, and Antonio Torralba IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010.
 "Building the Gist of a Scene: The Role of Global Image Features in Recognition," A. Oliva, and A. Torralba, Visual Perception, Progress in Brain Research, vol 155. 2006.
 "Tiny images," Torralba, A. and Fergus, R. and Freeman, W.T., Tech. Rep. MIT-CSAIL-TR-2007-024. 2007.
 "Texture optimization for example-based synthesis." Kwatra, V., Essa, I., Bobick, A., and Kwatra, N., ACM Trans. Graph., 795-802. 2005.
 "Poisson image editing," Perez, P. and Gangnet, M. and Blake, A., ACM Transactions on Graphics (TOG), Vol 22. Num. 3, 313-318, 2003.
 "Infinite Images: Creating and Exploring a Large Photorealistic Virtual Space," Kaneva, B. and Sivic, J. and Torralba, A. and Avidan, S. and Freeman, W.T., Proceedings of the IEEE, Vol 98, Num 8, 1391--1407, 2010.