Project 5: Automated Panorama Stitching
Due Date: 9pm on Thursday, November 19th, 2020
Logistics
- Files: Github Classroom.
- Extra disk space: /course/cs1290_students
- Code
- Required files: code/, writeup/writeup.pdf —use the
'createSubmissionZip.py'
script. - Hand-in process: Gradescope as ZIP file. Submit anonymous materials please!
- Due: Thurs 19th Nov. 2020, 9pm.
- Required files: code/, writeup/writeup.pdf —use the
Overview
The general pipeline for this project involes the following steps:
- Define correspondences between 2 images.
- Recover the homography using RANSAC
- Warp one image using the recovered homography
- Composite the images
Details
Define correspondences between the 2 images
This part of the process is based on
Brown, et al.
It is strongly recommended that you read this paper. It is mostly already implemented for you in define_correspondences()
in student.py.
- Harris Interest Point Detector (section 2) - is already done for you. Note: these interest points are being found in find_keypoints_and_correspondences(), as a step of extracting SIFT features for the images. Under the hood, SIFT uses the Harris corner detector at multiple scales.
- Adaptive non-maximal suppression (section 3) - loop through all initial interest points, and for each interest point compare the corner strength to all other interest points. Keep track of the minimum distance to a larger magnitude interest point (not necessarily larger, perhaps within 0.9 as large, see paper). After you've computed this minimum radius for each point, sort the list of interest points by descending radius and take the top N. You do not need to do this, but you are more than welcome to.
- Implement feature descriptor extraction (section 4) - As mentioned above, the provided code currently extracts SIFT features, which are scale invariant feature descriptors that are essentially a histogram of gradient magnitudes over a neighborhood. (I encourage you to read more about SIFT or come to office hours if you're interested.) You could extract simpler features, for example simply take all the intensity values in the neighborhood and slap them in a vector, and if you decide to do this, remember to normalize each of your feature vectors: subtract the mean and divide by the standard deviation. Regardless of the size of the patch sampled from the image, the descriptor should end up as an 8x8 patch.
- Implement feature matching (section 5) - This is the one step you *will* need to implement, because it's just a good exercise. Numpy/scipy are your friends! Look at pairs of features and compute the distance between them. After distances between all pairs of points hvae been computed, you might have 500 interest points, and could consider each point mapped to its first nearest neighbor as a correspondence. But most of these correspondences are spurious. Use Lowe's "ratio test" and keep only correspondences where the ratio between the first and second nearest neighbor distance is below some threshold.
Recover the homography using RANSAC
In this step, we recover a projective transformation H. H is a 3x3 matrix with 8 degrees of freedom. It is possible to solve H by pairs of corresponding points since q=Hp. We need to solve a system of at least 8 linear equations to solve for the 8 unknowns of H. These 8 linear equations are based on the 4 pairs of corresponding points. Here is a useful resource on projective mappings. If you want to implement this step yourself, your code will go in calculate_transform(). Currently, this is done with opencv's cv2.findHomography() function, which takes in two arrays of corresponding points from the two images, along with other parameters, to compute H.
With the automatic feature extraction, you will have a overdetermined system of equations for H. Unfortunately, some of the points may be false positives: points that matched under the threshold but are not actually matches. Even with an overdetermined system of equations, outliers from false positives will make a least-squares solution difficult to use (especially because with a d^2 penalty the outliers will be weighted more). A robust way to solve this problem is using the RANSAC algorithm, which examines random subsamples of matches in the hope that some subset will not be confused by outliers. Because you need to solve for 8 values, you will need four 2D points to create the transformation matrix. The TAs suggest 1000 iterations and an error tolerance of 0.5. This equates to iterating 1000 times to find a transform based on 4 points and counting the number of inliers -- transformed points that have at most an error of 0.5 (half a pixel) to their actual correspondence points. Return the homography which had the most inliers. You may need to invert this homography for the later image warping functions.
Warp one image to the other using the recovered homography
This is simply a matter of using the homography matrix to transform each of the points in image A to the coordinate system of image B. However, there is a bit of extra work that needs to go into making sure both images are fully visible post-transformation, thus you need to compute the expected size of the output image and potentially translate the warped images so that they do not get cropped (which happens if they go beyond the bounds of the original image, particularly in the negative domain). We'll walk you through it in warp_images().
Composite
Finally, you will need to composite the two aligned images. As we have learned in this class, there are many ways of compositing, some smarter and producing better results than others, but almost all involving some sort of masking, alpha, and addition operations. Refer to the compositing lab for a refresher.
Write up
As always, we expect you to write up a report of your code, including an explanation of what the code is doing. Please include images of stitched panoramas you make, including one with your own original photos.
Hand in
You will lose points if you do not follow instructions. Every time after the first that you do not follow instructions, you will lose 5 points. The folder you hand in must contain the following:
- code/ - directory containing all your code for this assignment
- writeup/writeup.pdf - your report as a PDF file generated with Latex.
- results/ - directory containing the outputted panoramas
To create the ZIP file to submit run:
>> createSubmissionZIP.py
from the command line. Make sure you are in the root project directory, and that you keep the name writeup.pdf for your report. Upload the generated ZIP file to Gradescope.
Extra Credit
You are free to do whatever extra credit you can think of, but here are a few examples.
- (up to +10) Implement your own Harris corner detector to extract interest points
- (up to +10) Implement the RANSAC algorithm to compute the homography.
- (up to +5) Instead of projecting your mosaic onto a plane, project it onto a sphere or cylinder.
- (up to +5) Creative use of image warping and compositing: add graffiti to a wall, project a movie onto a wall, etc..
- (up to +5) Video panorama: combine videos taken from stationary locations
- (up to +5) Add multiscale processing for corner and feature detection.
- (up to +5) Add rotation invariance to the descriptors
- (up to +10) Panorama recognition: given a set of images that might form panoramas, automatically discover and stitch them together.
- (up to +10) Automatically create "globe" panoramas.
- (up to +5) Extrapolate photo boundaries to fill in missing areas of panoramas. Texture synthesis approaches might work well enough.
- (up to +5) Make the image compositing more seamless. For instance, using seam finding and Poisson blending.
Final Advice
- Have fun with this and be creative with your images
Credits
Project derived from Alexei A. Efros' Computational Photography course, with permission.
Project adapted for Python for the 2020 Fall term by Megan Gessner