The goal of this project is to fill a masked region of an input image with information from one or many images that have been deemed related. My approach is to find the best region in each matched image that can fill the hole in the input, composite the two, and score the composites. The matched images are first scaled to be the same size as the input image. Then I use a gaussian pyramid built from canny edge detected images to find the best relative translation to align the two images. Pixels near the masked area of the input image are weighted more than pixels away from it by a guassian peaked at the center of the masked region. The masked region is ignored during alignment. Once the best alignment has been found, I graphcut to find the best seam with which to combine the images. Every pixel that is fifty pixels or more from the border of the masked region is constrained in the input image and every pixel in falling under the mask is constrained in the matched image. This forces the graphcut to completely fill the hole and allows the graphcut to chop a small amount into the input image to find a better seam. Finally I used Poisson blending to composite the match with the input. Composites are scored based on the cost of the seam and the shift associated with the alignment. Each factor is weighted equally.
Scene completion fails when the mask cuts through objects. Completing half a person is hopeless for an algorithm that does not have a high level understanding of the contents of the scene. It works amazling well for fluids (sky, water, etc) where it can match against similar fluids and blend them.
Extra Credit: Poisson Fill comparison
I also implemented the pure poisson fill extra credit. The results from scene completion are universally better than the Poisson fill results. The lack of texture in the filled regions is ruinous. Scene completion fails when the mask cuts through objects, but Poisson blending just looks like there was a greasy spot on the camera lens. Both approaches work resonably well for fluids, but smeared lack of texture is more noticeable than slightly choppy textures, so scene completion is superior there as well.