CS 195-G: Scene Completion
The goal of this project was to implement the local context matching phase from the paper Scene Completion Using Millions of Photographs.
Algorithm
This algorithm completes unwanted regions of an image using material from another similar image. In the example below, the unwanted region is shown in black.
![]() The original image |
![]() The scene to be completed |
![]() The similar image to use material from |
![]() The trivial composite |
The first step in completing an image is to align the two images as best as possible. The similar image is aligned to the hole using the information just outside the hole. Alignment error is calculated as the sum of squared differences between the two images weighted by the alignment mask, which is one just outside the hole and zero 50 pixels away from the hole. Chessboard distance and an image pyramid are used for speed.
![]() The alignment mask |
![]() The masked original image |
![]() The masked aligned image |
The next step in completing an image is to expand the hole using a graph cut. Graph cut was restricted to the area where the alignment mask was non-zero (from one to 50 pixels away from the original hole). Both regular sum of squared differences and sum of squared gradient differences were used as error metrics, and the one with the minimum flow was chosen. Comparing the gradients instead of the pixels was proposed as a better error metric in the paper, because it cuts through smooth regions where color changes are less noticeable.
![]() The original hole |
![]() The graph cut hole |
![]() The smooth graph cut hole |
![]() The original hole composited |
![]() The graph cut hole composited |
![]() The smooth graph cut hole composited |
The final step is to hide the seams at the edge of the hole using poisson blending.
![]() Poisson blending the original hole |
![]() Poisson blending the graph cut |
![]() Poisson blending the smooth graph cut |
This algorithm was tried on three different scales of the similar image (0.81, 0.9, and 1.0 as suggested in the paper) and the one with the best score was chosen. The score for each completion was one tenth the alignment error plus ten times the graph cut flow.
Problems
This algorithm does have some disadvantages. Scale is not considered at all, which sometimes creates mountain sized people or brick sized cars. Also, since humans are very sensitive to straight lines, a very slight change in perspective can make what would otherwise be very good scene completion look horribly wrong. In addition, the scene completion algorithm does not make any attempt to preserve whole objects, which often results in disconcertingly headless people.
![]() Mismatched scales |
![]() Mismatched perspective |
![]() Partial people |
Results
Each row below contains the original image, a simple poisson fill, and the best three scene completions. The scene completions are more visually pleasing than the poisson fill in virtually all cases because the novel textures provided by scene completion are far better than the absence of texture due to the poisson fill.