image completion teaser

Project 3: Image Completion

Patch-based image completion in the spirit of Criminisi et al. with graph cut seam finding to composite texture patches (Kwatra et al., website).

Due Date: 11:59pm on Wednesday, March 2nd, 2011

Brief

Requirements

You are required to improve the baseline image completion algorithm provided by the stencil code. You must also apply your algorithm to several of your own images to get full credit.

Details

MATLAB stencil code is available in /course/cs129/asgn/proj3/stencil/. You're free to do this project in whatever language you want, but the TAs are only offering support in MATLAB. The stencil code for this project is unusually extensive, offering a (poorly) working image completion algorithm which you will have to improve in specific ways. One of the motivations of this extensive starter code is to give you an algorithm that runs reasonably fast, thus encouraging experimentation with different improvements. There are numerous test cases (photo + mask pairs) provided from the Scene Completion test set, as well as some synthetic test cases, and you can make your own test cases using the interactive function getmask.m.

Image Completion Overview

Image completion is one of the most elementary yet challenging image manipulation operations. Numerous photo editing operations (deleting, moving, or resizing an object; removing imaging artifacts; filling gaps in panoramas) require us to convincingly hallucinate image data. Research on image completion has been active since 2000, sometimes under names such as inpainting and hole filling, and in the last year mainstream tools such as Photoshop have offered their own implementations.

This project will develop a fairly standard greedy image completion pipeline as in Criminisi et al. The algorithm will incrementally fill the incomplete image region using patches of texture found elsewhere in the image. The inner loop of such an algorithm must perform three critical operations: The focus of Criminisi et al. is largely on operation 1. The focus of Kwatra et al. is largely on operation 3. We will combine these approaches. Unlike Criminisi et al. which uses tiny image patches, we will use relatively large image patches and composite them with the graph cut seam finding operation of Kwatra et al. We will address each of these steps in detail below.

1. Hole-filling Priority

Deciding the fill order is a surprisingly challenging aspect of image completion. This is because the fill order has a strong influence on the layout of synthesized scene elements. How do we pick which region to fill at any given iteration?

One reasonable constraint is to only fill regions on the border of the hole -- you certainly wouldn't want to try and fill a region that was completely masked (you'd have no known pixels to condition the search on, so it would basically amount to picking a random patch) and you wouldn't want to fill a region that was completely known. The baseline method in the starter code picks the image region that is most complete but whose center pixel is unknown.

Another baseline, sometimes referred to as "concentric" or "onion peel" ordering, is to fill regions in order based on their distances from the initial boundary. As this ordering is a simple function of the initial mask, it doesn't need to be computed in the main loop. It can instead be predetermined.

Criminisi fill priority terms However, as shown by Criminisi et al. (see Section II), one can improve image completion quality significantly with an image aware priority algorithm. Specifically, Criminisi et al. prioritizes the filling such that structures are propagated first. The structures we're interested in are lines (isophotes) abutting the incomplete region. These lines might correspond to edges of buildings, roads, or the image horizon. Regardless, it is perceptually important that these features remain unbroken and synthesizing them first encourages this.

More formally, Criminisi et al. define the patch priority as the product of two terms, C(p) and D(p), where C(p), the "confidence" term, indicates how complete and trustworthy the neighborhood around p is, and D(p), the "data" term, indicates how much region p contains linear structures pointed towards the unknown region boundary. For each hole-filling iteration, the priority is computed for all potential image fill regions and the region with the highest priority is selected. See Figure 6 in Criminisi et al. for a visualization of these two terms.

C(p) is defined within the entire unknown region. Criminisi et al. define C(p) to encourages concentric ordering, but their method requires storing and updating additional variables. One simple alternative is to define C(p) as the proportion of already known pixels for each possible patch center p (exactly what the starter code does).

D(p) is only defined along the image boundary. It can be computed as the magnitude of the dot product between the boundary direction and the direction of maximum image gradient. See Figure 5 in Criminisi et al. Measuring the image gradient near the boundary is tricky -- don't let the boundary itself influence your computations. You should only trust measurements over neighborhoods that are entirely known (not under the mask).

For this project, graduate students are required to implement some form of image-aware, structure-based hole filling priority and compare it to the baseline priority algorithm. Undergraduate students may do so for extra credit.

2. Search for Matching Patch

Finding the most appropriate matching patch for a query region is fundamentally hard -- it is implicitly a "vision hard" recognition problem. But within the restricted domain of a single photograph SSD matching can do quite well. The baseline search method implemented in the stencil code is a coarse-to-fine SSD search just like project 1. You are not required to improve this aspect of the algorithm, although there are possible improvements to make (see extra credit).

3. Compositing the Patch to fill the Hole

Each iteration, your image completion algorithm must composite a matching patch and an incomplete query region to fill in missing image data. The simplest way to do this is to copy pixels from the patch in to the hole, unmodified, and make no changes to the known pixels in the query region. This is the strategy adopted by Criminisi et al. and the starter code. It sometimes works for easy test cases and small patch sizes, but you can do better! Larger patch sizes and more sophisticated compositing algorithms will produce more convincing results.

If you've taken a graphics course at Brown, you've probably heard of Seam Carving. You may have even implemented it. That algorithm uses dynamic programming to find the minimum error seam of an image along a specific dimension, and then remove it. Dynamic programming has been used in texture synthesis, most famously in Image Quilting, but you're in for a world of hurt if you try to use dynamic programming for image completion. It is difficult to use dynamic programming to find optimal seams when you need to extract closed regions or when the topology of the constraints (which pixels are known and which are missing) is complex.

We take inspiration from texture synthesis method Graphcut Textures (website), which extends textures by overlaying patches and finding the minimum error seam with a max-flow/min-cut operation on a specially constructed graph. It might be easier to think of this graph cut operation less as a seam finding process and more as a labeling process. The output of this labeling process will tell us, at every overlapped pixel, whether we use the existing pixel or we keep the pixel from the matched patch. Depending on the shape of the known and unknown regions and depending on how you construct the graph the solution to this labeling might not be a closed contour, but that's fine.

graph cut

The graph cut compositing algorithm works by creating a (mostly) planar graph, with a node corresponding to each pixel, and with weighted edges connecting all 4-way neighbors (Kwatra et al. calls these "constraint arcs"). The min-cut through this graph will directly imply the labeling / seam.

The neighbor edge weights encode how important it is for two neighbors to share the same label. For instance, if an edge weight is small, that means the min-cut can break the edge and pay little penalty. If the edge weight is large, the min-cut is likely to go between other nodes, instead. When should the weight be small? When a seam passing between two pixels wouldn't be perceptible, because the two images agree at that location.

As with all max-flow/min-cut formulations, there are two special terminal nodes -- the source and the sink. Pixels that must come from the matched patch (e.g. the pixels in the unknown region) are connected to one of these terminal nodes with very high weight, and pixels that must come from the existing, known pixels (e.g. those under the patch boundary) are connected to the other terminal node. For these special cases the label is predetermined, but for all of the other pixels the graph cut algorithm will tell us which label to use.

Kwatra et al. suggests making edge weight proportional to the following sum of squared difference formulation: graph cut You will need to represent these edge weights in an adjacency matrix somewhat reminiscent of what you used in project 2 -- the matrix will have patch_width^2 rows and patch_width^2 columns, and each entry will encode the affinity or edge weight between two pixels. But the vast majority of entries will be zero -- only neighboring pixels will have non-zero affinity.

You will also need to represent the source and sink edge weights in this problem (sometimes referred to as unary weights). One solution is to use a mask that signifies that any pixel under the mask must come from that image. The terminal weight for those pixels is set to infinity (or a large value). Once formulated, you can solve the max-flow/min-cut problem and the output should be a labeling of the pixels as belonging to one side or the other. This is effectively a new mask for the image that you can use for compositing.

The stencil includes a mex compiled max-flow/min-cut algorithm that runs fast and requires a specific type of input. The source code for this can be found here: Maxflow. You are not responsible for implementing a max-flow/min-cut algorithm; this course is not cs157

Other Avenues for Improvement (Possible Extra Credit)

There are a litany of possible improvements over the baseline algorithm beyond the required improvements (graph cut compositing for undergrads, and additionally image-aware prioritization for grad students). Here are some suggestions, broken down by which part of the algorithm they (hopefully) improve.

0. Miscellaneous Improvements

1. Improvements to Priority

2. Improvements in Search

3. Improvements in Compositing

Write up

For this project, just like all other projects, you must do a project report in HTML. In the report you will describe your algorithm and any decisions you made to write your algorithm a particular way. Then you will show and discuss the results of your algorithm. Also discuss any extra credit you did. Feel free to add any other information you feel is relevant.

Graduate Credit

As stated above, graduate students must implement an image-aware priority algorithm.

Handing in

This is very important as you will lose points if you do not follow instructions. Every time after the first that you do not follow instructions, you will lose 5 points. The folder you hand in must contain the following:

Then run: cs129_handin proj3
If it is not in your path, you can run it directly: /course/cs129/bin/cs129_handin proj3

Rubric

Final Advice

Credits

Project inspired by Criminisi et al. and Kwatra, et al.. Project description and code by James Hays and Pat Doran.

Good Luck!