Teaser Image

Automatic color aligning and compositing of the Prokudin-Gorskii photo collection

Project 1: Image Alignment with Pyramids

Logistics

Background

Sergei Mikhailovich Prokudin-Gorskii (1863-1944) was a photographer ahead of his time. He saw color photography as the wave of the future and came up with a simple idea to produce color photos: record three exposures of every scene onto a glass plate using a red, a green, and a blue filter and then project the monochrome pictures with correctly coloured light to reproduce the color image; color printing of photos was very difficult at the time. Due to the fame he received from his color photos, including the only color portrait of Leo Tolstoy (a famous Russian author), he won the Tzar's permission and funding to travel across the Russian Empire and document it in 'color' photographs. His RGB glass plate negatives were purchased in 1948 by the Library of Congress. They are now digitized and available on-line.

Requirements

Take the digitized Prokudin-Gorskii glass plate images and automatically produce a color image with as few visual artifacts as possible. Your program should:

Potentially useful functions: skimage.transform.resize() or equivalent.

Forbidden functions: Anything that builds and image pyramid for you —write your own function which blurs an image and then subsamples the pixels.

Assumptions

Python stencil code is available in code/. You're free to complete this project in any language, but the TAs will only offer support in Python.

Write up

Describe your process and algorithm, show your results as images and output values (e.g., alignment vectors, speed-up factors), describe any extra credit and show its effect, tell us any other information you feel is relevant, and cite your sources. We provide you with a LaTeX template in writeup/writeup.tex. Please compile it into a PDF and submit it along with your code. In class, we will present the project results for interesting cases.

Task:Submit writeup/writeup.pdf

We conduct anonymous TA grading, so please don't include your name or ID in your writeup or code.

Extra Credit

Although the color images resulting from this automatic procedure will often look strikingly real, they are still not nearly as good as the manually restored versions available on the LoC website and from other professional photographers. However, each photograph takes hours of manual Photoshop work, such as adjusting the color levels, removing blemishes, or adding contrast. Can you come up with ways to address these problems automatically? Feel free to devise your own approaches or talk to the Professor or TAs about your ideas. There is no right answer here, just try out things and see what works.

Here are some ideas, but we will give credit for other clever ideas:

Hand in

Use Gradescope to submit your repo directly. When creating a submission, you will be asked to upload your repo. Please do not commit the image files back to the repo! They are too big and will cause problems. Instead, put your results in the writeup pdf, compile that, then submit that too.

As such, the repo you hand in must contain the following:

You will lose points if you do not follow instructions. Every time after the first that you do not follow instructions, you will lose 5 points.

Rubric


Hints

example negative

The easiest way to align the parts is to exhaustively search over a window of possible displacements (e.g., [-15,15] pixels), score each one using some image matching metric, and take the displacement with the best score. There are several possible metrics to measure how well images match:

Note that in this particular case, the images to be matched do not actually have the same brightness values (they are different color channels), so other metrics might work better.

Exhaustive search will become prohibitively expensive if the displacement search range or image resolution are too large. This will be the case for high-resolution glass plate scans. To avoid this, you will need to implement a coarse-to-fine search strategy using an image pyramid. An image pyramid represents the image at multiple scales (usually scaled by a factor of 2). Start from the coarsest scale (smallest image) and update your displacement estimate as you go down the pyramid. data/ holds the digitized glass plate images in low- and high-resolution versions, so consider trying your alignment algorithm on the low-resolution version first to test its performance more quickly.


Credits

Project derived by James Hays from Alexei A. Efros' Computational Photography course, with permission. Converted to Python by Trevor Houchens.