Images of the Russian Empire

Colorizing the Prokudin-Gorskii Photo Collection

All Output Animation

Project Overview

This project focuses on reconstructing color images from the digitized glass plate negatives of Sergei Mikhailovich Prokudin-Gorskii, a Russian photographer who pioneered color photography in the early 1900s. Prokudin-Gorskii captured thousands of color photographs by taking three separate exposures of each scene using red, green, and blue filters, recording them onto glass plates.

The challenge lies in automatically aligning these three color channel images to produce a single, artifact-free RGB color image. For this project, I explored grid searching to find that optimal alignment, and using an image pyramid to speed things up for larger images.

Key Challenge: The glass plate images are very large (especially the .tif files), so the alignment procedure must be both accurate and computationally efficient.

Algorithm

The algorithm starts with preprocessing. I first convert the images to float for memory efficiency, then separate the images into their respective color channels. With each channel sepearted, I then demean each image, crop out the border (roughly 5% of each side), then find the neccesary displacements along each axis to align the blue channel to the green channel, and I do the same for aligning the red channel to the green channel. The green channel serves as the reference channel for alignment.

Single-Scale Alignment

For low resolution images, it is sufficient to search over a small window of possible displacements (e.g. [-15, 15] pixels), and compute a similarity metric for each candidate displacement value. The displacement values that perform best with that given metric are chosen, and I align the input channel to the reference channel with those displacement values. Once the blue channel and red channel are sufficiently aligned relative to the reference channel, all three channels can be stacked to form the final color image! We have successfully brought color to the Russian Empire!

Image Matching Metrics

Picking which metric to use is an important design choice that will effect the performance of the algorithm, especially for tricker images like that of the Emir of Bukhara. I first experimented with using simply just the L2 norm (Euclidean Distance) to measure the similarity between the channel and the reference channel(Note that I1 and I2 are two images being compared).

$$\text{L2}(I_1, I_2) = \sqrt{\sum_{i,j} (I_1(i,j) - I_2(i,j))^2}$$

While this metric is simple and computationally efficient, for certain images it is not robust enough to accurately align the channels. This can be seen below with a somewhat hazy image of this monastery.

Bad Monastery Reconstruction

To remedy this, and introduce more robustness to the alignment algorithm I used the Normalized Cross-Correlation (NCC) metric. Using this, and finding the alignments where NCC is maximized provided significantly better results.

$$\text{NCC}(I_1, I_2) = \frac{\sum_{i,j} (I_1(i,j) - \mu_1)(I_2(i,j) - \mu_2)}{\sqrt{\sum_{i,j} (I_1(i,j) - \mu_1)^2 \sum_{i,j} (I_2(i,j) - \mu_2)^2}}$$
Monastery Reconstruction

Multiscale Implementation and Image Pyramid

For large images, exhaustive search becomes prohibitively expensive. To account for large resolution images, I implemented an image pyramid approach that processes images at multiple scales, starting from the coarsest (smallest) scale and refining the alignment estimate as it progresses down the pyramid to finer images. In particular, each image size is doubled from one layer of the pyramid to the next. This means that each step down the pyramid, we are much closer to our optimal displacements, allowing us to search over a smaller window, improving the performance of the algorithm dramatically. For this project specifically, I chose to half the search space each time I traverse a layer down the pyramid. This allowed for accurate, but computationally efficient alignment for large image sizes

Note: The image pyramid implementation relies heavily on how images are resized/downsampled. At first, I used a 5x5 gaussian kernel and convolved it with the original image, then cut out every other pixel. I later chose to use the resize function from skimage directly however both approaches provided good results.

Results

Below are the results of my color reconstruction algorithm on various Prokudin-Gorskii images. Each image shows the final aligned color reconstruction along with the calculated displacement vectors for both blue-to-green and red-to-green alignments.

Personal Images

I also tested my algorithm on additional images from the Prokudin-Gorskii collection to demonstrate its robustness across different types of scenes and photographic conditions. The owls are my favorite!

Conclusion

By using modern image processing techniques it is possible with relative ease to bring beautiful color to Sergei Prokudin-Gorskii's original collection. Further improvements can be made by experimenting with different metrics as well as exploring possibly better representation of pixel features, but using very simple tools and techniques, we are able to create beautiful color images of the Russian Empire.