CS280A Project 1

Images of the Russian Empire: Colorizing the Prokudin-Gorskii photo collection

Overview:


As early as 1907, Sergei Mikhailovich Prokudin-Gorskii envisioned color photography as a possibilty for the future. He traveled across the Russian Empire with permission from Tzar and photographed thousands of scenes. For each scene he took three pictures, each using a color filter, those being red, green, and blue. While his vision was not realized in his lifetime, his images exist today, we are able to use software today to align the images and combine the three images into one color image.


Approach:

For each image, I cropped all three sections of the image to exclude the borders. For small images, I cut out around 20px from each side whereas for the large images I cropped out between 180-240 pixels from each side. Next I created a function to align two images given a search window. This function moves one of the images vertically and horizontally based on the window and evaluates a structural similarity score derived from comparing every pixel across the images. The displacement with the best score is returned. To optimize alignment for larger images, I implemented an image pyramid. The image pyramid works by aligning the image at different levels of rescaling. The lowest level is an alignment on the images scaled to 1/8th the size, and that displacement informs the next level, which scales the images by 1/4 and so on. The window for the smallest image is a 50px radius around the center, but any subsequent window is only 4 pixels wide centered around the optimal displacement of the previous level. Finally, the optimal displacement of the original image is returned. Once the r and g images are aligned to the b image, these three images become the rgb values of our final image matrix, the result of which is below.

On issue I ran into was choosing the correct window size and depth of the image pyramid. Having too many layers to the pyramid often caused worse results, and I initially did not change the window size for each depth, so runtime was slow and the result still was not good. Also the same configuration of parameters did not always work for all the images. In particular, Emir was difficult to align and was often incorrect even when other large images were fine. After some trial and error, I found a configuration that worked for every image.


Images:

Cathedral, green-blue offset: x = 2, y = 5, red-blue offset: x = 3, y = 12

Cathedral

Church, green-blue offset: x = 4, y = 25, red-blue offset: x = -4, y = 58

church

Emir, green-blue offset: x = 22, y = 50, red-blue offset: x = 40, y = 105

emir

Harvesters, green-blue offset: x = 16, y = 59, red-blue offset: x = 13, y = 123

harvesters

Icon, green-blue offset: x = 16, y = 39, red-blue offset: x = 23, y = 89

icon

Kapri, green-blue offset: x = -17, y = 32, red-blue offset: x = -25, y = 78

kapri

Lady, green-blue offset: x = 9, y = 56, red-blue offset: x = 12, y = 119

lady

Lastochkino, green-blue offset: x = -2, y = -3, red-blue offset: x = -9, y = 76

lastochkino

Melons, green-blue offset: x = 10, y = 80, red-blue offset: x = 13, y = 177

melons

Monastery, green-blue offset: x = 2, y = -3, red-blue offset: x = 2, y = 3

monastery

Onion Church, green-blue offset: x = 26, y = 52, red-blue offset: x = 35, y = 108

onion_church

Sculpture, green-blue offset: x = -11, y = 33, red-blue offset: x = -27, y = 140

sculpture

Self Portrait, green-blue offset: x = 29, y = 78, red-blue offset: x = 37, y = 175

self_portrait

Three Generations, green-blue offset: x = 17, y = 55, red-blue offset: x = 11, y = 113

three_generations

Tobolsk, green-blue offset: x = 3, y = 3, red-blue offset: x = 3, y = 6

tobolsk

Train, green-blue offset: x = 7, y = 40, red-blue offset: x = 30, y = 85

train

Zakat, green-blue offset: x = -40, y = 75, red-blue offset: x = -67, y = 113

zakat