All metrics tried have about the same effectiveness, which I found
surprising considering applying sum of squared differences or
the normalized cross correlation metric directly to the RGB images
does not compensate for the brightness of each of the image. There
were 2 images that none of the metrics could align properly without
making other images unaligned (the goal being to apply the same
algorithm to all images without user input): image 3 - 00153v.jpg
and image 15 - 01861a.tif. They could be aligned, but it required
tweaking input parameters specifically for those images and that
ruined more images than it fixed.
The algorithm simply takes the height and divides by 3 and that
does not necessarily separate the images along the black borders.
If each of the images were completely extracted from their
black borders then this problem could be resolved, likely enhancing
results.
The algorithm assumes that the camera was only translated, it does
not consider affine transformations like rotation. That seems like
a logical next step to possibly improve this algorithm. Though it
will be more computationally expensive.
|