CSCI 1430 Project 5
1. Isolate a set of the strongest Harris features
in an image. Harris Feature Keypoints: Harris analysis of image features yields a set of keypoints centered over distinct features in the given image, features which should be easy to recognize and track from frame to frame. Generally speaking, the detected features are patches of distinct texture, or more commonly distinct corners in the image, which is why this process is also termed Harris corner detection. Areas of consistent intensity cannot be tracked because no direction of motion can be determined, and edges produce the "aperture problem" illustrated by the barber pole illusion. Corners are optimal features to track, since they possess a gradient in both possible directions of motion, and therefore 2D motion can effectively be extracted. Features (500-count) detected in the first frame of the sequence:
The KLT method makes several key assumptions that
are critical to the success of the algorithm: The brightness constancy constraint ensures that
the same patch or feature can be detected in both frames, since it would
be exceedingly difficult to track features that substantially change
appearance between frames. The small motion constraint ensures that
features remain close enough between frames that they can be corresponded
across frames and tracked consistently. The spatial coherence constraint
allows the algorithm to evaluate an overconstrained system of equations by
least-squares to determine the u,v (x,y) optical flow from frame to
frame. Since the pixel in question and all of its neighbors are assumed to
move in the same direction, they each contribute an equation to help
determine u,v. The optical flow is computed between each pair of
consecutive frames, and the predicted position of the Harris keypoints is
updated at each frame.
to the last frame (ending positions marked by X's):
With around 500 key points, structure can be reasonably extracted from the scene motion. As you can see, the right angle of the building corner is accurately represented, as well as some of the window and roof detail. The red lines represent camera direction vectors, tracing the camera path through the frames. The dark gray areas have unusual shape since a few stray key-points existed in those areas and accurate motion could not be recognized for those points due to the lack of texture.
Camera direction decomposed into x,y,z components over the image sequence:
|