Post-Processing a Rolling Shutter Video Effect from Matt Nichols on Vimeo.
In this process, composing each frame is simply a matter of incorporating frame information from past frames, cascading in increasing time in whatever direction the effect is being applied. In my first attempt, I tried building each output frame by iterating backwards in the video, copying in rows from the past as needed. However, this ended up being really slow, as each input frame needs to be queried n
times for the n
output frames it is included within. I encountered another challenge with framerate: 15 fps is a standard framerate, but even for a slowly walking person, the gaps in position are large enough to be noticeable for small desired time offsets between each pixel row. A 5-15 millisecond offset between rows usually looks good, but 15 fps gives up 67 milliseconds between frames – some replication of rows or interpolation between frames is needed (read on if this doesn't make sense). Just copying the same row multiple times to decrease the average offset between single pixel rows creates noticeable jagged edges in the video output, so I ended up first trying a linear combination of neighboring frames, and finally a Gaussian weighting of the four nearest frames for a reasonably smooth output. The details of my algorithm follow:
g(x) = e^(-(x-c)2/(2*s2)
, centered around the exact desired time offset (c
), at each neighboring value. (I've found that a standard deviation s = .75
works well in most circumstances, though this could be increased for more blur / less jaggedness. It's a tradeoff.)f + s
, where f
is the current frame and s
is the current segment. Thus the second segment will be written into the output frame two frames ahead... etc. There is a rudimentary visualization of this in the video above.
Automatically finding the best offset time: Because a lower offset time usually works better for videos with more motion, this should be as simple as finding some sort of metric for the motion present in the input, and choosing the offset time accordingly. I actually tried to do this, by writing a motion metric descriptor for videos, which essentially calculates a rolling SSD between frames and normalizes this by the video's framerate and size. However, in practice, it was really difficult to find a good mapping between motion and millisecond offset, and I thus left it out of my final product. I think the descriptor could also be better implemented by using distance between keypoints at each frame, instead of a SSD – this would allow it to be more consistent with differently sized moving objects.
Automatically determining the best direction: Because an object moving left and right will look better with a top-to-bottom effect, and vice versa for a vertically moving object, it could be cool to automatically decide which way to cascade the time offset. This could also be accomplished with some sort of keypoint analysis, as above.
More directions or different offset patterns: Currently the offset can cascade right, left, up and down, though it's theoretically possible to use arbitrary angles, and even arbitrary patterns. Selecting segments for either of these would be much more complicated and probably quite a bit slower (since I wouldn't be able to use MATLAB matrix magic with full rows and columns), but it's potentially doable.
More and better videos: I've currently only made videos with this using relatively low framerate webcams, recording relatively slow movement. However provided a bulk of time and access to better video equipment (see: winter break, my nicer camera) I could come up with much more interesting and higher quality videos to run this effect on. This, for sure, I plan to do.