Thesis Defense


"Light Fields and Synthetic Aperture Photography for Robotic Perception"

John Oberlin

Friday, April 27, 2018, at 1:00 P.M.

Swig Boardroom 241 (CIT 2nd Floor)

Robots work in factories to assemble, deliver, and package goods, but they have yet to see widespread deployment in homes. Most tasks to which they are applied require little or no sensing, so that we can take advantage of their strength, speed, and consistency without solving difficult and expensive perceptual problems. Robots can be directed by perception in structured environments with consistent lighting conditions and little clutter, and tags can be applied to objects and landmarks in more complex environments to circumvent these limitations. While RGB-D cameras have improved robotic perception to the point where we can begin to study problems such as fine grained manipulation and motion planning with realistically shaped objects without the need for tags, methods which rely on projected light do not scale well because such cameras will interfere with each other and often fail on reflective and transparent surfaces. While a typical RGB camera measures the intensity of light at each location on its sensor, a light field camera records the intensity of light at each location across a number of angles. By reasoning about the direction of incoming light in addition to its location, we can obtain some of the benefits of depth cameras while utilizing a passive sensor. I will describe our method of passive RGB sensing which uses a calibrated camera located on the end effector robotic arm to record a dense light field over a planar surface on a fixed trajectory, turning the robot into a time-lapse light field camera. I show how to use this light field to render synthetic 2D orthographic projections and to perform 3D reconstruction of the imaged environment. The synthesized photographs suffer from reduced sensor noise, chromatic aberration, and specular highlights when compared to the images used to compose them. I then describe a novel graphical model over the light field data collected by the robot and show that, using this approach, a robot can robustly perform object segmentation, classification, pose estimation, and grasp inference on objects in a static scene. Next I will show how our system enables a Baxter robot to autonomously separate previously un-encountered objects from an input pile, collect images of and generate models for those objects, and remove the objects to an output pile.

Light field photography performed with an eye in hand camera can enable a robot to perceive and manipulate most rigid household objects, even those with pathological non-Lambertian surfaces. I will support this position with three contributions. The first is the theory and application of synthetic photography we use on our robots, which allows them to collect light fields and synthesize 2D photographs from those light fields. The second is a collection of methods which form a perception and planning system. This system enables our robots to perform robust pick and place in challenging environments, achieving state-of-the-art results. The third is the reactive programming environment Ein, which coordinates the hand and eye movements of the robot to control its behavior through alternate and simultaneous perception and movement. Finally, I will present a collection of ethnographic studies in robotics which mark the way to uncharted realms of autonomy and collaboration.

Host: Professor Stefanie Tellex