Before I'd ever taken a class in computer science I thought I knew how to build an "artificial intelligence." I wrote up my plans and sent a copy to Marvin Minsky, a leading light in the field, thinking he'd immediately recognize my genius and help me to realize my dreams. He wrote back (for which I'll always be grateful) a short but personal note in which he suggested that I study ethology and in particular the works of Tinbergen and Lorenz. I can't remember how I reacted to my first exposure to Tinbergen but I certainly learned a lot about the diversity and sophistication of animal behavior, both individual and group behaviors. It also fueled my reductionist view of the world - I saw programs and computation everywhere in Tinbergen's observations about animal behavior. I was particularly interested in tales about how opposing behaviors were reconciled (or not) such as the conflict between fighting and fleeing.
Hull [1943] suggested that there was a single drive that energized behavior in general and that would explain the modulation of behavior otherwise governed by stimulus response activity. Lorenz [1950] postulated reservoirs in which the animal would build up some 'action specific energy' for each of its instinct-driven (instincts such as aggression, nesting, and thirst [Tinbergen, 1951]) motivational systems. Lorenz's theory accounted for mysterious 'vacuum' activities that occur in the absence of releasing stimuli. Tinbergen's disciples believed that if an animal must eat, drink and mate to survive (basic functions) then a motivation for such activities could be postulated (the mechanism giving rise to the corresponding behaviors).
These very general ideas were refined and subjected to experiment and cyberneticists and control theorists postulated various ways of characterizing motivations and drives mathematically. One of the recurring mathematical themes involved the use of vector spaces to represent a space of internal and external parameters, e.g., internal body temperature and external environmental temperature, and then characterizing the response mechanism in terms of the animal following the gradient along the surface defined by the value assigned to the points in the vector space, e.g., if your internal temperature is too high then you move in a direction in the vector space that reduces the internal temperature possibly by moving so as to decrease the external temperature. For an isolated behavior this approach was fairly tractable; the problem arose in combining multiple such behaviors.
In some cases it is relatively easy for a set of independent systems to coordinate their individual activity to achieve some global desirable behavior. Brooks and Maes [1990] describe a distributed approach to learning to coordinate multiple behaviors all (unknowingly) contributing to the same global goal of getting a legged robot (Genghis) to walk (see "Learning to Coordinate Behaviors," R. Brooks, P. Maes, In Proceedings of the Eighth National Conference on AI, 1990). In other cases, however, that involve multiple and possibly conflicting goals coordination and arbitration among multiple behaviors has been elusive, e.g., an animal choosing between fight or flight.
Ideas from ethology have crept into areas well beyond robotics including, for example, human computer interaction. Here is a partial list of ideas from ethology that inspired the architecture described in "A Novel Environment for Situated Vision and Behavior," by T. Darrell, P. Maes, B. Blumberg, and A.P. Pentland, In the Proceedings of the IEEE Workshop for Visual Behaviors, CVPR-94, 1994:
In graduate school I was exposed to the ideas of Ulrich Neisser and the notions of schemata and how they are triggered. Neisser coined the term "cognitive psychology" and was in part responsible for the development of cognitive science as a discipline. His ideas about top-down cognitive processes were particularly influential at the time. Later I would hear of Michael Arbib's work on schema theory and its application to robot control. Arbib was interesting to me for the way he combined automata theory, animal behavior, and neuroscience. One of Arbib's students, Ron Arkin, at Georgia Tech is one of the most active in applying and extending schema theory to robotics (Arkin, Ronald C., Behavior-Based Robotics, MIT Press, 1998).
One of Neisser's colleagues at Cornell, J.J. Gibson, was known for his "ecological approach" to understanding perception and behavior. I didn't know much about his theories at the time, but I learned about optic flow, focus of expansion, and time to contact in a course in machine vision at Yale. I was fascinated with the problem of calculating optic flow fields and after I came to Brown one of my students, Ted Camus, working with Heinrich Bultoff, then a Professor of Cognitive and Linguistic Sciences at Brown but later to become the director of the Max-Planck-Institute for Biological Cybernetics in Tuebingen Germany, developed some very efficient algorithms for calculating such fields to aid in robot navigation. Michael Black here at Brown is very well known for his work in the related field of human motion tracking.
Calculating optic flow fields is an example of extracting information about shape and distance from (motion) sequences of images. In the simplest case, the algorithm is presented with a sequence of images taken from a camera that is moving (say on a robot) or a fixed camera in which objects in the field of view are moving with respect to the camera. The algorithm tries to determine how patches of pixels corresponding to objects in one image map to patches of pixels corresponding to the same objects in subsequent images in the sequence. By tracking such patches through a sequence of images, the algorithm can make predictions regarding the motion of the camera or the objects in the images. An optic flow field is a grid (matrix) of vectors in which the vector at a given location in the grid indicates the best guess as to where the patch centered at that location in the corresponding images is moving. Flow fields can be used in robot navigation to identify and avoid objects in the path of a robot.
The tales of roboticists inspired by biological organisms are legion and I recommend that you follow some of the threads presented in this lecture. One of the primary themes in studying animal behavior concerns the form of primitive, component behaviors (how they arise in nature) and then how they are combined to form more complicated behaviors. How do you string together behaviors so that they are tied to events in the world but coordinated with one another so as to avoid undesirable conflicts? What if two behaviors that control the same effectors are running simultaneously; how are the commands they issue combined? How are multiple percepts combined into a coherent picture of the environment for purposes of guiding action? These are some of the questions that are important to roboticists and animal behaviorists alike.
Here are some very sketchy answers to the above questions. I've included a glossary of sorts below as well as a listing of relevant scientists and animal behavior anecdotes. The basic component is a primitive behavior which maps sensor inputs to motor outputs to achieve tasks. Often behaviors are triggered by specific stimuli. They are typically self terminating using some sort of idiothetic (internal) or allothetic (external) sensing but can also be terminated by other stimuli, internal or external. Behaviors can serve to align the robot to facilitate some other activity or they can serve some primary activity such as capturing prey. Many animals have innate sequences of behaviors where one behavior in the sequence triggers the next by employing a so-called releasing mechanism. Some species of wasp mate, build a nest and then lay eggs in the nest where the nest is thought to be a trigger or releaser for the egg laying behavior. One can easily imagine a boolean logic in which the releasers play the role of boolean variables and logic circuitry orchestrates the sequencing of behaviors.
In addition to sequential behaviors there is the issue of handling parallel and asynchronous behaviors and in particular what to do when two behaviors wish to control the same outputs or when one behavior attempts to undo something that was achieved by another behavior earlier in an attempt to facilitate yet another behavior later. Lots of so-called arbitration mechanisms have been proposed including those that were introduced in the subsumption architecture: suppression and inhibition. There are also strategies such as domination in which the dominant behavior according to some dynamic or static ranking has control of the outputs or cancellation in which one behavior can simply cancel another. Computer scientists not concerned with biological plausibility often assume some central arbiter but this can lead to complex strategies for combining behaviors.
Arbib's schema theory is one proposal that works well for combining some types of behaviors and in particular navigational behaviors. A schema is a generic template for carrying out some activity. In schema theory a behavior consists of a perceptual schema that directs and organizes perception and a motor schema that takes the output of the perceptual schema and produces a pattern of motor activity. For example, a frog might have a behavior for snapping at flies; the perceptual schema would generate a frog-centric coordinate space that accounts for orientation of the frog's snapping apparatus and the position of the fly. The motor schema would align the snapping apparatus and initiate snapping at an appropriate time. One very neat way of combining behaviors in schema theory is to assume that each perceptual schema generates a vector field with the animal at the origin and the vectors indicating the direction and magnitude of forces, velocities, relevant for behavior. Two behaviors concerned with the same sensors and motors would combine the outputs of the corresponding perceptual schemas by summing the vector fields to produce a single combined field and then directing the motor outputs accordingly. This can have unintended consequences when, say, there are two flies near the frog and instead of aiming for one or the other of the flies the frog flicks its tongue at a point midway between them.
![]() |
![]() |
![]() |