3. Research

The demand for higher quality, faster computer graphics extends throughout virtually all fields of science, types of industry, and areas of government, from mathematics and engineering to the military and entertainment. Such a pressured environment encourages ad hoc approaches and short-term solutions to problems. With the establishment of the Center, we are able to pursue the long-term, discipline-wide goal of strengthening the foundations of computer graphics and scientific visualization. While there has been great progress in the last few years in hardware performance, algorithms, and interaction techniques, far too many systems and applications used in graphics and visualization today still depend on ad hoc techniques rather than scientific foundations. More than ever, we need to have computer graphics that doesn't just ``look right'' but ``is right'': our models and algorithms must be physically-based and subject to experimental validation.

Our strategy is based on coordinated foundational research in four core areas: modeling, rendering, interaction, and performance. Work in these areas is further focused, stimulated, and validated by two key ``driving application'' research areas: scientific visualization and a Center-wide effort on telecollaboration for mechanical CAD. An increasing number of collaborative projects have become essential for realizing the Center's goals. These projects and the sites involved are shown in the diagram below.

Please note: Highlight boxes throughout the following sections, indicated by boxes around a portion of text, indicate projects that represent particularly high-profile research or showcase Center examples of education and outreach.

A. Achievements

A.1 Modeling

Geometric modeling, which plays a key interdisciplinary role in the Center, underlies all computer graphics technologies. Although substantial progress has been made since our original proposal, modeling remains a major bottleneck in graphics with many fundamental issues still only partially understood.

Our research plan is driven by two complementary goals:

1. Improve modeling in current applications, testing the results through the rigors of real-world problems in geometric modeling, CAD/CAM, and scientific visualization.

2. Create fundamentally new modeling methods for the next generation of applications through research focused on geometric modeling, physically-based modeling, biological modeling, and new mathematical shape and behavior representations.

Improvements in modeling extend the domains of application for computer graphics, while at the same time advances in the application areas stimulate fundamental research in modeling. For this reason, modeling research within the Center has been highly leveraged by collaboration with researchers in other areas of computer graphics. Moreover, other research thrusts in the Center, such as those of tracking and display, have made advances that would not have been possible without access to the most recent modeling advances.

The wide range of important modeling domains indicated above provides strong motivation for pursuing modeling research as a Center. We achieve greater coverage through complementary specializations and research activities that reinforce one another. We describe our modeling research in part through its relations to other areas in graphics but our achievements are organized below by Center subareas, with an additional section on fundamental advances.

A.1.1 Center Core Area Achievements

Performance. The UNC and Utah sites collaborated on several joint design-and-manufacture efforts, including the design and rapid production of a head-tracker component (HiBall) (now used in the experimental UNC wide-area ceiling tracker (see Performance, Section A.4, highlight box)), and the housing for a new see-through video head-mounted display. These projects serve to drive and test our manufacturing algorithm research (http://www.cs.brown.edu/Center/resea/Center_modeling.html), and have also focused our research in distributed collaborative modeling systems [CUTT97] [RIES97] (discussed further in Telecollaboration, Sections A.6 and B.6), which reflects the increasing geographical distribution of team members.

A Center undergraduate is lead author on a paper describing an animation optimization technique designed to move objects in user-desired ways with minimal interaction [RAMA97] that improves on earlier Center research [BARR92]. The new Euler-Lagrange subdivision technique aids rapid convergence of the minimization method and produces a 1000X speedup.

Interaction. We have recently developed a scalable, interactive multi-resolution tool that operates directly on complex polygonal meshes of arbitrary topology by using wavelet subdivision and smoothing algorithms to extend existing surface representations. The essential core algorithms are local and adaptive, a considerable efficiency advantage [SCHR96a] [ZORI97] (http://www.gg.caltech.edu/~dzorin/multires/meshed/).

Previous interaction/modeling projects have included a time-critical approach to collision detection that approximates object shapes at multiple levels of detail by using sets of hierarchical spheres [HUBB95a][HUBB95b].

Scientific Visualization. Magnetic resonance imaging (MRI) research in the Center includes the measurement of biological structures and the creation of geometrically and physically accurate representations of those structures as graphics models. Through the Biological Imaging Center at Caltech the Center has access to an MRI microscope capable of ten-micron resolution, as well as access to clinical MRI machines. Using chemical segmentation and Bayesian classifications, we have developed an approach to tissue classification from volumetric MRI data. Our new family of classification algorithms models the voxel histograms with new mixture basis functions, and accurately locates pieces of geometric surfaces within voxels [LAID96]. See:

http://www.gg.caltech.edu/brain/mri_research.html

http://www.gg.caltech.edu/~dhl

A Center's research faculty member teaches a design course for mechanical engineering undergraduates that includes designing, bullding, and racing real vehicles. Students use the Alpha_1 research system for integrated design, process planning, and manufacturing to help them build the custom subsystems they need. While the engines, overall dimensions, and safety requirements are prescribed, the rest of the vehicle is the work of the student teams. The brake system assemblies, suspension, power train encasements, and even specialized gears assemblies, for example, were designed and manufactured in Alpha_1.

One of our two 1997 teams won first place in a four-hour endurance event, first place in braking, and fifth place overall. Out of 68 vehicles, only about 25 actually finished all the events (two of them were ours).

The accompanying image shows the 1996 SAE formula-style entry, whose skin was also modeled in Alpha_1. The skin was manufactured by making a positive form using Alpha_1 and building up the composite material.

To complete student projects of this complexity in this time frame would be impossible without the use of sophisticated modeling and manufacturing support. Even a few years ago, it was not possible to take on so many custom aspects, especially with the small budget available to the teams. Sponsored in part by the Center. Alpha_1 research in design and manufacturing environments, has led to present capabilities. We take pride, and hold a sense of co-achievement, in the teams' successes.

A.1.2 Fundamental Achievements

Construction of a C1 Interpolating Subdivision Scheme for Arbitrary Meshes. Used in many efforts throughout the Center, subdivision is a powerful technique for generating a smooth and visually pleasing surface from an arbitrary topology. Given an initial triangular mesh, we have developed an interpolating scheme that retains the simplicity of the Dyn, Gregory, and Levin Butterfly scheme, but creates smoother surfaces, especially on irregular topology.

Representing Surfaces with Manifolds. We have developed a new graphical representation that uses the structure of manifolds. Multiple overlapping parameterizations, called charts, describe the topology of an object, and functions defined through these multiple parameterizations then associate geometry with the object. Our system generalizes the traditional uniform tensor-product B-spline basis functions to make surfaces of complex topology and arbitrary levels of parametric continuity automatically from polyhedral sketches of arbitrary topology [GRIM95a].

Torn B-Spline Surface Representation. The torn B-spline surface representation [ELLE95] is an approach to designing with partial and nonisoparametric feature curves in tensor product nonuniform rational B-spline (NURBS) surfaces. The torn surfaces can be trimmed and can have multiple intersecting feature curves. We demonstrated its use [ELLE96] on applications such as mathematical data fitting, mechanical modeling, modeling of manufacturing processes such as stamping, and topographical modeling.

Test of Structurized Modeling Principles. We have applied principles of structured modeling [BARZ92] to developmental modeling [FLEI95a] and to cellular texture generation using a biologically-motivated cellular development simulation [FLEI95b]. The resulting developmental models combine elements of chemical, cell-lineage, and mechanical models of morphogenesis. They can represent a wide range of biological phenomena, explain biological mechanisms, and be used for graphics modeling of complex organic phenomena. See:

http://www.gg.caltech.edu/Center/develop_sim.html

Linear-Complexity Articulated-Body Models with Dynamic Constraints. By using a structured modeling context, we have developed a much faster and more general implementation of articulated body simulation. The work combines Featherstone-style articulated models with dynamic constraints to create new equations of motion. The algorithm has linear time complexity in the loop-free parts of the articulated system [PFAR96].

Parametrization artifacts are pervasive in parametrically defined curves and surfaces. To improve the resulting interpolants in data-fitting and animation applications, our approach uses nonlinear constraints to produce smoother curves and surfaces while freeing the user from the distracting task of interacting with the parameterizations [DRIS95].

Tangent plane continuity at patch boundaries is important in modeling arbitrary topologies from multiple rectangular patches. We have developed nonlinear (cubic) constraint techniques to create G1 continuity across the edges of datafit tensor product spline surfaces whose edges might change under optimization [SAND96]. These methods have been used to recover and model vascular structures from MRI data [SAND97].

A.2 Rendering

Our primary focus during the past six years has been to develop physically-based lighting models and perceptually based rendering procedures for computer graphics to produce synthetic images that are visually and measurably indistinguishable from real-world images. Physical simulation fidelity is of primary concern.

For several decades now, computer graphics simulations have been used for a wide range of tasks such as pilot training, automotive design, and architectural walkthroughs. The entertainment industry has developed techniques for creating startling special effects and realistic simulations. Even virtual reality games use convincing imagery with great success. But are these images correct? Would they accurately represent the scene if the environment actually existed? In general, the answer is no, although the effects are appealing because the images are believable.

If we can generate simulations that are guaranteed to be correct, they can then be used in a predictive manner. This major paradigm shift will make it possible to use computer graphics algorithms for testing and developing printing technologies, photographic image capture, the design of display devices, and algorithmic development in image processing, robotics and machine vision.

However, simulations used for prediction must be provably correct. Fidelity is the key. This difficult task requires a major multidisciplinary effort among physicists, computer scientists, and perception psychologists. Unfortunately, very little work has been done to date in correlating the results of computer graphics simulations with real scenes. However, with more accurate image acquisition and measurement devices available, and with increased computer processing power, these comparisons can now be achieved.

A.2.1 Global Illumination Research

Over the past two years the Center has articulated and refined a framework for global illumination research that will be presented in a special SIGGRAPH session this August. Our research framework has subdivided the system into three parts: the local light reflection model, the energy-transport simulation, and the visual display algorithms. The first two parts are physically-based and the last is perceptually based.

Real or Simulated?

When the Center began, we started building the light measurement lab at Cornell to assess quantitative differences between actual scenes and rendered simulations of those scenes. Previously, comparisons were based on subjective apearance alone. The images shown here represent calibrated quantitative comparisons of real vs. simulated , on a simple, controllable scene. We have been able to simulate how light scatters off each surface and propagates through the scene, and ahve verified our results using the Center's measurement facilities. The repercussions of this research ar far reaching: the ability to predict exact lighting and color appearance will not only improve visualization accuracy, but also enhance the development of new technology for color printing, digital photography, and image displays.

Light Reflection. Our ultimate goal in light reflection work is to derive an accurate, physically-based local light reflection model for arbitrary reflectance functions. Models developed approximately 25 years ago at the University of Utah [PHON75] have been improved [BLIN77] [COOK81] but are not sufficiently accurate or general, though they still dominate standard graphics pipelines.

In its first years the Center contributed a sophisticated but computationally expensive model based on physical optics that incorporates the specular, directional diffuse, and uniform diffuse reflections by a surface [HE91] [HE92]. This past year, we introduced a new class of primitive functions with nonlinear parameters for representing reflectance functions. The functions are reciprocal, energy-conserving and expressive, and capture important phenomena such as off-specular reflection, increasing reflectance with angle of incidence, and retroreflection [LAFO97]. Most importantly, the representation is simple, compact, and uniform and has been verified by comparisons to our physically-based model and actual measurements.

We have established a sophisticated light measurement laboratory with NSF and industry support. We are now able to measure directional-hemispherical diffuse reflectance and specular reflectance as a function of wavelength and incident angle, and the isotropic bidirectional reflectance functions (BRDFs) over a range of incidence and reflection angles and a range of wavelengths. We can also measure spectral radiance from a point source, the spectral radiance of a scene, and surface roughness parameters for physically-based reflectance models [TORR95] [FOO95] [CHEN95].

The ability to measure physical environments radiometrically greatly improves the Center's capacity to carry out controlled experiments and to compare simulations with real-world environments. Descriptions and data from our ``Cornell Box'' experiments are available at:

http://www.graphics.cornell.edu/cbox

Bidirectional reflectance distribution function (BRDF) information measurements are added to the site as they become available in order to make a materials database available to the graphics community.

Light Transport. The second part of our research framework involves simulation of the physical propagation of light energy. The two most common physically-based rendering methods used today are stochastic ray-tracing [WHIT80] and radiosity [GORA84]. Although during the past fifteen years many improvements have been made, both techniques still neglect various significant mechanisms of light transport.

A general formalization of the rendering equation has been well known [KAJI86], but until recently neither the processing power nor a sufficiently accurate reflection model has been available to perform predictive simulations. We are now using a density estimation framework that splits light transport and lighting representation into separate computational stages [WALT97A].

This physically-based method supports characterization of error and can handle complex geometries and general reflectance functions.

In the transport stage, we first simulate the flow of light between surfaces using Monte Carlo particle tracing, without explicitly reconstructing the lighting on them. Since the intensity of the lighting on each surface is proportional to the ultimate density of light particles, local lighting reconstruction is a density estimation problem [SILV86]. Although computational requirements are enormous, both the particle-tracing and density-estimation phases can easily exploit coarse-grained parallelism, thus reducing computation time.

We use the light measurement laboratory again to compare the resulting simulated radiant energy on an image plane with measured values at full dynamic range and infinite resolution. The last stage of realistic image synthesis then incorporates human perceptual factors to generate a visual image.

Perception. While the physically-based rendering methods described above make it possible to simulate accurately the radiometric properties of scenes, physical accuracy does not guarantee that the final images will have a realistic visual appearance. Current display devices are limited in a number of ways, including spatial resolution, temporal resolution, absolute and dynamic range, and color gamuts. In addition, the observer of the physical scene and the observer of the display may be in very different visual states, which can affect how they perceive the visual information before them.

A better understanding of the spatial, temporal, chromatic, and three-dimensional properties of vision can lead to even more realistic and efficient graphics algorithms. Our research framework in this area is based on the idea of a tone reproduction operator, introduced by Tumblin [TUMB93], which incorporates the physical transfer properties of the display device and the visual states of the scene and display observers. Using this visual model to determine the mapping from simulated scene radiances to display radiances that produce a perceptual match between the scene and the displayed image, we can produce images predictive of what observers in the simulated scene would see. This allows the images to be used quantitatively in areas such as illumination engineering, transportation and safety design, and visual ergonomics.

Current global illumination algorithms spend a great deal of time computing unimportant and imperceptible scene features. Algorithms can be substantially accelerated by using error metrics that correctly predict the perceptual thresholds of scene features. The establishment of these techniques will not only allow realistic visual display, but will also provide a feedback loop to improve the efficiency of the global illumination computations.

A.2.2 Inverse Rendering and Image-Based Rendering

While the ``direct'' problem of image synthesis continues to dominate computer graphics, it has become increasingly clear that ``inverse'' problems are also vitally important. In particular, the need to acquire a detailed 3D model of an existing object arises naturally in several contexts, such as augmented reality, in which synthetic imagery is to be fused seamlessly with images of actual scenes. Doing this requires a knowledge of scene geometry, materials, and possibly illumination, any of which may need to be inferred from the images themselves. This problem, called ``inverting the rendering equation'' within graphics, corresponds exactly to the class of problems central to computer vision.

The recovery of shape and material information from images is extremely challenging and is far from being solved in complete generality. However, we are actively pursuing a number of approaches that promise to be extremely useful, albeit not completely general. First, the notion of using a large number of images (a ``sea of cameras'' [FUCH94]) has led to some rather dramatic breakthroughs in geometry capture by removing the constraint of monocular or binocular vision typically imposed in robotics [KAMB96] [KANA96]. A second approach described under Telecollaboration, Sections A.6 and B.6 is the use of ``structured light'' [TRO95] to obtain range information from arbitrary geometries using a small number of images.

A third approach is ``model-based'' recognition, which seeks to match a parameterized model with one or more images that are known a priori to depict an instance of the model; the model may be as simple as a surface of revolution or as complex as a human face [DU94]. The latter example is clearly applicable to telecollaboration, where human faces are the dominant features of the scene.

Image-based methods [LEVO96] [GORT96] offer a wholly new approach using essentially high-dimensional interpolation, which sidesteps most of the problems that make inversion ill-posed. The recent success of these methods suggests that they are a viable alternative to inversion when geometry capture is not the explicit objective, that is, when the goal is simply to generate new images [CHEN95] [MCMI95] [TORB96]. We are exploring the theoretical limits of image-based rendering using a number of tools from approximation theory, in a manner analogous to our previous efforts to analyze global illumination.

Data obtained from Cornell's light measurement laboratory are likely to be instrumental in closing the gap between theory and practice. For example, we will need to characterize several aspects of actual BRDFs, such as the maximum reflectivity and the spread of the specular peak, to verify error bounds for a range of interpolated images constructed from a sequence of images. This is an excellent opportunity to combine measurement and analysis and should provide a better foundation for future algorithms.

A.3 Interaction

Graphical user interfaces (GUIs) have been largely responsible for the commercial success of desktop applications and have allowed expert users as well as novices, even preschoolers, to be effective computer users. In fact, the graphical user interface is as important in making users productive as the application's functionality and performance.

Through steady improvements in hardware as well as in graphics algorithms, three-dimensional (3D) computer graphics is rapidly changing from an esoteric and expensive specialty to a commodity, just as 2D graphics did when Apple released the Macintosh in 1984. 3D graphics has already become an essential tool in both science and industry for analysis, design, production, and exploration, as well as in the latest video games. Nonetheless, 2D GUIs (often called the WIMP interface, for windows, icons, menus, and pointing) are still dominant even for 3D applications. We believe that new user-interface technology, such as the Center's pioneering 3D widgets [CONN92], can be far more productive for 3D applications than WIMP GUIs. Our long-term goal is to make user interaction with computer-based objects at least as easy as interaction with comparable real-world objects, particularly for familiar tasks. The strategy for achieving this goal is threefold:

Focus on building interfaces for tasks in which speed of interaction is the essential feature of the real-world phenomenon (e.g., sketching designs on paper).
Take advantage of the user's previously learned skills and experiences (e.g., ability to express ideas symbolically through graphical and gestural idioms such as architectural floor plans and hand waving).
Evaluate our interfaces by studying the performance of users familiar with complex end-to-end tasks in an application area. We study the complex interrelationships among composite interaction techniques applied to real-world, non-trivial tasks.

Center interaction research involves collaboration among multiple sites and with other universities. The Center's efforts have influenced commercial products (e.g., Silicon Graphics' Open Inventor and Caligari's True Space) and have created 3D interface tools now used in the University of Utah's SCIRun system [PARK95] and at NASA.

Sketch-An Innovative New Interface for 3D Modeling

Today's 3D modelers allow the specification and display of complex and intricate geometries, but are difficult to use and do not address many of the distinct stages in the design process. In particular, current modeling applications, tailored for precise specification, fail entirely to address the early stages of design when conceptual ideas must be prototyped quickly, but details are intrusive.

The Sketch system [ZELE96] represents a new paradigm in 3D interaction that takes advantage of natural gestural drawing idioms to bridge the gap between hand sketches and computer-based modeling programs. Using Sketch is like drawing with pencil and paper in that only gestures are used (no menus, dialog boxes or buttons). Unlike sketching on paper which produces a static image, the user's hand motions produce a dynamic 3D image.

Users "sketch" objects using familiar graphical conventions for representing 3D, such as three lines sharing a vertex to show the corner of a cube. Because objects in Sketch are defined gesturally, not with numerical input, they may be only approximate models of a final idea. To convey the sense of an informal drawing, Sketch uses a non-photorealistic rendering method (see Interaction, Section B.3) that makes models look as if they were drawn by hand instead of a computer. In addition, once learned, the Sketch system is enjoyable to use&emdash; a difficult-to-quantify but important factor.

Sketch was enthusiastically received at SIGGRAPH, the premier computer graphics conference. Its success has led to a host of other Center research projects including two-handed interaction [ZELEZ97] and non-photorealistic rendering [MARK97] as well as to corporate funding from Autodesk and SGI's Alias/Wavefront. In addition, a new collaborative Center project combing the gestural sketch paradigm with the Alpha_1 system at Utah is a major component of the all-site telecollaboration project. It will also be used for a multi-Center collaboration on virtual prototyping with the Fraunhofer Center for Research in Computer Graphics.

Interestingly, the inspiration for Sketch came not from the research community or industry but from a Center educational outreach program: Sketch's chief designer, Center staff researcher Bob Zeleznik, participated in a Saturday Academy workshop for inner-city school children and thought he would impress them by demonstrating how real, high-end modeling was done. Trying to explain a CAD interface to children accustomed to crayon and pencil interfaces made Bob acutely aware of the limitations of the WIMP interface style for 3D modeling. This realization led him to research gestural interfaces in order to make computer-based 3D modeling more fluid and more reflective of traditional artistic interfaces.

Two-Handed Interaction. Computer interfaces have so far been limited to keyboard and mouse-based solutions, even though people naturally work with two hands and in three dimensions. Two-handed interaction is increasingly being used in desktop [BUXT86][KABB94] and immersive VR environments [MAPE95]. Interfaces for compound continuous tasks (e.g., positioning an object in 3D) that use one-handed input are often less natural and less efficient than interfaces that split those tasks into parallel subtasks. For example, with our two-handed Sketch interface [ZELE97] the dominant and non-dominant hands can work together efficiently to simultaneously translate and rotate an object, corresponding more closely to real-world interaction. Other tasks such as viewpoint specification and temporary ungrouping are also simpler and more efficient with a two-handed interface.

Interaction in Immersive Environments. (see Scientific Visualization, Sections A.5.2 and B.5 and Telecollaboration, Sections A.6.4 and B.6.1)

Usability Studies. We have conducted user studies to identify values for the parameters that influence and are dependent on the accuracy with which a user can point at a target object with our immersive VR interaction techniques[FORS96][AYER96][PIER97]. In addition, we are conducting user studies to improve the Sketch interface. The modeling process is accelerated by the gestural nature of the interface, but we believe it is hindered by the indirect nature of the mouse and tablet input devices used to date. We are conducting user studies to compare the accuracy and speed of a mouse, tablet, and light pen for gesture operations. We expect the results to suggest preferred modes of input and influence the future design of the Sketch interface.

A.4 Performance

Performance-related research in the Center concerns a set of enabling technologies that are essential to real-time graphics and visualization, ranging from input to rendering to display. We work on innovative tracking methods for handling user head and hand input, leverage UNC's high-speed graphics engines and other techniques from all the sites to render images faster and more effectively, and develop new 3D displays that provide the proper images to each eye of each user.

The importance of high performance in an interactive system is that it can provide transparency to the user. For example, with high quality rendering delivered by fast graphics engines, a scientist is not distracted by aliasing artifacts while interacting with molecules. With accurate tracking, a physician need not compensate for delayed display updates when moving a transducer during an ultrasound examination. And improving displays, increasing frame rates, reducing lag, and accurately registering objects, allow users to spend more time in a virtual environment without simulator sickness.

A.4.1 Tracking

Precise, unencumbered tracking of a user's head and hands over a room-sized working area has been an elusive goal in modern technology and the weak link in most virtual reality systems. Current commercial offerings based on magnetic technologies perform poorly around such ubiquitous, magnetically noisy computer components as CRTs, while optical-based products have a very small working volume and often need to maintain line-of-sight access between tracking cameras and illuminated beacon targets (LEDs). Lack of an effective tracker has crippled a host of augmented reality applications in which the user's views of the local surroundings are augmented by synthetic data (e.g., location of a tumor in the patient's breast or the removal path of a part from within a complicated piece of machinery).

For years UNC has been pursuing a variety of new approaches to head and hand tracking, principally with DARPA funding but also with Center support. Our latest system, an optical tracker with a golf-ball-sized tracking target that is either worn on the head or held in the hand, consists of a miniature cluster of six optical sensors looking out onto a specially outfitted room whose ceiling tiles have been embedded with infrared LEDs. At this writing, the system is just becoming operational but is already exceeding expectations: operating anywhere in a large (20 x 20 foot) room at a 1khz update rate, exhibiting resolution of under 1mm with excellent stability. The system itself should make possible a host of formerly stymied applications.

The system results from a satisfying collaboration among UNC, Utah, Brown, and Caltech. Its excellent performance encourages us to develop future trackers that will be able to operate outside the instrumented rooms, e.g., outdoors, whose target can be miniaturized not merely into a golfball sized enclosure but can be embedded, for instance, within normal (appearing) eyeglass frames. The availability of such radically improved trackers may well stimulate a wide variety of new applications, from on-site architecture planning to allowing emergency rescue personnel to see inside buildings.

A.4.2 Rendering Hardware and Software

Graphics Systems. The Center uses the DARPA-funded advanced computer graphics engine at UNC, Pixel-Planes 5, which has achieved record performance in rendering speed. The Center-wide availability of such hardware, together with our infrastructure, enables us to undertake graphics and visualization projects that we would otherwise be unable to tackle. The hardware is used, for instance, in a Utah/UNC collaboration to render complex models in real-time and in a Cornell/UNC collaboration to improve shading interpolation by using higher-order polynomial approximations.

Analog VLSI. Caltech has patented several algorithms that have eliminated barriers to analog VLSI system design. The goal is to create mass-producible low-power silicon structures for real-time graphics operations for modeling, rendering, tracking, model acquisition, and interaction that are several orders of magnitude faster and more reactive than today's digital systems. We are now ready for the design and construction of the first analog VLSI computer graphics system.

Image-Based Rendering. Image-based rendering is an important new approach to rapid rendering of highly realistic images (see Rendering, Section A.2). A typical image-based approach is to use cameras to collect real-world information, and then to use discrete reference images to create new views within a continuous range by interpolation, typically image warping. The advantage of image-based rendering over polygon rendering is the ability to display natural, real-world (as well as synthetic) scenes at rendering speeds that are independent of scene complexity.

The Center became an early leader in image-based rendering with the plenoptic modeling work of Leonard McMillan and Gary Bishop at UNC, and continues to pursue the field vigorously [MCMI95C]. In addition to developing both theory and algorithms, we are beginning work on ImageFlow, a hardware architecture that will display high-resolution, highly detailed views of natural and synthetic scenes at interactive rates using image-based rendering techniques.

A particularly exciting application of image-based rendering is post-rendering warp [MARK97], which promises a cheap but effective 5X - 10X speedup for renderers originally generating only a few frames per second. The fundamental idea is to take the combination of image and z-buffer from the renderer and warp it to nearby user positions while waiting for the next image from the renderer. The current solution to the most serious problem of such image warping, namely gaps caused by newly unoccluded regions for which there is no information, is to use two range images rather than one, each warped to the current user location, and then merge them. To avoid doubling the burden on the original renderer, the two range images used are the last two generated by the renderer, and to avoid the problem of these having been generated for places far away from the user's current location, the renderer is asked to generate an image not for the current location, but for the location where the user is predicted to be at the next renderer's frame time. The results from the current software (non-real time) implementation of post-rendering warp are so startling that viewers typically have a hard time detecting the difference between post-rendering and standard per-frame rendering. It appears feasible to implement post-rendering warp on a single chip, allowing such dramatic performance gains at a very modest cost.

Time-Critical Computation. An important issue in scene generation is the scheduling necessary to balance frame rate and image fidelity appropriately. We call the class of rendering and simulation techniques that addresses these issues time-critical computing; it enables an application to maintain an interactive frame rate by automatically reducing presentation quality, thus preserving interactivity at all times even in graphics environments with different performance characteristics. It relies on degradable algorithms to implement these trade-offs, a scheduler to decide dynamically which trade-off is most appropriate, and a predictor to estimate the performance of the selected trade-off. We have been investigating degradable algorithms for scan-conversion-bound rendering [WLOK95][BISH94], for collision detection [HUBB95a][HUBB95b], and for rendering real-time motion blur [WLOK96] [SCHE97]. We have also been studying performance measures, such as throughput and end-to-end lag, how to measure them, and how time-critical computing improves them [WLOK95][JACO97].

A.4.3 Image Display Technologies

We have been pursuing two promising methods of image display for interactive 3D applications: video see-through HMDs and the stereo Multi-Viewer Display. The video see-through augmented reality system, designed specifically for surgical applications in collaboration with Utah (see Telecollaboration, Section A.6, highlight box), is lightweight and open, and provides optically registered real and synthetic objects. It includes two video cameras for real-world capture, LCDs that combine real-world and synthetic imagery, and a hybrid tracking technology that provides accurate registration of the real and synthetic objects the user sees simultaneously.

Multi-Viewer Display is a multiple-viewer, immersive, interactive, large-screen environment. At the heart of the system is a TI Digital Light Projector(TM) that can produce time-multiplexed stereo images for several users without flicker and at a high frame rate. The users wear shuttered glasses that separate the time-multiplexed color channels. Currently we can separate the three color channels, providing a monochrome stereo view for one user and a single color monocular view for a second user. These results encourage us to extend this projector technology's capabilities to support dramatically higher frame rates, thereby supporting two or more simultaneously active users, each receiving the proper (and thus distinct) stereo image pair during each frame time. This contrasts to traditional Cave implementations [CRUZ93], which track only one of multiple viewers, and should be an improvement over the recent achievement of the Stanford group's tracking of two users of the responsive workbench.

A.5 Scientific Visualization

The Center's wide-ranging accomplishments in scientific visualization have had significant impact on the fields of science, medicine, and engineering. Our strategy is twofold: we conduct fundamental research in the field of scientific visualization and we apply the Center's skills in its four core research areas to scientific visualization projects.

The Center has increased its profile in this field by delivering invited talks and publishing papers in scientific visualization journals and conferences, such as the new IEEE Transactions on Visualization and Computer Graphics and the annual IEEE Visualization conference, as well as in domain-specific conferences involving computational fluid dynamics, imaging, and bioengineering. The Center was one of the main exhibitors at the IEEE Visualization'96 Conference last year.

In addition, we maintain a number of software packages online to aid other researchers in this field. These include the SCIRun Computational Steering Software System, which will soon be available to academic institutions (and for commercial purchase) with the book of the same name forthcoming in 1997 from Birkhauser Press [PARK97b].

The Center is involved in a diverse set of scientific visualization projects that substantially further integrative relationships with other scientific institutions and serve to transfer Center research knowledge directly to practical application. Ongoing or newly formed Center collaborative relationships in scientific visualization include work with NASA (two separate projects), the Los Alamos National Laboratory, the Human Brain Project (and others) at the Caltech Biological Imaging Center, the Collaboratory for Microscopic Digital Anatomy, Cedars Sinai Medical Center, and the Nanomanipulator project at the departments of Computer Science and Physics at the University of North Carolina at Chapel Hill. In addition, the Center has an inter-site collaboration project, funded through the Director's Pool, to study wavelet methods for ECG/EEG visualization and computational modeling. Through these relationships, the Center has made significant scientific visualization accomplishments in the areas of user interfaces, scalar and vector field visualization, volume rendering, and image analysis.

A.5.1 Scalar and Vector Field Visualization

The Center has developed scalar and vector field visualization algorithms primarily aimed at large-scale scientific computing applications on unstructured grids. A new isosurface algorithm has been developed that reduces the complexity of the search phase from O(n) to O(sqrt(n)) [LIVN96]. This algorithm has recently been parallelized and run on large data sets (10 - 20 million elements) at the Los Alamos National Laboratory [SHEN96a]. We have developed new local and global vector field visualization algorithms based upon a three-dimensional version of the line integral convolution (LIC) technique [SHEN96b]. We have also introduced a bivariate rendering model to visualize two dependent vectors within a volume.

Image-Guided Streamline Placement. Accurate control of streamline density is key to producing several effective forms of visualization of two-dimensional vector fields. We have introduced a technique that uses an energy function to guide the placement of streamlines at a specified density. This energy function uses a low-pass-filtered version of the image to measure the difference between the current image and the desired visual density. We reduce the energy (and thereby improve the placement of streamlines) by (1) changing the positions and lengths of streamlines, (2) joining streamlines that almost abut, and (3) creating new streamlines to fill sufficiently large gaps. The entire process is iterated to produce streamlines that are neither too crowded nor too sparse.

Wavelet Methods for ECG/EEG Visualization and Computational Modeling. This intersite collaboration examines the feasibility of applying advanced multiresolution geometric modeling, numerical simulation, and visualization approaches to improve algorithms in this area. As part of our overall efforts to build a multiresolution mesh technology base, we have constructed ``libmesh'', which manages the discrete and continuous aspects of multiresolution meshes. It supports the entire class of 1-ring subdivision schemes and a fast coarsification strategy. In conjunction with this effort we are building a multiresolution mesh constructor, which uses volumetric data as input and directly generates multiresolution meshes.

Data and Image Analysis. Data interpretation is an important step in visualizing any form of measured data. The Center has been exploring two widely used forms of medical imaging, ultrasound and magnetic resonance imaging (MRI).

Ultrasound Image Data Use. The Center is applying its new real-time, stereoscopic, video-see-through augmented-reality system to ultrasound-guided needle biopsy, a medical procedure. The system combines properly registered live ultrasound data with views of the patient in a head-mounted-display (HMD). Trial experiments with physician collaborators are underway.

Geometric Model Extraction from Magnetic Resonance Volume Data. We have developed a computational framework for creating geometric models and images of physical objects. New algorithms developed within the framework include a goal-based technique for choosing MRI collection protocols and parameters and a family of Bayesian tissue-classification methods. This interdisciplinary work has been carried out in collaboration with the MRI team of the Human Brain Project at the Caltech Biological Imaging Center.

A.5.2 Scientific Visualization User Interfaces

Interface Widgets for NASA. We have developed 3D interaction techniques (or 3D widgets) for NASA, for use in visualizing and navigating scientific visualization environments. New accomplishments include an innovative interaction technique for context-sensitive 3D probing. We have implemented a prototype of this new widget functionality that gives the user finer positioning control of a widget near dynamic areas in a dataset (e.g., near reference surfaces).

Improved Selection and Manipulation in Immersive VR. The Center has created a new interaction style for immersive VR called ``projective manipulation.'' Projective manipulation extends established cursor-based desktop 3D interaction techniques to immersive VR. For example, to select a distant object, the user positions a finger in front of his or her eye so that the fingertip appears to be on top of the distant object. The user's fingertip is equivalent to a 2D cursor in desktop interaction [FORS96]. Part of this work has been done in collaboration with the User Interface Group at the University of Virginia [PIER97]. Preliminary results from pilot usability studies indicate that manipulating projections is more effective than techniques that require the user to intersect objects in 3D.

Lego Toolkit. Another approach to interaction in immersive environments that avoids the selection problem is to construct a physical prop to act as a counterpart to the virtual object. The Lego toolkit [AYER96] was designed to allow rapid prototyping and customization of input hardware using Lego parts and simple electronic sensors. The toolkit demonstrates that it is technologically and economically feasible to create application-specific hardware input devices.

A.6 Telecollaboration

Researchers at UNC and Utah have been collaborating since the beginning of the Center on the building of complex mechanical parts. Recently, we designed and built a new compact video-based see-through head-mounted display (HMD). This breakthrough lightweight HMD, which was primarily designed for augmented reality enhancements in surgical procedures [STAT96B], provides the user with unprecedented visual integration of synthetic 3D objects with objects in the real world -- for example integrating an ultrasound image of a tumor with images of the patient's breast. We expect the new HMD to enable a host of new surgical (and other) applications, which have been stymied by lack of such a display device. Besides producing useful results, these joint design and manufacturing efforts teach us what aspects of remote mechanical CAD are more or less useful and help to establish the research agenda in this area.

A.6.1 Our Vision

For the past four years our Center has engaged in multi-site, multidisciplinary distributed collaborative design and prototyping, and we have been learning how to do this more effectively. We have created wide-area tracking systems and video see-through head-mounted displays for augmented reality, and are currently working on very wide-field-of-view camera clusters for telepresence. In each of these projects individuals from a variety of disciplines and multiple sites come together to develop the concepts from initial ideas through detailed design, physical manufacturing, and system integration. Experience using the prototypes leads to iterative refinement of the design. Distributed teams incorporate mechanical, electrical, optical, and manufacturing specialists. Working in this way, we have created complex research prototypes quickly, doing things that haven't been done before and pushing the state of the art further than any of our sites working separately could have done.

During these collaborations, we have become increasingly aware of the severe limitations of current tools for such multi-site, multidisciplinary collaborations. We have become increasingly involved in building our own tools during these collaborations, to the point where we now have an all-site multi-threaded project in the development of collaboration environments for integrated design and prototyping. The Center is particularly well positioned to do this, since we have experience in CAD, manufacturing, rendering, user interaction, immersive environments, and related enabling technologies.

We have also realized how much more productive our team members are when they are in the same room than at their separate sites. Most of us intuitively feel that much of this increase in productivity comes from the sense of shared presence. This intuition has inspired us to make a variety of short- and long-term investigations into increasing the sense of presence in our collaboration environment over that provided by our state-of-the-art T1-based teleconferencing system and network interconnections. Although it is not yet clear what factors most significantly contribute to a sense of presence and increased productivity, we generally agree with the internal and external factors as identified in [Barfield93], particularly quality of the display and feeling of immersion within a shared space. Our working assumption is that even though making telepresence ``as good as being there'' will take decades, the potential payoff in increased productivity is likely to be worth the effort

Some of our efforts toward achieving shared presence will have short-term payoffs, such as our wide field-of-view camera cluster, tracking technologies, image-based rendering, and 3D scene acquisition using Imperceptible Structured Light (UNC patent pending). Some efforts undertaken now will pay off in the longer term, such as head-mounted displays and automatic 3D full-scene acquisition via inverse rendering (see Rendering, Section A.2). We have great confidence in their ultimate payoff.

We envision a system that gives geographically dispersed participants a visceral sense of working together in a common design space. We want the shared design space to be a composite that incorporates representations of the design participants within a scene [FUCHS94] as well as various representations of the product being designed (2D and 3D sketches, images, parametric feature objects, and shape and function models.) This design space could be even better than sharing a physical space because it incorporates design representations such as concept sketches, optical paths, mechanical stresses, and electrical characteristics. Fundamentally, this involves three components: (1) live image capture of each of the participants (and their immediate surroundings) (2) a sophisticated, distributed computer system providing a merged digital design world containing the participants together with the constantly changing product design; and (3) a variety of types of immersive (and non-immersive) displays for use by participants.

A.6.2 The Infrastructure

A handful of sites around the world are developing collaborative 3D mechanical CAD environments. The Fraunhofer Institute in Darmstadt, Germany, Hewlett Packard Research Labs, and MIT have recently established research into networked CAD tools and environments. Likewise, a number of research groups have built shared virtual environment systems in which remote participants can see and interact with other participants, applications, and even virtual objects. Two examples include the DIVE (Distributed Interactive Virtual Environment) system developed at the Swedish Institute of Computer Science, and SPLINE (Scalable Platform for Large Interactive Networked Environments) developed at MERL (Mitsubishi Electric Research Lab). Additionally, the ATR group in Kyoto, Japan has produced a collaborative environment that supports a semi-immersive display (wide projection screen) and, although not working specifically at mechanical design, has explored solutions to many of the same issues as the above CAD environments [YOSH95].

However, both categories of systems have limitations. Shared non-immersive CAD systems provide very little sense of presence for the design participants. On the other hand, the VR systems mentioned above are oriented towards social experiences in fairly simple worlds without the density or complexity of CAD model environments. Little research exists on how to best exploit the combination of and transition between immersive, collaborative virtual environments and 3D mechanical CAD. Finally, no current system provides a compelling feeling of shared presence among the various participants working together.

The system we are building includes state-of-the-art interaction techniques and features a tightly-coupled immersive modeling (mechanical CAD) system as well as technologies for telepresence and scene display linked to a rapid prototyping facility. The critical issue for this research is how to enable widely distributed engineering and design teams to interact, communicate, and work together in the context of complex models without regard to distance or location.

The entire telecollaboration project is depicted in the figure below, which shows the contributions of the individual Center sites, and how all of the pieces fit together to form the complete infrastructure.

A.6.3 Shared Virtual Environment (VRAPP)

The shared virtual reality application VRAPP (center of diagram) provides the central framework for our telecollaboration infrastructure. The software operates over the Center's T1-based communications link, communicating video, audio, and state between sites. Using VRAPP, remote participants can join together in a virtual conference room within which they can see and interact with multiple merged representations of the design in the shared space, together with other participants represented by avatars. The avatar motion in the virtual conference room follows each participant's location and orientation, which is controlled by the participant via a mouse or a tracked display device such as a HMD. In addition, when possible we superimpose live video of participants on their avatars. VRAPP can be used in a variety of environments from non-immersive (desktop) to semi-immersive (Active Desk, a table-size stereo display with head tracking), to fully immersive (BOOM and HMD). This approach lets VRAPP take advantage of each type of environment without impacting the others.

A.6.4 Interaction Techniques and Tools

To support natural interaction, our system must reflect various methods normally used to interact and communicate. For example, a simple way for someone to move a chair to the corner of a room is to speak its name and point to where it should be placed; to describe an innovative design for a phone receiver, simply sketch it.

We have developed a set of interaction techniques that address problems unique to collaborative environments such as contention between participants for control of objects [CONN97B] as well as interaction tools that apply well to single-user design situations. For example, our 2D interaction techniques provide an environment that is more ergonomic than most 3D immersive systems for long sessions, leverage many people's ability to sketch geometric shapes, and provide coordinated bimanual input for some tasks. These techniques, which are bundled into the Sketch system [ZELE96][ZELE97], support the creation of 3D geometry. We are currently improving VRAPP by adapting Sketch to be usable from within VRAPP, and updating Sketch to support yet more complex geometry by integrating it with Alpha_1 (see Interaction, Sections A.3 and B.3).

We are also exploring a range of interaction techniques specifically tailored to immersive environments. In particular, we have incorporated virtual laser pointers that make it easier to point out features of a model, and we have developed a suite of ``Through-the-Lens'' techniques for examining, directly manipulating, and navigating about any visible virtual object [PIER97].

In addition to seeing and interacting with one another, VRAPP participants can see and interact with virtual objects in the environment. We have incorporated Utah's Alpha_1 tool into VRAPP so that all users can view and discuss a model, while one of them makes modifications. Seeing and interacting with a CAD model and other collaborators combines several benefits from the HP Shared 3D Viewer, MIT's StudioNet(TM), DIVE, and SPLINE. Unlike these projects however, we are working to tightly couple the CAD tool with the immersive environment: not only can participants display a mechanical model within VRAPP, but they can also use VRAPP to make detailed changes to the model. By choosing Utah's Alpha_1 as our CAD tool we provide an industrial-strength system that is both readily usable by a large group of existing experienced users and accessible and modifiable by our toolbuilders.

A.6.5 Scene Acquisition

Ultimately we would like to be able to continually ``scan'' and reconstruct arbitrary 3D objects, people, and even entire rooms from remote locations. We want 3D video capability; while this should, in theory, be provided by existing image-based rendering techniques, the real-time application of these techniques is generally hampered by the difficulties involved in the acquisition of real-time scene-depth information and in the merging of image and range information from a multitude of camera locations. Much of the spectrum of approaches to such 3D scene extraction can be characterized along a spectrum from no-depth-range-information (but large dense set of camera images) [LEVO96] to precise range information (but sparse set camera images). (Some of the approaches, such as plenoptic modeling [MCMI95D], can move back and forth along this spectrum, depending on implementation needs.) Our center has been pursuing methods at many places along this spectrum.

At UNC we are pursuing real-time Imperceptible Structured Light (patent pending) scene acquisition techniques for dynamic environments. Although structured light techniques have been employed effectively for decades to extract depth from scenes, heretofore they have not been practical for scenes with humans, since the changing patterns have been too disturbing. The new technique, which minimizes the disturbing visual effects, consists of projecting in very rapid succession the pattern of interest and then its complement, so that when integrated over even a short period of time (say, 10msec), humans perceive a flat field of light. We project dynamic light patterns briefly into the scene, and then with the aid of synchronized cameras obtain and analyze scene images to determine range (depth). We combine the range information with properly registered color images to obtain dense 3D reconstructions, and then merge these reconstructions into our existing shared virtual environment so that they can be displayed at remote environments. At Caltech we are using similar structured light techniques to obtain range images of a single static object from multiple viewpoints and are working on robust automatic methods to combine that data into a unified surface description. These results will then be applied to the real-time dynamic scene acquisition work at UNC.

The Center has been collaborating with Professor Ruzena Bajcsy at the University of Pennsylvania to develop completely passive image-based methods that do not rely on controlled light or inherent scene textures. Currently we are able to use the automated depth-extraction methods of UPenn to do non-real-time image-based reconstructions of relatively small scenes or objects. As with the structured light reconstructions, we then merge these reconstructions into our existing shared virtual environment.

Our accomplishments related to display are discussed in Rendering, Sections A.2 and B.2 and Performance, Sections A.4, and B.4. Most relevant to the telecollaboration project is the work on image-based rendering described in those sections.

A.6.6 Mechanical Parts

Our current object of collaboration is a wide-field-of-view, high-resolution video camera cluster. The device we are designing will use six conventional cameras in a single cluster to construct an image with a 180 x 80 degree field of view, with six times the resolution of a single camera. The device will be used to capture high resolution images for 3D scene reconstruction as well as for conventional televideo systems. Even in the near term it will allow televideo systems in which each of many remote viewers can electronically pan and zoom about the cluster's field of view independently. The technical challenge in developing such a cluster lies in the difficulty of properly merging all the views into a seamless whole. This can only be achieved by optically arranging all the cameras to have the same center of projection. Achieving both the same center of projection and sufficient overlap to allow resampling without seams between the individual views has never been demonstrated to our knowledge. As with the HMD previously built, the multidisciplinary nature of the camera cluster means that we have to draw heavily on expertise at multiple sites.

B. Plans

B.1 Modeling

Where will modeling be in 20 years? While each current approach to modeling has some significant advantages, they all exhibit some intrinsic, fundamental limitations. In the future, representations and methods will combine the advantages of several of today's approaches to form more general and significantly more powerful approaches. The potential for such hybrids seems vast, yet today, attempts to form them seem ad hoc because there is not adequate understanding, either foundationally or experimentally. Our integration of modeling with diverse applications in the Center advances our understanding of complex requirements, while our core theoretical research explores new approaches to achieve significant breakthroughs in geometric modeling, physically-based modeling, biological modeling, and new mathematical shape and behavior representations. In what follows we detail some concrete steps pursuing this dual path.

B.1.1 Plans for Modeling in Support of Applications

Geometric Modeling, Modeling Design, and Modeling for Manufacturing. Design for manufacturing (DFM) will continue to be an important Center effort as it expands into different manufacturing processes with its integrated approach. We will use a generalized feature-based approach to capture design intent rather than the operational details of any specific process, which will continually change with emerging technologies.

Textbook examples notwithstanding, design for manufacture rarely progresses smoothly through a predetermined series of sequential stages. Especially in the early stages of a design, it is typical to consider more than one alternative design and to keep many versions and ancillary drawings or models active simultaneously. Through the Center, we have brought together experts in areas from sketch input to numerically controlled manufacturing to rethink a design environment for tomorrow's design and manufacturing technology. We will test our research results and advance the system as we use it to design new components unavailable from any commercial source. The integrated design context will support multiple levels of detail, multiple versioning, sketches, and live video of participating domain experts.

Telecollaborative design requires environments in which each user can explore a complex model, immersively and interactively refining it. In addition to research on integrated design, we will investigate how to distribute models across sites, how to create and query the distributed model, and how lag affects collaborative design.

Wavelet Methods for Visualization. We plan to create second-generation wavelet methods to quantify contour maps on irregular surfaces such as topographical maps (geology) and body surface potential maps (cardiology). While simple statistically based methods allow a certain level of comparison between contour maps, we seek to use wavelet methods to quantify the changes in both scalar magnitude and geometry that occur among maps.

A new collaborative project for inverse problems will generate multiresolution wavelet meshes to represent the geometry and create parallel multigrid solvers for ill-posed, large size, biomedical problems with noisy data. Center research experts in both multiresolutional modeling and biomedical inverse problems will participate. (see Scientific Visualization, Sections A.5 and B.5)

B.1.2 Plans for Fundamental Research in Computer Graphics Modeling

Model Extraction. One way to construct complex models is to acquire shape and reflectance information from images of real-world scenes and objects, as discussed under Rendering. To assure that gathered data is suitable for the intended applications, we will use an MRI machine to create dynamic models via 3D scanning. We are establishing a teleological pipeline to gather high-resolution anatomical MR data, classify tissues, reduce artifacts, make geometric and dynamic models, and then visualize them.

Physically-Based Modeling. We will create a mathematical foundation for a physically-based modeling (PBM) language and testbed, and develop the elements of the first PBM language for heterogeneous physical structures and media (rigid, flexible, fluid, with constraints). To further our goals in PBM, we have begun a long-term collaboration with Caltech's Control and Dynamical Systems group, supported by MURI funds. See:

http://www.cds.caltech.edu/~doyle/MURI.html

To help develop methods for expressive motion of human figures, robust representations of dynamic contact between rigid and flexible objects, and simulations of instantaneous ``impulses'' when composite constrained objects collide with one another, we plan to create a PODE (piecewise ordinary differential equation solver) that applies to general physically-based state machines. Efforts in developmental modeling and artificial life testbeds will be aimed at methods to unify flexible, rigid, and fluid systems of objects.

Mathematical Representations. To understand the complex requirements of modeling and explore more general mathematical approaches, we will develop new methods and modeling primitives and also use known mathematical methods in new computer graphics applications. This relates to projects at all sites of the Center, including:

Differential geometry applied to computer graphics
Wavelets and multiresolution representation of surfaces
Representing and manipulating complex models of manifolds and/or torn surfaces
Inverse methods for determining potential functions
Mathematical methods for inverting the rendering equation
Mathematics of image interpolation for use in model-less rendering
Further work in correction of geometric perceptual distortions in images
More general norms for measuring similarities between images

B.2 Rendering

We are merging our simulation and analytic capabilities with our physical measurement and experimental perception environments. Comparisons among these capabilities and environments will let us provide the fundamental background necessary to improve the accuracy and fidelity of synthetic images with more efficient computational approaches.

We believe that in the next one or two decades, we can make time-varying computer graphics images that are both interactive and objectively realistic. This will be accomplished by removing perceptually secondary computations and leveraging advances in hardware. However, several fundamental issues in scene representation, algorithmic design, hardware, and perception must first be better understood. The boundaries between areas begin to blur as issues of performance and modeling become tightly coupled with issues of rendering, thereby reinforcing the goals of the Center.

Our long-term research goals are:

Light Reflection Models

Development of a general-purpose wavelength-dependent model or models for arbitrary reflectance functions including the effects of subsurface scattering and re-emission, texturing and surface anisotropy
Validation of this local light-reflection model by comparison with measured physical experiments
Improving representations of this model in a compact, data-efficient form useful for progressive rendering algorithms
Establishment and distribution of a database of experimentally verified reflectance characteristics of materials

Light Transport Simulation

Creation of global illumination procedures capable of accurately simulating the light energy transport within complex geometric environments composed of surfaces with arbitrary reflection functions
Validation of the global energy transport algorithms by comparison to measured physical environments
Development of physical error estimates for radiometric computations
Development of error bounds and predictions of computational complexity for global illumination algorithms
Creation of automatic and adaptive techniques for progressive energy transport algorithms
Development of techniques for rendering scenes with extreme geometric complexity

Perceptual Research

Use of perceptual metrics to establish realistic visual display methods for a wide variety of display devices and viewing conditions
Creation of photorealistic synthetic images that are physically and perceptually indistinguishable from real scenes
Improving the efficiency of global illumination algorithms through perceptual error bounds

Rendering and Inverse Rendering

Mathematical analysis of the theoretical limits of image-based rendering
Investigation of the process and limits of light field interpolation
Establishment of error metrics to determine when the transition from geometrically based to image-based rendering can occur
Exploration of methods to extract 3D geometry, true texture, and lighting information from 3D images

B.3 Interaction

Using Gestures in Constructing Parametric Mechanical CAD Models. We are exploring the use of Sketch's gestural interface for a specific scenario that focuses on the Center's interests in mechanical design. We address modeling issues throughout the design process, from early conceptual design to detailed specifications, and support fluid iterative transitions between these extremes. Since each modeling stage makes models of different types, our model representations must facilitate lossless transitions between stages and blend changes made in one stage to another. We have found that the gestural 3D solids sketching work at Brown (which started with the ``broad-brush''-level design and is working toward detailing) and the Alpha_1[ALPHA1] parametric CAD modeling work at Utah (which started with mechanical detailing and manufacturing and is now working toward early conceptual design) are in fact ever more compatible and complementary than we realized. Working together in the Center has made it possible for us to actively pursue synergies between the two approaches.

Our specific scenario has both top-down and bottom-up interface projects. Our top-down projects will use conventional input hardware such as mice and tablets to develop a range of interfaces for mechanical design, object placement, and telecollaboration applications. Initially, we will generalize the Sketch framework to support various 3D geometry engines, including those provided by Alpha_1, Autodesk, and Fraunhofer. Subsequently, we will extend Sketch to support a gesture set specific to mechanical design for constructions of features such as pockets, fillets, and extrusions. Our bottom-up projects will explore a range of multimodal interfaces customized to human skills, including speech recognition, tracking various parts of the body (eyes, hands, and fingertips), and haptics (force feedback).

Interfaces for Telecollaboration (see Telecollaboration, Sections A.6 and B.6).

Free-form 3D Modeling. Our planned modeling system goes beyond the gesture-based techniques of Sketch to incorporate algorithms from computer vision (e.g., shape from shading). The goal is to allow a skilled artist to leverage his existing drawing skills to rapidly create and refine free-form shapes. For more details, see:

http://www.cs.brown.edu/stc/interaction/fform.html

Time-Critical Support for Interaction (see Performance, Section B.4.2).

Haptic Interfaces. Just as previous researchers have invented specific visual idioms such as icons, windows, and menus, we have now begun to explore haptic idioms for presenting abstract information. Our approach contrasts with most haptic research, which focuses only on information with a literal haptic mapping. For example, using a haptic interface we want to transition seamlessly from constrained to unconstrained manipulation of an object in 3D. In addition, we will extend the real-time direct haptic rendering work [THOM97] to support haptic telecollaboration for mechanical design. For more details, see:

http://www.cs.brown.edu/stc/interaction/haptics.html

Non-Photorealistic Rendering. Historically, research in computer graphics has focused primarily on producing images that are indistinguishable from photographs. But graphic designers have long understood that photographs are not always the best way to present visual information. A growing body of research addresses the production of nonphotorealistic imagery, but usually at the expense of long rendering times (e.g., [MEIE96] [WINK96]). Over the next few years we will extend our recent accomplishment of producing non-photorealistic renderings at interactive rates [MARK97]. Our goals are to support a broader range of rendering styles (beyond simple line drawings) and to develop more general methods for maintaining both frame-to-frame coherence and a given level of detail of imagery in screen space. For more details, see:

http://www.cs.brown.edu/stc/interaction/npr.html

Interaction in Immersive Environments.Our long-term goal is to develop complete systems for immersive and semi-immersive VR. These systems will make possible rich interaction dialogues that let users work productively in immersive VR environments without fatigue. For more details see:

http://www.cs.brown.edu/stc/interaction/ivr.html

(In addition, see Scientific Visualization, Section B.5)

Usability Studies. We will continue to conduct usability studies that evaluate, support, and refine our user interaction techniques. For example, we want to identify how and when haptic interfaces are better than alternatives such as a mouse or six-DOF input device (for a 3D placement task). User studies will also help evaluate and refine the end-to-end tasks in immersive VR. We will conduct user studies on the tradeoffs between conventional and non-photorealistic rendering techniques. In addition, both the Sketch-Alpha_1 integration and telecollaboration projects will be driven by user feedback, because both will be used for real applications during their development.

B.4 Performance

B.4.1 Tracking

For large area tracking our major goal is to reduce our dependence on beacons by using technologies such as inertial sensors. We also plan to capitalize on the images from video cameras. This new method of tracking will be implemented first with known targets, building on the hybrid tracking method [STAT96a]. The more versatile but complex solution of tracking unknown targets will be accomplished by finding edges or corners in the environment by correspondences [BAJU95B]. Our goal is to be able to track outside, or visit areas not specially outfitted to accommodate an HMD. Our overarching approach is to use sensor information from whatever methods possible and fuse the information in a cohesive model of predicting position one or two frames ahead.

B.4.2 Rendering Hardware and Software

Graphics Systems. The newest advanced computer graphics engine at UNC, PixelFlow, supports a complete shading language that lets users create complex, dynamic textures and advanced shading models. PixelFlow is beginning to run this year and we expect it to provide the same synergistic advantages for Center-funded projects as did Pixel-Planes 5. We also expect that the design of the successor to PixelFlow, ImageFlow, will be influenced by Center project needs, as was PixelFlow. ImageFlow, an NSF/DARPA funded grant, will be based on images as well as polygonal primitives.

Analog VLSI. We will continue to design and construct the world's first prototype analog VLSI modeling and rendering engine to display analog VLSI computer graphics images in real time on an analog CRT screen. Starting with simpler schemes such as depth buffering and splining, the chips can ultimately be used in more advanced methods, such as high-quality image-making, solving analog versions of the rendering equation, and obtaining high-quality models from real objects in real time.

Image-Based Rendering. Despite promising results in image-based rendering, there is still considerable uncharted territory in this relatively new area. We plan to improve the current techniques used for real-world image acquisition via cameras. We will develop strategies to reduce and compress the potentially explosive number of images required to represent a complex scene: a scene with many rooms, and thus a high depth complexity, requires numerous views of each and every room.

Time-Critical Computing. We will continue development of techniques for time-critical rendering that degrade visual characteristics of less perceptual importance to maintain performance. We plan to incorporate the time-critical rendering techniques into a modular framework for time-critical applications using scheduling algorithms to budget time within real-time constraints. We will also continue work on frameless rendering to ascertain perceptually important contents of a scene and time-critical presentation of this crucial information.

B.4.3 Image Delivery Techniques

A remote physician needs a functioning, compact, lightweight, easy-to-use and unencumbering video see-through head-mounted display device with eye-aligned video cameras and high-resolution, stereoscopic displays. Future video-see-through HMD versions will be so small that they can be built into prescription glasses, similar to ``surgeon''-magnifying glasses.

We will continue to build multi-viewer displays, full-color stereo displays accommodating multiple active users simultaneously. Each user will be individually tracked and will see perspectively correct images.

B.5 Scientific Visualization

Space restrictions preclude a full explanation of plans for all of the Center's 12 ongoing and newly funded projects in scientific visualization. We emphasize new directions here.

Library of User Interface Techniques for NASA. Continuing four years of work with NASA, we are abstracting the desktop and immersive environment interfaces we have developed into a library of interaction techniques. This library, comprising selection, manipulation, and navigation techniques, will be integrated with NASA's Virtual Wind Tunnel system [BRYS91] and could also be integrated with other systems requiring similar 3D interaction solutions.

Remote Microscope Control, with the Collaboratory for Microscopic Digital Anatomy (CMDA). The Center has been participating in the development of a collaborative research environment, the Collaboratory for Microscopic Digital Anatomy (CMDA), designed to provide remote access to the sophisticated instrumentation located at the National Center for Microscopy and Imaging Research (NCMIR). This project has been extended by the NSF for full five-year National Challenge Project support. We are expanding our research efforts so as to provide significantly improved EM tomography visualization.

Improvement of Tissue Classification Methods, with the Caltech Biological Imaging Center and Cedars Sinai Medical Center. Building on several years of previous Center work, we will develop a wider variety of tissue classification methods and goal-based methods. We are also beginning a collaborative effort with Cedars Sinai Medical Center to apply the classification techniques to clinical medicine in order to identify particular geometric structures for non-invasive diagnostic and exploratory purposes.

Interactive Steering of Biological Image Acquisition, with the Human Brain Project. With support from the Human Brain Project, we are developing an interactive environment for goal-directed steering of magnetic resonance microscopy data. Goals include minimizing acquisition time and collecting images with material signature targets that allow effective tissue classification. The interactivity will let imaging users apply the technique to new situations and validate the algorithm, suggest further directions for its development, and acquire more effective imaging data in shorter times.

An Improved Interface for Scanning Probe Microscopes, with the Nanomanipulator (nM) Project. The Nanomanipulator (nM) project is an NSF-funded collaboration between the departments of Computer Science and Physics at the University of North Carolina at Chapel Hill that uses the Center-related PixelFlow supercomputer. This project will study interface issues with the goal of making the nM a more powerful and accessible tool for the scientific community.

Wavelet Methods for ECG/EEG Visualization and Computational Modeling. Second-generation wavelet methods are of practical interest for quantifying contour maps on irregular surfaces such as topographical maps and body surface potential maps. While simple statistically based methods such as RMS error and correlation coefficients allow a certain level of comparison between contour maps, we seek to use wavelet methods to quantify both the scalar magnitude and the geometric changes that occur between maps. We will also investigate the use of multiresolution wavelet methods to perform accurate visualization computations and compare them to existing methods, and the use of the analysis features to guide the user to interesting features.

Haptic Rendering and User Interaction with the NSF, ARO, and NASA. To augment research proposed by Utah researchers to NSF and the ARO, we are collaborating with researchers at NASA Ames to develop graphical and haptic rendering methods for local visualization that can operate simultaneously with a global visualization method. A haptics-derived position locates a point in the simulation data much as does a multi-dimensional computer mouse, but unlike a mouse, which is purely a positioning device, the haptic interface exerts forces or torques on the user that are mapped to provide quantitative information about the simulation data. This collaboration will draw upon the Center's expertise in local and global visualization techniques, as well as on its significant experience in three-dimensional widgets.

B.6 Telecollaboration

We shall vigorously pursue technologies that improve the sense of presence while simultaneously working to improve interaction and better collaborative design and visualization. Our efforts will focus on new 3D interaction techniques and 3D scene acquisition.

B.6.1 Interaction Techniques and Tools

We plan to extend the existing capabilities of our interaction techniques and tools in several ways. We are seeking new interaction paradigms that are better suited to collaborative mechanical design. In the Clear Board project [ISHI94] two users manipulated a shared 2D workspace directly on a display screen the size and shape of a large drafting-table. We envision extending Clear Board into 3D so that a 3D virtual mechanical part appears in front of a 3D reconstruction of a remote collaborator. The participants would each see each other in the context of the shared environment as if they were together, but also could manipulate a mechanical part being designed using both hands.

Our goal is to provide a continuum of CAD tool interaction techniques ranging from simple gestural sketching of rough ideas, to two-handed interaction with intermediate models, to detailed final model specification. For example, we plan to develop various high-resolution shared annotation tools. In particular, we want to explore gestural techniques such as using fingers as arrows to point to parts of a model, or waving a hand in a circle to highlight a region of the model.

In addition, multiple users should be able to simultaneously use various interaction techniques on this continuum. We want to have ``cooperative interaction'', which is interaction dependent on multiple participants simultaneously acting together. This contrasts with most existing multi-user systems where many users can interact at the same time, but the effects of their actions are independent of each other. Users should be able to work together on closely related modifications in a natural way without interfering with the manipulations of other users. In other words, the interaction of multiple users should be as quick and natural as if they were in the same room.

B.6.2 Scene Acquisition

One technology that is driven directly by telecollaboration is specialized display systems. In particular, we intend to explore the tiling together of multiple digital light projectors (DLP's) to form a very large high-resolution display to use (for example) to display images from our high-resolution, wide-field-of-view camera cluster. We are also interested in exploiting the speed and control of the DLP technology to support more than two individualized stereo views via time-division image multiplexing. A major plan for the DLP is to have direct-drive mirror control to achieve these dramatically greater frame rates, allocated among multiple viewers. The same techniques will enhance Imperceptible Structured Light and other techniques.

While we are working to merge 3D reconstructions of remote users and objects into our existing shared virtual environment, ultimately we would like to reverse that relationship and instead work in a primarily reconstructed environment augmented with some virtual objects. On one front, our collaborators at the University of Pennsylvania are pushing ahead with parallelized versions of their passive (image-based) depth-extraction algorithms. We hope that such real-time depth extraction may eventually make possible high-quality, real-time, image-based 3D acquisition and display of large scenes. We will also continue to explore a novel approach to the problem that involves effectively inverting the process normally used to render realistic scenes from known geometry, surface, and lighting properties. This effort, which we call ``inverse rendering'', is described in Rendering, Section A.2.

Finally, we plan to use Cornell's physically-based renderer (see Rendering, Section A.2, highlight box) in two ways for 3D scene acquisition research. First, it will be the basis for a controlled experimental testbed for developing new algorithms for scene extraction from multiple photographs. Second, we hope to use it as the inner loop of a predictor/corrector scene extraction system in which renderings of successive approximations of a scene description move progressively closer to the acquired photographs (see model-based recognition, Rendering, Section A.2.2). Crucial in both these uses is the ability to vary any of many parameters in a scene description (geometry, lighting, and material properties) with confidence that the resulting renderings will be a near-perfect approximation to photographs of the scene. No other research group of which we are aware possesses all the related resources and knowledge (in photometric measurement, calibration, rendering, image analysis, etc.) to move forward decisively with such a comprehensive approach.

B.6.3 Mechanical Parts

Experiences and excitement generated by our successes to date only whet our appetite for building further exciting electro-mechanical devices. Among the many items we would like to build are:

New tracking systems that can be used outdoors and other places that have not been previously instrumented
HMD's built into normal eyeglass frames
HMD's that clip onto normal eyeglasses and flip up as desired
Video see-through displays for surgeons that are built into their prescription eyeglasses
Alternate surgical displays not worn by the surgeon but rather moved near the patient to visualize medical information registered to the patient
Camera-projector clusters that perform automatic and continuous calibration of large-area tiled displays to give the effect of one large seamless display from many overlapped projected images

B.6.4 Remote Scientific Visualization

An exciting possible use for our system is remote shared scientific collaboration. For example, the Nanomanipulator project is working on providing a graphical interface to atomic-scale scanning probe microscopes. A planned experiment with Nanomanipulator researchers uses VRAPP to visualize live imagery from the microscope (in theory this is not much different than our standard live video feeds of participants' faces). Researchers across the country could then examine ``live'' data together in the shared virtual environment. Then the microscope could be controlled from within VRAPP from a remote location to provide remote steering for scientific experiments, such as manipulating Bucky tubes or other nano structures.