Supporting Software Engineering with Open Hypermedia
Department of Computer Science Web: http://www.cs.colorado.edu/
ECOT 717, Campus Box 430, Boulder, CO 80309-0430, USA
Software engineering is a discipline addressing the large-scale development of software systems. These development projects are hindered by the complexity intrinsic to software, as described by Fred Brooks in "No Silver Bullet" [Brooks 1987]. One aspect of this complexity is the heterogeneous nature of the artifacts produced by software development. For instance, a typical large-scale project will produce feasibility studies, business plans, requirements documents, architecture specifications, module interface documentation, and test plans (not to mention source code). Software engineers are tasked with the formidable goal of managing this collection such that information is accessible, arranged cohesively, and organized consistently.
Unfortunately, three factors combine to make this task near impossible. First, a team will typically consist of multiple people (for large-scale projects this number can be in the hundreds and even thousands [Brooks 1995]) spanning multiple organizations. Each organization is likely to have a different computing environment with different procedures and tools for managing documents. Thus requirements documents might be created using Microsoft Word in one environment, FrameMaker in another, and LaTeX in a third. This heterogeneity of applications hinders document interchange and information accessibility. It also increases the work of organizations by forcing teams to address issues related to information interchange. Note that historical and budgetary issues will hinder the adoption of standard environments and practices across organizations.
Second, each artifact will be related both explicitly and implicitly to many other artifacts. Explicit relationships are typically captured in the form of manually maintained cross-references and indexes. The lack of automation for these relationships enables the gradual introduction of errors which eventually hinder engineers conducting development-related tasks. The larger problem however is that explicit relationships typically number much less than implicit relationships. Implicit relationships can take many forms and include missing cross-reference or index entries, semantic dependencies, such as the relationship between all artifacts that document a subsystem, time dependencies, such as when a particular set of artifacts are used to document a particular configuration of a software system, and ad hoc dependencies, which occur when a developer discovers an unexpected relationship between two or more artifacts that aids the completion of a particular task. Ad hoc relationships are typically ephemeral, being useful for only a short time, but nonetheless critical to the tasks they support.
One problem with implicit relationships is that engineers must employ ad hoc procedures, including memorization, to manage them. These ad hoc procedures increase the conceptual burden on each engineer and make it difficult for new engineers to discover important information. Another problem is associated with keeping artifacts consistent. If one document depends upon another via an implicit relationship, then an engineer making a change in one will not necessarily modify the other since he or she may be unaware of the dependency.
Finally, a third factor hindering the management of artifacts in a development project is the large-scale nature of industrial software projects. A single subsystem in large-scale software systems can literally have hundreds of artifacts consisting of tens of thousands of pages with hundreds of thousands of explicit and implicit relationships [Anderson 1999a]. The amount of information that requires management is staggering and quickly overwhelms the engineers tasked with maintaining it.
In order to address this relationship management problem, a solution is needed that can handle the application and document heterogeneity of software development organizations, can provide principles and tools to capture implicit relationships (making them explicit) and can automate the management of these relationships in order to address the scalability demands of large-scale projects. Open hypermedia [Østerbye 1996] is one such approach and open hypermedia systems have been used to address the complexity and heterogeneity of large-scale software development.
2 Open Hypermedia Approach
Open hypermedia possesses two key characteristics: externally managed relationships and third-party application integration. The latter is important since it addresses the first concern raised above. By providing techniques to integrate hypermedia into existing applications [Davis 1994], [Whitehead 1997], a development organization is not forced to give up familiar applications or business processes when adopting open hypermedia. The former characteristic is important for two reasons. First, legacy documents can have hypermedia anchors and links attached to them, without modification, since all anchor and link information is stored externally. Second, external relationships can be grouped into sets which can be made active independently of each other. This allows, for instance, one document to have multiple hypermedia configurations that support distinct tasks. If a user switches tasks, then the same document is displayed but with a different set of anchors and links that support the new task. There are many open hypermedia systems in existence (see [Østerbye 1996] for a good overview); one in particular, Chimera [Anderson 1997], [Anderson 1994], has focused exclusively on supporting software engineering. However, before we discuss Chimera, we describe other hypermedia techniques that have been applied to software engineering.
2.1 Applying Hypermedia to Software Engineering
Hypermedia was first used to support the construction of software systems in Neptune [Delisle 1986]. Neptune was built on top of the Hypertext Abstract Machine (HAM) [Campbell 1987] and featured integrated support for hypermedia within custom-built CAD applications. A software engineer using Neptune could, for instance, create a link between a requirements document and a design diagram, as long as both artifacts were managed by the CAD applications built on top of the HAM. This approach has seen further elaboration within CASE tools, where research has been performed on integrating hypermedia functionality [Bieber 2000b] into all of the tools of an integrated development environment (IDE) [Oinas-Kukkonen 1997], [Oinas-Kukkonen 2000]. In addition, Østerbye has examined how hypermedia services can influence the act of programming itself [Østerbye 1995]. Finally, Amsellem has examined issues of hypermedia-enabled programming environments within the context of Smalltalk-80 [Amsellem 1995]. These efforts examine the range of hypermedia services required within an IDE and how those services can be put to good use creating cohesive software designs. The problem with this approach from the standpoint of open hypermedia is that each application must store both data and hypermedia information in a centralized repository; this constraint limits the technique's scalability. Thus, in order to add a new hypermedia-aware tool to the environment, it must be built from the ground up to utilize the hypermedia engine for both data storage and hypermedia services. This restriction is in contrast to the open hypermedia approach where the open hypermedia system focuses solely on providing hypermedia services to integrated applications and does not restrict where an application stores its persistent data.
Another interesting approach to using hypermedia to support software engineering is demonstrated by the Kiosk system [Creech 1991]. Kiosk was produced to aid software engineers in selecting reusable software components. Kiosk came with a set of classification tools that grouped software "work products" into linked hierarchies that could then be browsed by Kiosk's browsing tools. The purpose of Kiosk was to help engineers with two technical problems faced by software development projects: (1) how does an engineer discover and select useful components in a large component library and (2) how do engineers organize their components to aid in the selection process. Thus, Kiosk was using hypermedia in an attempt to raise the level of abstraction at which software engineers perform their software development tasks. Similar techniques have been applied towards the documentation of object-oriented frameworks [Meusel 1997], [Odenthal 1997].
We now examine work that employs the open hypermedia approach directly.
3 The Chimera Open Hypermedia System
Chimera's support for software engineering flows from its support for heterogeneity. In particular, Chimera defines a flexible set of hypermedia concepts well suited for software engineering tasks. Of particular importance is Chimera's first-class support for views. Project artifacts can often be viewed or manipulated by many tools. Chimera allows each of these views to be explicitly represented and to have a set of anchors and links tailored to the view's visual appearance and interaction style. Chimera also provides application programming interfaces in a variety of programming languages (Ada, C, C++, and Java). This increases the chance that a developer has direct access to Chimera's services in the programming language used to develop his or her application. Finally, Chimera is implemented using the Java programming language and is configured using an external preference file--both characteristics facilitate the use of Chimera on multiple computing platforms.
3.1 Integration with the Web
With the advent of the World Wide Web (WWW), software engineering projects have turned to storing some of their artifacts on-line. Chimera has been integrated with the Web in order to support the linking of these artifacts with the remaining artifacts that cannot be easily moved onto the Web [Anderson 1997]. In fact, one Chimera user reported that several attempts to move all of a software project's artifacts onto the Web failed due to the difficulties of converting legacy document formats into HTML in a cost-effective matter. Two problems were encountered. First, there were simply too many types of documents to be converted and second, the layouts of very complex tables could not be reproduced in HTML with acceptable fidelity. The user preferred using Chimera's integration with the Web to link his legacy artifacts with his HTML-based artifacts, rather than attempt the conversion process to a complete Web-based solution. Additional work on integrating open hypermedia systems with the WWW is reported in [Carr 1995] and [Grønbæk 1997].
One recent technique to address scalability is reported in [Anderson 1999]. Chimera has the ability to import and export hypermedia information in XML [Bray 1998]. One industrial user leveraged this feature by creating parsers for several key artifact types. These parsers scan project artifacts for implicit anchor and link instances and then produce XML files containing information about the discovered anchors and links. Using this technique, over 500,000 anchors and links were created after scanning more than 30,000 pages of project artifacts. An interesting side-effect of the XML import/export files is that they facilitate the configuration management of hypermedia information.
This paper has identified several issues encountered by software engineers in managing the relationships that exist between their software artifacts and has described why the open hypermedia approach is well-suited to address this problem domain.
References[Amsellem 1995] Maurice Amsellem. "ChyPro: A Hypermedia Programming Environment for Smalltalk-80" in Proceedings of the ECOOP'95. Aarhus, Denmark, August 1995.
[Anderson 1994] Kenneth M. Anderson, Richard N. Taylor, and E. James Whitehead. "Chimera: Hypertext for Heterogeneous Software Environments" in Proceedings of the ACM European Conference on Hypermedia Technology (ECHT '94), Edinburgh, Scotland, 94-106, [Online: http://acm.org/pubs/citations/proceedings/hypertext/192757/p94-anderson/], September 1994.
[Anderson 1997] Kenneth M. Anderson. "Integrating Open Hypermedia Systems with the World Wide Web" in Proceedings of ACM Hypertext '97, Southampton UK, 157-167, [Online: http://acm.org/pubs/citations/proceedings/hypertext/267437/p157-anderson/], April 1997.
[Anderson 1999] Kenneth M. Anderson. "Data Scalability in Open Hypermedia Systems" in Proceedings of ACM Hypertext '99, Darmstadt, Germany, 27-36, February 1999.
[Anderson 1999a] Kenneth M. Anderson. "Supporting Industrial Hyperwebs: Lessons in Scalability" in Proceedings of the 21st International Conference on Software Engineering, Los Angeles, CA, 573-582, May 1999.
[Bieber 2000b] Michael Bieber, Harri Oinas-Kukkonen, and V. Balasubramanian. "Hypertext Functionality" in ACM Computing Surveys Symposium on Hypertext and Hypermedia, 2000.
[Bray 1998] Tim Bray, Jean Paoli, and C. Michael Sperberg-McQueen (editors). Extensible Markup Language (XML) 1.0, Cambridge, Massachusets: World Wide Web Consortium, [Online: http://www.w3.org/TR/REC-xml], February 1998.
[Brooks 1987] Frederick P. Brooks, Jr. No Silver Bullet: Essence and Accidents of Software Engineering. IEEE Computer, 20(4): 10-19, 1987.
[Brooks 1995] Frederick P. Brooks, Jr. The Mythical Man-Month, 20th Anniversary Edition, Addison-Wesley, 1995.
[Campbell 1987] Bruce Campbell and Joseph M. Goodman. "HAM: A General-Purpose Hypertext Abstract Machine" in Proceedings of ACM Hypertext '87, Chapel Hill, NC, 21-32, November 1987.
[Carr 1995] Leslie A. Carr, David C. DeRoure, Wendy Hall, and Gary J. Hill. "The Distributed Link Service: A Tool for Publishers, Authors, and Readers" in Proceedings of the Fourth International World Wide Web Conference, Boston, MA, 647-656, [Online: http://www.staff.ecs.soton.ac.uk/lac/dls/link_service.html], December 1995.
[Creech 1991] Michael L. Creech, Dennis F. Freeze, Martin L. Griss. "Using Hypertext in Selecting Reusable Software Components" in Proceedings of ACM Hypertext '91, San Antonio, TX, 25-38, December 1991.
[Davis 1994] Hugh C. Davis, Simon Knight, and Wendy Hall. "Light Hypermedia Link Services: A Study of Third-Party Application Integration" in Proceedings of ACM European Conference on Hypermedia Technologies (ECHT)'94, Edinburgh, Scotland, 41-50, September 1994.
[Delisle 1986] Norman M. Delisle and Mayer D. Schwartz. "Neptune: A Hypertext System for CAD Applications" in Proceedings of ACM SIGMOD'86, Washington DC, 132-142, May 1986.
[Grønbæk 1997] Kaj Grønbæk, Niels Olof Bouvin, Lennert Sloth. "Designing Dexter-Based Hypermedia Services for the World Wide Web" in Proceedings of ACM Hypertext '97, Southamption, UK, 146-156, [Online: http://acm.org/pubs/citations/proceedings/hypertext/267437/p146-gronbaek/], 1997
[Meusel 1997] Matthias Meusel, Krzysztof Czarnecki, Wolfgang Kopf. "A Model for Structuring User Documentation of Object-Oriented Frameworks using Patterns and Hypertext" in Proceedings of the ECOOP'97. Jyvaskyla, Finland. July 1997.
[Odenthal 1997] Georg Odenthal and Klaus Quibeldey-Cirkel. "Using Patterns for Design and Documentation" in Proceedings of the ECOOP'97. Jyvaskyla, Finland, July 1997.
[Oinas-Kukkonen 1997] Harri Oinas-Kukkonen. "Towards greater flexibility in software design systems through hypermedia functionality" in Information and Software Technology, 39(6), 391-397, June 1997.
[Oinas-Kukkonen 2000] Harri Oinas-Kukkonen. "Flexible CASE and Hypertext" in ACM Computing Surveys Symposium on Hypertext and Hypermedia, 2000.
[Østerbye 1995] Kasper Østerbye . "Literate Smalltalk Programming Using Hypertext" in IEEE Transactions on Software Engineering, 21(2), 138-145, 1995.
[Østerbye 1996] Kasper Østerbye and Uffe K. Wiil. "The Flag Taxonomy of Open Hypermedia Systems" in Proceedings of ACM Hypertext '96, Washington DC, 129-139, [Online: http://acm.org/pubs/citations/proceedings/hypertext/234828/p129-osterbye/], March 1996.
[Whitehead 1997] E. James Whitehead, Jr. "An Architectural Model for Application Integration in Open Hypermedia Environments" in Proceedings of ACM Hypertext '97, Southampton, UK, 1-12, April 1997.
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or email@example.com.