Modifications

Added component testing, modified requirements and non-functional requirement, made some clarifications on the component descriptions, and updated the schedule.

1. Descriptions/Features

Thread Debugger is a tool used for debugging multithreaded code in an intuitive graphical user interface. It has common debugger features such as stepping through code, setting breakpoints, and examining variable values and state of the program using backtraces.

The multithreaded component is a key portion of the project. The debugger must be able to simultaneously display and debug several threads simultaneously, with all the features of the single-threaded debugger.

The debugger will utilize the existing debugging capabilities of GDB. The main difficulty of this project will be providing an effective, easy-to-use front-end.

For the first time in CS 190, Thread Debugger will be implemented in OCaml. OCaml's features and language design make it the best choice. This decision was made based on a number of factors:

The primary user interface feature will be a debugger pane. There can be several debugger panes visible at once one for each running thread, tiled horizontally down to a minimum width for a single pane.

The primary feature of a pane is the code display, displaying the source file for the running code. If the code is paused, the line of code that execution is halted at is highlighted. The user may left-click in the margin to set a breakpoint in the code, and a breakpoint icon appears wherever there is a breakpoint.

Panes also have a toolbar of controls at the bottom. Each thread has Break, Continue, Step, Next, and Finish buttons, which function similar to those of GDB. The pane may also have several other status icons indicating the threads status, such as running, blocked on a synchronization device, or in a system call. These status icons are modular and can be easily added and/or customized by advanced users.

When a thread stops, a mini-pane extends from the bottom of the code window displaying relevant data about the thread, including local variables and backtracing. This mini-pane disappears when the user interacts with the pane, but it can be redisplayed through a button on the pane's toolbar.

The system will have features for easily displaying variable values and interacting with GDB directly. As the user moves the cursor over symbols in the code, the debugger highlights features of interest related to that symbol. For example, when the user moves the cursor over a variable, the debugger highlights other instances of that variable in the code; when the user moves the cursor over a delimiter, the debugger highlights the matching delimiter. When the user left-clicks a variable, a mini-pane appears, displaying the variable's value. The user can left-click a toolbar button to enter a command mode where they can enter a command to GDB and get the result.

There will be many keyboard shortcuts available for advanced users. The UI maintains the concept of an active thread, to which keyboard input is directed. Users can take advantage of shortcuts to activate almost all UI capabilities (for instance, pressing n performs the next action on the current thread).

2. System Model

Design diagram

3. System Model Description of Major Components

GUI: handles initialization of program and placement of all UI elements (when program starts and upon thread creation/destruction)

FileLoader: manages all open source files

GUI Elements: manage display and interaction with user.

  • GUI -> GUI elements

  • Debugger: is the abstract interface for a debugger backend. It has methods for program control, breakpoints, monitoring of values, and implements many callbacks to let interested parties know what is going on

    GDB Backend Debugger is implemented with a GDB backend. It opens a separate GDB process and talks to it using GDB/MI interface.

    GDBParser: is a module that understands GDB/MI and translates it into debugger's structures.

    POpen: just knows how to open a GDB process and attach its output file descriptors to the GTK event loop, then provides callbacks (synchronous in the GTK event loop) when data is ready

  • GUI elements -> GTK

  • 4. GUI Design

    GUI

    5. GUI Description

    Program Control Buttons

    The program control buttons manage the debugging of the entire application. They allow you to load a new binary into the debugger, reload the current binary, and start and stop execution of all threads. Eventually, these buttons will have icons to make them more recognizable.

    Thread Columns

    Each running thread contains one corresponding column in the GUI. Each is responsible for displaying important data about the specific thread and allows for controlling the debugging process in the thread. Only one thread column is active at a given time. The active thread column is the one that has keyboard focus.

    Eventually, advanced features can be implemented to deal with hiding these thread columns, but at the moment they will be held in a generic container which just renders them side by side. To implement more advanced scenarios, the container would have to be replaced with a more sophisticated and intuitive one.

    When the entire application is running, most features of the thread columns will be disabled. The status will continue to be displayed, but everything else will be hidden and will not be updated. To view details about a thread, the user must stop the program and step through it.

    Thread Status:

    Displays the current status of the thread. Possible categories for status include Running, Stopped, Blocked, or Terminated. Eventually, this text will be replaced with a symbol that best represents the type of thread status.

    Thread Control Buttons:

    These buttons control the process of stepping through their corresponding thread. Like the thread status displays, the buttons will eventually contains icons for clarity.

    Variables Pane:

    The variables pane displays the number of recently-examined variables. When the user left-clicks on a variable name in the code pane, that variable will be added to the variables pane and its value will be displayed and updated as the user continues to debug. If the amount of variables examined exceeds the amount that can fit, the variable that was examined the least recent will be deleted from the display.

    Backtrace Pane:

    This displays the current backtrace of the thread. The user can left-click on any of these lines to retrieve the code in the immediate vicinity of the code pane.

    Code Pane:

    This displays the line of code where the thread is currently executing as well as the next preceding and proceeding lines. The user can set breakpoints in the code pane or left-click on a variable to examine its value. If the user desires a larger window to view the code, he or she can left-click the "Open in Code Browser" button to open that file using the Code Browser.

    Terminal Pane:

    At the bottom of each thread column is a small terminal for communicating with GDB. This is so that the user can type commands directly to GDB and access features that are not well represented in the GUI. Exactly what commands will be allowed in the terminal pane and what feedback the user can receive is still under debate.

    Code Browser Pane:

    The code browser is a simple tabbed source-code viewer. It allows the user to load files to view, and to set breakpoints in them. Having this feature gives the user a convenient way of setting breakpoints in a section of code that is not currently in the scope of one of the paused threads or during execution of the program. The user cannot edit code in the code browser. It is a read-only text viewer, and will automatically update its contents if the file has been changed or removed.

    Program Output Pane:

    The program output pane displays the output of the program while it is running.

    Final Note:

    Although this mockup was designed using the QT graphical GUI builder, the actual program will use GTK instead, since this is better supported by OCaml. However, the overall appearance and layout of the final GUI should be similar to this, although the toolkit will be different.

    6. Requirements

    7. Non-functional Requirements

    In order to implement the desired features in section II while meeting the requirements detailed in section VII, Thread Debugger must ensure that the following four attributes are met:

    Performance: If the graphical debugger experiences more than 1 second delay in a response from the debugger the thread that placed the given operation should become completely grayed out, so that the user can visually see that the thread is busy. That said, setting a break point in gdb with a large amount of source files can take several seconds, so it is hard to place a numerical value on exactly how long the gui should take. Given a small amount of source code, setting a break point should not take more than 1-2 seconds through the GUI.

    Reliability: The debugger must not crash or exit unexpectedly, even on unexpected and/or invalid input from the user or GDB.

    Ease of use: A programmer with significant GDB experience or significant graphical debugger experience should be able to use the debugger without documentation or help, i.e. they should be able to sit down and load a program, set breakpoints, step through it and print out the variable values without needing help.

    Portability: The program must build and run on Linux (Debian for now, other distros if we have time).

    8. Special Challenges/Risks

    If GDB does not have sufficient thread support built-in, Thread Debuggers functionality could be compromised. For example, it is unclear whether one thread could be running while others are stopped. This risk could be reduced by researching the thread capabilities of GDB in detail. Few users take advantage of GDBs multithreading features and those who do use it from a text console, so it is vital that they be investigated further. If it turns out that something important is not implemented, it might be necessary to modify GDB.

    The usability of having several thread panes visible at once is untested. The current belief is that it would be feasible, but if it is determined that having multiple thread views simultaneously does not work from a usability perspective (because, for example, there is not enough screen space horizontally to view the code), then more work will be necessary to update the interface paradigm.

    If the usability of having multiple thread views works, we must then develop the code with a robust design in mind so that we can accommodate an arbitrary number of threads. Upon testing the code with different amount of threads and testing for performance and reliability, we can then set the upper limit on the amount of threads our debugger will support.

    In order to examine the contents of common data structures, such as blocking queues, used in concurrent programs (as outlined in Section VII) we must investigate whether a standard implementation exists.

    9. Testing Strategy

    One advantage of the requirements for Thread Debugger project is that the testing is relatively straight forward. There are only two external dependencies (GDB and GTK), both of which happen to be projects that are maintained and thoroughly tested by an avid community. These external dependencies will run on the same machine and are isolated in such a way that they are accessed by a single user. As such, they are particularly stable especially when contrasted with other dependencies like web-services. Additionally, communications between components act as closed circuits. There is a single user interacting with a single copy of the application which is interacting with a single copy of GDB. Once the interfaces between the components has been set, testing should be quite systematic and thorough.

    As such, testing will be focused primarily on the individual components and the communication bridges between them.

    Responsibilities

    While clearly most of the testing responsibilities will fall on the testing team, every member of the team will have some responsibilities in regards to testing.

    Individual Coders

    Unit Testing

    Every developer will be responsible for creating their own unit tests. There are a number of unit testing frameworks written for OCaml, such as OUnit. Choosing one of these frameworks will be one of the first tasks for testing team.

    Bug Tracking

    Every member of the project will be involved in bug tracking. Anyone who stumbles upon a bug (tester or not) will be responsible for reporting a bug.

    Bug-trackers are used to keep track of known bugs in a code-base, and their status. For a given bug, most systems track an assigned owner (a person or group responsible for the bug in question), status, information surrounding its discovery and how to exacerbate it, its impact, how it manifests itself in the application, and a few other things. Some tracker systems also integrate with whatever source control system is in use, for example CVS or Subversion. In these cases developers can frequently add particularly-formatted sections to their commit messages which can be culled by the bug tracker in order to update the bug status at the same time that a fix (or partial fix) is committed to the repository. Generally, if a bug appears, it is a good indicator for a needed unit test to detect regressions after the fix. Using bug tracking software helps developers manage the bugs they encounter without having to keep track themselves.

    Code Comments and Conventions

    Every developer must maintain excellent comments for her own code, including module and function contracts as well as internal comments. While this might initially sound like a documentation task, comments are extremely helpful for QA development. When a bug is found contracts can help narrow the potential sources to blame. Also, comments can illuminate the difference between what code is actually doing and what it is supposed to be doing.

    Style conventions are also useful when for testing. It is very likely that people will need to read each others code for a number reasons whether it's in the middle of a bug hunt, looking for clarification of an interface contract, or simply because they have nothing better to do on a Saturday night. If the ten people working on this project had ten different coding style conventions, it would be obnoxious to sift through the code. This coding convention should also cover the style for tests.

    Testing Team

    Managing Regression Tests

    Testing Framework Development

    Component Testing

    This stage of testing is designed to guarantee that each component (GUI, Debugger, GDB Interface, and GDB Pipe) behaves properly according to the interfaces that have been agreed on during development. It will reply primarily on unit testing and dummy components.

    All dummy components should have at least two modes: friendly and pathological. The friendly mode should provide interactions that are reasonable and legal. The pathological mode should either provide randomized interaction or a sequence of commands specifically designed to be disruptive. For instance, the dummy debugger could send a break event to the GUI. Then the GUI could request a variable value. The friendly debugger might return a randomly generated value for the variable, while the pathological debugger could return another break event.

    Debugger

    The debugger will have to interact with a dummy GUI and a dummy GDB Interface, which will have to be coordinated with each other. The dummy GUI will have to subscribe to various events and log callbacks. The dummy GDB interface will provide respond to instructions events appropriately.

    We will create a testing module signature and then create various testing modules that implement that signature such that they can be run easily from a list of tests. Some subset of these tests can be set up to run automatically upon checkin.

    Debugger Component:

    We will manually construct several canned parse trees of the type that will be returned by the Parser, and then verify that the debugger correctly maintains state. Our test will register some dummy callbacks that just print messages, and make sure that they are called correctly.

    GTKPopen Component (formerly Fancy Popen):

    We will test this component by piping in preconstructed input in the place of gdb. We could do this using cat in place of gdb. We should also test what happens if the gdb process exits unexpectedly (and the pipe is broken).

    Parser Component:

    Create a module that feeds the parser canned strings of GDB-MI text, and then check the resulting gdbmi_in value against what it should be. We should also test the unparse method by feeding it preconstructed gdbmi_out values, and verify that it generates valid and correct GDB-MI text. We can print warnings when the parser's output differs from the expected output.

    GDB Interface / Parser

    The GDB interface and parser are the most straight forward component to test. The parser needs to be able to robustly handle all sorts of inputs. It should gracefully ignore unimplemented extensions to GDB/MI. Additionally, it should be able to tolerate bogus input without crashing.

    Rational Input Testing

    Irrational Testing

    GDB Pipe

    The pipe is the most precarious of all the components as it is a connection to a forked process. We will need to run GDB in order to perform the tests properly. Tests should focus on attempting to break the pipeline connection and how it behaves when it does break. Additionally, the tests should run some sample threaded code with an automated sequence of instructions to ensure GDB is returning the output that is expected. This will require a dummy GDB Interface.

    GUI

    The GUI needs both unit testing and a harness for interactive testing. The unit tests should handle testing the components of the GUI,. Interactive testing should be supported by a dummy debugger, which should handle event subscription for GUI components.

    Integration Testing

    If component testing is successful then the actual integration should be relatively simple. The primary task will be to replace each of the dummy components with the actual components and run the same test suites. There will also be new test suites designed specifically to test chains of components. Whereas component testing would only test the boundary behavior of a single component, these will test the boundary behaviors for groups of components. It's clear to see how the success of the integration testing stage is firmly rooted in the thoroughness of the component testing, and the development team should act accordingly.

    The integration stage will also allow for a more thorough exploration of user interface behavior cases. This will be the first opportunity for the testers to freely explore the full system using the sample threaded projects.

    Stress Testing

    Stress testing is designed to test a system under extreme cases of abuse. It should test for cases that are outside of normal operational capacity. In this project, the pressure points for stress are the communications between components. These tests should handle cases with like spamming communication lines (e.g. sending an unreasonable quantity of step requests to the GDB pipe before waiting for a response) and sending massive data chunks between components (e.g. a huge output from the GDP pipe to the parser). Any tests concerning malformed data or pathological corner cases should be handled by the regular component and unit testing.

    The reason this testing is considered a separate from component testing is because they lie outside of normal specified behavior. In fact many bugs that arise because of stress testing might not even be considered true bugs. Stress tests are designed to observe how the system behaves in circumstances beyond the requirements. As such, any bugs that are revealed by these tests do not necessarily need to be fixed. They should, however, at the very least be caught and handled gracefully. For instance, if a system expects to have only 1,000 concurrent users should not necessarily be considered defective if fails to handle 100,000 users. Instead it should be tested to meet a set of relaxed requirements, like not corrupting or loosing data or displaying a pleasant error message instead of violently crashing.

    Stress testing should be done both in parallel with and following integration testing. Individual components can be stress tested before the entire project has been integrated; but, once integration is complete, the entire system should be subjected to stress tests (for the same rationale as the integration tests).

    User Testing

    For tdb, user testing will focus on usability. Since it is a simple desktop application, we do not need to attempt to test many users stressing our application at once -- there is never more than one user using tdb at a time. There are two primary groups of users we should do testing with.

    First, we should invite users who are familiar with debugging multi-threaded C/C++ code using a graphical IDE such as Microsoft Visual Studio. They will be presented with some multi-threaded code and asked to debug it using our software. We will observe how they use our interface and what parts of it are confusing. We will also ask for their feedback regarding the application. One potential challenge is that these users may be less familiar with pthreads than with Win32 threads. Based on the feedback from these users, we will hopefully be able to refine the interface, and learn more about users' expectations of graphical debugging. We should give a demo to some of these users and not give a demo to others to see how easy our interface is to figure out intuitively.

    Second, we should do testing with users who are experienced at debugging multi-threaded C/C++ code using gdb. These users should be very experienced with pthreads. We will need to spend some time training these users about how our GUI works, making sure to show them how to access all of the commonly-used features of GDB via our GUI. We will present these users with the same multi-threaded code as the first group, and ask them to debug it using tdb. The feedback we receive from these users will be particularly important, since they are basically our target user base (UNIX programmers).

    It is worth noting that by the time we have reached this stage of testing, many students in the CS department will be working on final projects for other tasks. It is not unreasonable to assume that they could provide real case tests (instead of contrived) for tdb.

    While we will likely encounter some bugs during user testing, but the primary goal is to make sure that programmers can use our application effectively.

    10. External Dependencies

    GDB/MI:

    GDB/MI is a text based interface to GDB designed for use by software rather than humans. It has a fairly predictable grammar and the GDB developers use care when changing its definition. It provides full access to GDB's functionality that one would expect: breakpoints, watchpoints, printing expression values, etc. It is ideal for our current needs.

    Functionality with Regards to Threads:

    GDB/MI provides commands to allow for the enumeration and selection of threads, as well as means to control GDB's breaking and resuming settings for multi- threaded use.

    Associated Risks:

    GDB/MI is currently under development. While existing functionality should not change, GDB/MI can change in the following ways (from docs):

    To mitigate these risks the parser must be robustly designed and as independent from the remainder of the debugger as possible.

    Implementation:

    We'll use OCaml's excellent built-in parser generator tools, ocamllex and ocamlyacc, to make a parser for GDB/MI.

    GTK:

    GTK provides two major things: a library of GUI widgets, and an event loop. The GUI widget library is very much like any such library (though GTK is of particularly high quality). The same can be said of the way it handles events.

    Using the GUI widgets:

    Not much to say here; we've all done GUIs, and they're all about the same at some level.

    Using the Event loop:

    GTK allows one to "watch" a file descriptor in its main event loop. Basically, it calls select(2), and by adding a file descriptor to watch, it will select() over it as well, and deliver an event to you when I/O can be done. We'll be using that to do all of our file I/O asynchronously. This is done via GMain.Io.add_watch().

    11. Schedule

    Schedule

    12. Task Breakdown / Group Organization

    1. Team Lead: Tara

      The Team Lead will make sure that everything stays on schedule and will help coordiante the members of the group. Also, she will help any person or people that fall behind schedule with coding if need be.

    2. Coders:
      1. GUI - Nate(lead), Lincoln, Tara
      2. GDB Backend - Sean(lead), Colin, Brendan, Owen

      The coders will be responsible for coding their respective components according to the system design

    3. Documentation Master: Dominic

      The Documentation Master will work closely with the coders and architects to produce high-quality documentation of both internal APIs and end-user features. In addition, he will be responsible for creating a course webpage and taking notes at meetings and posting them to the webpage. He will be in charge of the final README document when the software is released.

    4. Architect: Lincoln

      The Architect will be responsible for having a good understanding of how the overall system is designed and how the pieces fit together. Questions and conflicts about the interfaces of the various components will be resolved by the architect. He will also play a support role during integration.

    5. Tools Tzar: Colin

      The Tools Tzar will be responsible for making sure that tools like SVN are set up, and will answer questions that other group members have about the tools.

    6. Lead Tester: Josh, Brendan

      The lead tester will be responsible for running unit and system tests, and possibly will deploy an automated testing system. Also, he will be responsible for finding users to do user-testing.