CS190: Thread Debugger Detailed Design

Modifications

Added component testing, modified requirements and non-functional requirement, made some clarifications on the component descriptions, and updated the schedule.

1. Descriptions/Features

Thread Debugger is a tool used for debugging multithreaded code in an intuitive graphical user interface. It has common debugger features such as stepping through code, setting breakpoints, and examining variable values and state of the program using backtraces.

The multithreaded component is a key portion of the project. The debugger must be able to simultaneously display and debug several threads simultaneously, with all the features of the single-threaded debugger.

The debugger will utilize the existing debugging capabilities of GDB. The main difficulty of this project will be providing an effective, easy-to-use front-end.

For the first time in CS 190, Thread Debugger will be implemented in OCaml. OCaml's features and language design make it the best choice. This decision was made based on a number of factors:

Static type-checking and strong interface contrasts.For a project of this size, it makes sense to enforce as many constraints at compile-time as possible in order to increase the likelihood of success at run-time, especially during integration.
Strong built-in libraries, specifically for parsing. OCaml has excellent parser-construction libraries built in, which will be vital to communicating with GDB.
Functional syle.Functional programming is preferred among CS 190 students because of its significant productivity improvements over strictly imperative styles.

The primary user interface feature will be a debugger pane. There can be several debugger panes visible at once one for each running thread, tiled horizontally down to a minimum width for a single pane.

The primary feature of a pane is the code display, displaying the source file for the running code. If the code is paused, the line of code that execution is halted at is highlighted. The user may left-click in the margin to set a breakpoint in the code, and a breakpoint icon appears wherever there is a breakpoint.

Panes also have a toolbar of controls at the bottom. Each thread has Break, Continue, Step, Next, and Finish buttons, which function similar to those of GDB. The pane may also have several other status icons indicating the threads status, such as running, blocked on a synchronization device, or in a system call. These status icons are modular and can be easily added and/or customized by advanced users.

When a thread stops, a mini-pane extends from the bottom of the code window displaying relevant data about the thread, including local variables and backtracing. This mini-pane disappears when the user interacts with the pane, but it can be redisplayed through a button on the pane's toolbar.

The system will have features for easily displaying variable values and interacting with GDB directly. As the user moves the cursor over symbols in the code, the debugger highlights features of interest related to that symbol. For example, when the user moves the cursor over a variable, the debugger highlights other instances of that variable in the code; when the user moves the cursor over a delimiter, the debugger highlights the matching delimiter. When the user left-clicks a variable, a mini-pane appears, displaying the variable's value. The user can left-click a toolbar button to enter a command mode where they can enter a command to GDB and get the result.

There will be many keyboard shortcuts available for advanced users. The UI maintains the concept of an active thread, to which keyboard input is directed. Users can take advantage of shortcuts to activate almost all UI capabilities (for instance, pressing n performs the next action on the current thread).

2. System Model

Design diagram

3. System Model Description of Major Components

GUI: handles initialization of program and placement of all UI elements (when program starts and upon thread creation/destruction)

GUI to Debugger

createDebugger
reloadProgram: reloads the program being debugged
callbacks

programLoaded
programEnded
debuggerReady
threadCreated
threadExited

FileLoader: manages all open source files

GUI -> FileLoader

openFile

GUI Elements: manage display and interaction with user.

Elements include:

CodeView - displays code & breakpoints & current instruction, allows user to set bp, click on code to get info
History - displays recently viewed variable values and updates with stepping
Backtrace - displays backtrace and lets user browse frames of execution
Toolbar - buttons for stepping through code
CommandLine - allows user to type commands to GDB and get result

GUI -> GUI elements

construction and placement

Debugger: is the abstract interface for a debugger backend. It has methods for program control, breakpoints, monitoring of values, and implements many callbacks to let interested parties know what is going on

GUI elements -> Debugger

continue/break/step/next/finish
setBreakpoint
monitorValue: display command of gdb
stopMonitoringValue
callbacks:

programLoaded
programEnded
debuggerReady
threadCreated
threadExited
programStepped
breakHit
newValue

GDB Backend Debugger is implemented with a GDB backend. It opens a separate GDB process and talks to it using GDB/MI interface.

Debugger (GDB) -> POpen

openProcess
callbacks:
- newData

GDBParser: is a module that understands GDB/MI and translates it into debugger's structures.

Debugger (GDB) -> GDBParser

parseData

POpen: just knows how to open a GDB process and attach its output file descriptors to the GTK event loop, then provides callbacks (synchronous in the GTK event loop) when data is ready

POpen -> EventLoop

monitorFD
- newData

GUI elements -> GTK

create gtk widgets and put them on screen
set up callbacks for when they are clicked

4. GUI Design

GUI

5. GUI Description

Program Control Buttons

The program control buttons manage the debugging of the entire application. They allow you to load a new binary into the debugger, reload the current binary, and start and stop execution of all threads. Eventually, these buttons will have icons to make them more recognizable.

Thread Columns

Each running thread contains one corresponding column in the GUI. Each is responsible for displaying important data about the specific thread and allows for controlling the debugging process in the thread. Only one thread column is active at a given time. The active thread column is the one that has keyboard focus.

Eventually, advanced features can be implemented to deal with hiding these thread columns, but at the moment they will be held in a generic container which just renders them side by side. To implement more advanced scenarios, the container would have to be replaced with a more sophisticated and intuitive one.

When the entire application is running, most features of the thread columns will be disabled. The status will continue to be displayed, but everything else will be hidden and will not be updated. To view details about a thread, the user must stop the program and step through it.

Thread Status:

Displays the current status of the thread. Possible categories for status include Running, Stopped, Blocked, or Terminated. Eventually, this text will be replaced with a symbol that best represents the type of thread status.

Thread Control Buttons:

These buttons control the process of stepping through their corresponding thread. Like the thread status displays, the buttons will eventually contains icons for clarity.

Variables Pane:

The variables pane displays the number of recently-examined variables. When the user left-clicks on a variable name in the code pane, that variable will be added to the variables pane and its value will be displayed and updated as the user continues to debug. If the amount of variables examined exceeds the amount that can fit, the variable that was examined the least recent will be deleted from the display.

Backtrace Pane:

This displays the current backtrace of the thread. The user can left-click on any of these lines to retrieve the code in the immediate vicinity of the code pane.

Code Pane:

This displays the line of code where the thread is currently executing as well as the next preceding and proceeding lines. The user can set breakpoints in the code pane or left-click on a variable to examine its value. If the user desires a larger window to view the code, he or she can left-click the "Open in Code Browser" button to open that file using the Code Browser.

Terminal Pane:

At the bottom of each thread column is a small terminal for communicating with GDB. This is so that the user can type commands directly to GDB and access features that are not well represented in the GUI. Exactly what commands will be allowed in the terminal pane and what feedback the user can receive is still under debate.

Code Browser Pane:

The code browser is a simple tabbed source-code viewer. It allows the user to load files to view, and to set breakpoints in them. Having this feature gives the user a convenient way of setting breakpoints in a section of code that is not currently in the scope of one of the paused threads or during execution of the program. The user cannot edit code in the code browser. It is a read-only text viewer, and will automatically update its contents if the file has been changed or removed.

Program Output Pane:

The program output pane displays the output of the program while it is running.

Final Note:

Although this mockup was designed using the QT graphical GUI builder, the actual program will use GTK instead, since this is better supported by OCaml. However, the overall appearance and layout of the final GUI should be similar to this, although the toolkit will be different.

6. Requirements

(Rank 1)Display and debug several threads simultaneously:

Room for at least three code listings on-screen at once.
Each code listing highlights to the user the current statement being executed by that thread, and provides backtrace capability

(Rank 1)Allow users to control threads:

Break, Continue, Step, Next, Finish
Setting breakpoints by clicking in the GUI

(Rank 1/4)Allow users to inspect threads:

(Rank 1)Inspect variables on-screen by clicking on them
(Rank 4)Inspect the value of expressions by typing them

(Rank 3)Provide information on common multithreading constructs:

If a thread is waiting on a lock (i.e., a mutex), identify the thread that holds the lock
Examine the contents of some common data structures used in concurrent programs, such as blocking queues

Thread Debugger should (in decreasing order of priority):

(Rank 4)Provide a way for the user to type commands rather than clicking if they are experienced (i.e., interact with the underlying GDB layer, if present)

(Rank 3)Display several "collapsed" threads

Collapsed threads have no code listing
Indicate in summary what a thread is doing
Contain Break, Continue, Step, Next, and Finish buttons

7. Non-functional Requirements

In order to implement the desired features in section II while meeting the requirements detailed in section VII, Thread Debugger must ensure that the following four attributes are met:

Performance: If the graphical debugger experiences more than 1 second delay in a response from the debugger the thread that placed the given operation should become completely grayed out, so that the user can visually see that the thread is busy. That said, setting a break point in gdb with a large amount of source files can take several seconds, so it is hard to place a numerical value on exactly how long the gui should take. Given a small amount of source code, setting a break point should not take more than 1-2 seconds through the GUI.

Reliability: The debugger must not crash or exit unexpectedly, even on unexpected and/or invalid input from the user or GDB.

Ease of use: A programmer with significant GDB experience or significant graphical debugger experience should be able to use the debugger without documentation or help, i.e. they should be able to sit down and load a program, set breakpoints, step through it and print out the variable values without needing help.

Portability: The program must build and run on Linux (Debian for now, other distros if we have time).

8. Special Challenges/Risks

If GDB does not have sufficient thread support built-in, Thread Debuggers functionality could be compromised. For example, it is unclear whether one thread could be running while others are stopped. This risk could be reduced by researching the thread capabilities of GDB in detail. Few users take advantage of GDBs multithreading features and those who do use it from a text console, so it is vital that they be investigated further. If it turns out that something important is not implemented, it might be necessary to modify GDB.

The usability of having several thread panes visible at once is untested. The current belief is that it would be feasible, but if it is determined that having multiple thread views simultaneously does not work from a usability perspective (because, for example, there is not enough screen space horizontally to view the code), then more work will be necessary to update the interface paradigm.

If the usability of having multiple thread views works, we must then develop the code with a robust design in mind so that we can accommodate an arbitrary number of threads. Upon testing the code with different amount of threads and testing for performance and reliability, we can then set the upper limit on the amount of threads our debugger will support.

In order to examine the contents of common data structures, such as blocking queues, used in concurrent programs (as outlined in Section VII) we must investigate whether a standard implementation exists.

9. Testing Strategy

One advantage of the requirements for Thread Debugger project is that the testing is relatively straight forward. There are only two external dependencies (GDB and GTK), both of which happen to be projects that are maintained and thoroughly tested by an avid community. These external dependencies will run on the same machine and are isolated in such a way that they are accessed by a single user. As such, they are particularly stable especially when contrasted with other dependencies like web-services. Additionally, communications between components act as closed circuits. There is a single user interacting with a single copy of the application which is interacting with a single copy of GDB. Once the interfaces between the components has been set, testing should be quite systematic and thorough.

As such, testing will be focused primarily on the individual components and the communication bridges between them.

Responsibilities

While clearly most of the testing responsibilities will fall on the testing team, every member of the team will have some responsibilities in regards to testing.

Individual Coders

Unit Testing

Every developer will be responsible for creating their own unit tests. There are a number of unit testing frameworks written for OCaml, such as OUnit. Choosing one of these frameworks will be one of the first tasks for testing team.

Bug Tracking

Every member of the project will be involved in bug tracking. Anyone who stumbles upon a bug (tester or not) will be responsible for reporting a bug.

Bug-trackers are used to keep track of known bugs in a code-base, and their status. For a given bug, most systems track an assigned owner (a person or group responsible for the bug in question), status, information surrounding its discovery and how to exacerbate it, its impact, how it manifests itself in the application, and a few other things. Some tracker systems also integrate with whatever source control system is in use, for example CVS or Subversion. In these cases developers can frequently add particularly-formatted sections to their commit messages which can be culled by the bug tracker in order to update the bug status at the same time that a fix (or partial fix) is committed to the repository. Generally, if a bug appears, it is a good indicator for a needed unit test to detect regressions after the fix. Using bug tracking software helps developers manage the bugs they encounter without having to keep track themselves.

Code Comments and Conventions

Every developer must maintain excellent comments for her own code, including module and function contracts as well as internal comments. While this might initially sound like a documentation task, comments are extremely helpful for QA development. When a bug is found contracts can help narrow the potential sources to blame. Also, comments can illuminate the difference between what code is actually doing and what it is supposed to be doing.

Style conventions are also useful when for testing. It is very likely that people will need to read each others code for a number reasons whether it's in the middle of a bug hunt, looking for clarification of an interface contract, or simply because they have nothing better to do on a Saturday night. If the ten people working on this project had ten different coding style conventions, it would be obnoxious to sift through the code. This coding convention should also cover the style for tests.

Testing Team

Managing Regression Tests

Enforce / Assist in the development of individual unit testing

Monitoring the bug-tracker and incorporating new unit tests into regression testing

Testing Framework Development

Harnesses and Dummy Components

Create Sample Threaded Code Projects used for testing the full product and also to be cannibalized in the dummy components

Component Testing

This stage of testing is designed to guarantee that each component (GUI, Debugger, GDB Interface, and GDB Pipe) behaves properly according to the interfaces that have been agreed on during development. It will reply primarily on unit testing and dummy components.

All dummy components should have at least two modes: friendly and pathological. The friendly mode should provide interactions that are reasonable and legal. The pathological mode should either provide randomized interaction or a sequence of commands specifically designed to be disruptive. For instance, the dummy debugger could send a break event to the GUI. Then the GUI could request a variable value. The friendly debugger might return a randomly generated value for the variable, while the pathological debugger could return another break event.

Debugger

The debugger will have to interact with a dummy GUI and a dummy GDB Interface, which will have to be coordinated with each other. The dummy GUI will have to subscribe to various events and log callbacks. The dummy GDB interface will provide respond to instructions events appropriately.

We will create a testing module signature and then create various testing modules that implement that signature such that they can be run easily from a list of tests. Some subset of these tests can be set up to run automatically upon checkin.

Debugger Component:

We will manually construct several canned parse trees of the type that will be returned by the Parser, and then verify that the debugger correctly maintains state. Our test will register some dummy callbacks that just print messages, and make sure that they are called correctly.

GTKPopen Component (formerly Fancy Popen):

We will test this component by piping in preconstructed input in the place of gdb. We could do this using cat in place of gdb. We should also test what happens if the gdb process exits unexpectedly (and the pipe is broken).

Parser Component:

Create a module that feeds the parser canned strings of GDB-MI text, and then check the resulting gdbmi_in value against what it should be. We should also test the unparse method by feeding it preconstructed gdbmi_out values, and verify that it generates valid and correct GDB-MI text. We can print warnings when the parser's output differs from the expected output.

GDB Interface / Parser

The GDB interface and parser are the most straight forward component to test. The parser needs to be able to robustly handle all sorts of inputs. It should gracefully ignore unimplemented extensions to GDB/MI. Additionally, it should be able to tolerate bogus input without crashing.

Rational Input Testing

Standardized GDB/MI: Parse a corpus of real GDB/MI output

Expanded GDB/MI: Parse a corpus of GDB/MI output modified per the specification of how GDB/MI may be extended in the future.

Randomized GDB/MI output: Parse a set of randomly generated GDB/MI statements.

Irrational Testing

Quasimodo: Parse a corpus of subtly malformed GDB/MI output

Standard-Random: Parse the corpus used in the standardized test, followed by random input.

/dev/urandom: The parser should be able to accept input from /dev/urandom without crashing.

GDB Pipe

The pipe is the most precarious of all the components as it is a connection to a forked process. We will need to run GDB in order to perform the tests properly. Tests should focus on attempting to break the pipeline connection and how it behaves when it does break. Additionally, the tests should run some sample threaded code with an automated sequence of instructions to ensure GDB is returning the output that is expected. This will require a dummy GDB Interface.

GUI

The GUI needs both unit testing and a harness for interactive testing. The unit tests should handle testing the components of the GUI,. Interactive testing should be supported by a dummy debugger, which should handle event subscription for GUI components.

Integration Testing

If component testing is successful then the actual integration should be relatively simple. The primary task will be to replace each of the dummy components with the actual components and run the same test suites. There will also be new test suites designed specifically to test chains of components. Whereas component testing would only test the boundary behavior of a single component, these will test the boundary behaviors for groups of components. It's clear to see how the success of the integration testing stage is firmly rooted in the thoroughness of the component testing, and the development team should act accordingly.

The integration stage will also allow for a more thorough exploration of user interface behavior cases. This will be the first opportunity for the testers to freely explore the full system using the sample threaded projects.

Stress Testing

Stress testing is designed to test a system under extreme cases of abuse. It should test for cases that are outside of normal operational capacity. In this project, the pressure points for stress are the communications between components. These tests should handle cases with like spamming communication lines (e.g. sending an unreasonable quantity of step requests to the GDB pipe before waiting for a response) and sending massive data chunks between components (e.g. a huge output from the GDP pipe to the parser). Any tests concerning malformed data or pathological corner cases should be handled by the regular component and unit testing.

The reason this testing is considered a separate from component testing is because they lie outside of normal specified behavior. In fact many bugs that arise because of stress testing might not even be considered true bugs. Stress tests are designed to observe how the system behaves in circumstances beyond the requirements. As such, any bugs that are revealed by these tests do not necessarily need to be fixed. They should, however, at the very least be caught and handled gracefully. For instance, if a system expects to have only 1,000 concurrent users should not necessarily be considered defective if fails to handle 100,000 users. Instead it should be tested to meet a set of relaxed requirements, like not corrupting or loosing data or displaying a pleasant error message instead of violently crashing.

Stress testing should be done both in parallel with and following integration testing. Individual components can be stress tested before the entire project has been integrated; but, once integration is complete, the entire system should be subjected to stress tests (for the same rationale as the integration tests).

User Testing

For tdb, user testing will focus on usability. Since it is a simple desktop application, we do not need to attempt to test many users stressing our application at once -- there is never more than one user using tdb at a time. There are two primary groups of users we should do testing with.

First, we should invite users who are familiar with debugging multi-threaded C/C++ code using a graphical IDE such as Microsoft Visual Studio. They will be presented with some multi-threaded code and asked to debug it using our software. We will observe how they use our interface and what parts of it are confusing. We will also ask for their feedback regarding the application. One potential challenge is that these users may be less familiar with pthreads than with Win32 threads. Based on the feedback from these users, we will hopefully be able to refine the interface, and learn more about users' expectations of graphical debugging. We should give a demo to some of these users and not give a demo to others to see how easy our interface is to figure out intuitively.

Second, we should do testing with users who are experienced at debugging multi-threaded C/C++ code using gdb. These users should be very experienced with pthreads. We will need to spend some time training these users about how our GUI works, making sure to show them how to access all of the commonly-used features of GDB via our GUI. We will present these users with the same multi-threaded code as the first group, and ask them to debug it using tdb. The feedback we receive from these users will be particularly important, since they are basically our target user base (UNIX programmers).

It is worth noting that by the time we have reached this stage of testing, many students in the CS department will be working on final projects for other tasks. It is not unreasonable to assume that they could provide real case tests (instead of contrived) for tdb.

While we will likely encounter some bugs during user testing, but the primary goal is to make sure that programmers can use our application effectively.

10. External Dependencies

GDB/MI:

GDB/MI is a text based interface to GDB designed for use by software rather than humans. It has a fairly predictable grammar and the GDB developers use care when changing its definition. It provides full access to GDB's functionality that one would expect: breakpoints, watchpoints, printing expression values, etc. It is ideal for our current needs.

Functionality with Regards to Threads:

GDB/MI provides commands to allow for the enumeration and selection of threads, as well as means to control GDB's breaking and resuming settings for multi- threaded use.

Associated Risks:

GDB/MI is currently under development. While existing functionality should not change, GDB/MI can change in the following ways (from docs):

New MI commands may be added.
New fields may be added to the output of any MI command.
The range of values for fields with specified values, e.g., in_scope (see -var-update) may be extended.

To mitigate these risks the parser must be robustly designed and as independent from the remainder of the debugger as possible.

Implementation:

We'll use OCaml's excellent built-in parser generator tools, ocamllex and ocamlyacc, to make a parser for GDB/MI.

GTK:

GTK provides two major things: a library of GUI widgets, and an event loop. The GUI widget library is very much like any such library (though GTK is of particularly high quality). The same can be said of the way it handles events.

Using the GUI widgets:

Not much to say here; we've all done GUIs, and they're all about the same at some level.

Using the Event loop:

GTK allows one to "watch" a file descriptor in its main event loop. Basically, it calls select(2), and by adding a file descriptor to watch, it will select() over it as well, and deliver an event to you when I/O can be done. We'll be using that to do all of our file I/O asynchronously. This is done via GMain.Io.add_watch().

11. Schedule

Schedule

12. Task Breakdown / Group Organization

Team Lead: Tara
The Team Lead will make sure that everything stays on schedule and will help coordiante the members of the group. Also, she will help any person or people that fall behind schedule with coding if need be.
Coders:
1. GUI - Nate(lead), Lincoln, Tara
2. GDB Backend - Sean(lead), Colin, Brendan, Owen
The coders will be responsible for coding their respective components according to the system design
Documentation Master: Dominic
The Documentation Master will work closely with the coders and architects to produce high-quality documentation of both internal APIs and end-user features. In addition, he will be responsible for creating a course webpage and taking notes at meetings and posting them to the webpage. He will be in charge of the final README document when the software is released.
Architect: Lincoln
The Architect will be responsible for having a good understanding of how the overall system is designed and how the pieces fit together. Questions and conflicts about the interfaces of the various components will be resolved by the architect. He will also play a support role during integration.
Tools Tzar: Colin
The Tools Tzar will be responsible for making sure that tools like SVN are set up, and will answer questions that other group members have about the tools.
Lead Tester: Josh, Brendan
The lead tester will be responsible for running unit and system tests, and possibly will deploy an automated testing system. Also, he will be responsible for finding users to do user-testing.

Thread Debugger Detailed Design

3/9/07 CS190

Modifications

1. Descriptions/Features

2. System Model

3. System Model Description of Major Components

GUI to Debugger

GUI -> FileLoader

Elements include:

GUI -> GUI elements

GUI elements -> Debugger

Debugger (GDB) -> POpen

Debugger (GDB) -> GDBParser

POpen -> EventLoop