Dynamic analysis, part I: what is a debugger?

Terms

A program is a state machine (frequently, a nondeterministic one). A bug is a bad state transition rule.⁰^,¹ An execution witness is the states and state transitions encountered on a particular run of a program, which serves to demonstrate some interesting property.

Debugging

If an execution witness exhibits a bad state, then there must be a bug, but it is not always obvious where the bug is, because it is not always obvious at what point the state became bad. Frequently, a bug produces a subtly bad state, which a benign transition later turns into a more overtly bad state. Hence, it is difficult to debug.

A debugger is a query engine over the states and transitions of a buggy execution witness.

Existing debuggers were not designed with this in mind. They were, I guess, designed for smaller, simpler programs running on computers with few resources. They tend to have a sequential bent; this is cheap to implement, and for small programs, the sequentiality serves to create a narrative explanation—it tells a story—which is very intuitive.

For large programs, however, the sequential format only obscures—a narrative view becomes superfluous as the narrative becomes inordinately large—the sequential debugger is a hindrance rather than a help.

Nondeterminism

Most programs behave nondeterministically. Frequently, a bug can be isolated to a deterministic component. The extent to which this is possible is to some extent a function of programming style; a style which attempts to encapsulate nondeterminism has advantages similar to the strictly stronger ‘functional style’ which attempts to encapsulate all side effects.

However, it is sometimes necessary to debug concurrent code.² In these cases, an execution witness to a bug is a very valuable thing. A traditional debugger will allow you to single-step through such an execution witness once and once only; if you would like to step through it again, you must re-run the program and hope you get lucky. Solutions to this have been developed, under the names ‘time-travel debugging’ and ‘deterministic replay’, with the most prominent recent example being rr.

On the user interface side of things, these include an important generalisation of classic sequential debugging: you can step backwards. However, they do not, generally, to my knowledge, allow for the debugging of a live process. Another limitation of rr is that, although it can record multithreaded programs with fairly little overhead, it can only use a single processor at a time; this limits scalability and means that certain bugs will never manifest and others will have a much lower likelihood of manifesting than they might. I suspect that it is possible to implement scalable deterministic replay with fairly low time overhead in practice on commodity hardware by exploiting thread-locality (c.f.); this should be explored.

Experimentation

The debugger should allow the program to be changed while debugging. Attendant are:

User interface considerations
It should be possibly to apply a rule, change it, and then continue to another application of the now-changed rule
An interesting intermediate state may be deterministically reached, but expensive to recreate
An interesting intermediate state may be valid, but rare due to nondeterminism
It should be possible to compare two continuations from some intermediate state following different changes to the code; a tree structure results, which should be uniformly introspectible
If, following a fork in the program specification, nondeterminism is encountered, this much be managed somehow; it may not be possible to make the behaviour of the two branches accord, but the debugger should still try to be as useful as it can
When nondeterminism is encountered, the user should have the ability to control it, forking the execution in just the same fashion as when they change the program³
Runtime changes to the program should follow clear, centrally-defined rules about late binding, so their behaviour can be robust and understandable⁴

Tracing and logging

What are logs and traces? Whereas an execution witness comprises the full body of states and state transitions, a log contains only a subset of them. Another way to view this is that a log corresponds to a class of execution witnesses that contain the log. Logs aid debugging by constraining the state-space search for an execution witness containing a particular bug.⁵

Further, the queries performed in a given debugging session will consider only a subset of states and state transitions—i.e., a log—hence, given a log, we do not search for an execution witness, but for another log. Sometimes, the second log will be completely contained within the first.

This suggests an alternate definition of ‘bug’: a program is buggy if it admits an execution witness which exhibits a bad state transition. This definition, which borrows from the concurrency semantics literature, is attractive because it may be easier to reason about bad states or state transitions than bad state transition rules.

This also suggests an optimisation strategy for deterministic replay. In order to debug a particular bug, we don't need a particular execution witness in order to debug it; we just need any execution witness that exhibits the same bug. This allows the execution recording to throw away unnecessary information and nondeterministically recreate something isomorphic on demand. It can be difficult to specify when two witnesses witness the ‘same’ bug; but it can be more or less reasonable in case of a crash or obvious invariant violation. (The alternate witness may not exhibit the same crash, but so long as it exhibits some crash, it still helps solve some bug!)

Incremental-practical

I should suggest some incremental steps that debuggers could take to move towards a more holistic view of state (though I suspect a ground-up reimagining of both the technology and the user interface is warranted, to get the best possible results).

Debuggers should support deterministic replay.
- Ideally, it should be scalable and work on live applications.
- Functionality such as ‘continue to breakpoint’ or ‘continue to watchpoint’ should display a list of points in the program execution that satisfy the requested criteria, rather than always seeking to the next or previous one.
- The user interface should allow multiple concurrent views into the state of the application.
Even if they fail to achieve full integration (such that the distinction is erased), debuggers should at least feature interoperation with frameworks for logging and tracing.
Functionality such as ‘breakpoints’ or ‘watchpoints’ should be unified and extended to include more sophisticated queries (for example, existential queries or temporal relations). This is related to ‘design by contract’ (because we can query for states in which a contract is broken).
Debuggers should allow for logging some information at each watchpoint, and allowing the user to inspect the logs in aggregate, rather than interrupting the execution whenever a watchpoint is satisfied and waiting for user input.
- This can be done manually with the scripting functionality of some debuggers, but it is cumbersome. The functionality should be placed front and centre. A good example is SLY’s stickers.
After single-stepping through a patch of straight-line code (or before, in a time-travelling debugger), the debugger should show the values all intermediate terms took on.
- In the presence of branches and joins, this should apply uniformly and also show which branches were taken and which not.
- In the presence of a loop, the debugger should show all loop-borne state in a tabular format, with one row per iteration.
- Absent a time-travelling debugger, these data mightn’t be fully relevant; for example, one intermediate value might be a pointer, including a weak pointer (such as an array index), for which the data pointed to has changed. Providing the information is useful regardless; but this can be integrated with the aforementioned logging support.

Appendix: What about compiler bugs?

Herein, I have broadly assumed that the user’s code is in error, and that the language implementation, the debugger, and the rest of the system are collaborating to try to help find the error. Usually, this is the case, but not always; what should we do when it is not?

I do not have much to add here to the existing body of knowledge on the topic, but I will make a couple of observations.

First, traditional designs for optimising compilers are not very good, and one of their faults is propensity for bugs. In the future, I will write about how to make optimising compilers that have fewer bugs (among many other virtues).

Second, system bugs tend not to be self-concealing, so more expressive debugging tools are frequently effective in debugging themselves or other system components.

Third, having more effective tools for analysing programs’ runtime states makes it easier to identify inconsistencies in those states, and therefore to identify system bugs.

Interesting links

Notes

(back)
A state transition rule is a piece of code that corresponds to a number (typically infinitely many) of state transitions in the state machine.
(back)
Identifying and fixing a single bad transition rule is not always sufficient. Some bugs result from different portions of a codebase’s making differing assumptions; suppose rule A disagrees with B and C, and we identify B as faulty and so fix it to agree with A. We aren’t done, because we still have to fix C. Sometimes, the existence of C is obvious, but sometimes it is not, and it would be good if our tools could help us identify potential Cs (this starts to get at the interplay between static and dynamic analysis).
(back)
Most interesting cases of nondeterminism are really concurrency issues.
(back)
This too starts to interplay with static analysis.
(back)
For example, Common Lisp does not specify when global function definitions are bound unless they are explicitly marked notinline; this is bad.
(back)
Again with the static analysis.