Dynamic Binary Instrumentation

From TracingWiki

Jump to: navigation, search

Dynamic binary instrumentation techniques are used to execute additional instructions at certain locations in a program. Since instrumentation is added dynamically, a tracing overhead is incurred only when a program is dynamically instrumented to produce a trace. This is unlike source level instrumentation where the decision to instrument is taken at compile time and some overhead is present when tracing is compiled in, even if it is not activated at execution time.

Valgrind, DTrace, SystemTap (based on kprobes), Frysk and GDB are examples of dynamic binary instrumentation tools. DTrace, SystemTap, Frysk and GDB overwrite the program with trap instructions at locations where instrumentation code must be executed. When the trap instruction is encountered, an interrupt is generated. The additional instrumentation code may then be executed. Then, the original instruction content (overwritten by the trap) is restored, the processor is single stepped inline (SSIL) over the restored instruction, and the trap can be replaced again before continuing the program execution. Because of the trap mechanism used, the overhead for each instrumented site execution is larger than for static, source level, instrumentation.

Static and dynamic instrumentation are therefore complementary, each having specific advantages. For example, frequently used tracepoints in the Linux Kernel are most efficiently implemented with Kernel Tracepoints, the cost of calling a probe being that of a simple function call. Moreover, a static tracepoint insures that all its arguments (e.g. local variables) will remain available at runtime and not be optimized out by the compiler. On the other hand, adding ad hoc tracepoints at runtime is not possible with Kernel Tracepoints, but may conveniently be achieved with SystemTap via kprobes, albeit at a larger cost of a trap instruction. The probe connected with either a Kernel Tracepoint or a SystemTap tracepoint can then, for instance, write an event to LTTng trace buffers for tracing purposes.

GDB and Frysk use the ptrace programming interface to interact with the process to debug. Ptrace can be used to monitor and control another process: receive notification of signals, traps and system calls, read and write the process memory and registers, and start, stop and single step the execution. Ptrace is used by a number of tools, including strace a system call tracer, and ltrace a library function call tracer. The method used by GDB to continue after hitting a breakpoint, single stepping inline (SSIL), is adequate if all threads are stopped. Otherwise, other running threads may miss the breakpoint. On some processors, replacing even a single byte of the instruction stream may lead to inconsistencies between the memory and the instruction cache. For this reason, more sophisticated techniques are often required to quiesce all threads before inserting a breakpoint, and for single stepping out of line (SSOL) when continuing after a breakpoint. The utrace in kernel programming interface has been proposed to address these issues and in general provide a more flexible and efficient mechanism to support debugging and tracing tools.

Valgrind uses a different approach, similar to that of static binary instrumentation packages. The program binary code, before executing each linear instruction block, is decompiled into a higher level intermediate representation, this representation is then instrumented as needed, and the result is recompiled.

DTrace, SystemTap or GDB are typically used to add a few tracepoints or breakpoints while Valgrind is more commonly used for pervasive monitoring of the program, for instance tracing every access to memory.

Personal tools