Tracepoints and Markers

From TracingWiki

Jump to: navigation, search

Tracepoints, and Markers (a tracepoint variant for directly generating LTTng events from a generic printf-like function call), are hooking mechanisms providing static instrumentation that can be enabled at runtime (dynamically) with very small footprint when disabled. This document describes both the Tracepoints and the Markers, and their implementation in the Linux kernel and ongoing implementation in Linux user space processes.

Tracepoints are a C-functions prototype based instrumentation mechanism meant to create a simple and flexible instrumentation API. The developers define instrumentation prototypes in include (.h) files which are then used in the C code. The prototypes definition include the use of preprocessor macros which add en entry in a table for each site where a tracepoint is used in the C code. This allows using the tracepoints at runtime. It is possible to connect a probe (a C function with an argument type list matching the tracepoint) to a tracepoint, and to activate the tracepoint. When an activated tracepoint is hit in the program, the connected probe is called and the execution continues when the probe returns. A naive implementation of tracepoints could result in code somewhat equivalent the following.

if (tracepoint_1_activated)
  (*tracepoint_1_probe)(arg1, arg2);

In practice, several optimizations and enhancements are achieved. For instance, it is possible to connect several probes to the same tracepoint, and the various operations (probe connection, tracepoint activation) are thread safe. Furthermore, a faster immediate boolean value, instead of a global boolean variable, may be used to check the activation status. Other optimizations are also used such as using an unlikely branch, and placing the probe call setup instructions at the function end to improve instruction cache usage. Thus, as a first approximation, the cost of an inactive tracepoint is that of an if on a boolean variable, and the cost of an active tracepoint is that of the if plus a call to the connected probe through a function pointer.

One possible use for a tracepoint is to call a probe for tracing purposes. The probe will then use the received arguments to write an event to the trace buffers. The LTTng tracer provides two different APIs to write an event to the trace buffers. The first is based on a format string and a variable argument list. The format string is registered with LTTng at initialization time and written to a trace only once; it can later be retrieved from the trace metadata to help a tracer understand the event types. The variable argument list is accessed based on the format string and the individual arguments are efficiently written in binary form in the trace buffers. The second, lower level but more efficient, API lets the probe reserve space for an event and write each argument in binary form directly in the trace buffers.

Markers are very much like Tracepoints, except that they declare a format string and export the data through a variable argument list. There is a small overhead associated with variable argument lists and the associated interpretation of the format string. The type of the arguments is also normally restricted to scalar types which can easily be described by the format string. The advantage of Markers is that they are self described. They do not require a prior declaration in an include (.h) file, and they can be processed by a generic probe, expecting a printf-like variable argument list. For tracing purposes, the LTTng tracer provides such a generic probe. LTTng can then directly connect to and activate any Marker for tracing purposes.

Markers are thus used for simple ad-hoc instrumentation. Tracepoints are typically used when a formal hook is desired at an important location in the code. A tracepoint in the code is less visually invasive than a marker since it only contains the relevant arguments (no format string). Furthermore, a tracepoint is a general hooking mechanism which may be used for different purposes, one of which being tracing. The disadvantage is that for each tracepoint the developer must provide a prior definition and a corresponding probe. For tracing purposes, the probe connected to a tracepoint may either call the format string based event writing function, or when more performance is desired (e.g. for very frequent events) the lower level functions for directly writing each argument.

Markers have been implemented in the mainline Linux kernel since version 2.6.24, and Tracepoints were added later in 2.6.28-rc1. They are used as a generic mechanism to statically insert hooks. These hooks are used by several kernel components including the LTTng tracer and SystemTap.

Kernel Tracepoints represent key locations in the kernel where hooking is desirable and require community approval before they are accepted. They are never exposed to user-space to avoid making kernel code instrumentation part of the userspace API. The tracing probes provided to connect to the tracepoints provide the interface to the user-space trace analysis tools. Kernel Markers are typically used within Kernel Tracepoint probes, for add hoc tracing, or for regular tracing in less important locations, possibly replacing printk statements.

Markers and Tracepoints have not yet been implemented for user space tracing. The API discussions and refinements were conducted for the Kernel implementation and are being complemented by the corresponding work on the LTTng kernel tracer. Once this is completed, Markers and Tracepoints will be implemented in a similar manner for user-space tracing and LTTng will be extended accordingly from its current limited user-space tracing facility.

Google proposed a set of tracepoints to the Linux community, explaining that these core kernel tracepoints have been in many cases sufficient to solve most of the problems they faced. The list is as follow:

  • syscall_entry/exit
  • irq_entry/exit
  • irq_softirq_entry/exit
  • irq_softirq_raise
  • irq_tasklet_{low,high}_entry/exit
  • sched_kthread_stop
  • sched_wait_task
  • sched_wakeup
  • sched_switch
  • sched_migrate_task
  • sched_process_free
  • sched_process_exit
  • sched_process_wait
  • sched_process_fork
  • sched_signal_send
  • timer_itimer_expired
  • timer_itimer_set
  • timer_set
  • timer_update_time
  • timer_timeout
  • wait_on_page_start/end
  • memory_handle_fault_entry/exit
  • page_alloc
  • page_free
  • net_dev_xmit
  • net_dev_receive
  • swap_in
  • swap_out

Detailed thread: http://lwn.net/Articles/290203/

Personal tools