What is tracing?

Tracing is a technique used to understand what is going on in a system in order to debug or monitor it. A tracer is the software used for tracing. Tracing can be used to debug a wide range of bugs that are otherwise extremely challenging. These include, for example, performance problems in complex parallel systems or real-time systems.

Tracing is similar to logging: it consists in recording events that happen in a system. However, compared to logging, it usually records much lower-level events that occur much more frequently. Tracers must therefore be optimized to handle a lot of data while having a small impact on the system. Traces typically generate thousands of events per second. They frequently contain millions of events and have sizes from many megabytes to tens of gigabytes.

Traces may include events from the operating system kernel (IRQ handler entry/exit, system call entry/exit, scheduling activity, network activity, etc). They may also include events from any application.

The list of events of a trace may be read manually like a log file, for the maximum level of detail. However, trace analyzers and viewers are available to produce graphs and statistics from this enormous amount of data. These programs must be specially designed to handle quickly the enormous amount of data traces contain.