Sampling

From TracingWiki

Jump to: navigation, search

TracingBook : Sampling


Sampling may be used to get copious data about a running program with low disturbance. At regular intervals, the unmodified program is interrupted by an external source (e.g. programmable interrupt timer) and a number of parameters may be sampled, such as the currently executing program, the execution address, and the execution mode. Moreover, the overhead associated with sampling can be controlled by changing the sampling frequency. There is however a compromise between the accuracy of the profile obtained and the sampling frequency (and overhead).

GProf, part of the GNU binary utilities, uses the operating system's virtual timer in order to get interrupted at every .01 s or .001 s of CPU time. At every interrupt, the address of the currently executing instruction is taken and the counter associated with this region is incremented. Typically, each region covers 4 bytes of the program instructions address space, requiring one 4-byte counter per 4 bytes of program text, or as much memory space for the sampling data as for the program binary code itself. GProf can also instrument each function entry in order to produce a call graph. Running Gzip 1.2.4 on a 64 MB log file takes 28.16 s of elapsed time. With sampling, the same task requires 28.30 s, and with sampling and function entries instrumentation, the time rises to 29.88 s.

Most modern CPUs provide performance counters which may be used for sampling. Time sampling is achieved by counting clock cycles and requesting an interrupt after 1,000,000 clock cycles or so. Similarly, it is possible to count instruction or data cache misses, branch delays and many other interesting metrics which may impact performance. By sampling the address of the currently executed instruction, one eventually gets an histogram of the relevant metric (time, cache misses, branch delays...) for the different program regions. It then becomes very easy to determine the program section consuming the most CPU time, or causing the most cache misses. OProfile is a performance analysis tool on Linux based on performance counters sampling. A graphical frontend to oprofile, oprofileui, is available.

Personal tools