The Linux Trace Toolkit: next generation is an open source software
toolkit which you can use to trace the Linux kernel, user applications,
and user libraries at the same time.
LTTng consists of:
Kernel modules to trace the Linux kernel.
Shared libraries to trace C/C++ user applications.
Java packages to trace Java applications which use java.util.logging
or Apache log4j 1.2.
A Python package to trace Python applications which use the standard
logging package.
A kernel module to trace shell scripts and other user applications
without a dedicated instrumentation mechanism.
Daemons and a command-line tool, lttng, to control the
LTTng tracers.
Open source documentation
Note:This is an open documentation: its source is available in a
public Git repository.
Should you find any error in the content of this text, any grammatical
mistake, or any dead link, we would be very grateful if you would file a
GitHub issue for it or, even better, contribute a patch to this
documentation by creating a pull request.
Target audience
The material of this documentation is appropriate for intermediate to
advanced software developers working in a Linux environment and
interested in efficient software tracing. LTTng is also worth a
try for students interested in the inner mechanics of their systems.
If you do not have a programming background, you may wish to skip
everything related to instrumentation, which often requires at least
some programming language skills.
What’s in this documentation?
The LTTng Documentation is divided into the following sections:
Nuts and bolts explains the
rudiments of software tracing and the rationale behind the
LTTng project.
Skip this section if you’re familiar with software tracing and with the
LTTng project.
Installation describes the steps to
install the LTTng packages on common Linux distributions and from
their sources.
Skip this section if you already properly installed LTTng on your target
system.
Quick start is a concise guide to
getting started quickly with LTTng kernel and user space tracing.
We recommend this section if you’re new to LTTng or to software tracing
in general.
Skip this section if you’re not new to LTTng.
Core concepts explains the concepts at
the heart of LTTng.
It’s a good idea to become familiar with the core concepts
before attempting to use the toolkit.
Components of LTTng describes the various components
of the LTTng machinery, like the daemons, the libraries, and the
command-line interface.
Instrumentation shows different ways to
instrument user applications and the Linux kernel.
Instrumenting source code is essential to provide a meaningful
source of events.
Skip this section if you don’t have a programming background.
Tracing control is divided into topics
which demonstrate how to use the vast array of features that
LTTng 2.12 offers.
Glossary is a specialized dictionary of terms related
to LTTng or to the field of software tracing.
Convention
Function names, parameter names, variable names, command names, argument
names, file system paths, file names, and other literal strings are
written using a monospace typeface in this document. An italic
word within such a block is a placeholder, usually described in the
following sentence.
Practical tips and notes are given throughout the document using the
following style:
Tip:Read the tips.
Terminal boxes are used to show command lines:
$
#
echo Command line as a regular user
echo Command line as a the `root` user
Command lines which you need to execute as a priviledged user start with
the # prompt or with sudo. Other command lines start with the $
prompt.
Acknowledgements
A few people made the online LTTng Documentation possible.
Philippe Proulx wrote most of the content, created
the diagrams, and formatted the document. He’s the current maintainer
of the LTTng Documentation.
LTTng 2.12 bears the name Ta Meilleure, a Northeast IPA
beer brewed by Lagabière. Translating to “Your
best one”, this beer gives out strong aromas of passion fruit, lemon,
and peaches. Tastewise, expect a lot of fruit, a creamy texture, and a
smooth lingering hop bitterness.
New features and changes in LTTng 2.12:
Tracing control
Clear the contents of one or more tracing sessions
without having to destroy and reconfigure them
with the new lttng-clear(1) command.
This is especially useful to clear the tracing data of a tracing session
between attempts to reproduce a problem.
Before LTTng 2.12, the lttng-track(1) and
lttng-untrack(1) commands used to add and remove process IDs
(PIDs) to a whitelist so that LTTng would only trace processes with
specific PIDs.
LTTng 2.12 adds Unix user IDs (UIDs) and Unix group IDs
(GIDs) to the available process attributes to track.
You can specify numeric user/group IDs and user/group names to track,
for example:
While you can also track UIDs and GIDs with the
--filter option of the enable-event command,
this dedicated process attribute tracking approach reduces tracing
overhead and prevents the creation of sub-buffers for
the users and groups which LTTng doesn’t track.
In the command manual pages, the term “whitelist” is renamed to
“inclusion set” to clarify the concept.
The relay daemon can now maintain many files
virtually opened without using as many file descriptors (FD). It does
so by closing and reopening FDs as needed.
This feature is meant as a workaround for users who can’t bump the
system limit because of permission restrictions.
The new --fd-pool-size relay daemon option
sets the maximum number of simultaneously opened file descriptors
(using the soft RLIMIT_NOFILE resource limit of the process by
default; see getrlimit(2)).
By default, the relay daemon writes its traces under a predefined
directory hierarchy,
$LTTNG_HOME/lttng-traces/host/session/domain, with:
Change this hierarchy to group traces by tracing session name rather
than by hostname
($LTTNG_HOME/lttng-traces/session/host/domain) with the
new --group-output-by-session option of the
relay daemon.
This feature is especially useful if you’re tracing two or more hosts,
having different hostnames, which share the same tracing session name as
part of their configuration. In this scenario, you can use a descriptive
tracing session name (for example, connection-hang) across a fleet of
machines streaming to a single relay daemon.
The relay daemon has a new --working-directory
option to override its working directory.
Linux kernel tracing
New instrumentation hooks to trace the entry and exit tracepoints of
the network reception code paths of the Linux kernel.
Use the resulting event records to identify the bounds of a network
reception and link the events that occur in the interim (for example,
wake-ups) to a specific network reception instance. You can also
analyze the latency of the network stack thanks to those event records.
The thread field of the irqaction structure, which specifies the
process to wake up when a threaded interrupt request (IRQ) occurs, is
now part of the lttng_statedump_interrupt event record.
Use this information to discover which processes handle the various
IRQs. You can also associate the events occurring in the context of
those processes with their respective IRQ.
New lttng_statedump_cpu_topology tracepoint to record the active
CPU/NUMA topology.
Use this information to discover which CPUs are SMT siblings or part of
the same socket. As of LTTng 2.12, only the x86 architecture
is supported since all architectures describe their topologies
differently.
The architecture field of the tracepoint is statically defined and
exists for all architecture implementations. Analysis tools can
therefore anticipate the layout of the event record.
Event record example:
lttng_statedump_cpu_topology:architecture:x86cpu_id:0vendor:GenuineIntelfamily:6model:142model_name:Intel(R) Core(TM) i7-7600U CPU @ 2.80GHzphysical_id:0core_id:0cores:2
New product UUID metadata environment field, product_uuid,
which LTTng copies from the
Desktop
Management Interface (DMI).
Use this environment field to uniquely identify a machine (virtual or
physical) in order to correlate traces from multiple virtual machines.
What is LTTng? As its name suggests, the Linux Trace Toolkit: next
generation is a modern toolkit for tracing Linux systems and
applications. So your first question might be:
what is tracing?
As the history of software engineering progressed and led to what
we now take for granted—complex, numerous and
interdependent software applications running in parallel on
sophisticated operating systems like Linux—the authors of such
components, software developers, began feeling a natural
urge to have tools that would ensure the robustness and good performance
of their masterpieces.
One major achievement in this field is, inarguably, the
GNU debugger (GDB),
an essential tool for developers to find and fix bugs. But even the best
debugger won’t help make your software run faster, and nowadays, faster
software means either more work done by the same hardware, or cheaper
hardware for the same work.
A profiler is often the tool of choice to identify performance
bottlenecks. Profiling is suitable to identify where performance is
lost in a given software. The profiler outputs a profile, a statistical
summary of observed events, which you may use to discover which
functions took the most time to execute. However, a profiler won’t
report why some identified functions are the bottleneck. Bottlenecks
might only occur when specific conditions are met, conditions that are
sometimes impossible to capture by a statistical profiler, or impossible
to reproduce with an application altered by the overhead of an
event-based profiler. For a thorough investigation of software
performance issues, a history of execution is essential, with the
recorded values of variables and context fields you choose, and
with as little influence as possible on the instrumented software. This
is where tracing comes in handy.
Tracing is a technique used to understand what goes on in a running
software system. The software used for tracing is called a tracer,
which is conceptually similar to a tape recorder. When recording,
specific instrumentation points placed in the software source code
generate events that are saved on a giant tape: a trace file. You
can trace user applications and the operating system at the same time,
opening the possibility of resolving a wide range of problems that would
otherwise be extremely challenging.
Tracing is often compared to logging. However, tracers and loggers are
two different tools, serving two different purposes. Tracers are
designed to record much lower-level events that occur much more
frequently than log messages, often in the range of thousands per
second, with very little execution overhead. Logging is more appropriate
for a very high-level analysis of less frequent events: user accesses,
exceptional conditions (errors and warnings, for example), database
transactions, instant messaging communications, and such. Simply put,
logging is one of the many use cases that can be satisfied with tracing.
The list of recorded events inside a trace file can be read manually
like a log file for the maximum level of detail, but it is generally
much more interesting to perform application-specific analyses to
produce reduced statistics and graphs that are useful to resolve a
given problem. Trace viewers and analyzers are specialized tools
designed to do this.
In the end, this is what LTTng is: a powerful, open source set of
tools to trace the Linux kernel and user applications at the same time.
LTTng is composed of several components actively maintained and
developed by its community.
The dtrace tool interprets user scripts and is responsible for
loading code into the Linux kernel for further execution and collecting
the outputted data.
A subsystem in the Linux kernel in which a virtual machine can
execute programs passed from the user space to the kernel.
You can attach such programs to tracepoints and kprobes thanks to a
system call, and they can output data to the user space when executed
thanks to different mechanisms (pipe, VM register values, and eBPF maps,
to name a few).
Like SystemTap, uses scripts to analyze Linux kernel events.
You write scripts, or chisels in the jargon of sysdig, in Lua and
sysdig executes them while it traces the system or afterwards. The
interface of sysdig is the sysdig command-line tool as well as the
text UI-based csysdig tool.
A Linux kernel and user space tracer which uses custom user scripts
to produce plain text traces.
SystemTap converts the scripts to the C language, and then compiles them
as Linux kernel modules which are loaded to produce trace data. The
primary user interface of SystemTap is the stap command-line tool.
The main distinctive features of LTTng is that it produces correlated
kernel and user space traces, as well as doing so with the lowest
overhead amongst other solutions. It produces trace files in the
CTF format, a file format optimized
for the production and analyses of multi-gigabyte data.
LTTng is the result of more than 10 years of active open source
development by a community of passionate developers. LTTng is currently
available on major desktop and server Linux distributions.
The main interface for tracing control is a single command-line tool
named lttng. The latter can create several tracing sessions, enable
and disable events on the fly, filter events efficiently with custom
user expressions, start and stop tracing, and much more. LTTng can
record the traces on the file system or send them over the network, and
keep them totally or partially. You can view the traces once tracing
becomes inactive or in real-time.
LTTng is a set of software components which interact to
instrument the Linux kernel and user applications, and
to control tracing (start and stop
tracing, enable and disable event rules, and the rest). Those
components are bundled into the following packages:
LTTng-tools
Libraries and command-line interface to control tracing.
LTTng-modules
Linux kernel modules to instrument and trace the kernel.
LTTng-UST
Libraries and Java/Python packages to instrument and trace user
applications.
Most distributions mark the LTTng-modules and LTTng-UST packages as
optional when installing LTTng-tools (which is always required). In the
following sections, we always provide the steps to install all three,
but note that:
You only need to install LTTng-modules if you intend to trace the
Linux kernel.
You only need to install LTTng-UST if you intend to trace user
applications.
Availability of LTTng 2.12 for major Linux distributions as of 3 November 2023.
To install LTTng-tools 2.12 and LTTng-UST 2.12 on
Alpine Linux 3.12, Alpine Linux 3.13, Alpine Linux 3.14,
or Alpine Linux 3.15:
Add the LTTng packages:
#
#
apk add lttng-tools
apk add lttng-ust-dev
Download, build, and install the latest LTTng-modules 2.12:
$
cd $(mktemp -d) &&
wget http://lttng.org/files/lttng-modules/lttng-modules-latest-2.12.tar.bz2 &&
tar -xf lttng-modules-latest-2.12.tar.bz2 &&
cd lttng-modules-2.12.* &&
make &&
sudo make modules_install &&
sudo depmod -a
LTTng 2.12 recipes are available in the
openembedded-core
layer for Yocto Project 3.2 Gatesgarth and Yocto Project 3.3 Hardknott
under the following names:
lttng-tools
lttng-modules
lttng-ust
With BitBake, the simplest way to include LTTng recipes in your target
image is to add them to IMAGE_INSTALL_append in conf/local.conf:
Download, build, and install the latest LTTng-modules 2.12:
$
cd $(mktemp -d) &&
wget http://lttng.org/files/lttng-modules/lttng-modules-latest-2.12.tar.bz2 &&
tar -xf lttng-modules-latest-2.12.tar.bz2 &&
cd lttng-modules-2.12.* &&
make &&
sudo make modules_install &&
sudo depmod -a
Download, build, and install the latest LTTng-UST 2.12:
$
cd $(mktemp -d) &&
wget http://lttng.org/files/lttng-ust/lttng-ust-latest-2.12.tar.bz2 &&
tar -xf lttng-ust-latest-2.12.tar.bz2 &&
cd lttng-ust-2.12.* &&
./configure &&
make &&
sudo make install &&
sudo ldconfig
Add --disable-numa to ./configure if you don’t have
numactl.
Java and Python application tracing
Important:If you need to instrument and trace Java applications, pass the --enable-java-agent-jul,
--enable-java-agent-log4j, or --enable-java-agent-all options to the
configure script, depending on which Java logging framework you use.
If you need to instrument and trace Python applications, pass the --enable-python-agent option to the
configure script. You can set the PYTHON environment variable to the
path to the Python interpreter for which to install the LTTng-UST Python
agent package.
Note:By default, LTTng-UST libraries are installed to
/usr/local/lib, which is the de facto directory in which to
keep self-compiled and third-party libraries.
Append /usr/local/lib to the LD_LIBRARY_PATH environment
variable.
Pass the -L/usr/local/lib and -Wl,-rpath,/usr/local/lib options to
gcc(1), g++(1), or clang(1).
Download, build, and install the latest LTTng-tools 2.12:
$
cd $(mktemp -d) &&
wget http://lttng.org/files/lttng-tools/lttng-tools-latest-2.12.tar.bz2 &&
tar -xf lttng-tools-latest-2.12.tar.bz2 &&
cd lttng-tools-2.12.* &&
./configure &&
make &&
sudo make install &&
sudo ldconfig
Tip:The vlttng tool can do all the
previous steps automatically for a given version of LTTng and confine
the installed files in a specific directory. This can be useful to test
LTTng without installing it on your system.
The Linux kernel which boots is configured with
CONFIG_MODULE_SIG_FORCE.
The Linux kernel boots with a command line containing
module.sig_enforce=1.
Example:root user running lttng-sessiond which fails to load a required kernel module due to the signature enforcement policies.
#
lttng-sessiond
Warning: No tracing group detected
modprobe: ERROR: could not insert 'lttng_ring_buffer_client_discard': Key was rejected by service
Error: Unable to load required module lttng-ring-buffer-client-discard
Warning: No kernel tracer available
There are several methods to enroll trusted keys for signing modules
that are built from source. The precise details vary from one Linux
version to another, and distributions may have their own mechanisms. For
example, DKMS may autogenerate a key and
sign modules, but the key isn’t automatically enrolled.
See
Kernel
module signing facility and the documentation of your distribution
to learn more about signing Linux kernel modules.
The following command lines start with the # prompt because you need
root privileges to trace the Linux kernel. You can also trace the kernel
as a regular user if your Unix user is a member of the
tracing group.
Create a tracing session which writes its traces
to /tmp/my-kernel-trace:
List the available kernel tracepoints and system calls:
#
#
lttng list --kernel
lttng list --kernel --syscall
Create event rules which match the desired instrumentation
point names, for example the sched_switch and sched_process_fork
tracepoints, and the open(2) and close(2) system calls:
#include<stdio.h>#include"hello-tp.h"intmain(intargc,char*argv[]){intx;puts("Hello, World!\nPress Enter to continue...");/* * The following getchar() call is only placed here for the purpose * of this demonstration, to pause the application in order for * you to have time to list its tracepoints. It's not needed * otherwise. */getchar();/* * A tracepoint() call. * * Arguments, as defined in hello-tp.h: * * 1. Tracepoint provider name (required) * 2. Tracepoint name (required) * 3. my_integer_arg (first user-defined argument) * 4. my_string_arg (second user-defined argument) * * Notice the tracepoint provider and tracepoint names are * NOT strings: they are in fact parts of variables that the * macros in hello-tp.h create. */tracepoint(hello_world,my_first_tracepoint,23,"hi there!");for(x=0;x<argc;++x){tracepoint(hello_world,my_first_tracepoint,x,argv[x]);}puts("Quitting now!");tracepoint(hello_world,my_first_tracepoint,x*x,"x^2");return0;}
Build the application:
$
gcc -c hello.c
Link the application with the tracepoint provider package,
liblttng-ust, and libdl:
By default, LTTng saves the traces in
$LTTNG_HOME/lttng-traces/name-date-time,
where name is the tracing session name. The
LTTNG_HOME environment variable defaults to $HOME if not set.
A rich, flexible trace manipulation toolkit which includes
a versatile command-line interface
(babeltrace2),
a C library,
and Python 3 bindings
so that you can easily process or convert an LTTng trace with
your own script.
The Babeltrace 2 project ships with a
plugin
which supports the format of the traces which LTTng produces,
CTF.
A graphical user interface for viewing and analyzing any type of
logs or traces, including those of LTTng.
Note:This section assumes that LTTng saved the traces it recorded
during the previous tutorials to their default location, in the
$LTTNG_HOME/lttng-traces directory. The LTTNG_HOME
environment variable defaults to $HOME if not set.
The text output of babeltrace2 is useful to isolate events by simple matching using
grep(1) and similar utilities. However, more elaborate filters, such
as keeping only event records with a field value falling within a
specific range, are not trivial to write using a shell. Moreover,
reductions and even the most basic computations involving multiple event
records are virtually impossible to implement.
Fortunately, Babeltrace 2 ships with
Python 3 bindings
which make it easy to read the event records of an LTTng trace
sequentially and compute the desired information.
The following script accepts an LTTng Linux kernel trace path as its
first argument and prints the short names of the top five running
processes on CPU 0 during the whole trace:
top5proc.py
importbt2importsysimportcollectionsdeftop5proc():# Get the trace path from the first command-line argument.it=bt2.TraceCollectionMessageIterator(sys.argv[1])# This counter dictionary will hold execution times:## Task command name -> Total execution time (ns)exec_times=collections.Counter()# This holds the last `sched_switch` timestamp.last_ts=Noneformsginit:# We only care about event messages.iftype(msg)isnotbt2._EventMessageConst:continue# Event of the event message.event=msg.event# Keep only `sched_switch` events.ifevent.cls.name!='sched_switch':continue# Keep only events which occurred on CPU 0.ifevent.packet.context_field['cpu_id']!=0:continue# Event timestamp (ns).cur_ts=msg.default_clock_snapshot.ns_from_originiflast_tsisNone:# We start here.last_ts=cur_ts# (Short) name of the previous task command.prev_comm=str(event.payload_field['prev_comm'])# Initialize an entry in our dictionary if not yet done.ifprev_commnotinexec_times:exec_times[prev_comm]=0# Compute previous command execution time.diff=cur_ts-last_ts# Update execution time of this command.exec_times[prev_comm]+=diff# Update last timestamp.last_ts=cur_ts# Print top 5.forname,nsinexec_times.most_common(5):print('{:20}{} s'.format(name,ns/1e9))if__name__=='__main__':top5proc()
Run this script:
$
python3 top5proc.py /tmp/my-kernel-trace/kernel
Output example:
swapper/0 48.607245889 s
chromium 7.192738188 s
pavucontrol 0.709894415 s
Compositor 0.660867933 s
Xorg.bin 0.616753786 s
Note that swapper/0 is the “idle” process of CPU 0 on Linux;
since we weren’t using the CPU that much when tracing, its first
position in the list makes sense.
From a user’s perspective, the LTTng system is built on a few concepts,
or objects, on which the lttng command-line tool
operates by sending commands to the session daemon.
Understanding how those objects relate to eachother is key in mastering
the toolkit.
Those attributes and objects are completely isolated between different
tracing sessions.
A tracing session is analogous to a cash machine session:
the operations you do on the banking system through the cash machine do
not alter the data of other users of the same system. In the case of
the cash machine, a session lasts as long as your bank card is inside.
In the case of LTTng, a tracing session lasts from the lttng create
command to the lttng destroy command.
LTTng can send the generated trace data to different locations. The
tracing session mode dictates where to send it. The following modes
are available in LTTng 2.12:
Local mode
LTTng writes the traces to the file system of the machine it traces
(target system).
Network streaming mode
LTTng sends the traces over the network to a
relay daemon running on a remote system.
Snapshot mode
LTTng doesn’t write the traces by default.
Instead, you can request LTTng to take a snapshot,
that is, a copy of the current sub-buffers of the tracing session, and
to write it to the file system of the target or to send it over the
network to a relay daemon running on a remote system.
A tracing domain is a namespace for event sources. A tracing domain
has its own properties and features.
There are currently five available tracing domains:
Linux kernel
User space
java.util.logging (JUL)
log4j
Python
You must specify a tracing domain when using some commands to avoid
ambiguity. For example, since all the domains support named tracepoints
as event sources (instrumentation points that you manually insert in the
source code), you need to specify a tracing domain when
creating an event rule because all the
tracing domains could have tracepoints with the same names.
You can create channels in the Linux kernel and user space
tracing domains. The other tracing domains have a single default
channel.
A channel is an object which is responsible for a set of ring buffers.
Each ring buffer is divided into multiple sub-buffers. When an LTTng
tracer emits an event, it can record it to one or more
sub-buffers. The attributes of a channel determine what to do when
there’s no space left for a new event record because all sub-buffers
are full, where to send a full sub-buffer, and other behaviours.
A channel is always associated to a tracing domain. The
java.util.logging (JUL), log4j, and Python tracing domains each have
a default channel which you can’t configure.
A channel also owns event rules. When an LTTng tracer emits
an event, it records it to the sub-buffers of all
the enabled channels with a satisfied event rule, as long as those
channels are part of active tracing sessions.
Allocate one set of ring buffers—one per CPU—shared by all the
instrumented processes of each Unix user.
Per-process buffering
Allocate one set of ring buffers—one per CPU—for each
instrumented process.
The per-process buffering scheme tends to consume more memory than the
per-user option because systems generally have more instrumented
processes than Unix users running instrumented processes. However, the
per-process buffering scheme ensures that one process having a high
event throughput won’t fill all the shared sub-buffers of the same
user, only its own.
The Linux kernel tracing domain has only one available buffering scheme
which is to allocate a single set of ring buffers for the whole system.
This scheme is similar to the per-user option, but with a single, global
user “running” the kernel.
When an event occurs, LTTng records it to a specific sub-buffer (yellow
arc in the following animations) of the ring buffer of a specific
channel. When there’s no space left in a sub-buffer, the tracer marks it
as consumable (red) and another, empty sub-buffer starts receiving the
following event records. A consumer daemon
eventually consumes the marked sub-buffer (returns to white).
In an ideal world, sub-buffers are consumed faster than they are filled,
as it is the case in the previous animation. In the real world,
however, all sub-buffers can be full at some point, leaving no space to
record the following events.
By default, LTTng-modules and LTTng-UST are non-blocking tracers: when
no empty sub-buffer is available, it is acceptable to lose event records
when the alternative would be to cause substantial delays in the
execution of the instrumented application. LTTng privileges performance
over integrity; it aims at perturbing the target system as little as
possible in order to make tracing of subtle race conditions and rare
interrupt cascades possible.
Since LTTng 2.10, the LTTng user space tracer, LTTng-UST, supports
a blocking mode. See the blocking timeout example to learn how to use the blocking mode.
When it comes to losing event records because no empty sub-buffer is
available, or because the blocking timeout is
reached, the event record loss mode of the channel determines what to
do. The available event record loss modes are:
Discard mode
Drop the newest event records until the tracer releases a sub-buffer.
This is the only available mode when you specify a
blocking timeout.
Overwrite mode
Clear the sub-buffer containing the oldest event records and start
writing the newest event records there.
This mode is sometimes called flight recorder mode because it’s
similar to a
flight recorder:
always keep a fixed amount of the latest data.
Which mechanism you should choose depends on your context: prioritize
the newest or the oldest event records in the ring buffer?
Beware that, in overwrite mode, the tracer abandons a whole sub-buffer
as soon as a there’s no space left for a new event record, whereas in
discard mode, the tracer only discards the event record that doesn’t
fit.
In discard mode, LTTng increments a count of lost event records when an
event record is lost and saves this count to the trace. Since
LTTng 2.8, in overwrite mode, LTTng writes to a given sub-buffer
its sequence number within its data stream. With a local,
network streaming, or livetracing session, a trace reader can use such
sequence numbers to report lost packets. In overwrite mode, LTTng
doesn’t write to the trace the exact number of lost event records in
those lost sub-buffers.
Trace analyses can use saved discarded event record and sub-buffer
(packet) counts of the trace to decide whether or not to perform the
analyses even if trace data is known to be missing.
There are a few ways to decrease your probability of losing event
records.
Sub-buffer count and size shows
how to fine-tune the sub-buffer count and size of a channel to virtually
stop losing event records, though at the cost of greater memory usage.
When you create a channel, you can
set its number of sub-buffers and their size.
Note that there is noticeable CPU overhead introduced when
switching sub-buffers (marking a full one as consumable and switching
to an empty one for the following events to be recorded). Knowing this,
the following list presents a few practical situations along with how
to configure the sub-buffer count and size for them:
High event throughput: In general, prefer bigger sub-buffers to
lower the risk of losing event records.
The number of sub-buffers is only meaningful if you create the channel
in overwrite mode: in this case, if a sub-buffer overwrite happens, the
other sub-buffers are left unaltered.
Low event throughput: In general, prefer smaller sub-buffers
since the risk of losing event records is low.
Because events occur less frequently, the sub-buffer switching frequency
should remain low and thus the overhead of the tracer shouldn’t be a
problem.
Low memory system: If your target system has a low memory
limit, prefer fewer first, then smaller sub-buffers.
Even if the system is limited in memory, you want to keep the
sub-buffers as big as possible to avoid a high sub-buffer switching
frequency.
Note that LTTng uses CTF as its trace format,
which means event data is very compact. For example, the average
LTTng kernel event record weights about 32 bytes. Thus, a
sub-buffer size of 1 MiB is considered big.
The previous situations highlight the major trade-off between a few big
sub-buffers and more, smaller sub-buffers: sub-buffer switching
frequency vs. how much data is lost in overwrite mode. Assuming a
constant event throughput and using the overwrite mode, the two
following configurations have the same ring buffer total size:
Two sub-buffers of 4 MiB each: Expect a very low sub-buffer
switching frequency, but if a sub-buffer overwrite happens, half of
the event records so far (4 MiB) are definitely lost.
Eight sub-buffers of 1 MiB each: Expect four times the
overhead of the tracer as the previous configuration, but if a
sub-buffer overwrite happens, only the eighth of event records so far
are definitely lost.
In discard mode, the sub-buffers count parameter is pointless: use two
sub-buffers and set their size according to the requirements of your
situation.
The switch timer period is an important configurable attribute of
a channel to ensure periodic sub-buffer flushing.
When the switch timer expires, a sub-buffer switch happens. Set
the switch timer period attribute when you
create a channel to ensure that LTTng
consumes and commits trace data to trace files or to a distant relay
daemon periodically in case of a low event throughput.
This attribute is also convenient when you use big sub-buffers to cope
with a sporadic high event throughput, even if the throughput is
normally low.
By default, the LTTng tracers use a notification mechanism to signal a
full sub-buffer so that a consumer daemon can consume it. When such
notifications must be avoided, for example in real-time applications,
use the read timer of the channel instead. When the read timer fires,
the consumer daemon checks for full, consumable
sub-buffers.
By default, trace files can grow as large as needed. Set the maximum
size of each trace file that a channel writes when you
create a channel. When the size of a
trace file reaches the fixed maximum size of the channel, LTTng creates
another file to contain the next event records. LTTng appends a file
count to each trace file name in this case.
If you set the trace file size attribute when you create a channel, the
maximum number of trace files that LTTng creates is unlimited by
default. To limit them, set a maximum number of trace files. When the
number of trace files reaches the fixed maximum count of the channel,
the oldest trace file is overwritten. This mechanism is called trace
file rotation.
Important:Even if you don’t limit the trace file count, you can’t assume that
LTTng doesn’t manage any trace file.
In other words, there is no safe way to know if LTTng still holds a
given trace file open with the trace file rotation feature.
The only way to obtain an unmanaged, self-contained LTTng trace before
you destroy the tracing session
is with the tracing session rotation feature
(available since LTTng 2.11).
You always attach an event rule to a channel when you create
it.
When an event passes the conditions of an event rule, LTTng records it
in one of the sub-buffers of the attached channel.
The available conditions, as of LTTng 2.12, are:
The event rule is enabled.
The type of the instrumentation point is T.
The name of the instrumentation point (sometimes called event name)
matches N, but isn’t E.
The log level of the instrumentation point is as severe as L, or
is exactly L.
The fields of the payload of the event satisfy a filter
expression F.
As you can see, all the conditions but the dynamic filter are related to
the status of the event rule or to the instrumentation point, not to the
occurring events. This is why, without a filter, checking if an event
passes an event rule isn’t a dynamic task: when you create or modify an
event rule, all the tracers of its tracing domain enable or disable the
instrumentation points themselves once. This is possible because the
attributes of an instrumentation point (type, name, and log level) are
defined statically. In other words, without a dynamic filter, the tracer
doesn’t evaluate the arguments of an instrumentation point unless it
matches an enabled event rule.
Note that, for LTTng to record an event, the channel to
which a matching event rule is attached must also be enabled, and the
tracing session owning this channel must be active
(started).
The second T in LTTng stands for toolkit: it would be wrong
to call LTTng a simple tool since it is composed of multiple
interacting components. This section describes those components,
explains their respective roles, and shows how they connect together to
form the LTTng ecosystem.
The following diagram shows how the most important components of LTTng
interact with user applications, the Linux kernel, and you:
The LTTng project incorporates:
LTTng-tools: Libraries and command-line interface to
control tracing sessions.
The LTTng control library, liblttng-ctl, is used to communicate
with a session daemon using a C API that hides the
underlying details of the protocol. liblttng-ctl is part of LTTng-tools.
Use liblttng-ctl in C or C++ source code by including its
“master” header:
#include<lttng/lttng.h>
Some objects are referenced by name (C string), such as tracing
sessions, but most of them require to create a handle first using
lttng_create_handle().
As of LTTng 2.12, the best available developer documentation for
liblttng-ctl is its installed header files. Every function and structure is
thoroughly documented.
The user space tracing library, liblttng-ust (see lttng-ust(3)),
is the LTTng user space tracer. It receives commands from a
session daemon, for example to
enable and disable specific instrumentation points, and writes event
records to ring buffers shared with a
consumer daemon.
liblttng-ust is part of LTTng-UST.
Public C header files are installed beside liblttng-ust to
instrument any C or C++ application.
LTTng-UST agents, which are regular Java and Python
packages, use their own library providing tracepoints which is
linked with liblttng-ust.
An application or library doesn’t have to initialize liblttng-ust
manually: its constructor does the necessary tasks to properly register
to a session daemon. The initialization phase also enables the
instrumentation points matching the event rules that you
already created.
The LTTng-UST Java and Python agents are regular Java and Python
packages which add LTTng tracing capabilities to the
native logging frameworks. The LTTng-UST agents are part of LTTng-UST.
In the case of Python, the standard
logging package
is supported. Both Python 2 and Python 3 modules can import the
LTTng-UST Python agent package.
The applications using the LTTng-UST agents are in the
java.util.logging (JUL),
log4j, and Python tracing domains.
Both agents use the same mechanism to trace the log statements. When an
agent initializes, it creates a log handler that attaches to the root
logger. The agent also registers to a session daemon.
When the application executes a log statement, the root logger passes it
to the log handler of the agent. The log handler of the agent calls a
native function in a tracepoint provider package shared library linked
with liblttng-ust, passing the formatted log message and
other fields, like its logger name and its log level. This native
function contains a user space instrumentation point, hence tracing the
log statement.
The log level condition of an
event rule is considered when tracing
a Java or a Python application, and it’s compatible with the standard
JUL, log4j, and Python log levels.
The LTTng kernel modules are a set of Linux kernel modules
which implement the kernel tracer of the LTTng project. The LTTng
kernel modules are part of LTTng-modules.
The LTTng kernel modules include:
A set of probe modules.
Each module attaches to a specific subsystem
of the Linux kernel using its tracepoint instrument points. There are
also modules to attach to the entry and return points of the Linux
system call functions.
Ring buffer modules.
A ring buffer implementation is provided as kernel modules. The LTTng
kernel tracer writes to the ring buffer; a
consumer daemon reads from the ring buffer.
The LTTng kernel tracer module.
The LTTng logger module.
The LTTng logger module implements the special /proc/lttng-logger
(and /dev/lttng-logger since LTTng 2.11) files so that any
executable can generate LTTng events by opening and writing to those
files.
Generally, you don’t have to load the LTTng kernel modules manually
(using modprobe(8), for example): a root session daemon loads the necessary modules when starting. If you have extra
probe modules, you can specify to load them to the session daemon on the
command line. See also
Linux kernel module signature.
The LTTng kernel modules are installed in
/usr/lib/modules/release/extra by default, where release is
the kernel release (see uname --kernel-release).
The session daemon, lttng-sessiond(8), is a daemon responsible for
managing tracing sessions and for controlling the various components of
LTTng. The session daemon is part of LTTng-tools.
The session daemon sends control requests to and receives control
responses from:
Any instance of the user space tracing library first registers to
a session daemon. Then, the session daemon can send requests to
this instance, such as:
Get the list of tracepoints.
Share an event rule so that the user space tracing library
can enable or disable tracepoints. Amongst the possible conditions
of an event rule is a filter expression which liblttng-ust evalutes
when an event occurs.
Share channel attributes and ring buffer locations.
The session daemon and the user space tracing library use a Unix
domain socket for their communication.
The session daemon doesn’t send and receive trace data: this is the
role of the consumer daemon and
relay daemon. It does, however, generate the
CTF metadata stream.
Each Unix user can have its own session daemon instance. The
tracing sessions which different session daemons manage are completely
independent.
The root user’s session daemon is the only one which is
allowed to control the LTTng kernel tracer, and its spawned consumer
daemon is the only one which is allowed to consume trace data from the
LTTng kernel tracer. Note, however, that any Unix user which is a member
of the tracing group is allowed
to create channels in the
Linux kernel tracing domain, and thus to trace the Linux
kernel.
The lttng command-line tool automatically starts a
session daemon when using its create command if none is currently
running. You can also start the session daemon manually.
The consumer daemon, lttng-consumerd, is a daemon which shares
ring buffers with user applications or with the LTTng kernel modules to
collect trace data and send it to some location (on disk or to a
relay daemon over the network). The consumer daemon
is part of LTTng-tools.
You don’t start a consumer daemon manually: a consumer daemon is always
spawned by a session daemon as soon as you create an
event rule, that is, before you start tracing. When you kill
its owner session daemon, the consumer daemon also exits because it is
the child process of the session daemon. Command-line options of
lttng-sessiond(8) target the consumer daemon process.
There are up to two running consumer daemons per Unix user, whereas only
one session daemon can run per user. This is because each process can be
either 32-bit or 64-bit: if the target system runs a mixture of 32-bit
and 64-bit processes, it is more efficient to have separate
corresponding 32-bit and 64-bit consumer daemons. The root user is an
exception: it can have up to three running consumer daemons: 32-bit
and 64-bit instances for its user applications, and one more
reserved for collecting kernel trace data.
The relay daemon, lttng-relayd(8), is a daemon acting as a bridge
between remote session and consumer daemons, local trace files, and a
remote live trace viewer. The relay daemon is part of LTTng-tools.
The main purpose of the relay daemon is to implement a receiver of
trace data over the network.
This is useful when the target system doesn’t have much file system
space to record trace files locally.
The relay daemon is also a server to which a
live trace viewer can
connect. The live trace viewer sends requests to the relay daemon to
receive trace data as the target system emits events. The
communication protocol is named LTTng live; it is used over TCP
connections.
Note that you can start the relay daemon on the target system directly.
This is the setup of choice when the use case is to view events as
the target system emits them without the need of a remote system.
There are many examples of tracing and monitoring in our everyday life:
You have access to real-time and historical weather reports and
forecasts thanks to weather stations installed around the country.
You know your heart is safe thanks to an electrocardiogram.
You make sure not to drive your car too fast and to have enough fuel
to reach your destination thanks to gauges visible on your dashboard.
All the previous examples have something in common: they rely on
instruments. Without the electrodes attached to the surface of your
body skin, cardiac monitoring is futile.
LTTng, as a tracer, is no different from those real life examples. If
you’re about to trace a software system or, in other words, record its
history of execution, you better have instrumentation points in the
subject you’re tracing, that is, the actual software.
Various ways were developed to instrument a piece of software for LTTng
tracing. The most straightforward one is to manually place
instrumentation points, called tracepoints, in the source code of the
software. It is also possible to add instrumentation points dynamically
in the Linux kernel tracing domain.
If you’re only interested in tracing the Linux kernel, your
instrumentation needs are probably already covered by the built-in
Linux kernel tracepoints of LTTng. You may also wish
to trace a user application which is already instrumented for LTTng
tracing. In such cases, skip this whole section and read the topics of
the Tracing control section.
Many methods are available to instrument a piece of software for LTTng
tracing. They are:
A tracepoint provider is a set of compiled functions which provide
tracepoints to an application, the type of instrumentation point
supported by LTTng-UST. Those functions can emit events with
user-defined fields and serialize those events as event records to one
or more LTTng-UST channel sub-buffers. The tracepoint()
macro, which you insert in the source code of a user application, calls those functions.
A tracepoint provider package is an object file (.o) or a shared
library (.so) which contains one or more tracepoint providers.
Its source files are:
Your tracepoint provider name must be unique amongst all the possible
tracepoint provider names used on the same target system. We
suggest to include the name of your project or company in the name,
for example, org_lttng_my_project_tpp.
A tracepoint definition defines, for a given tracepoint:
Its input arguments. They are the macro parameters that the
tracepoint() macro accepts for this particular tracepoint
in the source code of the user application.
Its output event fields. They are the sources of event fields
that form the payload of any event that the execution of the
tracepoint() macro emits for this particular tracepoint.
Create a tracepoint definition by using the
TRACEPOINT_EVENT() macro below the #include <lttng/tracepoint.h>
line in the
tracepoint provider header file template.
The syntax of the TRACEPOINT_EVENT() macro is:
TRACEPOINT_EVENT() macro syntax.
TRACEPOINT_EVENT(/* Tracepoint provider name */provider_name,/* Tracepoint name */tracepoint_name,/* Input arguments */TP_ARGS(arguments),/* Output event fields */TP_FIELDS(fields))
This tracepoint emits events named provider_name:tracepoint_name.
Event name length limitation
Important:The concatenation of the tracepoint provider name and the
tracepoint name must not exceed 254 characters. If it does, the
instrumented application compiles and runs, but LTTng throws multiple
warnings and you could experience serious issues.
The syntax of the TP_ARGS() macro is:
TP_ARGS() macro syntax.
TP_ARGS(type,arg_name)
Replace:
type with the C type of the argument.
arg_name with the argument name.
You can repeat type and arg_name up to 10 times to have more
than one argument.
Example:TP_ARGS() usage with three arguments.
TP_ARGS(int,count,float,ratio,constchar*,query)
The TP_ARGS() and TP_ARGS(void) forms are valid to create a
tracepoint definition with no input arguments.
The TP_FIELDS() macro contains a list of
ctf_*() macros. Each ctf_*() macro defines one event field. See
lttng-ust(3) for a complete description of the available ctf_*()
macros. A ctf_*() macro specifies the type, size, and byte order of
one event field.
Each ctf_*() macro takes an argument expression parameter. This is a
C expression that the tracer evalutes at the tracepoint() macro site
in the source code of the application. This expression provides the
source of data of a field. The argument expression can include input
argument names listed in the TP_ARGS() macro.
Each ctf_*() macro also takes a field name parameter. Field names
must be unique within a given tracepoint definition.
Here’s a complete tracepoint definition example:
Example:Tracepoint definition.
The following tracepoint definition defines a tracepoint which takes
three input arguments and has four output event fields.
A tracepoint class is a class of tracepoints which share the same
output event field definitions. A tracepoint instance is one
instance of such a defined tracepoint class, with its own tracepoint
name.
The TRACEPOINT_EVENT() macro is actually a
shorthand which defines both a tracepoint class and a tracepoint
instance at the same time.
When you build a tracepoint provider package, the C or C++ compiler
creates one serialization function for each tracepoint class. A
serialization function is responsible for serializing the event fields
of a tracepoint to a sub-buffer when tracing.
For various performance reasons, when your situation requires multiple
tracepoint definitions with different names, but with the same event
fields, we recommend that you manually create a tracepoint class
and instantiate as many tracepoint instances as needed. One positive
effect of such a design, amongst other advantages, is that all
tracepoint instances of the same tracepoint class reuse the same
serialization function, thus reducing
cache pollution.
Example:Use a tracepoint class and tracepoint instances.
Consider the following three tracepoint definitions:
In this case, we create three tracepoint classes, with one implicit
tracepoint instance for each of them: get_account, get_settings, and
get_transaction. However, they all share the same event field names
and types. Hence three identical, yet independent serialization
functions are created when you build the tracepoint provider package.
A better design choice is to define a single tracepoint class and three
tracepoint instances:
/* The tracepoint class */TRACEPOINT_EVENT_CLASS(/* Tracepoint provider name */my_app,/* Tracepoint class name */my_class,/* Input arguments */TP_ARGS(int,userid,size_t,len),/* Output event fields */TP_FIELDS(ctf_integer(int,userid,userid)ctf_integer(size_t,len,len)))/* The tracepoint instances */TRACEPOINT_EVENT_INSTANCE(/* Tracepoint provider name */my_app,/* Tracepoint class name */my_class,/* Tracepoint name */get_account,/* Input arguments */TP_ARGS(int,userid,size_t,len))TRACEPOINT_EVENT_INSTANCE(my_app,my_class,get_settings,TP_ARGS(int,userid,size_t,len))TRACEPOINT_EVENT_INSTANCE(my_app,my_class,get_transaction,TP_ARGS(int,userid,size_t,len))
Assigning different levels of severity to tracepoint definitions can
be useful: when you create an event rule,
you can target tracepoints having a log level as severe as a specific
value.
The concept of LTTng-UST log levels is similar to the levels found
in typical logging frameworks:
In a logging framework, the log level is given by the function
or method name you use at the log statement site: debug(),
info(), warn(), error(), and so on.
In LTTng-UST, you statically assign the log level to a tracepoint
definition; any tracepoint() macro invocation which refers to
this definition has this log level.
A tracepoint provider package source file is a C source file which
includes a tracepoint provider header file to expand its
macros into event serialization and other functions.
Use the following tracepoint provider package source file template:
Tracepoint provider package source file template.
#define TRACEPOINT_CREATE_PROBES#include"tp.h"
Replace tp.h with the name of your tracepoint provider header file name. You may also include more than one tracepoint
provider header file here to create a tracepoint provider package
holding more than one tracepoint providers.
The tracepoint() macro takes at least two parameters: the tracepoint
provider name and the tracepoint name. The corresponding tracepoint
definition defines the other parameters.
Example:tracepoint() usage.
The following tracepoint definition defines a
tracepoint which takes two input arguments and has two output event
fields.
Note how the source code of the application includes
the tracepoint provider header file containing the tracepoint
definitions to use, tp.h.
Example:tracepoint() usage with a complex tracepoint definition.
Consider this complex tracepoint definition, where multiple event
fields refer to the same input arguments in their argument expression
parameter:
Tracepoint provider header file.
/* For `struct stat` */#include<sys/types.h>#include<sys/stat.h>#include<unistd.h>TRACEPOINT_EVENT(my_provider,my_tracepoint,TP_ARGS(int,my_int_arg,char*,my_str_arg,structstat*,st),TP_FIELDS(ctf_integer(int,my_constant_field,23+17)ctf_integer(int,my_int_arg_field,my_int_arg)ctf_integer(int,my_int_arg_field2,my_int_arg*my_int_arg)ctf_integer(int,sum4_field,my_str_arg[0]+my_str_arg[1]+my_str_arg[2]+my_str_arg[3])ctf_string(my_str_arg_field,my_str_arg)ctf_integer_hex(off_t,size_field,st->st_size)ctf_float(double,size_dbl_field,(double)st->st_size)ctf_sequence_text(char,half_my_str_arg_field,my_str_arg,size_t,strlen(my_str_arg)/2)))
Refer to this tracepoint definition with the tracepoint() macro in
the source code of your application like this:
If you look at the event record that LTTng writes when tracing this
program, assuming the file size of /etc/fstab is 301 bytes,
it should look like this:
Event record fields
Field name
Field value
my_constant_field
40
my_int_arg_field
23
my_int_arg_field2
529
sum4_field
389
my_str_arg_field
Hello, World!
size_field
0x12d
size_dbl_field
301.0
half_my_str_arg_field
Hello,
Sometimes, the arguments you pass to tracepoint() are expensive to
compute—they use the call stack, for example. To avoid this computation
when the tracepoint is disabled, use the tracepoint_enabled() and
do_tracepoint() macros.
The syntax of the tracepoint_enabled() and do_tracepoint() macros
is:
tracepoint_enabled() and do_tracepoint() macros syntax.
tracepoint_enabled() returns a non-zero value if the tracepoint named
tracepoint_name from the provider named provider_name is enabled
at run time.
do_tracepoint() is like tracepoint(), except that it doesn’t check
if the tracepoint is enabled. Using tracepoint() with
tracepoint_enabled() is dangerous since tracepoint() also contains
the tracepoint_enabled() check, thus a race condition is
possible in this situation:
Possible race condition when using tracepoint_enabled() with tracepoint().
If the tracepoint is enabled after the condition, then stuff isn’t
prepared: the emitted event will either contain wrong data, or the whole
application could crash (segmentation fault, for example).
Note:Neither tracepoint_enabled() nor do_tracepoint() have an
STAP_PROBEV() call. If you need it, you must emit
this call yourself.
Once you have one or more tracepoint provider header files and a tracepoint provider package source file,
create the tracepoint provider package by compiling its source
file. From here, multiple build and run scenarios are possible. The
following table shows common application and library configurations
along with the required command lines to achieve them.
In the following diagrams, we use the following file names:
app
Executable application.
app.o
Application object file.
tpp.o
Tracepoint provider package object file.
tpp.a
Tracepoint provider package archive file.
libtpp.so
Tracepoint provider package shared object file.
emon.o
User library object file.
libemon.so
User library shared object file.
We use the following symbols in the diagrams of table below:
We assume that . is part of the LD_LIBRARY_PATH environment
variable in the following instructions.
Common tracepoint provider package scenarios.
Scenario
Instructions
The instrumented application is statically linked with
the tracepoint provider package object.
To build the tracepoint provider package object file:
The --export-dynamic option passed to the linker is necessary for the
dynamically loaded library to “see” the tracepoint symbols defined in
the application.
If your instrumented application calls fork(2), clone(2),
or BSD’s rfork(2), without a following exec(3)-family
system call, you must preload the liblttng-ust-fork.so shared
object when you start the application.
$
LD_PRELOAD=liblttng-ust-fork.so ./my-app
If your tracepoint provider package is
a shared library which you also preload, you must put both
shared objects in LD_PRELOAD:
If your instrumented application closes one or more file descriptors
which it did not open itself, you must preload the
liblttng-ust-fd.so shared object when you start the application:
$
LD_PRELOAD=liblttng-ust-fd.so ./my-app
Typical use cases include closing all the file descriptors after
fork(2) or rfork(2) and buggy applications doing
“double closes”.
On some distributions, LTTng-UST ships with a
pkg-config
metadata file. If this is your case, then use pkg-config to
build an application on the command line:
In order to trace a 32-bit application running on a 64-bit system,
LTTng must use a dedicated 32-bit
consumer daemon.
The following steps show how to build and install a 32-bit consumer
daemon, which is not part of the default 64-bit LTTng build, how to
build and install the 32-bit LTTng-UST libraries, and how to build and
link an instrumented 32-bit application in that context.
To build a 32-bit instrumented application for a 64-bit target system,
assuming you have a fresh target system with no installed Userspace RCU
or LTTng packages:
Download, build, and install a 32-bit version of Userspace RCU:
$
cd $(mktemp -d) &&
wget http://lttng.org/files/urcu/userspace-rcu-latest-0.9.tar.bz2 &&
tar -xf userspace-rcu-latest-0.9.tar.bz2 &&
cd userspace-rcu-0.9.* &&
./configure --libdir=/usr/local/lib32 CFLAGS=-m32 &&
make &&
sudo make install &&
sudo ldconfig
Using the package manager of your distribution, or from source,
install the following 32-bit versions of the following dependencies of
LTTng-tools and LTTng-UST:
Download, build, and install a 32-bit version of the latest
LTTng-UST 2.12:
$
cd $(mktemp -d) &&
wget http://lttng.org/files/lttng-ust/lttng-ust-latest-2.12.tar.bz2 &&
tar -xf lttng-ust-latest-2.12.tar.bz2 &&
cd lttng-ust-2.12.* &&
./configure --libdir=/usr/local/lib32 \
CFLAGS=-m32 CXXFLAGS=-m32 \
LDFLAGS='-L/usr/local/lib32 -L/usr/lib32' &&
make &&
sudo make install &&
sudo ldconfig
Note:Depending on your distribution,
32-bit libraries could be installed at a different location than
/usr/lib32. For example, Debian is known to install
some 32-bit libraries in /usr/lib/i386-linux-gnu.
In this case, make sure to set LDFLAGS to all the
relevant 32-bit library paths, for example:
$
LDFLAGS='-L/usr/lib/i386-linux-gnu -L/usr/lib32'
Download the latest LTTng-tools 2.12, build, and install
the 32-bit consumer daemon:
$
cd $(mktemp -d) &&
wget http://lttng.org/files/lttng-tools/lttng-tools-latest-2.12.tar.bz2 &&
tar -xf lttng-tools-latest-2.12.tar.bz2 &&
cd lttng-tools-2.12.* &&
./configure --libdir=/usr/local/lib32 CFLAGS=-m32 CXXFLAGS=-m32 \
LDFLAGS='-L/usr/local/lib32 -L/usr/lib32' \
--disable-bin-lttng --disable-bin-lttng-crash \
--disable-bin-lttng-relayd --disable-bin-lttng-sessiond &&
make &&
cd src/bin/lttng-consumerd &&
sudo make install &&
sudo ldconfig
From your distribution or from source,
install the 64-bit versions of
LTTng-UST and Userspace RCU.
Download, build, and install the 64-bit version of the
latest LTTng-tools 2.12:
$
cd $(mktemp -d) &&
wget http://lttng.org/files/lttng-tools/lttng-tools-latest-2.12.tar.bz2 &&
tar -xf lttng-tools-latest-2.12.tar.bz2 &&
cd lttng-tools-2.12.* &&
./configure --with-consumerd32-libdir=/usr/local/lib32 \
--with-consumerd32-bin=/usr/local/lib32/lttng/libexec/lttng-consumerd &&
make &&
sudo make install &&
sudo ldconfig
Pass the following options to gcc(1), g++(1), or clang(1)
when linking your 32-bit application:
tracef(3) is a small LTTng-UST API designed for quick,
printf(3)-like instrumentation without the burden of
creating and
building
a tracepoint provider package.
To use tracef() in your application:
In the C or C++ source files where you need to use tracef(),
include <lttng/tracef.h>:
#include<lttng/tracef.h>
In the source code of the application, use tracef() like you would
use printf(3):
Important:The tracef() utility function was developed to make user space tracing
super simple, albeit with notable disadvantages compared to
user-defined tracepoints:
All the emitted events have the same tracepoint provider and
tracepoint names, respectively lttng_ust_tracef and event.
There is no static type checking.
The only event record field you actually get, named msg, is a string
potentially containing the values you passed to tracef()
using your own format string. This also means that you can’t filter
events with a custom expression at run time because there are no
isolated fields.
Since tracef() uses the vasprintf(3) function of the
C standard library behind the scenes to format the strings at run
time, its expected performance is lower than with user-defined
tracepoints, which don’t require a conversion to a string.
Taking this into consideration, tracef() is useful for some quick
prototyping and debugging, but you shouldn’t consider it for any
permanent and serious applicative instrumentation.
The liblttng-ust-libc-wrapper.so and
liblttng-ust-pthread-wrapper.so helpers
add instrumentation to some C standard library and POSIX
threads functions.
Functions instrumented by preloading liblttng-ust-libc-wrapper.so.
When you preload the shared object, it replaces the functions listed
in the previous tables by wrappers which contain tracepoints and call
the replaced functions.
The liblttng-ust-cyg-profile*.so helpers can add instrumentation
to the entry and exit points of functions.
gcc(1) and clang(1) have an option named
-finstrument-functions
which generates instrumentation calls for entry and exit to functions.
The LTTng-UST function tracing helpers,
liblttng-ust-cyg-profile.so and
liblttng-ust-cyg-profile-fast.so, take advantage of this feature
to add tracepoints to the two generated functions (which contain
cyg_profile in their names, hence the name of the helper).
To use the LTTng-UST function tracing helper, the source files to
instrument must be built using the -finstrument-functions compiler
flag.
There are two versions of the LTTng-UST function tracing helper:
liblttng-ust-cyg-profile-fast.so is a lightweight variant
that you should only use when it can be guaranteed that the
complete event stream is recorded without any lost event record.
Any kind of duplicate information is left out.
Assuming no event record is lost, having only the function addresses on
entry is enough to create a call graph, since an event record always
contains the ID of the CPU that generated it.
Use a tool like addr2line(1) to convert function addresses back to
source file names and line numbers.
liblttng-ust-cyg-profile.so is a more robust variant
which also works in use cases where event records might get discarded or
not recorded from application startup.
In these cases, the trace analyzer needs more information to be
able to reconstruct the program flow.
All the tracepoints that this helper provides have the
log level TRACE_DEBUG_FUNCTION (see lttng-ust(3)).
Tip:It’s sometimes a good idea to limit the number of source files that
you compile with the -finstrument-functions option to prevent LTTng
from writing an excessive amount of trace data at run time. When using
gcc(1), use the
-finstrument-functions-exclude-function-list option to avoid
instrument entries and exits of specific function names.
Apache log4j 1.2, since
LTTng 2.6. Note that Apache Log4j 2 isn’t supported.
Note that the methods described below are new in LTTng 2.8.
Previous LTTng versions use another technique.
Note:We use OpenJDK 8 for development
and continuous integration, thus this version is
directly supported. However, the LTTng-UST Java agent is also tested
with OpenJDK 7.
This isn’t strictly necessary, but it is recommended for a clean
disposal of the resources of the handler.
Include the common and JUL-specific JAR files of the LTTng-UST Java agent,
lttng-ust-agent-common.jar and lttng-ust-agent-jul.jar,
in the
class
path when you build the Java application.
The JAR files are typically located in /usr/share/java.
Important:The LTTng-UST Java agent must be
installed for the logging framework your
application uses.
Example:Use the LTTng-UST Java agent for java.util.logging.
Test.java
importjava.io.IOException;importjava.util.logging.Handler;importjava.util.logging.Logger;importorg.lttng.ust.agent.jul.LttngLogHandler;publicclassTest{privatestaticfinalintanswer=42;publicstaticvoidmain(String[]argv)throwsException{// Create a loggerLoggerlogger=Logger.getLogger("jello");// Create an LTTng-UST log handlerHandlerlttngUstLogHandler=newLttngLogHandler();// Add the LTTng-UST log handler to our loggerlogger.addHandler(lttngUstLogHandler);// Log at will!logger.info("some info");logger.warning("some warning");Thread.sleep(500);logger.finer("finer information; the answer is "+answer);Thread.sleep(123);logger.severe("error!");// Not mandatory, but cleanerlogger.removeHandler(lttngUstLogHandler);lttngUstLogHandler.close();}}
This isn’t strictly necessary, but it is recommended for a clean
disposal of the resources of the appender.
Include the common and log4j-specific JAR
files of the LTTng-UST Java agent, lttng-ust-agent-common.jar and
lttng-ust-agent-log4j.jar, in the
class
path when you build the Java application.
The JAR files are typically located in /usr/share/java.
Important:The LTTng-UST Java agent must be
installed for the logging framework your
application uses.
Example:Use the LTTng-UST Java agent for Apache log4j.
Test.java
importorg.apache.log4j.Appender;importorg.apache.log4j.Logger;importorg.lttng.ust.agent.log4j.LttngLogAppender;publicclassTest{privatestaticfinalintanswer=42;publicstaticvoidmain(String[]argv)throwsException{// Create a loggerLoggerlogger=Logger.getLogger("jello");// Create an LTTng-UST log appenderAppenderlttngUstLogAppender=newLttngLogAppender();// Add the LTTng-UST log appender to our loggerlogger.addAppender(lttngUstLogAppender);// Log at will!logger.info("some info");logger.warn("some warning");Thread.sleep(500);logger.debug("debug information; the answer is "+answer);Thread.sleep(123);logger.fatal("error!");// Not mandatory, but cleanerlogger.removeAppender(lttngUstLogAppender);lttngUstLogAppender.close();}}
Build this example ($LOG4JPATH is the path to the Apache log4j JAR
file):
A Java application-specific context field is a piece of state provided
by the application which you can add, using the
lttng-add-context(1) command, to each event record
produced by the log statements of this application.
For example, a given object might have a current request ID variable.
You can create a context information retriever for this object and
assign a name to this current request ID. You can then, using the
lttng-add-context(1) command, add this context field by name to
the JUL or log4j channel.
To provide application-specific context fields in a Java application:
In the source code of the Java application, import the LTTng-UST
Java agent context classes and interfaces:
This retrieveContextInfo() method is the only member of the
IContextInfoRetriever interface. Its role is to return the current
value of a state by name to create a context field. The names of the
context fields and which state variables they return depends on your
specific scenario.
All primitive types and objects are supported as context fields.
When retrieveContextInfo() returns an object, the context field
serializer calls its toString() method to add a string field to
event records. The method can also return null, which means that
no context field is available for the required name.
Register an instance of your context information retriever class to
the context information manager singleton:
This isn’t strictly necessary, but it is recommended for a clean
disposal of some resources of the manager.
Build your Java application with LTTng-UST Java agent support as
usual, following the procedure for either the JUL or
Apache log4j framework.
Example:Provide application-specific context fields in a Java application.
Test.java
importjava.util.logging.Handler;importjava.util.logging.Logger;importorg.lttng.ust.agent.jul.LttngLogHandler;importorg.lttng.ust.agent.context.ContextInfoManager;importorg.lttng.ust.agent.context.IContextInfoRetriever;publicclassTest{// Our context information retriever classprivatestaticclassMyContextInfoRetrieverimplementsIContextInfoRetriever{@OverridepublicObjectretrieveContextInfo(Stringkey){if(key.equals("intCtx")){return(short)17;}elseif(key.equals("strContext")){return"context value!";}else{returnnull;}}}privatestaticfinalintanswer=42;publicstaticvoidmain(Stringargs[])throwsException{// Get the context information manager instanceContextInfoManagercim=ContextInfoManager.getInstance();// Create and register our context information retrieverIContextInfoRetrievercir=newMyContextInfoRetriever();cim.registerContextInfoRetriever("myRetriever",cir);// Create a loggerLoggerlogger=Logger.getLogger("jello");// Create an LTTng-UST log handlerHandlerlttngUstLogHandler=newLttngLogHandler();// Add the LTTng-UST log handler to our loggerlogger.addHandler(lttngUstLogHandler);// Log at will!logger.info("some info");logger.warning("some warning");Thread.sleep(500);logger.finer("finer information; the answer is "+answer);Thread.sleep(123);logger.severe("error!");// Not mandatory, but cleanerlogger.removeHandler(lttngUstLogHandler);lttngUstLogHandler.close();cim.unregisterContextInfoRetriever("myRetriever");}}
You can instrument a Python 2 or Python 3 application which
uses the standard
logging package.
Each log statement emits an LTTng event once the
application module imports the
LTTng-UST Python agent package.
To use the LTTng-UST Python agent:
In the source code of the Python application, import the LTTng-UST
Python agent:
importlttngust
The LTTng-UST Python agent automatically adds its logging handler to the
root logger at import time.
Any log statement that the application executes before this import does
not emit an LTTng event.
Important:The LTTng-UST Python agent must be
installed.
Use log statements and logging configuration as usual.
Since the LTTng-UST Python agent adds a handler to the root
logger, you can trace any log statement from any logger.
Note:logging.basicConfig(), which adds to the root logger a basic
logging handler which prints to the standard error stream, isn’t
strictly required for LTTng-UST tracing to work, but in versions of
Python preceding 3.2, you could see a warning message which indicates
that no handler exists for the logger my-logger.
When an application imports the LTTng-UST Python agent, the agent tries
to register to a session daemon. Note that you must
start the session daemonbefore you run the Python
application. If a session daemon is found, the agent tries to register
to it during five seconds, after which the application continues
without LTTng tracing support. Override this timeout value with
the LTTNG_UST_PYTHON_REGISTER_TIMEOUT environment variable
(milliseconds).
If the session daemon stops while a Python application with an imported
LTTng-UST Python agent runs, the agent retries to connect and to
register to a session daemon every three seconds. Override this
delay with the LTTNG_UST_PYTHON_REGISTER_RETRY_DELAY environment
variable.
The lttng-tracer Linux kernel module, part of
LTTng-modules, creates the special LTTng logger files
/proc/lttng-logger and /dev/lttng-logger (since
LTTng 2.11) when it’s loaded. Any application can write text data
to any of those files to emit an LTTng event.
The LTTng logger is the quickest method—not the most efficient,
however—to add instrumentation to an application. It is designed
mostly to instrument shell scripts:
$
echo "Some message, some $variable" > /dev/lttng-logger
Any event that the LTTng logger emits is named lttng_logger and
belongs to the Linux kernel tracing domain. However, unlike
other instrumentation points in the kernel tracing domain, any Unix
user can create an event rule which
matches its event name, not only the root user or users in the
tracing group.
To use the LTTng logger:
From any application, write text data to the /dev/lttng-logger
file.
The msg field of lttng_logger event records contains the
recorded message.
Note:The maximum message length of an LTTng logger event is
1024 bytes. Writing more than this makes the LTTng logger emit more
than one event to contain the remaining data.
You shouldn’t use the LTTng logger to trace a user application which
can be instrumented in a more efficient way, namely:
Note:This section shows how to add instrumentation points to the
Linux kernel. The subsystems of the kernel are already thoroughly
instrumented at strategic places for LTTng when you
install the LTTng-modules
package.
The following procedure assumes that your ftrace tracepoints are
correctly defined in their own header and that they are created in
one source file using the CREATE_TRACE_POINTS definition.
To add an LTTng layer over an existing ftrace tracepoint:
Make sure the following kernel configuration options are
enabled:
CONFIG_MODULES
CONFIG_KALLSYMS
CONFIG_HIGH_RES_TIMERS
CONFIG_TRACEPOINTS
Build the Linux source tree with your custom ftrace tracepoints.
Boot the resulting Linux image on your target system.
Confirm that the tracepoints exist by looking for their names in the
/sys/kernel/debug/tracing/events/subsys directory, where subsys
is your subsystem name.
Get a copy of the latest LTTng-modules 2.12:
$
cd $(mktemp -d) &&
wget http://lttng.org/files/lttng-modules/lttng-modules-latest-2.12.tar.bz2 &&
tar -xf lttng-modules-latest-2.12.tar.bz2 &&
cd lttng-modules-2.12.*
In instrumentation/events/lttng-module, relative to the root
of the LTTng-modules source tree, create a header file named
subsys.h for your custom subsystem subsys and write your
LTTng-modules tracepoint definitions using the LTTng-modules
macros in it.
Start with this template:
instrumentation/events/lttng-module/my_subsys.h
#undef TRACE_SYSTEM#define TRACE_SYSTEM my_subsys#if !defined(_LTTNG_MY_SUBSYS_H) || defined(TRACE_HEADER_MULTI_READ)#define _LTTNG_MY_SUBSYS_H#include"../../../probes/lttng-tracepoint-event.h"#include<linux/tracepoint.h>LTTNG_TRACEPOINT_EVENT(/* * Format is identical to the TRACE_EVENT() version for the three * following macro parameters: */my_subsys_my_event,TP_PROTO(intmy_int,constchar*my_string),TP_ARGS(my_int,my_string),/* LTTng-modules specific macros */TP_FIELDS(ctf_integer(int,my_int_field,my_int)ctf_string(my_bar_field,my_bar)))#endif /* !defined(_LTTNG_MY_SUBSYS_H) || defined(TRACE_HEADER_MULTI_READ) */#include"../../../probes/define_trace.h"
The entries in the TP_FIELDS() section are the list of fields for the
LTTng tracepoint. This is similar to the TP_STRUCT__entry() part of
the TRACE_EVENT() ftrace macro.
make KERNELDIR=/path/to/linux
make modules_install && depmod -a
Replace /path/to/linux with the path to the Linux source tree where
you defined and used tracepoints with the TRACE_EVENT() ftrace macro.
Note that you can also use the
LTTNG_TRACEPOINT_EVENT_CODE() macro
instead of LTTNG_TRACEPOINT_EVENT() to use custom local variables and
C code that need to be executed before the event fields are recorded.
The best way to learn how to use the previous LTTng-modules macros is to
inspect the existing LTTng-modules tracepoint definitions in the
instrumentation/events/lttng-module header files. Compare them
with the Linux kernel mainline versions in the
include/trace/events directory of the Linux source tree.
Although we recommended to always use the
LTTNG_TRACEPOINT_EVENT() macro to describe
the arguments and fields of an LTTng-modules tracepoint when possible,
sometimes you need a more complex process to access the data that the
tracer records as event record fields. In other words, you need local
variables and multiple C statements instead of simple
argument-based expressions that you pass to the
ctf_*() macros of TP_FIELDS().
Use the LTTNG_TRACEPOINT_EVENT_CODE() macro instead of
LTTNG_TRACEPOINT_EVENT() to declare custom local variables and define
a block of C code to be executed before LTTng records the fields.
The structure of this macro is:
LTTNG_TRACEPOINT_EVENT_CODE() macro syntax.
LTTNG_TRACEPOINT_EVENT_CODE(/* * Format identical to the LTTNG_TRACEPOINT_EVENT() * version for the following three macro parameters: */my_subsys_my_event,TP_PROTO(intmy_int,constchar*my_string),TP_ARGS(my_int,my_string),/* Declarations of custom local variables */TP_locvar(inta=0;unsignedlongb=0;constchar*name="(undefined)";structmy_struct*my_struct;),/* * Custom code which uses both tracepoint arguments * (in TP_ARGS()) and local variables (in TP_locvar()). * * Local variables are actually members of a structure pointed * to by the special variable tp_locvar. */TP_code(if(my_int){tp_locvar->a=my_int+17;tp_locvar->my_struct=get_my_struct_at(tp_locvar->a);tp_locvar->b=my_struct_compute_b(tp_locvar->my_struct);tp_locvar->name=my_struct_get_name(tp_locvar->my_struct);put_my_struct(tp_locvar->my_struct);if(tp_locvar->b){tp_locvar->a=1;}}),/* * Format identical to the LTTNG_TRACEPOINT_EVENT() * version for this, except that tp_locvar members can be * used in the argument expression parameters of * the ctf_*() macros. */TP_FIELDS(ctf_integer(unsignedlong,my_struct_b,tp_locvar->b)ctf_integer(int,my_struct_a,tp_locvar->a)ctf_string(my_string_field,my_string)ctf_string(my_struct_name,tp_locvar->name)))
Important:The C code defined in TP_code() must not have any side
effects when executed. In particular, the code must not allocate
memory or get resources without deallocating this memory or putting
those resources afterwards.
Note:In the following subsections, we refer to an lttng(1) command
using its man page name. For example, instead of Run the create
command to…, we use Run the lttng-create(1) command to….
You will see the following error when you run a command while no session
daemon is running:
Error: No session daemon is available
The only command that automatically runs a session daemon is
lttng-create(1), which you use to
create a tracing session. While
this is most of the time the first operation that you do, sometimes it’s
not. Some examples are:
Each Unix user must have its own running session
daemon to trace user applications. The session daemon that the root user
starts is the only one allowed to control the LTTng kernel tracer. Users
that are part of the tracing group can control the root session
daemon. The default tracing group name is tracing; set it to something
else with the --group option when you start the
root session daemon.
In both cases, remove the --daemonize option to
start the session daemon in foreground.
To stop a session daemon, use kill(1) on its process ID (standard
TERM signal).
Note that some Linux distributions could manage the LTTng session daemon
as a service. In this case, you should use the service manager to
start, restart, and stop session daemons.
Replace my-session with the specific tracing session name.
LTTng appends the creation date to the name of the created tracing
session.
LTTng writes the traces of a tracing session in
$LTTNG_HOME/lttng-trace/name by default, where name is the
name of the tracing session. Note that the LTTNG_HOME environment
variable defaults to $HOME if not set.
When you create a tracing session, it is set as
the current tracing session. The following lttng(1) commands
operate on the current tracing session when you don’t specify one:
Replace new-session by the name of the new current tracing session.
When you’re done tracing in a given tracing session, destroy it. This
operation frees the resources taken by the tracing session to destroy;
it doesn’t destroy the trace data that LTTng wrote for this tracing
session (see Clear a tracing session for one way to do this).
The session daemon can query the running instrumented
user applications and the Linux kernel to get a list of available
instrumentation points. For the Linux kernel tracing domain,
they are tracepoints and system calls. For the user space tracing
domain, they are tracepoints. For the other tracing domains, they are
logger names.
To list the available instrumentation points:
Use the lttng-list(1) command with the option of the requested
tracing domain amongst:
With the --probe,
--function, and
--userspace-probe options, this is a custom
name given to the event rule. With the JUL, log4j, and Python domains,
this is a logger name.
With a tracepoint, logger, or system call name, use the special
* globbing character to match anything (for example, sched_*,
my_comp*:*msg_*).
All.
One of:
--loglevel=LEVEL
--loglevel-only=LEVEL
Match only tracepoints or log statements with a logging level at
least as severe as LEVEL.
Match only tracepoints or log statements with a logging level
equal to LEVEL.
When you use a * character at the end of the tracepoint or logger
name (first positional argument), exclude the specific names in the
comma-delimited list EXCLUSIONS.
User space, JUL, log4j, and Python.
--filter=EXPR
Match only events which satisfy the expression EXPR.
You attach an event rule to a channel on creation. If you do
not specify the channel with the --channel
option, and if the event rule to create is the first in its
tracing domain for a given tracing session, then LTTng
creates a default channel for you. This default channel is reused in
subsequent invocations of the lttng-enable-event(1) command for the
same tracing domain.
An event rule is always enabled at creation time.
The following examples show how to combine the previous
command-line options to create simple to more complex event rules.
Example:Create an event rule targetting a Linux kernel tracepoint (default channel).
$
lttng enable-event --kernel sched_switch
Example:Create an event rule matching four Linux kernel system calls (default channel).
The event rules of a given channel form a whitelist: as soon as an
emitted event passes one of them, LTTng can record the event. For
example, an event named my_app:my_tracepoint emitted from a user space
tracepoint with a TRACE_ERROR log level passes both of the following
rules:
To disable an event rule that you created
previously, use the lttng-disable-event(1) command. This command
disables all the event rules (of a given tracing domain and channel)
which match an instrumentation point. The other conditions aren’t
supported as of LTTng 2.12.
The LTTng tracer doesn’t record an emitted event which passes
a disabled event rule.
Example:Disable an event rule matching a Python logger (default channel).
$
lttng disable-event --python my-logger
Example:Disable an event rule matching all java.util.logging loggers (default channel).
$
lttng disable-event --jul '*'
Example:Disable all the event rules of the default channel.
The --all-events option isn’t, like the
--all option of lttng-enable-event(1), the
equivalent of the event name * (wildcard): it disables all the event
rules of a given channel.
$
lttng disable-event --jul --all-events
Note:You can’t delete an event rule once you create it.
LTTng is very flexible: you can launch user applications before
or after the you start the tracers. The tracers only record the events
if they pass enabled event rules and if they occur while the tracers are
started.
You might need to remove all the current tracing data of one or more
tracing sessions between multiple attempts to
reproduce a problem without interrupting the LTTng tracing activity.
Note that LTTng automatically creates a default channel when, for a
given tracing domain, no channels exist and you
create the first event rule. This default
channel is named channel0 and its attributes are set to reasonable
values. Therefore, you only need to create a channel when you need
non-default attributes.
You specify each non-default channel attribute with a command-line
option when you use the lttng-enable-channel(1) command. The
available command-line options are:
Set the timeout of user space applications which load LTTng-UST
in blocking mode to TIMEOUTUS:
0 (default)
Never block (non-blocking mode).
inf
Block forever until space is available in a sub-buffer to record
the event.
n, a positive value
Wait for at most n µs when trying to write into a sub-buffer.
Note that, for this option to have any effect on an instrumented
user space application, you need to run the application with a set
LTTNG_UST_ALLOW_BLOCKING environment variable.
--output=TYPE (Linux kernel tracing domain only)
Set the output type of the channel to TYPE, either mmap or
splice.
You can only create a channel in the Linux kernel and user space
tracing domains: other tracing domains have their own channel
created on the fly when creating event rules.
Important:Because of a current LTTng limitation, you must create all channels
before you start tracing in a given
tracing session, that is, before the first time you run
lttng-start(1).
Since LTTng automatically creates a default channel when you use the
lttng-enable-event(1) command with a specific tracing domain, you
can’t, for example, create a Linux kernel event rule, start tracing,
and then create a user space event rule, because no user space channel
exists yet and it’s too late to create one.
For this reason, make sure to configure your channels properly
before starting the tracers for the first time!
The following examples show how to combine the previous
command-line options to create simple to more complex channels.
Example:Create a Linux kernel channel with default attributes.
$
lttng enable-channel --kernel my-channel
Example:Create a user space channel with four sub-buffers or 1 MiB each, per CPU, per instrumented process.
The state of a channel precedes the individual states of event rules
attached to it: event rules which belong to a disabled channel, even if
they are enabled, are also considered disabled.
Event record fields in trace files provide important information about
events that occured previously, but sometimes some external context may
help you solve a problem faster.
Examples of context fields are:
The process ID, thread ID, process name, and
process priority of the thread in which the event occurs.
The hostname of the system on which the event occurs.
The Linux kernel and user call stacks (since
LTTng 2.11).
The current values of many possible performance counters using
perf, for example:
CPU cycles, stalled cycles, idle cycles, and the other cycle types.
Cache misses.
Branch instructions, misses, and loads.
CPU faults.
Any context defined at the application level (supported for the
JUL and log4j tracing domains).
To get the full list of available context fields, see
lttng add-context --list. Some context fields are reserved for a
specific tracing domain (Linux kernel or user space).
You add context fields to channels. All the events
that a channel with added context fields records contain those fields.
To add context fields to one or all the channels of a given tracing
session:
Example:Add context fields to all the channels of the current tracing session.
The following command line adds the virtual process identifier and
the per-thread CPU cycles count fields to all the user space channels
of the
current tracing session.
The following command line adds the thread identifier and user call
stack context fields to the Linux kernel channel named my-channel in
the current tracing session.
Example:Add an application-specific context field to a specific channel.
The following command line adds the cur_msg_id context field of the
retriever context retriever for all the instrumented
Java applications recording event records
in the channel named my-channel:
It’s often useful to only allow processes with specific attributes to
emit events. For example, you may wish to record all the system calls
which a given process makes (à la
strace).
The lttng-track(1) and lttng-untrack(1) commands serve this
purpose. Both commands operate on inclusion sets of process attribute
values. The available process attribute types are:
Each tracing domain has one inclusion set per process attribute type:
the Linux kernel tracing domain has six while the user space tracing
domain has three.
For a given event which passes an enabled event rule to be
recorded, all the attributes of its executing process must be part of
the inclusion sets of the tracing domain of the event rule.
Add entries to an inclusion set with the lttng-track(1) command and
remove entries with the lttng-untrack(1) command. A process
attribute is tracked when it’s part of an inclusion set and
untracked otherwise.
Note:The process attribute values are numeric.
Should a process with a given tracked process ID, for example, exit, and
then a new process be given this ID, then the latter would also be
allowed to emit events.
With the lttng track command, you can add Unix user and group names
to the user and group inclusion sets: the session daemon finds the corresponding UID, VUID, GID, or VGID once on
addition to the inclusion set. This means that if you rename the user
or group after you run lttng track, its user/group ID remains tracked.
Example:Track and untrack virtual process IDs.
For the sake of the following example, assume the target system has
16 possible VPIDs.
When you
create a tracing session,
the user space VPID inclusion set contains all the possible VPIDs:
When the inclusion set is full and you use the lttng-track(1)
command to specify some VPIDs to track, LTTng first clears the inclusion
set, and then it adds the specific VPIDs to track. After:
$
lttng track --userspace --vpid=3,4,7,10,13
the VPID inclusion set is:
Add more VPIDs to the inclusion set afterwards:
$
lttng track --userspace --vpid=1,15,16
The result is:
The lttng-untrack(1) command removes entries from process attribute
inclusion sets. Given the previous example, the following command:
$
lttng untrack --userspace --vpid=3,7,10,13
leads to this VPID inclusion set:
LTTng can track all the possible VPIDs again using the
--all option:
$
lttng track --userspace --vpid --all
The result is, again:
Example:Track only specific process attributes.
A typical use case with process attribute tracking is to start with an
empty inclusion set, then start the tracers, and then add entries manually while the tracers are active.
If you trace with this inclusion set configuration, the LTTng kernel
tracer records no events within the current tracing session because it doesn’t track any UID. Use the
lttng-track(1) command as usual to track specific UIDs when you need
to, for example:
If you use LTTng to solve real world problems, chances are you have to
record events using the same tracing session setup over and over,
modifying a few variables each time in your instrumented program
or environment. To avoid constant tracing session reconfiguration,
the lttng(1) command-line tool can save and load tracing session
configurations to/from XML files.
Replace my-session with the name of the tracing session to save.
LTTng saves tracing session configurations to
$LTTNG_HOME/.lttng/sessions by default. Note that the
LTTNG_HOME environment variable defaults to $HOME if not set. Use
the --output-path option to change this destination
directory.
LTTng saves all configuration parameters, for example:
The tracing session name.
The trace data output path.
The channels with their state and all their attributes.
The context fields you added to channels.
The event rules with their state, log level and filter conditions.
Replace my-session with the name of the tracing session to load.
When LTTng loads a configuration, it restores your saved tracing session
as if you just configured it manually.
See lttng-load(1) for the complete list of command-line options. You
can also save and load many sessions at a time, and decide in which
directory to output the XML files.
Replace remote-system by the host name or IP address of the
remote system. See lttng-create(1) for the exact URL format.
On the target system, use the lttng(1) command-line tool as usual.
When tracing is active, the consumer daemon of the target sends
sub-buffers to the relay daemon running on the remote system instead
of flushing them to the local file system. The relay daemon writes the
received packets to the local file system.
The relay daemon writes trace files to
$LTTNG_HOME/lttng-traces/hostname/session by default, where
hostname is the host name of the target system and session
is the tracing session name. Note that the LTTNG_HOME environment
variable defaults to $HOME if not set. Use the
--output option of lttng-relayd(8) to write
trace files to another base directory.
LTTng live is a network protocol implemented by the relay daemon (lttng-relayd(8)) to allow compatible trace viewers to
display events as LTTng emits them on the target system while tracing is
active.
The relay daemon creates a tee: it forwards the trace data to both
the local file system and to connected live viewers:
To use LTTng live:
On the target system, create a tracing session
in live mode:
$
lttng create my-session --live
This spawns a local relay daemon.
Start the live viewer and configure it to connect to the relay
daemon. For example, with
babeltrace2:
my-session with the name of the tracing session to view.
Configure the tracing session as usual with the lttng(1)
command-line tool, and start tracing.
List the available live tracing sessions with Babeltrace 2:
$
babeltrace2 net://localhost
You can start the relay daemon on another system. In this case, you need
to specify the URL of the relay daemon when you create the tracing
session with the --set-url option. You also need to
replace localhost in the procedure above with the host name of the
system on which the relay daemon is running.
The normal behavior of LTTng is to append full sub-buffers to growing
trace data files. This is ideal to keep a full history of the events
that occurred on the target system, but it can
represent too much data in some situations. For example, you may wish
to trace your application continuously until some critical situation
happens, in which case you only need the latest few recorded
events to perform the desired analysis, not multi-gigabyte trace files.
With the lttng-snapshot(1) command, you can take a snapshot of the
current sub-buffers of a given tracing session.
LTTng can write the snapshot to the local file system or send it over
the network.
If you wish to create unmanaged, self-contained, non-overlapping
trace chunk archives instead of a simple copy of the current
sub-buffers, see the tracing session rotation
feature (available since LTTng 2.11).
Configure the tracing session as usual with the lttng(1)
command-line tool, and start tracing.
Optional: When you need to take a snapshot,
stop tracing.
You can take a snapshot when the tracers are active, but if you stop
them first, you’re sure that the data in the sub-buffers doesn’t
change before you actually take the snapshot.
Take a snapshot:
$
lttng snapshot record --name=my-first-snapshot
LTTng writes the current sub-buffers of all the channels of the
current tracing session to
trace files on the local file system. Those trace files have
my-first-snapshot in their name.
There is no difference between the format of a normal trace file and the
format of a snapshot: viewers of LTTng traces also support LTTng
snapshots.
By default, LTTng writes snapshot files to the path shown by
lttng snapshot list-output. You can change this path or decide to send
snapshots over the network using either:
A snapshot output path or URL that you add using
lttng snapshot add-output.
An output path or URL that you provide directly to the
lttng snapshot record command.
Method 3 overrides method 2, which overrides method 1. When
you specify a URL, a relay daemon must listen on a remote system (see
Send trace data over the network).
The snapshot user guide shows how to dump the
current sub-buffers of a tracing session to the file system or send them
over the network. When you take a snapshot, LTTng doesn’t clear the ring
buffers of the tracing session: if you take another snapshot immediately
after, both snapshots could contain overlapping trace data.
Inspired by log rotation,
tracing session rotation is a feature which appends the content of the
ring buffers to what’s already on the file system or sent over the
network since the creation of the tracing session or since the last
rotation, and then clears those ring buffers to avoid trace data
overlaps.
What LTTng is about to write when performing a tracing session rotation
is called the current trace chunk. When this current trace chunk is
written to the file system or sent over the network, it becomes a trace
chunk archive. Therefore, a tracing session rotation archives the
current trace chunk.
A trace chunk archive is a self-contained LTTng trace which LTTng
doesn’t manage anymore: you can read it, modify it, move it, or remove
it.
There are two methods to perform a tracing session rotation: immediately
or with a rotation schedule.
To perform an immediate tracing session rotation:
Create a tracing session
in normal mode or network streaming mode
(only those two creation modes support tracing session rotation):
The lttng-rotate command prints the path to the created trace
chunk archive. See lttng-rotate(1) to learn about the format
of trace chunk archive directory names.
Perform other immediate rotations while the tracing session is
active. It is guaranteed that all the trace chunk archives don’t
contain overlapping trace data. You can also perform an immediate
rotation once you have stopped the
tracing session.
With any command of the lttng(1) command-line tool, set the
--mi option to xml (before the command name) to get an
XML machine interface output, for example:
$
lttng --mi=xml enable-event --kernel --syscall open
A schema definition (XSD) is
available
to ease the integration with external tools as much as possible.
An LTTng trace, which is a CTF trace, has both
data stream files and a metadata file. This metadata file contains,
amongst other things, information about the offset of the clock sources
used to timestamp event records when tracing.
If, once a tracing session is
started, a major
NTP correction
happens, the clock offset of the trace also needs to be updated. Use
the metadata item of the lttng-regenerate(1) command to do so.
The main use case of this command is to allow a system to boot with
an incorrect wall time and trace it with LTTng before its wall time
is corrected. Once the system is known to be in a state where its
wall time is correct, it can run lttng regenerate metadata.
The LTTng kernel and user space tracers generate state dump
event records when the application starts or when you
start a tracing session. An analysis
can use the state dump event records to set an initial state before it
builds the rest of the state from the following event records.
Trace Compass is a notable example of an
application which uses the state dump of an LTTng trace.
When you take a snapshot, it’s possible that the
state dump event records aren’t included in the snapshot because they
were recorded to a sub-buffer that has been consumed or overwritten
already.
Use the lttng regenerate statedump command to emit the state
dump event records again.
To regenerate the state dump of the current tracing session, provided
create it in snapshot mode, before you take a snapshot:
Depending on the event throughput, you should run steps 1 and 2
as closely as possible.
Note:To record the state dump events, you need to
create event rules which enable them.
LTTng-UST state dump tracepoints start with lttng_ust_statedump:.
LTTng-modules state dump tracepoints start with lttng_statedump_.
Non-volatile random-access memory
(NVRAM) is random-access memory that retains its information when power
is turned off (non-volatile). Systems with such memory can store data
structures in RAM and retrieve them after a reboot, without flushing
to typical storage.
Linux supports NVRAM file systems thanks to either
DAX + pmem
(requires Linux 4.1+) or PRAMFS (requires Linux < 4).
This section doesn’t describe how to operate such file systems;
we assume that you have a working persistent memory file system.
When you create a tracing session, you can specify
the path of the shared memory holding the sub-buffers. If you specify a
location on an NVRAM file system, then you can retrieve the latest
recorded trace data when the system reboots after a crash.
To record trace data on a persistent memory file system and retrieve the
trace data after a system crash:
Create a tracing session with a sub-buffer shared memory path located
on an NVRAM file system:
$
lttng create my-session --shm-path=/path/to/shm
Configure the tracing session as usual with the lttng(1)
command-line tool, and start tracing.
After a system crash, use the lttng-crash(1) command-line tool to
view the trace data recorded on the NVRAM file system:
$
lttng-crash /path/to/shm
The binary layout of the ring buffer files isn’t exactly the same as
the trace files layout. This is why you need to use lttng-crash(1)
instead of your preferred trace viewer directly.
To convert the ring buffer files to LTTng trace files:
With the C/C++ notification and trigger API of LTTng, your user
application can get notified when the buffer usage of one or more
channels becomes too low or too high. Use this API
and enable or disable event rules during tracing to avoid
discarded event records.
Example:Have a user application get notified when the buffer usage of an LTTng channel is too high.
In this example, we create and build an application which gets notified
when the buffer usage of a specific LTTng channel is higher than
75 %. We only print that it is the case in the example, but we
could as well use the API of liblttng-ctl to
disable event rules when this happens.
Create the C source file of application:
notif-app.c
#include<stdio.h>#include<assert.h>#include<lttng/domain.h>#include<lttng/action/action.h>#include<lttng/action/notify.h>#include<lttng/condition/condition.h>#include<lttng/condition/buffer-usage.h>#include<lttng/condition/evaluation.h>#include<lttng/notification/channel.h>#include<lttng/notification/notification.h>#include<lttng/trigger/trigger.h>#include<lttng/endpoint.h>intmain(intargc,char*argv[]){intexit_status=0;structlttng_notification_channel*notification_channel;structlttng_condition*condition;structlttng_action*action;structlttng_trigger*trigger;constchar*tracing_session_name;constchar*channel_name;assert(argc>=3);tracing_session_name=argv[1];channel_name=argv[2];/* * Create a notification channel. A notification channel * connects the user application to the LTTng session daemon. * This notification channel can be used to listen to various * types of notifications. */notification_channel=lttng_notification_channel_create(lttng_session_daemon_notification_endpoint);/* * Create a "high buffer usage" condition. In this case, the * condition is reached when the buffer usage is greater than or * equal to 75 %. We create the condition for a specific tracing * session name, channel name, and for the user space tracing * domain. * * The "low buffer usage" condition type also exists. */condition=lttng_condition_buffer_usage_high_create();lttng_condition_buffer_usage_set_threshold_ratio(condition,.75);lttng_condition_buffer_usage_set_session_name(condition,tracing_session_name);lttng_condition_buffer_usage_set_channel_name(condition,channel_name);lttng_condition_buffer_usage_set_domain_type(condition,LTTNG_DOMAIN_UST);/* * Create an action (get a notification) to take when the * condition created above is reached. */action=lttng_action_notify_create();/* * Create a trigger. A trigger associates a condition to an * action: the action is executed when the condition is reached. */trigger=lttng_trigger_create(condition,action);/* Register the trigger to LTTng. */lttng_register_trigger(trigger);/* * Now that we have registered a trigger, a notification will be * emitted everytime its condition is met. To receive this * notification, we must subscribe to notifications that match * the same condition. */lttng_notification_channel_subscribe(notification_channel,condition);/* * Notification loop. Put this in a dedicated thread to avoid * blocking the main thread. */for(;;){structlttng_notification*notification;enumlttng_notification_channel_statusstatus;conststructlttng_evaluation*notification_evaluation;conststructlttng_condition*notification_condition;doublebuffer_usage;/* Receive the next notification. */status=lttng_notification_channel_get_next_notification(notification_channel,¬ification);switch(status){caseLTTNG_NOTIFICATION_CHANNEL_STATUS_OK:break;caseLTTNG_NOTIFICATION_CHANNEL_STATUS_NOTIFICATIONS_DROPPED:/* * The session daemon can drop notifications if a monitoring * application isn't consuming the notifications fast * enough. */continue;caseLTTNG_NOTIFICATION_CHANNEL_STATUS_CLOSED:/* * The notification channel has been closed by the * session daemon. This is typically caused by a session * daemon shutting down. */gotoend;default:/* Unhandled conditions or errors. */exit_status=1;gotoend;}/* * A notification provides, amongst other things: * * * The condition that caused this notification to be * emitted. * * The condition evaluation, which provides more * specific information on the evaluation of the * condition. * * The condition evaluation provides the buffer usage * value at the moment the condition was reached. */notification_condition=lttng_notification_get_condition(notification);notification_evaluation=lttng_notification_get_evaluation(notification);/* We're subscribed to only one condition. */assert(lttng_condition_get_type(notification_condition)==LTTNG_CONDITION_TYPE_BUFFER_USAGE_HIGH);/* * Get the exact sampled buffer usage from the * condition evaluation. */lttng_evaluation_buffer_usage_get_usage_ratio(notification_evaluation,&buffer_usage);/* * At this point, instead of printing a message, we * could do something to reduce the buffer usage of the channel, * like disable specific events. */printf("Buffer usage is %f %% in tracing session \"%s\", ""user space channel \"%s\".\n",buffer_usage*100,tracing_session_name,channel_name);lttng_notification_destroy(notification);}end:lttng_action_destroy(action);lttng_condition_destroy(condition);lttng_trigger_destroy(trigger);lttng_notification_channel_destroy(notification_channel);returnexit_status;}
Build the notif-app application, linking it to liblttng-ctl:
If you create the channel manually with the lttng-enable-channel(1)
command, control how frequently LTTng samples the current values of the
channel properties to evaluate user conditions with the
--monitor-timer option.
Run the notif-app application. This program accepts the
tracing session name and the user space channel
name as its two first arguments. The channel which LTTng automatically
creates with the lttng-enable-event(1) command above is named
channel0:
$
./notif-app my-session channel0
In another terminal, run an application with a very high event
throughput so that the 75 % buffer usage condition is reached.
In the first terminal, the application should print lines like this:
Buffer usage is 81.45197 % in tracing session "my-session", user space
channel "channel0".
If you don’t see anything, try modifying the condition in
notif-app.c to a lower value (0.1, for example), rebuilding it
(step 2) and running it again (step 4).
Null-terminated string; undefined behavior if e is NULL.
n
Field name.
e
Argument expression.
ctf_array(t, n, e, s)
ctf_array_nowrite(t, n, e, s)
ctf_user_array(t, n, e, s)
ctf_user_array_nowrite(t, n, e, s)
Statically-sized array of integers.
t
Array element C type.
n
Field name.
e
Argument expression.
s
Number of elements.
ctf_array_bitfield(t, n, e, s)
ctf_array_bitfield_nowrite(t, n, e, s)
ctf_user_array_bitfield(t, n, e, s)
ctf_user_array_bitfield_nowrite(t, n, e, s)
Statically-sized array of bits.
The type of e must be an integer type. s is the number
of elements of such type in e, not the number of bits.
t
Array element C type.
n
Field name.
e
Argument expression.
s
Number of elements.
ctf_array_text(t, n, e, s)
ctf_array_text_nowrite(t, n, e, s)
ctf_user_array_text(t, n, e, s)
ctf_user_array_text_nowrite(t, n, e, s)
Statically-sized array, printed as text.
The string doesn’t need to be null-terminated.
t
Array element C type (always char).
n
Field name.
e
Argument expression.
s
Number of elements.
ctf_sequence(t, n, e, T, E)
ctf_sequence_nowrite(t, n, e, T, E)
ctf_user_sequence(t, n, e, T, E)
ctf_user_sequence_nowrite(t, n, e, T, E)
Dynamically-sized array of integers.
The type of E must be unsigned.
t
Array element C type.
n
Field name.
e
Argument expression.
T
Length expression C type.
E
Length expression.
ctf_sequence_hex(t, n, e, T, E)
ctf_user_sequence_hex(t, n, e, T, E)
Dynamically-sized array of integers, displayed in base 16.
The type of E must be unsigned.
t
Array element C type.
n
Field name.
e
Argument expression.
T
Length expression C type.
E
Length expression.
ctf_sequence_network(t, n, e, T, E)
Dynamically-sized array of integers in network byte order (big-endian),
displayed in base 10.
The type of E must be unsigned.
t
Array element C type.
n
Field name.
e
Argument expression.
T
Length expression C type.
E
Length expression.
ctf_sequence_bitfield(t, n, e, T, E)
ctf_sequence_bitfield_nowrite(t, n, e, T, E)
ctf_user_sequence_bitfield(t, n, e, T, E)
ctf_user_sequence_bitfield_nowrite(t, n, e, T, E)
Dynamically-sized array of bits.
The type of e must be an integer type. s is the number
of elements of such type in e, not the number of bits.
The type of E must be unsigned.
t
Array element C type.
n
Field name.
e
Argument expression.
T
Length expression C type.
E
Length expression.
ctf_sequence_text(t, n, e, T, E)
ctf_sequence_text_nowrite(t, n, e, T, E)
ctf_user_sequence_text(t, n, e, T, E)
ctf_user_sequence_text_nowrite(t, n, e, T, E)
Dynamically-sized array, displayed as text.
The string doesn’t need to be null-terminated.
The type of E must be unsigned.
The behaviour is undefined if e is NULL.
t
Sequence element C type (always char).
n
Field name.
e
Argument expression.
T
Length expression C type.
E
Length expression.
Use the _user versions when the argument expression, e, is
a user space address. In the cases of ctf_user_integer*() and
ctf_user_float*(), &e must be a user space address, thus e must
be addressable.
The _nowrite versions omit themselves from the session trace, but are
otherwise identical. This means the _nowrite fields won’t be written
in the recorded trace. Their primary purpose is to make some
of the event context available to the
event filters without having to
commit the data to sub-buffers.
A process which is responsible for consuming the full
sub-buffers and write them to a file system or
send them over the network.
current trace chunk
A trace chunk which includes the current content
of all the sub-buffers of the
tracing session and the stream files
produced since the latest event amongst:
A communication protocol between the relay daemon and
live viewers which makes it possible to see event records “live”, as they are received by the
relay daemon.
LTTng doesn’t manage any trace chunk archive, even if its
containing tracing session is still active: you
are free to read it, modify it, move it, or remove it.