Writing babeltrace 2 plugins
This is a write up of Simon Marchi's talk, "Writing babeltrace 2 plugins" that he gave at the Tracing Summit 2019. Links to the video are here, and the slides can be found here.
Babeltrace 2 is a framework for viewing, converting, and analyzing trace data. The original Babeltrace project was started in 2010, but over the years developers noticed a number of shortcomings. For one, the Intermediate Representation (IR) was tightly coupled to the Common Trace Format (CTF), and even though the CTF is an expressive trace format used by popular tracing tools such as LTTng and barectf, the goal of Babeltrace was always to support as many formats as possible. The second shortcoming was that the original Babeltrace project didn't have any support for plugins which, again, made it difficult to support new input and output formats.
The Babeltrace 2 release is a complete redesign of the library, Python bindings, and CLI. The new design uses a graph architecture which consists of sink, source, and filter components. This modularity makes babeltrace independent from any one trace format, and while you can still feed CTF trace data to babeltrace, the rework means you can now also write plugins to handle reading other trace formats and potentially write that data using a different output format.
The Babeltrace project includes both a library – with bindings for C and
Python – as well as a command-line tool for working with traces. The
library, libbabeltrace2, is extremely flexible and allows you to build
any kind of transformation or analysis you want. babeltrace2
, the
command-line tool, is built using libbabeltrace2.
Babeltrace can be extended by plugins, which provide one or more component classes.
You can see exactly which plugins are available and the component classes they provide
by running the babeltrace2 list-plugins
command.
Core Concepts
Before you can start writing your own plugins for Babeltrace, you need to understand the core concepts. A graph is a collection of connected components, where each component can either receive data (source), transform data (filter), or write data out (sink). This source and sink design is the same concept used in Pulseaudio and FFMpeg (sometimes called a filter graph), and it's so popular because it's highly customizable – by connecting different components in different configurations you can build a pipeline to do whatever you want.
Here's a diagram that illustrates this design.
Sources receive data from the outside world, for example by reading it from a trace file or getting it from LTTng live input. Once received, the data is sent to downstream components such as any number of optional filters. Filters are used to transform the input data format or perform some kind of analysis. For example, Babeltrace 2 ships with a filter (filter.utils.trimmer) to discard events with timestamps outside a given range.
Then, once any filters have been applied the data is sent further downstream to the sink components which are responsible for sending that data out to the outside world. For example, if you're using babeltrace to convert between two file formats your sink component will write the converted data to the new file format, or if you're creating a visualization from the input trace data your sink component might pop up a window to display the graphical data.
Communication between components is done using messages. Various message types are used which signify the stage of the event stream, e.g. stream beginning message, event message, stream end message. Messages are sent between components using connection points known as ports.
Somewhat counterintuitively, Babeltrace starts processing trace data at the sink, and gradually works its way upstream. Sinks request messages from their immediate upstream components, which in turn request messages from their upstream components. This goes on all the way until you hit the sources which receive trace data.
The process of using a graph can be summarized as:
- User adds components and connects them together
- User starts the graph execution, sinks create iterators on their input ports
- The graph asks the sinks to consume from their iterators
- When all iterators and sinks have reached the end, the graph execution has completed successfully.
The babeltrace2-intro man
page
provides a more in-depth explanation of how Babeltrace works, but with
just the concepts we've covered above it's possible to start writing
your first component class. You can either
create your own component classes using the C API or Python bindings in
your application (using libbabeltrace2) or as a plugin that you can
distribute and load with another application that uses libbabeltrace2 (including
the babeltrace2
command-line tool).
For example, the ctf
plugin contains three component classes:
source.ctf.fs
,
source.ctf.lttng-live
and
sink.ctf.fs
There's no correlation between the programming language of your application and
the language you write your plugin with, meaning you can write your plugin in C
and use it in a Python application and vice-versa.
Example using Python
Here's an example of a sink component class that reads each message passed to
it and prints what kind of message it is. The Python plugin files need to be named
bt_plugin_*.py
to be considered by Babeltrace.
import bt2
bt2.register_plugin(__name__, "demo")
@bt2.plugin_component_class
class MyFirstSink(bt2._UserSinkComponent):
def __init__(self, config, params, obj):
self._port = self._add_input_port("some-name")
def _user_graph_is_configured(self):
self._it = self._create_message_iterator(self._port)
def _user_consume(self):
# Consume one message and print it.
msg = next(self._it)
if type(msg) is bt2._StreamBeginningMessageConst:
print("Stream beginning")
elif type(msg) is bt2._PacketBeginningMessageConst:
print("Packet beginning")
elif type(msg) is bt2._EventMessageConst:
ts = msg.default_clock_snapshot.value
name = msg.event.name
print("event {}, timestamp {}".format(name, ts))
elif type(msg) is bt2._PacketEndMessageConst:
print("Packet end")
elif type(msg) is bt2._StreamEndMessageConst:
print("Stream end")
else:
raise RuntimeError("Unhandled message type", type(msg))
Your component needs at least one input port with an assigned name. This is
usually done in the __init__()
function which is called when a component
is added to the graph. This function also accepts parameters in order to make the
behavior of the component parametrized. We'll mention the params
parameter
again in the Advanced Steps section.
_user_graph_is_configured()
is a callback that's invoked when the user
has connected all the components together and the graph is about to be executed. The example above creates an iterator on the component's input port which we'll use later to consume messages.
_user_consume()
is a callback that's invoked when Babeltrace wants our component to consume some messages. We consume messages from the iterator created in the
previous function and do something useful with it. In this example, that's printing the type of message.
Using the example plugin is straightforward with the --plugin-path
.
You can instantiate components using the -c
command-line switch:
babeltrace2 --plugin-path=. -c sink.demo.MyFirstSink /home/user/lttng-traces/auto-20190805-105005
Stream beginning
Stream beginning
Stream beginning
Stream beginning
Packet beginning
Packet beginning
Packet beginning
Packet beginning
event libc_malloc, timestamp 1800983054553
event libc_malloc, timestamp 1800983061401
event libc_malloc, timestamp 1800983066193
event libc_malloc, timestamp 1800983070414
event libc_malloc, timestamp 1800988924162
event libc_malloc, timestamp 1800988933077
event libc_malloc, timestamp 1800988959826
event libc_malloc, timestamp 1800988964099
Packet end
Packet end
Packet end
Packet end
Stream end
Stream end
Stream end
Stream end
Here, the events recorded in /home/user/lttng-traces/auto-20190805-105005
are
calls to the malloc()
function in the C library, and there are four
streams (denoted by "Stream beginning" and "Stream end") because the
trace was captured on a machine with four CPUs.
Now let's take a look at a source component.
import bt2
bt2.register_plugin(__name__, "demo-source")
class MyFirstSourceIter(bt2._UserMessageIterator):
def __init__(self, config, output_port):
self._event_class = output_port.user_data
stream_class = self._event_class.stream_class
trace_class = stream_class.trace_class
trace = trace_class()
self._stream = trace.create_stream(stream_class)
self._state = 0
def __next__(self):
if self._state == 0:
msg = self._create_stream_beginning_message(self._stream)
elif self._state == 1:
msg = self._create_event_message(
self._event_class, self._stream, default_clock_snapshot=123
)
elif self._state == 2:
msg = self._create_stream_end_message(self._stream)
else:
raise StopIteration
self._state += 1
return msg
@bt2.plugin_component_class
class MyFirstSource(bt2._UserSourceComponent, message_iterator_class=MyFirstSourceIter):
def __init__(self, config, params, obj):
tc = self._create_trace_class()
cc = self._create_clock_class()
sc = tc.create_stream_class(default_clock_class=cc)
ec = sc.create_event_class(name="my-event")
self._add_output_port("some-name", ec)
In the source's __init__()
function, we need to create a port just
like in the sink example, but this time it's an output port because the
source component needs to produce messages. Additionally, we also need
to create three other things: a trace class, a stream class, and an
event class. The event classes we create define the type of events our source can send.
If we want our events to have a clock snapshot (often known as a timestamp),
we also need a clock class.
We need an iterator class which is responsible for sending messages to
the downstream components that will iterate on our component's ports.
Any iteration state needs to be encapsulated entirely within the iterator so that
multiple iterators can be used simultaneously and independently without conflicts. The
iterator's __next__
callback needs to return some useful messages.
babeltrace2 --plugin-path=. -c source.demo-source.MyFirstSource
[01:00:00.000000123] (+?.?????????) my-event:
The above output shows the single event created by the component. You might
see different output because the sink.text.pretty
component class which
is implicitly used here to convert the timestamp to wallclock time using
your local timezone.
If you want to take advantage of the flexibility of the component architecture, you can combine both source and sink components like this:
babeltrace2 --plugin-path=. -c source.demo-source.MyFirstSource -c sink.demo.MyFirstSink
Stream beginning
event my-event, timestamp 123
Stream end
Here you can see that the source example creates three messages that are printed by the sink component.
Advanced Steps
The examples we've just covered are all that you need to get up and running with writing your own plugin to perform analysis and transformation of trace data with Babeltrace. If you want to take things a step further, there are some ways to make your component classes more flexible and easier to use.
Passing parameters to your component gives you a way to configure them on the fly.
The params
parameter to you component's __init__
method allows you
to pass things like paths to trace files.
Babeltrace can automatically discover input files that can be consumed
by your source component class. All you need to do is add
support for the babeltrace2.support-info
query
object
to your source component class.
This object lets Babeltrace ask your source component class whether it
recognizes a given file as a trace it can handle. The biggest benefit of
adding support for this query is that it makes babeltrace2 <mytrace>
work out of the box for your trace format.
And lastly, the new Babeltrace 2 error system provides a way to send user-friendly error messages when something goes wrong. In fact, this was one of the main goals behind the Babeltrace 2 redesign. The original Babeltrace 1 error messages were unclear which made it difficult for users to troubleshoot. When Babeltrace 2 has trouble parsing your Python plugin file, for example, you now get a descriptive error message that makes it much simpler to diagnose the root cause.
For example, here's the helpful output from Babeltrace 2 when a Python source
component iterator's __next__
function raises an unexpected
exception:
CAUSED BY [libbabeltrace2] (iterator.c:907)
Component input port message iterator's "next" method failed: iter-addr=0x55e475c7ba90, iter-upstream-comp-name="muxer", iter-upstream-comp-log-level=INFO, iter-upstream-comp-class-type=FILTER,
iter-upstream-comp-class-name="muxer", iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
CAUSED BY [libbabeltrace2] (iterator.c:907)
Component input port message iterator's "next" method failed: iter-addr=0x55e475c8aa00, iter-upstream-comp-name="source.demo.MyFirstSource", iter-upstream-comp-log-level=INFO,
iter-upstream-comp-class-type=SOURCE, iter-upstream-comp-class-name="MyFirstSource", iter-upstream-comp-class-partial-descr="", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="some-name", status=ERROR
CAUSED BY [source.demo.MyFirstSource (some-name): 'source.demo.MyFirstSource'] (bt2/native_bt_log_and_append_error.h:102)
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/bt2/message_iterator.py", line 201, in _bt_next_from_native
msg = next(self)
File "/home/user/babeltrace-2-plugins/bt_plugin_foo.py", line 25, in __next__
this_is_an_error()
NameError: name 'this_is_an_error' is not defined
We want to hear from you!
We'd love to answer any questions you might have. You can contact us on IRC and on the lttng-dev mailing list.