Writing babeltrace 2 plugins

The EfficiOS team on 21 July 2020

This is a write up of Simon Marchi's talk, "Writing babeltrace 2 plugins" that he gave at the Tracing Summit 2019. Links to the video are here, and the slides can be found here.

Babeltrace 2 is a framework for viewing, converting, and analyzing trace data. The original Babeltrace project was started in 2010, but over the years developers noticed a number of shortcomings. For one, the Intermediate Representation (IR) was tightly coupled to the Common Trace Format (CTF), and even though the CTF is an expressive trace format used by popular tracing tools such as LTTng and barectf, the goal of Babeltrace was always to support as many formats as possible. The second shortcoming was that the original Babeltrace project didn't have any support for plugins which, again, made it difficult to support new input and output formats.

The Babeltrace 2 release is a complete redesign of the library, Python bindings, and CLI. The new design uses a graph architecture which consists of sink, source, and filter components. This modularity makes babeltrace independent from any one trace format, and while you can still feed CTF trace data to babeltrace, the rework means you can now also write plugins to handle reading other trace formats and potentially write that data using a different output format.

The Babeltrace project includes both a library – with bindings for C and Python – as well as a command-line tool for working with traces. The library, libbabeltrace2, is extremely flexible and allows you to build any kind of transformation or analysis you want. babeltrace2, the command-line tool, is built using libbabeltrace2.

Babeltrace can be extended by plugins, which provide one or more component classes. You can see exactly which plugins are available and the component classes they provide by running the babeltrace2 list-plugins command.

Core Concepts

Before you can start writing your own plugins for Babeltrace, you need to understand the core concepts. A graph is a collection of connected components, where each component can either receive data (source), transform data (filter), or write data out (sink). This source and sink design is the same concept used in Pulseaudio and FFMpeg (sometimes called a filter graph), and it's so popular because it's highly customizable – by connecting different components in different configurations you can build a pipeline to do whatever you want.

Here's a diagram that illustrates this design.

Sources receive data from the outside world, for example by reading it from a trace file or getting it from LTTng live input. Once received, the data is sent to downstream components such as any number of optional filters. Filters are used to transform the input data format or perform some kind of analysis. For example, Babeltrace 2 ships with a filter (filter.utils.trimmer) to discard events with timestamps outside a given range.

Then, once any filters have been applied the data is sent further downstream to the sink components which are responsible for sending that data out to the outside world. For example, if you're using babeltrace to convert between two file formats your sink component will write the converted data to the new file format, or if you're creating a visualization from the input trace data your sink component might pop up a window to display the graphical data.

Communication between components is done using messages. Various message types are used which signify the stage of the event stream, e.g. stream beginning message, event message, stream end message. Messages are sent between components using connection points known as ports.

Somewhat counterintuitively, Babeltrace starts processing trace data at the sink, and gradually works its way upstream. Sinks request messages from their immediate upstream components, which in turn request messages from their upstream components. This goes on all the way until you hit the sources which receive trace data.

The process of using a graph can be summarized as:

User adds components and connects them together
User starts the graph execution, sinks create iterators on their input ports
The graph asks the sinks to consume from their iterators
When all iterators and sinks have reached the end, the graph execution has completed successfully.

The babeltrace2-intro man page provides a more in-depth explanation of how Babeltrace works, but with just the concepts we've covered above it's possible to start writing your first component class. You can either create your own component classes using the C API or Python bindings in your application (using libbabeltrace2) or as a plugin that you can distribute and load with another application that uses libbabeltrace2 (including the babeltrace2 command-line tool). For example, the ctf plugin contains three component classes: source.ctf.fs, source.ctf.lttng-live and sink.ctf.fs There's no correlation between the programming language of your application and the language you write your plugin with, meaning you can write your plugin in C and use it in a Python application and vice-versa.

Example using Python

Here's an example of a sink component class that reads each message passed to it and prints what kind of message it is. The Python plugin files need to be named bt_plugin_*.py to be considered by Babeltrace.

import bt2

bt2.register_plugin(__name__, "demo")

@bt2.plugin_component_class
class MyFirstSink(bt2._UserSinkComponent):
    def __init__(self, config, params, obj):
        self._port = self._add_input_port("some-name")

    def _user_graph_is_configured(self):
        self._it = self._create_message_iterator(self._port)

    def _user_consume(self):
        # Consume one message and print it.
        msg = next(self._it)

        if type(msg) is bt2._StreamBeginningMessageConst:
            print("Stream beginning")
        elif type(msg) is bt2._PacketBeginningMessageConst:
            print("Packet beginning")
        elif type(msg) is bt2._EventMessageConst:
            ts = msg.default_clock_snapshot.value
            name = msg.event.name
            print("event {}, timestamp {}".format(name, ts))
        elif type(msg) is bt2._PacketEndMessageConst:
            print("Packet end")
        elif type(msg) is bt2._StreamEndMessageConst:
            print("Stream end")
        else:
            raise RuntimeError("Unhandled message type", type(msg))

Your component needs at least one input port with an assigned name. This is usually done in the __init__() function which is called when a component is added to the graph. This function also accepts parameters in order to make the behavior of the component parametrized. We'll mention the params parameter again in the Advanced Steps section.

_user_graph_is_configured() is a callback that's invoked when the user has connected all the components together and the graph is about to be executed. The example above creates an iterator on the component's input port which we'll use later to consume messages.

_user_consume() is a callback that's invoked when Babeltrace wants our component to consume some messages. We consume messages from the iterator created in the previous function and do something useful with it. In this example, that's printing the type of message.

Using the example plugin is straightforward with the --plugin-path. You can instantiate components using the -c command-line switch:

babeltrace2 --plugin-path=. -c sink.demo.MyFirstSink /home/user/lttng-traces/auto-20190805-105005

Stream beginning
Stream beginning
Stream beginning
Stream beginning
Packet beginning
Packet beginning
Packet beginning
Packet beginning
event libc_malloc, timestamp 1800983054553
event libc_malloc, timestamp 1800983061401
event libc_malloc, timestamp 1800983066193
event libc_malloc, timestamp 1800983070414
event libc_malloc, timestamp 1800988924162
event libc_malloc, timestamp 1800988933077
event libc_malloc, timestamp 1800988959826
event libc_malloc, timestamp 1800988964099
Packet end
Packet end
Packet end
Packet end
Stream end
Stream end
Stream end
Stream end

Here, the events recorded in /home/user/lttng-traces/auto-20190805-105005 are calls to the malloc() function in the C library, and there are four streams (denoted by "Stream beginning" and "Stream end") because the trace was captured on a machine with four CPUs.

Now let's take a look at a source component.

import bt2

bt2.register_plugin(__name__, "demo-source")


class MyFirstSourceIter(bt2._UserMessageIterator):
    def __init__(self, config, output_port):
        self._event_class = output_port.user_data
        stream_class = self._event_class.stream_class
        trace_class = stream_class.trace_class

        trace = trace_class()
        self._stream = trace.create_stream(stream_class)
        self._state = 0

    def __next__(self):
        if self._state == 0:
            msg = self._create_stream_beginning_message(self._stream)
        elif self._state == 1:
            msg = self._create_event_message(
                self._event_class, self._stream, default_clock_snapshot=123
            )
        elif self._state == 2:
            msg = self._create_stream_end_message(self._stream)
        else:
            raise StopIteration

        self._state += 1

        return msg


@bt2.plugin_component_class
class MyFirstSource(bt2._UserSourceComponent, message_iterator_class=MyFirstSourceIter):
    def __init__(self, config, params, obj):
        tc = self._create_trace_class()
        cc = self._create_clock_class()
        sc = tc.create_stream_class(default_clock_class=cc)
        ec = sc.create_event_class(name="my-event")

        self._add_output_port("some-name", ec)

In the source's __init__() function, we need to create a port just like in the sink example, but this time it's an output port because the source component needs to produce messages. Additionally, we also need to create three other things: a trace class, a stream class, and an event class. The event classes we create define the type of events our source can send. If we want our events to have a clock snapshot (often known as a timestamp), we also need a clock class.

We need an iterator class which is responsible for sending messages to the downstream components that will iterate on our component's ports. Any iteration state needs to be encapsulated entirely within the iterator so that multiple iterators can be used simultaneously and independently without conflicts. The iterator's __next__ callback needs to return some useful messages.

babeltrace2 --plugin-path=. -c source.demo-source.MyFirstSource

[01:00:00.000000123] (+?.?????????) my-event:

The above output shows the single event created by the component. You might see different output because the sink.text.pretty component class which is implicitly used here to convert the timestamp to wallclock time using your local timezone.

If you want to take advantage of the flexibility of the component architecture, you can combine both source and sink components like this:

babeltrace2 --plugin-path=. -c source.demo-source.MyFirstSource -c sink.demo.MyFirstSink

Stream beginning
event my-event, timestamp 123
Stream end

Here you can see that the source example creates three messages that are printed by the sink component.

Advanced Steps

The examples we've just covered are all that you need to get up and running with writing your own plugin to perform analysis and transformation of trace data with Babeltrace. If you want to take things a step further, there are some ways to make your component classes more flexible and easier to use.

Passing parameters to your component gives you a way to configure them on the fly. The params parameter to you component's __init__ method allows you to pass things like paths to trace files.

Babeltrace can automatically discover input files that can be consumed by your source component class. All you need to do is add support for the babeltrace2.support-info query object to your source component class. This object lets Babeltrace ask your source component class whether it recognizes a given file as a trace it can handle. The biggest benefit of adding support for this query is that it makes babeltrace2 <mytrace> work out of the box for your trace format.

And lastly, the new Babeltrace 2 error system provides a way to send user-friendly error messages when something goes wrong. In fact, this was one of the main goals behind the Babeltrace 2 redesign. The original Babeltrace 1 error messages were unclear which made it difficult for users to troubleshoot. When Babeltrace 2 has trouble parsing your Python plugin file, for example, you now get a descriptive error message that makes it much simpler to diagnose the root cause.

For example, here's the helpful output from Babeltrace 2 when a Python source component iterator's __next__ function raises an unexpected exception:

CAUSED BY [libbabeltrace2] (iterator.c:907)
  Component input port message iterator's "next" method failed: iter-addr=0x55e475c7ba90, iter-upstream-comp-name="muxer", iter-upstream-comp-log-level=INFO, iter-upstream-comp-class-type=FILTER,
  iter-upstream-comp-class-name="muxer", iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
CAUSED BY [libbabeltrace2] (iterator.c:907)
  Component input port message iterator's "next" method failed: iter-addr=0x55e475c8aa00, iter-upstream-comp-name="source.demo.MyFirstSource", iter-upstream-comp-log-level=INFO,
  iter-upstream-comp-class-type=SOURCE, iter-upstream-comp-class-name="MyFirstSource", iter-upstream-comp-class-partial-descr="", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="some-name", status=ERROR
CAUSED BY [source.demo.MyFirstSource (some-name): 'source.demo.MyFirstSource'] (bt2/native_bt_log_and_append_error.h:102)
  Traceback (most recent call last):
    File "/usr/local/lib/python3.6/dist-packages/bt2/message_iterator.py", line 201, in _bt_next_from_native
      msg = next(self)
    File "/home/user/babeltrace-2-plugins/bt_plugin_foo.py", line 25, in __next__
      this_is_an_error()
  NameError: name 'this_is_an_error' is not defined

We want to hear from you!

We'd love to answer any questions you might have. You can contact us on IRC and on the lttng-dev mailing list.

← Back to LTTng's blog