QNX Momentics
From TracingWiki
Contents |
[edit] QNX Software Systems
[edit] Company
QNX Software Systems provides different products targeting primarily the embedded systems market. Their proprietary operating system, called QNX Neutrino, is a real-time operating system. As of 12 September 2007, a free license was released for non-commercial use together with the source code of QNX Neutrino. The company also provides a development suite called QNX Momentics which is described briefly later in this document. Other products include middleware for multimedia applications, acoustic processing as well as 2D and 3D graphics rendering.
[edit] Kernel Tracing
The instrumented QNX microkernel is equipped with an event-gathering module and runs at 98% of the speed of the regular microkernel. When tracing, system activity is intercepted by generating time-stamped and CPU stamped events to be stored in a circular linked list of small buffers. This allows the data capture program to store the filled buffers to the disk while other available buffers can still be used to log more events. Kernel activities that generate events are system calls, scheduling activities, interrupt handling, thread/process creation, destruction and state changes. The instrumented kernel can't flush a buffer or switch to another one within an interrupt. To solve this problem, the instrumented kernel requests a buffer flush when it becomes 70% full. The buffer is composed of 1024 event slots, of 16 bytes each, and most interrupt routines require less than 300 event buffer slots (approximately 30% of 1024 event buffer slots). Most events (simple events) usually fit in a single event buffer slot, but this is not the rule. Events holding too much information (combine event) can consume two or more event buffer slots. In fast mode, only one buffer slot is allocated to all events (combine events will be incomplete). The traceevent_t structure is 16 bytes long. To represent a combine event, multiple traceevent_t elements are required. The traceevent_t structure includes a 2-bit flag indicating whether the event is a single or combine one. Data is written in raw binary format to a device or file for offline processing. The library libtraceparser provides a set of API functions that allow the user to set up a series of callback functions associated with each event, and called when complete buffer slots of event data have been read and assembled from the binary event stream. The libtraceparser API transparently assemble combine events and sorts the events with respect to their timestamp. The timestamp uses the 32 LSB only of the 64-bit clock to reduce the amount of data in the trace. When this portion rolls over, a control event is issued. A provided tool called traceprinter uses this library and outputs all of the trace events in a human readable format and ordered linearly by their timestamp.
[edit] Measurements
The aim of this section is to determine the cost in time per traced event. The benchmark consists of compiling the gcc compiler on a Pentium 4 3.0GHz processor with 2GB of RAM. Only kernel events were traced to avoid loosing some events when writing to the tracefile. The compilation generated around 13 million events on average, with an overhead of around 24 additional seconds to finish the test execution. This gives an approximate cost of 1.85 x 10-6 second per event.
[edit] QNX Momentics Development Suite
QNX Momentics Development Suite is an Eclipse based IDE for C/C++ and embedded C++ development. It supports three host platforms which are Linux, QNX Neutrino and Windows. It has built-in support for the CVS source-control protocol with support for both remote pserver and secure SSH repository access. It provides many tools for trace analysis and application debugging.
[edit] Trace Analysis
The System Profiler Perspective, in the QNX Momentics IDE, groups different views for trace analysis. Some of these views are described below:
The summary view shows the time spent when the system was idle, was processing interrupts, was executing kernel code and user code. The CPU activity view shows the total amount of CPU time a thread or a process took throughout the period of the trace file or during a selected interval. The Timeline view shows the succession of events relative to every process, where the different thread states are shown in different colors. The CPU Migration pane displays a chart showing the number of CPU scheduling migrations over time. The count is incremented every time a thread migrates from one CPU to another. Another chart shows the number messages sent between a client and a server running on two different processors. The Bookmarks view can be used to bookmark events of particular interest. In the Client/Server CPU Statistics view, the thread running time is divided into self time and imposed time. For instance, if a client X requests a service from server Y, then the time Y consumes is considered an imposed on time. It becomes easier to find which clients are the most demanding and are causing the bottleneck. The Overview view has two charts. The first one displays the CPU usage over the period of the trace file, and the second shows the event distribution over time. The Trace Event Log view shows all the fields of an event including its number, timestamp, class, type and others. The why running tool enables the user to backtrace relevant events leading up to the execution of a selected thread.
[edit] Application Analysis Tools
- Memory analysis tools
- Application profiler
- Code coverage
[edit] Memory Analysis tools
When a program is launched with the Memory Analysis tools, it uses the debug version of the malloc library (libmalloc_g.so). This library tracks the history of every allocation and deallocation the program does and provides cover functions that validate the corresponding function's parameters before using them. This is used to detect memory leaks and memory errors such as overruns, underruns and freeing the same memory twice. Whenever a memory error is encountered at runtime, the developer can either request a process core dump file, or switch to the debugger view where he can identify the erroneous lines of code. Offline analysis is also possible. Once the information is available, a graph showing the requested, allocated and freed memory can be displayed in the Memory Analysis perspective. A list of memory errors is also provided showing the type, the pointer, the timestamp, the pid, the tid and the memory operation requested.
[edit] Application profiler
The main target of the application profiler is to identify highly executed sections of code which are the most eligible candidates for optimization. The Application Profiler provides two types of profiling: Statistical and Instrumented profiling. In statistical profiling, the tool relies on sampling the running code every millisecond and recording the address being executed. For this type of profiling to be accurate, the program should run for a sufficiently long time. The advantage of this method is that no instrumentation and code recompilation is required, and the profiler can be attached dynamically to a running process on the target system. In instrumented profiling. the compiler inserts snippets of code into the different functions to report the addresses of the called and calling functions. If the program is to be run on an embedded system, the profiler update interval can be set accordingly. For instance, a low interval would result in continuous, but low, network traffic whereas a long interval may result in forcing the embedded system to buffer a relatively large amount of profiled data.
[edit] Code coverage
The code coverage tool in the Momentics IDE is a visual front end to the gcov metrics generated by the gcc compiler. A list of coverage information is generated showing lines fully covered, not covered and partially covered. The coverage tool also highlights the lines of code that were not executed during the testing phase. Branch coverage is not yet provided by the IDE although this metric is already produced by gcc. Finally, a report can be generated and saved for later analysis or comparison with other reports. For each function, the report contains the number of lines that were not covered, that were partially covered and that were fully covered.
