← Back to LTTng's blog

Finding the Root Cause of a Web Request Latency

Comments

Photo: Copyright © Alexandre Claude, used with permission

When trying to solve complex performance issues, a kernel trace can be the last resort: capture everything and try to understand what is going on. With LTTng, that would go like that:

lttng create
lttng enable-event -k -a
lttng start
...wait for the problem to appear...
lttng stop
lttng destroy

Once this is done, depending on how long the capture is, there are probably a lot of events in the resulting trace (~50.000 events per second on my mostly idle laptop), now it is time to make sense of it !

This post is a first in a serie to present the LTTng analyses scripts. In this one, we will try to solve an unusual I/O latency issue.

Our test case is a PHP image board (tinyib-mappy) and we want to understand why some client requests are significantly slower than others. In this example, we manually call sync to force the kernel to flush the dirty buffers on disk to have an easy example, but the methodology is the same with more complex use-cases.

On the server, we are recording a kernel LTTng session as explained earlier. On the client, we test and measure with this command:

$ while true; do time wget -q -p http://10.2.0.1/; sleep 0.5; done
...
real	0m0.017s
real	0m0.016s
real	0m0.454s
real	0m0.016s
real	0m0.015s
...

The output here shows that one request took much longer than the usual delay (around 430ms more). Now we will try to understand the source of this latency.

I/O usage

The resulting trace covers around 16 seconds. Our first step to get an idea of what was the I/O activity in the trace is to look at the lttng-iousagetop script:

$ ./lttng-iousagetop analysis-20150115-120942/
Timerange: [2015-01-15 12:18:37.216484041, 2015-01-15 12:18:53.821580313]
Per-process I/O Read
###############################################################################
██████████████████████████████████████████████████    528.0 KB lttng-consumerd (19964)        0 B  file      0 B  net  528.0 KB unknown
███████                                                80.0 KB lttng-consumerd (19964)        0 B  file      0 B  net   80.0 KB unknown
███                                                   39.45 KB apache2 (31544)             33.7 KB file   5.33 KB net  424.0 B  unknown
██                                                    29.58 KB apache2 (31591)            25.28 KB file    4.0 KB net  318.0 B  unknown
██                                                    29.58 KB apache2 (31592)            25.28 KB file    4.0 KB net  318.0 B  unknown
██                                                    29.58 KB apache2 (31545)            25.28 KB file    4.0 KB net  318.0 B  unknown
██                                                    29.58 KB apache2 (31517)            25.28 KB file    4.0 KB net  318.0 B  unknown
██                                                    29.58 KB apache2 (31520)            25.28 KB file    4.0 KB net  318.0 B  unknown
██                                                    29.58 KB apache2 (31584)            25.28 KB file    4.0 KB net  318.0 B  unknown
██                                                    29.58 KB apache2 (31589)            25.28 KB file    4.0 KB net  318.0 B  unknown
Per-process I/O Write
###############################################################################
██████████████████████████████████████████████████     2.24 MB find (22785)                   0 B  file      0 B  net   2.24 MB unknown
███████████                                           528.0 KB lttng-consumerd (19964)        0 B  file      0 B  net  528.0 KB unknown
████                                                 225.24 KB apache2 (31544)                0 B  file  221.3 KB net   3.94 KB unknown
███                                                  168.93 KB apache2 (31591)                0 B  file 165.98 KB net   2.95 KB unknown
███                                                  168.93 KB apache2 (31592)                0 B  file 165.98 KB net   2.95 KB unknown
███                                                  168.93 KB apache2 (31545)                0 B  file 165.98 KB net   2.95 KB unknown
███                                                  168.93 KB apache2 (31517)                0 B  file 165.98 KB net   2.95 KB unknown
███                                                  168.93 KB apache2 (31520)                0 B  file 165.98 KB net   2.95 KB unknown
███                                                  168.93 KB apache2 (31584)                0 B  file 165.98 KB net   2.95 KB unknown
███                                                  168.93 KB apache2 (31589)                0 B  file 165.98 KB net   2.95 KB unknown

In this first section, we see the top 10 processes performing I/O system calls. From the information in the trace, we try to identify if the requests are disk or network-related, but sometimes the trace does not have all the information we need (especially with FD passing between processes). Of course LTTng itself is doing some I/O to store the trace, but besides that, we see the apache2 threads receiving our HTTP GET requests, reading some files on disk and answering on the network. We also see a find process that was launched on the system during the capture of the trace, but up to now, nothing alarming.

Note: you can see that the apache2 processes read less than they write, it is not because Apache magically creates new data, it is a current limitation in the syscall instrumentation. When Apache sends a complete file, it opens it, mmap it and passes the address of the memory map in a vector to writev, but the current instrumentation does not show us the content of the vectors for now, so we can't assume that it relates to the last opened file. This can be fixed in the instrumentation and should be soon.

Next, we see the same information, but sorted by files, so we can see the files shared by multiple processes (and the respective FD number in each process):

Files Read
###############################################################################
██████████████████████████████████████████████████    304.0 KB anon_inode:[lttng_stream] (lttng-consumerd) 'fd 28 in lttng-consumerd (19964)'
██████████████████████████████████████████           259.87 KB /var/www/html/tinyib.db 'fd 14 in apache2 (31591)', 'fd 14 in apache2 (31592)', 'fd 14 in apache2 (31545)', 'fd 14 in apache2 (31517)', 'fd 14 in apache2 (31520)', 'fd 14 in apache2 (31544)', 'fd 14 in apache2 (31584)', 'fd 14 in apache2 (31589)', 'fd 14 in apache2 (31588)', 'fd 14 in apache2 (31590)'
██                                                    12.85 KB socket (lttng-sessiond) 'fd 44 in lttng-sessiond (19839)'
                                                        6.0 KB /proc/interrupts 'fd 3 in irqbalance (1337)'
                                                       4.88 KB /lib/x86_64-linux-gnu/libc.so.6 'fd 3 in find (22785)', 'fd 3 in bash (22784)', 'fd 3 in bash (22790)', 'fd 3 in sync (22791)', 'fd 3 in bash (22796)', 'fd 3 in lttng (22797)'
                                                       4.82 KB socket:[168432] (ssh) 'fd 3 in ssh (10368)'
                                                       4.45 KB /proc/stat 'fd 3 in irqbalance (1337)'
                                                       3.25 KB /lib/x86_64-linux-gnu/libz.so.1 'fd 3 in bash (22784)', 'fd 3 in bash (22790)', 'fd 3 in bash (22796)', 'fd 3 in lttng (22797)'
                                                       3.25 KB /lib/x86_64-linux-gnu/libpthread.so.0 'fd 3 in bash (22784)', 'fd 3 in bash (22790)', 'fd 3 in bash (22796)', 'fd 3 in lttng (22797)'
                                                       2.51 KB /usr/share/locale/locale.alias 'fd 3 in find (22785)'
Files Write
###############################################################################
██████████████████████████████████████████████████     2.24 MB /dev/null 'fd 1 in find (22785)', 'fd 2 in bash (22784)', 'fd 2 in bash (22790)', 'fd 2 in bash (22796)'
██████                                                304.0 KB /home/julien/lttng-traces/analysis-20150115-120942/kernel/metadata 'fd 32 in lttng-consumerd (19964)'
██████                                                304.0 KB pipe:[180336] (lttng-consumerd) 'fd 18 in lttng-consumerd (19964)'
█                                                     55.33 KB unknown (origin not found) (apache2) 'fd 12 in apache2 (31590)'
                                                      30.52 KB /var/log/apache2/access.log 'fd 8 in apache2 (31591)', 'fd 8 in apache2 (31592)', 'fd 8 in apache2 (31545)', 'fd 8 in apache2 (31517)', 'fd 8 in apache2 (31520)', 'fd 8 in apache2 (31544)', 'fd 8 in apache2 (31584)', 'fd 8 in apache2 (31589)', 'fd 8 in apache2 (31588)', 'fd 8 in apache2 (31590)'
                                                      12.85 KB socket (lttng) 'fd 3 in lttng (22797)'
                                                       2.54 KB socket:[164921] (sshd) 'fd 3 in sshd (10339)'
                                                       1.69 KB socket:[164888] (sshd) 'fd 3 in sshd (10236)'
                                                       1.39 KB /dev/pts/4 'fd 5 in ssh (10368)'
                                                      604.0 B  /dev/pts/1 'fd 0 in bash (10237)', 'fd 2 in bash (10237)', 'fd 1 in lttng (22752)'

Once again, we see some LTTng-related I/O, we see our sqlite database accessed by apache, the apache log files and now we understand where find is writing (find / >dev/null)

Next, we see the actual block I/O performed, so how much really touched the disk during the trace period (not cached or buffered):

Block I/O Read
###############################################################################
██████████████████████████████████████████████████    38.47 MB find (pid=22785)
                                                      128.0 KB  (pid=-1)
                                                       28.0 KB sync (pid=22791)
Block I/O Write
###############################################################################
██████████████████████████████████████████████████     7.57 MB jbd2/sda1-8 (pid=245)
█████████████████████████████████████████              6.24 MB  (pid=-1)
███                                                   572.0 KB kworker/u32:1 (pid=21668)
█                                                     264.0 KB lttng-consumerd (pid=19964)
                                                       56.0 KB sync (pid=22791)
                                                       40.0 KB lttng-consumerd (pid=19964)

Here, we are starting to see some interesting stuff: even though we didn't do much I/O as far as the system calls are concerned, some kernel threads wrote around 14MB of data directly on disk. So there was definitely some I/O activity but we don't know which process started it.

Finally, the iotop script gives us the breakdown of network usage per network interface:

Network recv_bytes
###############################################################################
████████████████████████████████████████████████████████  103.9 KB eth1
███                                                       6.65 KB eth0
Network sent_bytes
###############################################################################
████████████████████████████████████████████████████████  1.72 MB eth1
                                                          8.03 KB eth0

Ok, so in this phase we saw the overall I/O usage in the trace, no obvious process was saturating our ressources, but we know we are writing actual data on disk. Now let's have a look at the I/O latencies.

I/O latency

The I/O latency is the delay an I/O system call takes to complete (time elapsed between the syscall entry, syscall exit), or the delay for a block request to complete (block_rq_issue, block_rq_complete). My bash process is always waiting to read one character for my keyboard, so in all the examples, I will filter out the syscalls that consume less that 2 bytes with the parameter --minsize 2

Latency statistics

The first step is to get some statistics:

$ ./lttng-iolatencystats analysis-20150115-120942/ --minsize 2
Syscalls latency statistics (usec):
Type      Count       Min        Average            Max          Stdev
----------------------------------------------------------------------
Open      14724     0.250         32.173     437697.753       3607.107
Read       1034     0.360          2.383        482.860         15.337
Write      1036     0.298          7.778       2667.346         82.972
Sync        153    13.393       6601.007     994745.622      80413.186

Disk latency statistics (usec):
Name                    Count            Min        Average            Max          Stdev
-----------------------------------------------------------------------------------------
sda1                    10211          0.002          0.507         24.260       1485.700

Now we are getting somewhere ! If we look at the Max column, the Open and Sync have particularly huge numbers compared to the other statistics.

Latency frequency distribution

To illustrate this, let's look at the frequency distribution of just the open operations which include most of the syscalls that return a new file descriptor (open, openat, connect, accept, etc).

$ ./lttng-iolatencyfreq prototyping/analysis-20150115-120942 --minsize 2
Open latency distribution (usec)
###############################################################################
     0.250 ████████████████████████████████████████████████████████████  14723
 21885.125                                                                   0
 43770.000                                                                   0
 65654.875                                                                   0
 87539.751                                                                   0
109424.626                                                                   0
131309.501                                                                   0
153194.376                                                                   0
175079.251                                                                   0
196964.126                                                                   0
218849.002                                                                   0
240733.877                                                                   0
262618.752                                                                   0
284503.627                                                                   0
306388.502                                                                   0
328273.377                                                                   0
350158.252                                                                   0
372043.128                                                                   0
393928.003                                                                   0
415812.878                                                                   1

As we can see, most of the requests (14723) have a latency between 0.250 and 21885.125 usec, only one is way above. If we want to "zoom in" on this graph, we can just use the --min and --max parameters to filter out the data.

Top latencies

Now that we know we have some outliers in Open and Sync operations, let's look at the top 10 worst I/O latencies found in this trace:

$ ./lttng-iolatencytop analysis-20150115-120942/ --minsize 2 --limit 4
Top open syscall latencies (usec)
Begin               End                  Name             Duration (usec)         Size  Proc                     PID      Filename
[12:18:50.432950815,12:18:50.870648568]  open                  437697.753          N/A  apache2                  31517    /var/lib/php5/sess_0ifir2hangm8ggaljdphl9o5b5 (fd=13)
[12:18:52.946080165,12:18:52.946132278]  open                      52.113          N/A  apache2                  31588    /var/lib/php5/sess_mr9045p1k55vin1h0vg7rhgd63 (fd=13)
[12:18:46.800846035,12:18:46.800874916]  open                      28.881          N/A  apache2                  31591    /var/lib/php5/sess_r7c12pccfvjtas15g3j69u14h0 (fd=13)
[12:18:51.389797604,12:18:51.389824426]  open                      26.822          N/A  apache2                  31520    /var/lib/php5/sess_4sdb1rtjkhb78sabnoj8gpbl00 (fd=13)

Top read syscall latencies (usec)
Begin               End                  Name             Duration (usec)         Size  Proc                     PID      Filename
[12:18:37.256073107,12:18:37.256555967]  read                     482.860        7.0 B  bash                     10237    unknown (origin not found) (fd=3)
[12:18:52.000209798,12:18:52.000252304]  read                      42.506       1.0 KB  irqbalance               1337     /proc/interrupts (fd=3)
[12:18:37.256559439,12:18:37.256601615]  read                      42.176        5.0 B  bash                     10237    unknown (origin not found) (fd=3)
[12:18:42.000281918,12:18:42.000320016]  read                      38.098       1.0 KB  irqbalance               1337     /proc/interrupts (fd=3)

Top write syscall latencies (usec)
Begin               End                  Name             Duration (usec)         Size  Proc                     PID      Filename
[12:18:49.913241516,12:18:49.915908862]  write                   2667.346       95.0 B  apache2                  31584    /var/log/apache2/access.log (fd=8)
[12:18:37.472823631,12:18:37.472859836]  writev                    36.205     21.97 KB  apache2                  31544    unknown (origin not found) (fd=12)
[12:18:37.991578372,12:18:37.991612724]  writev                    34.352     21.97 KB  apache2                  31589    unknown (origin not found) (fd=12)
[12:18:39.547778549,12:18:39.547812515]  writev                    33.966     21.97 KB  apache2                  31584    unknown (origin not found) (fd=12)

Top sync syscall latencies (usec)
Begin               End                  Name             Duration (usec)         Size  Proc                     PID      Filename
[12:18:50.162776739,12:18:51.157522361]  sync                  994745.622          N/A  sync                     22791    None (fd=None)
[12:18:37.227867532,12:18:37.232289687]  sync_file_range         4422.155          N/A  lttng-consumerd          19964    /home/julien/lttng-traces/analysis-20150115-120942/kernel/metadata (fd=32)
[12:18:37.238076585,12:18:37.239012027]  sync_file_range          935.442          N/A  lttng-consumerd          19964    /home/julien/lttng-traces/analysis-20150115-120942/kernel/metadata (fd=32)
[12:18:37.220974711,12:18:37.221647124]  sync_file_range          672.413          N/A  lttng-consumerd          19964    /home/julien/lttng-traces/analysis-20150115-120942/kernel/metadata (fd=32)

If we look at the first line in the Open category, we see clearly that a PHP session file took 437ms to open, whereas it usually takes around 20ms. If we remember the beginning of this example, we had a latency 430ms longer than usual, we are on the right track ! Now we need to understand why this latency occured. In order to narrow down this research, we can center the analysis around the interesting timestamp range:

$ ./lttng-iolatencytop analysis-20150115-120942/ --minsize 2  --timerange [12:18:50.432950815,12:18:50.870648568]
Top open syscall latencies (usec)
Begin               End                  Name             Duration (usec)         Size  Proc                     PID      Filename
[12:18:50.432950815,12:18:50.870648568]  open                  437697.753          N/A  apache2                  31517    /var/lib/php5/sess_0ifir2hangm8ggaljdphl9o5b5 (fd=13)

Top sync syscall latencies (usec)
Begin               End                  Name             Duration (usec)         Size  Proc                     PID      Filename
[12:18:50.162776739,12:18:51.157522361]* sync                  994745.622*         N/A  sync                     22791    None (fd=None)
*: Syscalls started and/or completed outside of the range specified

Now here is a very interesting line: we now know that while our open was running, a sync was also running at the same time. It started before and terminated after the range we specified. So we could now widen the range to match this one and for example see the actual disk usage recorded during the period this syscall was active, or the list of the other syscalls affected by this one.

Note: instead of using the --timerange option, we can control individually the beginning and the end of the range with --begin and/or --end.

Note 2: the time is relative to the timezone of the machine running the analysis script, so if you are exchanging results with people in a different timezone, add the --gmt option to have a common time reference.

Note 3: if you are working on a trace that spans multiple days, you will have to specify the date as well.

I/O requests log

Finally, we can also have a look at it in a timeline (instead of top or frequency). For that matter, we have the lttng-iolog script (or --log option in the other scripts) which can give us an understanding of the sequence that lead to our problem. Here is an example centered around the sync period:

$ ./lttng-iolog analysis-20150115-120942/ --timerange [12:18:50.162776739,12:18:51.157522361]
Log of all I/O system calls
Begin               End                  Name             Duration (usec)         Size  Proc                     PID      Filename
[12:18:50.162776739,12:18:51.157522361]  sync                  994745.622           N/A  sync                     22791    None (fd=None)
[12:18:50.432904513,12:18:50.432912248]  open                       7.735           N/A  apache2                  31517    /dev/urandom (fd=13)
[12:18:50.432912958,12:18:50.432923604]  read                      10.646        32.0 B  apache2                  31517    /dev/urandom (fd=13)
[12:18:50.432950815,12:18:50.870648568]  open                  437697.753           N/A  apache2                  31517    /var/lib/php5/sess_0ifir2hangm8ggaljdphl9o5b5 (fd=13)
[12:18:50.870685082,12:18:50.870689642]  open                       4.560           N/A  apache2                  31517    /var/www/html/inc/header.html (fd=14)
[12:18:50.870696308,12:18:50.870698097]  read                       1.789         6.0 B  apache2                  31517    /var/www/html/inc/header.html (fd=14)
[12:18:50.870785224,12:18:50.870788336]  open                       3.112           N/A  apache2                  31517    /var/www/html/tinyib.db (fd=14)
[12:18:50.870795212,12:18:50.870796527]  read                       1.315       100.0 B  apache2                  31517    /var/www/html/tinyib.db (fd=14)
[12:18:50.870866830,12:18:50.870867961]  read                       1.131        1.0 KB  apache2                  31517    /var/www/html/tinyib.db (fd=14)
...

Conclusion

Through this example, we identified an unusual latency and got up to the root cause: someone (or a script) launched the sync command which commited all the dirty buffers in the page cache to disk. This caused a high I/O load (for our old desktop hard drive). At the same time, our client tried to access the index page. When the client connected, PHP tried to open a temporary session file on disk which stalled 437ms because of the current disk activity.

One particularity about this workflow compared to the traditionnal debugging practices is that we only recorded the trace once and then investigated it through various tools and options. So we definitely understood this exact latency, not a look alike. For recurring problem it does not make much difference, but if you have ever investigated a weird case that happens only once a month, you know the value of this approach. Moreover, reproducing this exact example would have been really difficult considering the fact that it depends on the state of the page cache when the sync is started.

If you want to reproduce this example, the trace is available here. Feel free to experiment with the other LTTng analyses scripts, all the setup instructions are there.

Finally, we appreciate any feedback/suggestions/contributions and new use-cases/analyses ideas, so experiment with it and help us make great low-level performance debugging/monitoring tools.

Next time: an IRQ analysis.