Linux strace command – A magnificent troubleshooter

By | 06/10/2012

Trying to run a program but facing weird errors? Well, most of the Linux users might have faced this situation. Many of us either spend hours searching on internet about the problem or wait for days on forums to get a reply. Sometimes we even give up and un-install the program. Very few of us know that there exists a Linux command line utility that can well be a good starting point for debugging these kind of problems. I am talking about the Linux strace command.

Linux strace command

 

What is strace

The Linux strace command is a utility that can be used to trace down the interaction between a program and Linux kernel. The interaction here means the Linux system calls that the program uses and the Linux signals that the program receives.

From the man page of Linux strace command :

In the simplest case strace runs the specified command/program until it exits. It intercepts and records the system calls which are called by a process and the signals which are received by a process. The name of each system call, its arguments and its return value are printed on standard error or to the file specified with the -o option.

The basic syntax of strace command is :

strace [command or program][arguments]

So we see that the strace command can be run by just typing it’s name along with the program and its arguments from the command line.

Some capabilities of strace command

Here are some of the capabilities of the Linux strace command :

1. A basic example

In its very basic form, the strace command can be used with an executable.

For example :

$ sudo strace ls
execve("/bin/ls", ["ls"], [/* 18 vars */]) = 0
brk(0)                                  = 0x8c8f000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb775f000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=71704, ...}) = 0
mmap2(NULL, 71704, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb774d000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/i386-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0@A\0\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=120748, ...}) = 0
mmap2(NULL, 125852, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb772e000
mmap2(0xb774b000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c) = 0xb774b000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/i386-linux-gnu/librt.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\320\30\0\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=30684, ...}) = 0
mmap2(NULL, 33360, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7725000
mmap2(0xb772c000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x6) = 0xb772c000
...
...
...

In the output above, there is lots of information related to the system calls being used and their return values. You can see that the calls to access system call fail with No such file or directory error. Although this failure does not have much impact on the execution of ls command but this example should give you a good idea as to why Linux strace command is so helpful.

2. Produce a summary report

The strace command has an option -c. Through this option, the strace command counts time, calls, and errors for each  system  call  and  reports a summary on program exit.

For example :

$ strace -c ls
Desktop  Documents  Downloads  examples.desktop  grof  groff.txt  Music  Pictures  practice  Public  Templates	Ubuntu One  Videos
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  -nan    0.000000           0         9           read
  -nan    0.000000           0         1           write
  -nan    0.000000           0        10           open
  -nan    0.000000           0        13           close
  -nan    0.000000           0         1           execve
  -nan    0.000000           0         9         9 access
  -nan    0.000000           0         3           brk
  -nan    0.000000           0         2           ioctl
  -nan    0.000000           0         3           munmap
  -nan    0.000000           0         1           uname
  -nan    0.000000           0        10           mprotect
  -nan    0.000000           0         2           rt_sigaction
  -nan    0.000000           0         1           rt_sigprocmask
  -nan    0.000000           0         1           getrlimit
  -nan    0.000000           0        25           mmap2
  -nan    0.000000           0        11           fstat64
  -nan    0.000000           0         2           getdents64
  -nan    0.000000           0         1         1 futex
  -nan    0.000000           0         1           set_thread_area
  -nan    0.000000           0         1           set_tid_address
  -nan    0.000000           0         1           statfs64
  -nan    0.000000           0         1           openat
  -nan    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.000000                   110        10 total

So we see that a summary table is produced in the output that counts the number of times a system call is made, the number of times a particular system call returned error and the timing information for a system call.

3. Prefix each line with time of day

Each line of the strace output can be prefixed with the time of day. This can be done using the -t option.

For example:

~$ strace -t ls
14:05:49 execve("/bin/ls", ["ls"], [/* 38 vars */]) = 0
14:05:49 brk(0)                         = 0x8b91000
14:05:49 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
14:05:49 mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb76f4000
14:05:49 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
14:05:49 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
14:05:49 fstat64(3, {st_mode=S_IFREG|0644, st_size=71704, ...}) = 0
14:05:49 mmap2(NULL, 71704, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb76e2000
14:05:49 close(3)                       = 0
14:05:49 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
14:05:49 open("/lib/i386-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
14:05:49 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0@A\0\0004\0\0\0"..., 512) = 512
14:05:49 fstat64(3, {st_mode=S_IFREG|0644, st_size=120748, ...}) = 0
14:05:49 mmap2(NULL, 125852, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb76c3000
14:05:49 mmap2(0xb76e0000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c) = 0xb76e0000
14:05:49 close(3)                       = 0
14:05:49 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
14:05:49 open("/lib/i386-linux-gnu/librt.so.1", O_RDONLY|O_CLOEXEC) = 3
14:05:49 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\320\30\0\0004\0\0\0"..., 512) = 512
14:05:49 fstat64(3, {st_mode=S_IFREG|0644, st_size=30684, ...}) = 0
14:05:49 mmap2(NULL, 33360, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb76ba000
14:05:49 mmap2(0xb76c1000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x6) = 0xb76c1000
14:05:49 close(3)
...
...
...

In the output above as you can see, all the lines were prefixed with time of the day.  Similarly, the options -tt and -ttt can be used to prefix various other time units like microseconds etc.

4. Get the time spent in each system call

If you are a developer of a program and want to know the time spent in each system call, then the -T option can be used with Linux strace command.

For example :

$ strace -T ls
execve("/bin/ls", ["ls"], [/* 38 vars */]) = 0 <0.000399>
brk(0)                                  = 0x84f3000 <0.000036>
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory) <0.000047>
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77a6000 <0.000063>
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory) <0.000066>
...
...
...

So we see that time spent in each system call is printed at the end of each line.

5. Trace only specified system calls

This is an extremely useful feature of this command. Through this one can trace only those sytem calls that are required. This makes the analysis fast and easy. This is be achived using the ‘-e trace’ option.

For example :

$ strace -e trace=open ls
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
open("/lib/i386-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib/i386-linux-gnu/librt.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib/i386-linux-gnu/libacl.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/lib/i386-linux-gnu/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/lib/i386-linux-gnu/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
open("/lib/i386-linux-gnu/libattr.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/proc/filesystems", O_RDONLY|O_LARGEFILE) = 3
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
...
...
...

So we see that all the open system calls were traced.

6. Redirect the output to a file

Usually the output of strace command is a bit messy and that is the reason why mostly it is preferred to have the output in a text file so that detailed analysis can be done easily. This is possible using the -o option.

For example :

$ strace -o strace.out -T ls
Desktop    Downloads	     grof	Music	  practice  strace.out	Ubuntu One
Documents  examples.desktop  groff.txt	Pictures  Public    Templates	Videos

$ cat strace.out 
execve("/bin/ls", ["ls"], [/* 38 vars */]) = 0 
brk(0)                                  = 0x8115000 
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory) 
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77b1000 
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)

So we see that all the strace output was redirected to the file strace.out and the ls command output was produced on stdout.

These are only some of the capabilities of strace command. There are lots and lots of other options in the man page of Linux strace command.

Can I use strace

Yes. Any one using Linux can use this command. It doesn’t really matter if you are a developer, system administrator, network administrator or just an end user. As an when you face a problem and think that strace can be helpful you must use this command.

System administrators can use this command to interrogate what a command or program is  trying to access internally. This helps them to keep malicious programs at bay.

Developers can use this command to check the time spent by their program in system calls. This helps them to optimize their code. Also they can use Linux strace command to study the programs for which the source code is not available. For study purposes there is another friend of strace, the ltrace command, that can be used.

End users can use this command in case they are facing any issue related to a program which is not executing properly. Even if they are not able to understand the output of strace, at least they can share this output with the experienced people on forums and can get their answers fast.

A practical example

In this section, I will present a practical examples to give you an idea how magnificent troubleshooter the Linux strace command is.

Example

In this example, I will show you how one can use strace to debug a problem. Note that I have simplified the problem to give you an idea about how it can be used to solve a practical development related problem.

Consider the following code :

#include<stdio.h>

int main(void)
{
    if(NULL == fopen("MyLinuxBook","rw"))
    {
        printf("\n program failed\n");
    }
    else
    {
        printf("\n program successful\n");
    }

    return 0;
}

The program above tries to open a file named ‘MyLinuxBook’. If file opening fails then program fails else program is successful. Now suppose that you do not have this code with you and you just try to execute the program binary :

$ ./test_strace 

 program failed

Now you see the output says that the program failed but it does not give you an idea as to why the program failed?

Lets use strace command to debug this :

$ strace ./test_strace 
execve("./test_strace", ["./test_strace"], [/* 39 vars */]) = 0
brk(0)                                  = 0x8a8b000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7768000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=71704, ...}) = 0
mmap2(NULL, 71704, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7756000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000\226\1\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0
mmap2(NULL, 1743580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb75ac000
mprotect(0xb774f000, 4096, PROT_NONE)   = 0
mmap2(0xb7750000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a3) = 0xb7750000
mmap2(0xb7753000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7753000
close(3)                                = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75ab000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb75ab900, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
mprotect(0xb7750000, 8192, PROT_READ)   = 0
mprotect(0x8049000, 4096, PROT_READ)    = 0
mprotect(0xb778b000, 4096, PROT_READ)   = 0
munmap(0xb7756000, 71704)               = 0
brk(0)                                  = 0x8a8b000
brk(0x8aac000)                          = 0x8aac000
open("MyLinuxBook", O_RDONLY) = -1 ENOENT (No such file or directory)
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7767000
write(1, "\n", 1
)                       = 1
write(1, " program failed\n", 16 program failed
)       = 16
exit_group(0)

When you analyse the output of the strace command, you will see that the open() system call failed to open file ‘MyLinuxBook’ with error ‘No such file or directory’. So now you check the file ‘MyLinuxBook’ in current directory and you will see that because this file does not exist so the program failed.

This was a small demo as to how strace helps to debug problems. Many complex problems can be sorted out using Linux strace command.

Useful links

4 thoughts on “Linux strace command – A magnificent troubleshooter

  1. Duncan

    As a user/sysadmin, my most common use of strace is to see what where a particular program is storing its config files, and/or what data files it’s accessing.

    One often useful option you didn’t mention is -f, trace thru program forks, since often, a program will launch and then fork, with the fork doing the actual real work. I’ve gotten in the habit of throwing in the -f in any case, since it’s just easier to start with it than to examine the output and then decide I need to rerun with it.

    Let’s say I want to figure out where my mail program, claws-mail, is storing its mail folders, so I can back them up.

    strace -feopen claws-mail

    But particularly for X related programs, the list of files it tries to open is HUGE! However, I’m not interested in the libraries, fonts, icons, etc, it’s opening. And I’m only interested in what it’s accessing in my home dir. So I’ll pipe the output to grep. However, strace prints the interesting output to STDERR, so STDOUT can still be used to interact with the program you’re stracing. So STDERR must be piped.

    strace -feopen claws-mail |& grep -v ‘icons\|cursors\|ENOENT’ | grep /home/user

    That straces thru forks (which in this case don’t matter as it doesn’t fork, if it did the PID of the fork making each call would be printed as well), only the file-open call. |& pipes STDERR to grep.

    The first grep uses -v, print lines that do NOT contain, since I’m NOT interested in lines containing “icons” or “cursors”, nor in attempted opens when there was no file to open (no directory entry error, ENOENT).

    The single-quotes are necessary to keep bash from trying to interpret the \|, since those are logical ORs, intended for grep.

    The output of the first grep is piped to a second, grepping for my home dir, since those are the only entries I’m interested in.

    Now, the output is only about 50 lines, mostly files in a single dir, /home/user/.claws-mail/. That’s the config dir. There’s a couple entries for /home/user/config/local/mail/, however, the mail storage dir I was looking for. (There’s also entries for my gtkrc since claws is a gtk-based program, the mime-cache, .fonts.conf, etc, but those aren’t what I’m interested in either. If there were too many such entries, I could add them to the grep -v list, but there’s only a few, so it’s easier to just ignore them.)

    Here’s a short segment. (Oviously I’ve “genericized” the user in the grep and output, my real user isn’t “user”. Also, claws’ standard mail directory location would be different, I’ve customized it, here.):

    open(“/home/user/.fonts.conf”, O_RDONLY) = 13
    open(“/home/user/.claws-mail/accountrc”, O_RDONLY) = 10
    open(“/home/user/.claws-mail/folderlist.xml”, O_RDONLY) = 14
    open(“/home/user/config/local/mail/queue/.claws_cache”, O_RDONLY) = 14
    open(“/home/user/config/local/mail/queue/.claws_mark”, O_RDONLY) = 14
    open(“/home/user/.claws-mail/tagsdb/#mh/mail/queue/.claws_tags”, O_RDONLY) = 14
    open(“/home/user/.claws-mail/matcherrc”, O_RDONLY) = 14

    Reply
    1. Tarun Post author

      Thanks a lot for sharing your valuable experience.

      Reply
  2. Daniel Stavrovski

    hey Tarun Thakur, thanks for sharing these great examples of ‘strace’ usage. some of them will come handy to me, no doubt.

    anyhow, I wanted to add that I find the usage of the ‘-p’ switch quite handy as it allows strace to attach to an already running PID.

    keep up the great work.

    take care,

    - d

    Reply
  3. Ardi

    Owh, thanks god, you have seen people who share about strace.

    This is really helpful for me.

    Great thanks.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *