Troubleshooting with strace
The Tracer
© Photo by Marc Sendra Martorell on Unsplash
The strace command-line utility helps you diagnose complex problems by revealing details about the interaction between applications and the Linux kernel.
Every Linux admin or developer has encountered mysterious problems: applications crashing without clear error messages, network connections failing unexpectedly, or system resources disappearing for no apparent reason. These issues can be frustrating and time-consuming to debug – unless you have the right tools. In these situations, system tracing is often the next step, which makes strace [1] one of the most valuable tools in a Linux professional's toolkit.
Tracing in Linux allows you to observe the system calls an application makes to the kernel, revealing the hidden interactions that occur beneath the surface of your code. In this article, I'll explore four scenarios that demonstrate how to get to the source of a problem with strace.
System Calls and Tracing
When a program executes, it frequently requests services from the kernel – those services could include file operations, network access, memory allocation, or any of the other services the kernel provides. These requests are made through system calls (syscalls). System calls are the interface between user applications and the kernel.
The strace utility captures these system calls, showing you exactly what your application is asking the kernel to do and how the kernel responds. This visibility is invaluable when troubleshooting because it reveals both the sequence of operations and any failures that might occur.
To use strace with any command, the basic syntax is:
strace [options] command [arguments]
The best way to learn about strace is to see it in action. Consider the following scenarios.
Scenario 1: Tracking Down Missing Files
One of the most common issues developers face is applications failing because they can't find required files. The cause for this problem could be incorrect paths, misunderstood working directories, or installation issues.
The Python script in Listing 1 attempts to open a file that doesn't exist.
Listing 1
missing_file.py
def main():
try:
with open('/tmp/does-not-exist.txt', 'r') as f:
print("File opened successfully")
except FileNotFoundError as e:
print(f"Failed to open file: {e}")
if __name__ == "__main__":
main()
When I run this script with strace, I can see exactly what file the application is looking for:
$ strace python3 missing_file.py
The relevant output will contain lines like:
openat (AT_FDCWD, "/tmp/does-not-exist.txt",O_RDONLYI0_CLOEXEC) = -1 ENOENT (No such file or directory)
This example is very simple, but in complex applications with many dependencies, strace can reveal exactly which files are missing, even when error messages are vague or nonexistent. Figure 1 shows typical strace output for missing file errors.
Scenario 2: Memory Allocation Failures
Memory issues are among the most challenging problems to diagnose. An application might crash with an out-of-memory error, or worse, fail silently when memory allocations are denied. Listing 2 is an example of a program with a memory issue.
Listing 2
memory_issue.py
def main():
try:
# Try to allocate a very large amount of memory
size = 1024 * 1024 * 1024 * 4 # 4GB
print(f"Attempting to allocate {size} bytes")
# Create a large bytearray
large_array = bytearray(size)
# Try to use the memory
large_array[0] = 1
large_array[size-1] = 1
print("Memory allocated and initialized successfully")
except MemoryError as e:
print(f"Memory allocation failed: {e}")
if __name__ == "__main__":
main()
To observe memory allocation calls, use strace with specific filters for memory-related system calls:
$ strace -e trace=brk, mmap,munmap python3 memory_issue.py
When memory allocation fails, you'll see output like the following:
mmap(NULL, 524292096, PROT_READ I PROT_WRITE, MAP_PRIVATE IMAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
This output clearly shows that the application responded to the mmap() system call with ENOMEM (cannot allocate memory). With this information, I can adjust the application's memory requirements or diagnose why the system doesn't have enough available memory. Figure 2 illustrates how memory allocation failures appear in strace output.
Buy this article as PDF
(incl. VAT)
