TL;DR
strace is a tool that traces system calls and signals while a progam is running. The output is verbose and can be hard to comprehend at a glance. I built explain-strace, a Python tool that parses strace output and adds human-readable descriptions for each system call, categorizes them (filesystem, network, memory, etc.), and provides summary statistics. The tool evolved from a simple parser to a maintainable system that generates syscall metadata directly from Linux kernel source, keeping pace with kernel updates automatically.
The Problem: strace is Powerful but Cryptic
When debugging system-level issues, strace a tool I often reach for. It intercepts and logs all system calls made by a process, showing exactly what the program is asking the kernel to do. However the output can be overwhelming:
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=88784, ...}) = 0
mmap(NULL, 88784, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f8b2c8e1000
close(3) = 0
Unless you work with these syscalls daily, you’re constantly context-switching to man pages to understand what each call does. Even experienced developers can struggle to see patterns in hundreds of lines of syscall output.
Building the First Version
The initial goal was straightforward: parse strace output and add one-line descriptions for each system call. The first implementation handled:
- Reading from stdin or files
- Parsing the standard
syscall(args) = retvalformat - Displaying the original line plus a human-readable description
- Multiple verbosity levels (adding man page links in verbose mode)
This immediately made strace output more accessible:
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
- Category: filesystem
- Description: Open file relative to a directory file descriptor
- Returned: 3
- Documentation: https://man7.org/linux/man-pages/man2/openat.2.html
The tool could handle streaming input (useful when attaching to running processes) and interrupted sessions with Ctrl-C, displaying a summary at the end.
Adding Categories and Filtering
As I used the tool I noticed that it would be useful to be able to quickly identify types of system calls - for example “is this program making network connections?”. Being able to filter by category would make it much easier to focus on specific subsystems.
I added the following categories (this is easy to extend/change in the JSON if needed):
async_io- Asynchronous I/O operationsdevice- Device controlfilesystem- File and directory operationsipc- Inter-process communicationmemory- Memory managementnetwork- Socket and network operationsprocess- Process/thread managementscheduling- CPU scheduling and prioritysecurity- Security and permissionssignal- Signal handlingsystem- System information and configurationtime- Time and timersunimplemented- Unimplemented system calls
The --filter option allows focusing on specific categories:
# See only filesystem operations
strace ls 2>&1 | explain-strace --filter filesystem
# Focus on network calls
strace wget http://example.com 2>&1 | explain-strace --filter network
This filtering happens at the display level, not at strace’s level, which means you still capture all syscalls but only display what’s relevant. This preserves context that might be important for understanding the full picture.
Summary statistics now group by category:
======================================================================
SUMMARY BY CATEGORY
======================================================================
Category Count
----------------------------------------------------------------------
filesystem 38
memory 27
process 5
system 2
device 1
ipc 1
----------------------------------------------------------------------
Total: 74 calls across 6 categories
Preparing the code for release
After the core features worked, I focused on making this a proper Python project rather than just a script. This involved:
Converting to a package structure with pyproject.toml, proper module organization, and entry points that create a explain-strace command after installation.
Adding tests - This involved testing the parser with various strace output formats (unfinished calls, resumed calls, timestamps, PIDs) and ensuring the filtering and categorization logic worked correctly.
Linting and formatting using ruff and black to ensure code quality and consistency.
Automating with a Makefile with self-documenting targets for common tasks like make test, make lint, make format, and make check.
Addressing deprecation warnings by updating to modern Python packaging standards (SPDX license identifiers, updated setuptools requirements, dropping Python 3.8 support).
These changes transformed the project from a useful script to something that could be easily maintained and contributed to.
Updating from the kernel source
Having created a hard-coded list of syscalls and descriptions, I wanted to make it easy to update this program to stay up-to-date with changes in the linux kernel. So I modifed explain-strace to use a JSON data store for it’s syscall information, and created scripts/generate_syscalls.py which parses the kernel’s source and updates that JSON data:
python scripts/generate_syscalls.py /path/to/linux-6.11.0
The script:
- Parses kernel syscall tables from architecture-specific files like
arch/x86/entry/syscalls/syscall_64.tbl - Extracts version information from the kernel Makefile
- Merges with existing data to preserve human-curated descriptions and categories
- Detects changes by comparing syscall numbers and names across kernel versions
- Marks status for each syscall (active, new, removed, obsolete)
The generated syscalls.json file contains:
{
"metadata": {
"kernel_version": "6.11.0",
"generated_date": "2025-11-20",
"architecture": "x86_64"
},
"syscalls": {
"map_shadow_stack": {
"number": 453,
"description": "Map shadow stack for control-flow integrity",
"category": "memory",
"status": "new",
"first_seen_version": "6.6.0"
}
}
}
The tool now shows warnings in verbose mode when encountering new or deprecated syscalls:
map_shadow_stack(...) = 0x7f8b2c8e1000
- Category: memory
- Description: Map shadow stack for control-flow integrity
- Returned: 0x7f8b2c8e1000
⚠️ Status: new (first seen in kernel 6.6.0)
Updating to Linux 6.11
With the generator in place, updating to Linux 6.11 became straightforward:
- Run the generator against the new kernel source
- Review the detected new syscalls
- Research and categorize them appropriately
- Update descriptions using kernel documentation
The 6.11 update brought 21 new syscalls across multiple categories:
- Filesystem:
cachestat,fchmodat2,listmount,quotactl_fd,statmount - IPC:
futex_requeue,futex_wait,futex_waitv,futex_wake - Security: Landlock and LSM syscalls for fine-grained access control
- Memory:
map_shadow_stack,memfd_secret,mseal
I also recategorized several older syscalls that had been marked as “unknown”, properly identifying unimplemented stubs, obsolete calls, and architecture-specific operations.
Design Decisions and Tradeoffs
Several key design decisions shaped the tool:
Stream processing over buffering: The parser handles input line-by-line, making it work with both files and live streaming from running processes.
Display-level filtering over strace filtering: Users can use strace -e trace=... to filter at collection time, but this loses context. explain-strace --filter preserves all data while focusing display.
JSON-based syscall database: Moving from hard-coded dictionaries to JSON made updates trivial and enabled version tracking.
The --once flag: When analyzing programs with thousands of repeated syscalls, --once shows full explanations only for the first occurrence of each type, dramatically reducing output verbosity.
Fallback to hard-coded data: If the JSON file is missing or corrupted, the tool falls back to embedded syscall data, ensuring it always works.
Example Usage
Here are some examples:
Debug a program’s filesystem access:
strace ./myprogram 2>&1 | explain-strace --filter filesystem
Monitor a running process:
strace -p 1234 2>&1 | explain-strace -v
# Press Ctrl-C to stop and see summary
Analyze with minimal output:
strace ls 2>&1 | explain-strace --once --filter filesystem,memory
List available categories:
explain-strace --catlist
Try It Out
The tool is available on GitHub:
# From source
git clone https://github.com/bjdean/explain-strace.git
cd explain-strace
python3 -m venv venv
.venv/bin/activate
pip install -e .
# Basic usage
strace ls /tmp 2>&1 | explain-strace
The repository includes comprehensive documentation, tests, and the syscall generator for keeping data current with new kernel releases.
Reference: explain-strace on GitHub