Performance Analysis with strace and ltrace
Introduction
strace and ltrace are powerful diagnostic tools for analyzing application behavior at the system level. strace traces system calls (kernel-level operations), while ltrace traces library calls (user-space function calls). These tools are invaluable for performance analysis, debugging, and understanding why applications are slow, where they spend time, and what resources they access.
When applications perform poorly without obvious cause, strace and ltrace can reveal hidden bottlenecks: excessive system calls, slow I/O operations, failed operations retried repeatedly, or inefficient library usage. They provide visibility into the black box of application execution, showing exactly what operations are performed and how long each takes.
This comprehensive guide covers both tools' usage, performance profiling techniques, bottleneck identification, and real-world optimization examples. You'll learn how to diagnose performance issues that other tools miss.
Understanding System and Library Calls
System Calls vs Library Calls
System Calls (strace):
- Kernel-level operations
- Process management (fork, exec, wait)
- File operations (open, read, write, close)
- Network operations (socket, connect, send, recv)
- Memory management (mmap, brk, munmap)
- Examples: read(), write(), open(), close(), socket()
Library Calls (ltrace):
- User-space function calls
- Standard C library (libc)
- Dynamic library calls
- Examples: malloc(), printf(), strlen(), strcmp()
Why Performance Issues Happen
Common patterns that cause slowdowns:
- Excessive system calls: Thousands of small reads instead of buffered I/O
- Failed operations: Repeated failed attempts (file not found, connection refused)
- Blocking I/O: Waiting for slow disk or network
- Inefficient algorithms: Excessive allocations, string operations
- Lock contention: Waiting for mutexes or file locks
strace: System Call Tracing
Installation
# Ubuntu/Debian
apt-get install strace -y
# CentOS/Rocky Linux
dnf install strace -y
# Verify installation
strace -V
Basic Usage
# Trace a command
strace ls -la
# Output shows every system call:
# execve("/bin/ls", ["ls", "-la"], 0x7ffde2a3ef40 /* 24 vars */) = 0
# brk(NULL) = 0x55d8a3c4e000
# access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
# openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
# fstat(3, {st_mode=S_IFREG|0644, st_size=27002, ...}) = 0
# ...hundreds more lines...
# Attach to running process
strace -p <PID>
# Example: trace web server process
strace -p $(pidof nginx)
Performance-Focused Options
# Count system calls and time (-c)
strace -c ls -la
# Output summary:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
34.21 0.000234 12 19 mmap
18.42 0.000126 16 8 read
15.79 0.000108 14 8 openat
10.53 0.000072 24 3 write
...
# Show timestamps (-t, -tt, -ttt)
strace -tt ls -la # HH:MM:SS.microseconds
# Show time spent in each call (-T)
strace -T ls -la
# read(3, "..."..., 832) = 832 <0.000045> # 45 microseconds
# Combine timing options
strace -c -T -p <PID>
# Trace specific system calls (-e)
strace -e trace=open,read,write ls -la
strace -e trace=network nc google.com 80
strace -e trace=file ls -la
Real-World Performance Analysis
Example 1: Slow File Access
# Trace file operations
strace -e trace=open,read,close -T php slow-script.php
# Output reveals:
openat(AT_FDCWD, "config.php", O_RDONLY) = 3 <0.000234>
read(3, "<?php\n..."..., 4096) = 4096 <0.000045>
close(3) = 0 <0.000012>
openat(AT_FDCWD, "config.php", O_RDONLY) = 3 <0.000187> # Opened again!
read(3, "<?php\n..."..., 4096) = 4096 <0.000038>
close(3) = 0 <0.000011>
# Repeated 500 times!
# Problem: config file opened 500 times instead of once
# Solution: Cache file contents
# After fix:
openat(AT_FDCWD, "config.php", O_RDONLY) = 3 <0.000234>
read(3, "<?php\n..."..., 4096) = 4096 <0.000045>
close(3) = 0 <0.000012>
# Only once! 99.8% reduction in file operations
Example 2: Database Connection Issues
# Trace network calls
strace -e trace=network -T python app.py
# Output shows:
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3 <0.000145>
connect(3, {sa_family=AF_INET, sin_port=htons(3306), sin_addr=inet_addr("10.0.1.5")}, 16) = -1 ETIMEDOUT <3.015234> # 3 seconds!
close(3) = 0 <0.000034>
# Connection timeout repeated 20 times = 60 seconds wasted
# Problem: Database server unreachable, no timeout configured
# Solution: Set connection timeout, add connection pooling
# After fix:
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3 <0.000145>
connect(3, {sa_family=AF_INET, sin_port=htons(3306), sin_addr=inet_addr("127.0.0.1")}, 16) = 0 <0.000234>
# Connects immediately to local database
Profiling Script
#!/bin/bash
# profile-syscalls.sh - Analyze system call patterns
PID="$1"
DURATION=30
if [ -z "$PID" ]; then
echo "Usage: $0 <PID>"
exit 1
fi
echo "Profiling process $PID for $DURATION seconds..."
# Collect system call statistics
strace -c -p $PID 2>&1 &
STRACE_PID=$!
sleep $DURATION
# Stop strace
kill -INT $STRACE_PID 2>/dev/null
wait $STRACE_PID 2>/dev/null
echo
echo "Analysis complete. Look for:"
echo "1. High % time calls (slow operations)"
echo "2. High call counts (excessive operations)"
echo "3. Many errors (failed operations)"
ltrace: Library Call Tracing
Installation
# Ubuntu/Debian
apt-get install ltrace -y
# CentOS/Rocky Linux
dnf install ltrace -y
# Verify installation
ltrace -V
Basic Usage
# Trace library calls
ltrace ls -la
# Output shows library function calls:
# __libc_start_main(0x55a7c79a3170, 2, 0x7ffe32fa7598, 0x55a7c79a9e50...
# strrchr("ls", '/') = nil
# setlocale(LC_ALL, "") = "en_US.UTF-8"
# bindtextdomain("coreutils", "/usr/share/locale") = "/usr/share/locale"
# ...
# Trace specific functions
ltrace -e malloc,free ls -la
# Count calls (-c)
ltrace -c ls -la
# Show timestamps (-t, -tt, -ttt)
ltrace -tt ls -la
# Show time spent (-T)
ltrace -T ls -la
Performance Analysis with ltrace
Example 1: Memory Allocation Issues
# Trace memory operations
ltrace -e malloc,realloc,free -c python script.py
# Output summary:
% time seconds usecs/call calls function
------ ----------- ----------- --------- --------------------
45.67 0.234567 23 10000 malloc
32.15 0.165234 165 1000 realloc
22.18 0.114000 11 10000 free
------ ----------- ----------- --------- --------------------
100.00 0.513801 21000 total
# Analysis:
# - 10,000 malloc calls (excessive)
# - 1,000 realloc calls (resizing arrays repeatedly)
# Problem: Not pre-allocating memory, causing many reallocations
# Solution: Pre-allocate with expected size
# After optimization:
% time seconds usecs/call calls function
------ ----------- ----------- --------- --------------------
52.34 0.023456 23 1000 malloc # 90% reduction!
25.87 0.011592 116 100 realloc # 90% reduction!
21.79 0.009762 10 1000 free
------ ----------- ----------- --------- --------------------
100.00 0.044810 2100 total
# Result: 91% reduction in memory operations, 10x faster
Example 2: String Operations
# Trace string functions
ltrace -e 'strlen,strcpy,strcat,strcmp' -T ./app
# Output reveals:
strlen("very_long_string_here...") = 1024 <0.000012>
strlen("very_long_string_here...") = 1024 <0.000011>
strlen("very_long_string_here...") = 1024 <0.000012>
# strlen() called 50,000 times on same string!
# Problem: Not caching string length
# Solution: Store length in variable
# After optimization:
strlen("very_long_string_here...") = 1024 <0.000012>
# Only once! 99.998% reduction
Combined Analysis: strace + ltrace
Comprehensive Performance Profile
#!/bin/bash
# full-profile.sh - Complete performance analysis
APP="$1"
OUTPUT_DIR="profile-$(date +%Y%m%d-%H%M%S)"
mkdir -p "$OUTPUT_DIR"
echo "Starting comprehensive performance profile..."
# 1. System call profile
echo "Profiling system calls (30 seconds)..."
timeout 30 strace -c -o "$OUTPUT_DIR/syscalls.txt" "$APP" 2>&1
# 2. Library call profile
echo "Profiling library calls (30 seconds)..."
timeout 30 ltrace -c -o "$OUTPUT_DIR/libcalls.txt" "$APP" 2>&1
# 3. Detailed system call timing
echo "Capturing detailed system call timing..."
timeout 30 strace -T -tt -o "$OUTPUT_DIR/syscalls-detailed.txt" "$APP" 2>&1
# 4. File operations analysis
echo "Analyzing file operations..."
timeout 30 strace -e trace=file -T -tt -o "$OUTPUT_DIR/file-ops.txt" "$APP" 2>&1
# 5. Network operations analysis
echo "Analyzing network operations..."
timeout 30 strace -e trace=network -T -tt -o "$OUTPUT_DIR/network-ops.txt" "$APP" 2>&1
# Generate report
cat > "$OUTPUT_DIR/report.md" << EOF
# Performance Analysis Report
Date: $(date)
Application: $APP
## Summary
### Top System Calls by Time
\`\`\`
$(head -15 "$OUTPUT_DIR/syscalls.txt")
\`\`\`
### Top Library Calls by Time
\`\`\`
$(head -15 "$OUTPUT_DIR/libcalls.txt")
\`\`\`
## Recommendations
EOF
# Analyze patterns
echo "Analyzing for common issues..."
# Check for excessive file opens
FILE_OPENS=$(grep -c "openat" "$OUTPUT_DIR/file-ops.txt")
if [ $FILE_OPENS -gt 1000 ]; then
echo "- HIGH: Excessive file opens ($FILE_OPENS). Consider caching." >> "$OUTPUT_DIR/report.md"
fi
# Check for malloc/free imbalance
MALLOCS=$(grep -c "malloc" "$OUTPUT_DIR/libcalls.txt" || echo 0)
FREES=$(grep -c "free" "$OUTPUT_DIR/libcalls.txt" || echo 0)
if [ $MALLOCS -gt $((FREES * 2)) ]; then
echo "- WARNING: Memory allocation/free imbalance. Potential memory leak." >> "$OUTPUT_DIR/report.md"
fi
echo
echo "Profile complete. Results in $OUTPUT_DIR/"
echo "Review report.md for findings and recommendations."
Real-World Case Studies
Case Study 1: Slow Application Startup
Problem: Application takes 15 seconds to start
# Trace application startup
strace -T -tt -o startup-trace.txt ./app
# Analyze trace
grep "<[0-9]\." startup-trace.txt | sort -t'<' -k2 -n | tail -20
# Top slow operations found:
19:45:32.123456 connect(3, ...) = -1 ETIMEDOUT <5.015234>
19:45:37.138690 connect(4, ...) = -1 ETIMEDOUT <5.012345>
19:45:42.151035 connect(5, ...) = -1 ETIMEDOUT <5.011234>
# 15 seconds wasted on 3 failed connection attempts
# Root cause: Trying to connect to unavailable service on startup
# Solution: Make service connection async or fail faster
# After fix:
# Startup time: 0.5 seconds (96% improvement)
Case Study 2: High CPU Usage
Problem: Process using 80% CPU, unclear why
# Profile running process
strace -c -p <PID> &
TRACE_PID=$!
sleep 60
kill -INT $TRACE_PID
# Results show:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- --------
89.45 12.456789 25 498234 stat
5.23 0.728456 18 40567 read
3.12 0.434567 22 19823 write
2.20 0.306789 34 9012 open
# 498,234 stat() calls in 60 seconds = 8,304 per second!
# Investigate with detailed trace:
strace -e trace=stat -p <PID> | head -100
# Shows repeatedly checking if file exists:
stat("/var/www/html/cache/user_123", {st_mode=S_IFREG|0644, ...}) = 0
stat("/var/www/html/cache/user_124", {st_mode=S_IFREG|0644, ...}) = 0
stat("/var/www/html/cache/user_125", {st_mode=S_IFREG|0644, ...}) = 0
# Thousands of times per second!
# Root cause: Cache system checking file existence before every read
# Solution: Keep file descriptors open or use different caching strategy
# After fix:
# CPU usage: 15% (81% reduction)
# stat() calls: 45 per second (99.5% reduction)
Case Study 3: Memory Leak Detection
# Trace memory operations over time
ltrace -e malloc,free -p <PID> > mem-trace.txt &
LTRACE_PID=$!
sleep 300 # 5 minutes
kill -INT $LTRACE_PID
# Count allocations and frees
MALLOCS=$(grep -c "malloc(" mem-trace.txt)
FREES=$(grep -c "free(" mem-trace.txt)
echo "Mallocs: $MALLOCS"
echo "Frees: $FREES"
echo "Leak: $((MALLOCS - FREES)) unfreed allocations"
# Result:
# Mallocs: 45,678
# Frees: 23,456
# Leak: 22,222 unfreed allocations
# Identify leak pattern
grep "malloc(" mem-trace.txt | awk '{print $2}' | sort | uniq -c | sort -rn | head -10
# Shows most common allocation sizes:
# 8,543 malloc(4096) # 4KB blocks not being freed
# 6,234 malloc(1024)
# 3,456 malloc(512)
# Find allocation context with full trace
ltrace -T -e malloc,free -p <PID> | grep "malloc(4096)"
# Shows allocations from specific function
# 0x00007f8b... malloc(4096) = 0x... <0.000023> # From cache_add_entry()
# Root cause: cache_add_entry() allocating but never freeing
# Solution: Add free() in cache_remove_entry()
Advanced Techniques
Filtering and Focus
# Trace only file operations taking > 10ms
strace -e trace=file -T ./app 2>&1 | awk '$NF ~ /<0\.0[1-9]/ || $NF ~ /<0\.[1-9]/ || $NF ~ /<[1-9]/'
# Trace failures only
strace -e trace=file ./app 2>&1 | grep "= -1"
# Count failures by type
strace -e trace=file ./app 2>&1 | grep "= -1" | awk '{print $NF}' | sort | uniq -c | sort -rn
# Example output:
# 1,234 ENOENT (No such file or directory)
# 567 EACCES (Permission denied)
# 89 EMFILE (Too many open files)
Automated Analysis Script
#!/bin/bash
# analyze-bottlenecks.sh - Automated bottleneck detection
PID="$1"
DURATION=60
if [ -z "$PID" ]; then
echo "Usage: $0 <PID>"
exit 1
fi
TMPFILE=$(mktemp)
echo "Analyzing process $PID for $DURATION seconds..."
# Collect trace
timeout $DURATION strace -c -T -p $PID > "$TMPFILE" 2>&1
# Parse results
echo
echo "=== PERFORMANCE ANALYSIS ==="
echo
# Check for excessive system calls
TOTAL_CALLS=$(grep "total" "$TMPFILE" | awk '{print $4}')
CALLS_PER_SEC=$((TOTAL_CALLS / DURATION))
if [ $CALLS_PER_SEC -gt 10000 ]; then
echo "WARNING: High system call rate ($CALLS_PER_SEC per second)"
echo "Consider batching operations or using buffered I/O"
echo
fi
# Check for slow operations
echo "Slowest Operations (% time):"
head -15 "$TMPFILE" | tail -10 | sort -k1 -rn | head -5
echo
# Check for errors
ERROR_COUNT=$(grep "errors syscall" "$TMPFILE" -A 100 | grep -v "errors syscall" | awk '{sum += $5} END {print sum}')
if [ "$ERROR_COUNT" -gt 0 ]; then
echo "WARNING: $ERROR_COUNT failed system calls detected"
echo "Review errors for potential issues"
echo
fi
# Recommendations
echo "=== RECOMMENDATIONS ==="
echo
# Check for file I/O issues
FILE_OPS=$(grep -E "read|write|open|close|stat" "$TMPFILE" | awk '{sum += $4} END {print sum}')
if [ "$FILE_OPS" -gt $((TOTAL_CALLS / 2)) ]; then
echo "- File I/O is dominant ($(echo "$FILE_OPS * 100 / $TOTAL_CALLS" | bc)% of calls)"
echo " Consider: caching, buffering, or async I/O"
fi
rm "$TMPFILE"
Best Practices
1. Use Appropriate Scope
# Bad: Tracing entire application (too much data)
strace ./long-running-app
# Good: Trace specific time window
timeout 30 strace -c ./long-running-app
# Better: Attach to already-running process
strace -c -p <PID> &
sleep 60
killall -INT strace
2. Focus Analysis
# Don't trace everything
# Focus on likely problem areas:
# File I/O issues:
strace -e trace=file -T -c
# Network issues:
strace -e trace=network -T -c
# Memory issues:
ltrace -e malloc,free,realloc -c
3. Minimize Performance Impact
# strace/ltrace ADD overhead
# Don't use in production unless necessary
# Measure impact:
# Without tracing:
time ./benchmark
# real: 2.345s
# With strace:
time strace -o /dev/null ./benchmark
# real: 4.567s (95% overhead)
# With strace -c (summary only):
time strace -c -o /dev/null ./benchmark
# real: 2.678s (14% overhead)
# Use -c flag for minimal impact
4. Combine with Other Tools
# Use strace/ltrace WITH other tools
# CPU profiling:
perf record ./app &
strace -c -p $(pidof app)
# Memory profiling:
valgrind --tool=massif ./app &
ltrace -e malloc,free -c -p $(pidof app)
# I/O profiling:
iotop -p $(pidof app) &
strace -e trace=file -T -p $(pidof app)
Troubleshooting Common Issues
Permission Denied
# Error: Operation not permitted
# Solution: Run with sudo or as root
sudo strace -p <PID>
Process Stops When Traced
# strace sends SIGSTOP
# Process resumes when strace exits
# Use Ctrl+C to stop tracing gracefully
Too Much Output
# Redirect to file
strace ./app 2> trace.log
# Or use summary mode
strace -c ./app
# Filter output
strace ./app 2>&1 | grep -E "read|write"
Conclusion
strace and ltrace are indispensable tools for performance analysis:
Key Benefits:
- Reveal hidden bottlenecks invisible to other tools
- Identify excessive system/library calls
- Detect failed operations
- Understand application behavior
- Validate optimization effectiveness
Common Issues Detected:
- Excessive file opens (cache misses)
- Repeated failed connection attempts
- Memory allocation storms
- Inefficient string operations
- Missing buffering
- Lock contention
Usage Guidelines:
- Use for targeted analysis, not continuous monitoring
- Focus on specific call types (-e flag)
- Use -c for summary (less overhead)
- Combine with other profiling tools
- Test in development/staging first
Performance Improvements:
- Typical findings: 50-95% reduction in system calls
- Response time improvements: 2-10x faster
- Resource usage: 30-80% less CPU/I/O
By mastering strace and ltrace, you gain deep visibility into application behavior, enabling identification and resolution of performance issues that are invisible to conventional monitoring tools.


