performance-debug
Diagnose system performance issues including CPU, memory, disk, and network. Use when the user says "server is slow", "high CPU", "out of memory", "disk full", "performance issues", or asks to debug system performance.
When & Why to Use This Skill
This Claude skill provides a comprehensive framework for diagnosing and resolving system performance bottlenecks across Linux environments. It systematically analyzes CPU utilization, memory allocation, disk I/O, and network traffic using industry-standard utilities like top, iostat, and vmstat. By automating the identification of root-cause processes and resource constraints, it delivers actionable insights to restore system stability and optimize server performance.
Use Cases
- Troubleshooting 'Server Slow' reports: Rapidly identifying whether high load averages are caused by CPU-bound processes or I/O wait times.
- Memory Leak Detection: Monitoring per-process memory growth and swap usage to prevent Out-Of-Memory (OOM) errors and system crashes.
- Storage Performance Analysis: Identifying disk-heavy applications and large files that are saturating throughput or causing high latency.
- Network Connectivity & Bandwidth Debugging: Analyzing active connections, bandwidth consumption, and packet loss to resolve service delivery issues.
- Root Cause Analysis (RCA): Correlating system symptoms with specific process IDs (PIDs) to provide precise recommendations for optimization or scaling.
| name | performance-debug |
|---|---|
| description | Diagnose system performance issues including CPU, memory, disk, and network. Use when the user says "server is slow", "high CPU", "out of memory", "disk full", "performance issues", or asks to debug system performance. |
| allowed-tools | Bash, Read, Grep |
Performance Debug
Diagnose CPU, memory, disk, and network performance bottlenecks.
Instructions
- Get system overview:
top,htop, orvmstat - Identify the bottleneck type (CPU, memory, disk, network)
- Drill down with specific tools
- Identify the root cause process/resource
- Recommend solutions
Quick overview
# System summary
uptime
free -h
df -h
top -bn1 | head -20
# All-in-one view
htop # or top
CPU analysis
# High-level CPU usage
mpstat 1 5
top -bn1 -o %CPU | head -15
# Per-process CPU
pidstat 1 5
ps aux --sort=-%cpu | head -10
# Find CPU-intensive process
top -bn1 | grep -A10 "PID USER"
Memory analysis
# Memory overview
free -h
vmstat 1 5
# Per-process memory
ps aux --sort=-%mem | head -10
pidstat -r 1 5
# Memory details
cat /proc/meminfo
smem -tk # if available
# Find memory leaks (growth over time)
watch -n 5 'ps aux --sort=-%mem | head -10'
Disk analysis
# Disk space
df -h
du -sh /* 2>/dev/null | sort -hr | head -10
# Disk I/O
iostat -x 1 5
iotop -b -n 5 # requires root
# Find large files
find / -type f -size +100M 2>/dev/null
# Find disk-heavy processes
pidstat -d 1 5
Network analysis
# Connections and bandwidth
ss -tuln
ss -tunap | grep ESTAB
nethogs # per-process bandwidth
# Network statistics
netstat -i
ip -s link
# Check for connection issues
ping -c 4 8.8.8.8
mtr --report google.com
Common bottlenecks
| Symptom | Indicator | Solution |
|---|---|---|
| Load > cores | CPU-bound | Identify hot process, scale horizontally |
| High %wa in top | I/O wait | Check disk, move to SSD, optimize queries |
| Low free + high swap | Memory | Find leak, increase RAM, tune OOM |
| High %si/%hi | Interrupts | NIC issue, driver problem |
Rules
- MUST check all four resources (CPU, memory, disk, network)
- MUST identify specific processes causing issues
- MUST provide actionable recommendations
- Never kill processes without user approval
- Always check if symptoms correlate with specific times (cron, traffic)