How to see slowest units → systemd-analyze blame
How to See Slowest Units → systemd-analyze blame
Table of Contents
1. [Introduction](#introduction)
2. [Prerequisites](#prerequisites)
3. [Understanding systemd-analyze blame](#understanding-systemd-analyze-blame)
4. [Basic Usage and Syntax](#basic-usage-and-syntax)
5. [Interpreting the Output](#interpreting-the-output)
6. [Advanced Usage Examples](#advanced-usage-examples)
7. [Related systemd-analyze Commands](#related-systemd-analyze-commands)
8. [Troubleshooting Common Issues](#troubleshooting-common-issues)
9. [Best Practices and Optimization Tips](#best-practices-and-optimization-tips)
10. [Real-World Use Cases](#real-world-use-cases)
11. [Conclusion](#conclusion)
Introduction
System administrators and Linux users frequently encounter situations where their systems take longer than expected to boot or start services. The `systemd-analyze blame` command is an invaluable tool for identifying which systemd units are consuming the most time during system initialization. This comprehensive guide will teach you how to effectively use this command to diagnose performance issues, optimize boot times, and maintain efficient system operations.
The `systemd-analyze blame` command provides a detailed breakdown of service startup times, allowing you to pinpoint bottlenecks and make informed decisions about system optimization. Whether you're managing servers in a production environment or optimizing your personal Linux workstation, understanding how to leverage this tool is essential for maintaining peak system performance.
Prerequisites
Before diving into the specifics of `systemd-analyze blame`, ensure you have the following:
System Requirements
- A Linux system running systemd (most modern distributions)
- Administrative privileges (root or sudo access)
- Basic familiarity with command-line interface
- Understanding of systemd concepts (units, services, targets)
Knowledge Prerequisites
- Basic Linux command-line skills
- Understanding of system boot process
- Familiarity with systemd service management
- Basic knowledge of system performance concepts
Verification Steps
To verify your system supports `systemd-analyze`, run:
```bash
Check if systemd is running
systemctl --version
Verify systemd-analyze is available
which systemd-analyze
Check if you have necessary permissions
systemd-analyze --help
```
Understanding systemd-analyze blame
What is systemd-analyze blame?
The `systemd-analyze blame` command is part of the systemd suite of tools designed to analyze system and service manager performance. It specifically focuses on identifying which units took the longest time to initialize during the last system boot or service startup sequence.
How It Works
When systemd starts services during boot, it records timestamps for various stages of each unit's lifecycle. The `blame` subcommand processes this timing information and presents it in a human-readable format, sorted by initialization time in descending order.
Key Features
- Time-sorted output: Services listed from slowest to fastest
- Precise timing: Millisecond-level accuracy
- Unit identification: Clear service and unit names
- Historical data: Based on the most recent boot cycle
- No system impact: Read-only analysis with no performance overhead
Basic Usage and Syntax
Command Syntax
The basic syntax for `systemd-analyze blame` is straightforward:
```bash
systemd-analyze blame [OPTIONS]
```
Simple Usage Example
```bash
Basic blame command
sudo systemd-analyze blame
```
Sample Output
```
8.123s NetworkManager-wait-online.service
2.945s mysql.service
2.234s apache2.service
1.876s postgresql.service
1.234s docker.service
987ms plymouth-quit-wait.service
654ms systemd-networkd-wait-online.service
432ms accounts-daemon.service
321ms gdm.service
234ms ssh.service
123ms systemd-logind.service
89ms dbus.service
45ms systemd-resolved.service
23ms systemd-timesyncd.service
12ms systemd-tmpfiles-setup.service
```
Understanding the Output Format
Each line in the output follows this pattern:
```
[TIME] [UNIT_NAME]
```
Where:
- TIME: Duration the unit took to start (seconds, milliseconds)
- UNIT_NAME: The systemd unit identifier
Interpreting the Output
Time Units and Formatting
The output displays time in the most appropriate unit:
- Seconds (s): For times ≥ 1 second
- Milliseconds (ms): For times < 1 second
- Microseconds (μs): For very short times (rare in blame output)
Identifying Problem Services
Services to investigate typically include:
- Network services: Often slow due to timeout periods
- Database services: May have lengthy initialization procedures
- Web servers: Can be slow if checking configurations
- Custom applications: May have inefficient startup scripts
Normal vs. Concerning Times
Normal startup times:
- System services: 10-500ms
- Network services: 1-3 seconds
- Database services: 1-5 seconds
Concerning startup times:
- Any service > 10 seconds
- Multiple services > 5 seconds
- Services that previously started faster
Advanced Usage Examples
Filtering and Processing Output
Show Only Top 10 Slowest Services
```bash
systemd-analyze blame | head -10
```
Filter Services Taking More Than 1 Second
```bash
systemd-analyze blame | grep -E "^[[:space:]]*[0-9]+\.[0-9]+s"
```
Search for Specific Service Types
```bash
Find network-related services
systemd-analyze blame | grep -i network
Find database services
systemd-analyze blame | grep -E "(mysql|postgres|mongodb|redis)"
Find web server services
systemd-analyze blame | grep -E "(apache|nginx|httpd)"
```
Combining with Other Commands
Save Output for Analysis
```bash
Save to file with timestamp
systemd-analyze blame > boot_analysis_$(date +%Y%m%d_%H%M%S).txt
Compare before and after optimization
systemd-analyze blame > before_optimization.txt
... make changes ...
sudo reboot
systemd-analyze blame > after_optimization.txt
diff before_optimization.txt after_optimization.txt
```
Monitor Changes Over Time
```bash
Create a monitoring script
#!/bin/bash
echo "Boot Analysis - $(date)" >> boot_times.log
systemd-analyze blame | head -5 >> boot_times.log
echo "---" >> boot_times.log
```
Related systemd-analyze Commands
systemd-analyze time
Shows overall boot time breakdown:
```bash
systemd-analyze time
```
Output example:
```
Startup finished in 2.345s (kernel) + 1.234s (initrd) + 15.678s (userspace) = 19.257s
graphical.target reached after 15.234s in userspace
```
systemd-analyze critical-chain
Shows the critical path of service dependencies:
```bash
systemd-analyze critical-chain
```
systemd-analyze plot
Generates an SVG timeline of the boot process:
```bash
systemd-analyze plot > boot_timeline.svg
```
systemd-analyze dump
Provides detailed information about all units:
```bash
systemd-analyze dump > system_dump.txt
```
Troubleshooting Common Issues
Permission Denied Errors
Problem: Command fails with permission errors
Solution:
```bash
Use sudo for system-wide analysis
sudo systemd-analyze blame
Check if systemd is properly running
systemctl status systemd-logind
```
No Data Available
Problem: Command returns empty or minimal output
Causes and Solutions:
1. System hasn't been rebooted recently:
```bash
# Check last boot time
systemd-analyze time
# Reboot to generate fresh data
sudo reboot
```
2. Systemd logging disabled:
```bash
# Check systemd configuration
systemctl status systemd-journald
# Ensure logging is enabled in /etc/systemd/system.conf
grep -i "LogLevel" /etc/systemd/system.conf
```
Inconsistent Results
Problem: Results vary significantly between runs
Investigation Steps:
```bash
Check for services with random timing
systemd-analyze blame | head -10
Wait and check again
sleep 60
systemd-analyze blame | head -10
Look for network-dependent services
systemd-analyze critical-chain | grep -i network
```
Services Not Listed
Problem: Expected services don't appear in output
Possible Causes:
- Service failed to start
- Service is socket-activated
- Service starts very quickly
Investigation:
```bash
Check service status
systemctl status service_name
Check all units, including failed ones
systemctl list-units --failed
Check socket-activated services
systemctl list-sockets
```
Best Practices and Optimization Tips
Regular Monitoring
Establish Baseline Measurements
```bash
Create baseline after fresh installation
systemd-analyze blame > baseline_boot_times.txt
systemd-analyze time > baseline_overall.txt
Document system configuration
uname -a >> baseline_system_info.txt
systemctl list-enabled >> baseline_enabled_services.txt
```
Periodic Analysis
```bash
Weekly boot time check script
#!/bin/bash
LOGFILE="/var/log/boot-performance.log"
echo "=== Boot Analysis $(date) ===" >> $LOGFILE
systemd-analyze time >> $LOGFILE
echo "Top 10 slowest services:" >> $LOGFILE
systemd-analyze blame | head -10 >> $LOGFILE
echo "" >> $LOGFILE
```
Optimization Strategies
Disable Unnecessary Services
```bash
Identify services you don't need
systemctl list-enabled
Disable unwanted services
sudo systemctl disable service_name
sudo systemctl mask service_name # Prevent manual start
```
Optimize Network Services
```bash
Reduce NetworkManager-wait-online timeout
sudo systemctl edit NetworkManager-wait-online.service
Add these lines:
[Service]
TimeoutStartSec=30
```
Parallel Service Startup
```bash
Edit service files to remove unnecessary dependencies
sudo systemctl edit service_name
Optimize After/Before directives
[Unit]
After=network.target # Instead of network-online.target
```
Service-Specific Optimizations
Database Services
```bash
MySQL optimization example
sudo systemctl edit mysql.service
[Service]
Reduce innodb_buffer_pool_dump_at_startup time
ExecStartPre=/usr/bin/mysql_optimize_startup.sh
```
Web Servers
```bash
Apache optimization
sudo systemctl edit apache2.service
[Service]
Pre-validate configuration
ExecStartPre=/usr/sbin/apache2ctl configtest
```
Real-World Use Cases
Case Study 1: Server Boot Optimization
Scenario: Production server taking 45 seconds to boot
Analysis:
```bash
systemd-analyze blame | head -5
25.432s NetworkManager-wait-online.service
8.234s mysql.service
4.567s apache2.service
2.345s docker.service
1.234s ssh.service
```
Solution Applied:
```bash
Reduce network wait timeout
sudo systemctl edit NetworkManager-wait-online.service
[Service]
TimeoutStartSec=15
Optimize MySQL startup
sudo systemctl edit mysql.service
[Service]
ExecStartPre=/opt/mysql_quick_start.sh
Result: Boot time reduced to 18 seconds
```
Case Study 2: Desktop Performance
Scenario: Linux desktop slow to reach graphical interface
Analysis Process:
```bash
Check overall timing
systemd-analyze time
Result: 23.456s to reach graphical.target
Identify bottlenecks
systemd-analyze blame | grep -E "(gdm|plymouth|graphics)"
5.678s plymouth-quit-wait.service
3.456s gdm.service
2.345s nvidia-persistenced.service
Check critical path
systemd-analyze critical-chain graphical.target
```
Optimization Steps:
```bash
Disable plymouth if not needed
sudo systemctl disable plymouth-quit-wait.service
Optimize graphics drivers
sudo systemctl edit nvidia-persistenced.service
[Service]
Type=forking
TimeoutStartSec=10
```
Case Study 3: Container Host Optimization
Scenario: Docker host with slow container startup affecting boot time
Investigation:
```bash
systemd-analyze blame | grep -i docker
12.345s docker.service
3.456s docker-containerd.service
Check what's causing Docker delays
journalctl -u docker.service --since "last boot"
```
Resolution:
```bash
Optimize Docker daemon startup
sudo systemctl edit docker.service
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd --storage-driver=overlay2 --log-driver=journald
Configure Docker to start after network is ready
[Unit]
After=network-online.target
Wants=network-online.target
```
Advanced Analysis Techniques
Correlation Analysis
Comparing Multiple Boots
```bash
#!/bin/bash
Multi-boot analysis script
for i in {1..5}; do
echo "Boot $i - rebooting..."
sudo reboot
sleep 120 # Wait for system to stabilize
echo "=== Boot $i Analysis ===" >> multi_boot_analysis.txt
systemd-analyze blame | head -10 >> multi_boot_analysis.txt
echo "" >> multi_boot_analysis.txt
done
```
Service Dependency Analysis
```bash
Find services that might benefit from parallelization
systemd-analyze critical-chain | grep -A5 -B5 "slow_service.service"
Check what services are waiting for slow ones
systemctl list-dependencies --reverse slow_service.service
```
Performance Trending
Historical Data Collection
```bash
Advanced monitoring script
#!/bin/bash
DATE=$(date +%Y%m%d_%H%M%S)
LOGDIR="/var/log/systemd-performance"
mkdir -p $LOGDIR
Collect comprehensive data
systemd-analyze time > $LOGDIR/time_$DATE.txt
systemd-analyze blame > $LOGDIR/blame_$DATE.txt
systemd-analyze critical-chain > $LOGDIR/critical_$DATE.txt
Generate summary report
echo "Performance Summary - $DATE" > $LOGDIR/summary_$DATE.txt
echo "Total boot time: $(systemd-analyze time | grep 'Startup finished')" >> $LOGDIR/summary_$DATE.txt
echo "Slowest service: $(systemd-analyze blame | head -1)" >> $LOGDIR/summary_$DATE.txt
```
Integration with Monitoring Systems
Automated Alerting
```bash
#!/bin/bash
Boot time monitoring with alerting
THRESHOLD=30 # seconds
BOOT_TIME=$(systemd-analyze time | grep -oP 'Startup finished in \K[0-9.]+(?=s)')
if (( $(echo "$BOOT_TIME > $THRESHOLD" | bc -l) )); then
echo "WARNING: Boot time ${BOOT_TIME}s exceeds threshold ${THRESHOLD}s"
systemd-analyze blame | head -5 | mail -s "Slow boot detected" admin@company.com
fi
```
Integration with Prometheus
```bash
Export metrics for Prometheus
#!/bin/bash
METRICS_FILE="/var/lib/node_exporter/textfile_collector/boot_time.prom"
Extract boot time
BOOT_TIME=$(systemd-analyze time | grep -oP 'Startup finished in .*= \K[0-9.]+(?=s)')
echo "system_boot_time_seconds $BOOT_TIME" > $METRICS_FILE
Export top 5 service times
systemd-analyze blame | head -5 | while read line; do
TIME=$(echo $line | grep -oP '^[0-9.]+')
SERVICE=$(echo $line | grep -oP '[a-zA-Z0-9.-]+\.service')
echo "service_start_time_seconds{service=\"$SERVICE\"} $TIME" >> $METRICS_FILE
done
```
Security Considerations
Protecting Performance Data
```bash
Secure log directory
sudo mkdir -p /var/log/systemd-performance
sudo chown root:adm /var/log/systemd-performance
sudo chmod 750 /var/log/systemd-performance
Rotate logs to prevent disk space issues
sudo tee /etc/logrotate.d/systemd-performance << EOF
/var/log/systemd-performance/*.txt {
weekly
rotate 4
compress
delaycompress
missingok
notifempty
create 644 root adm
}
EOF
```
Audit Trail
```bash
Log all systemd-analyze commands
sudo tee -a /etc/audit/rules.d/systemd-analyze.rules << EOF
-w /usr/bin/systemd-analyze -p x -k systemd_analysis
EOF
sudo service auditd restart
```
Conclusion
The `systemd-analyze blame` command is an essential tool for Linux system administrators and users who want to optimize their system's boot performance and identify service bottlenecks. Throughout this comprehensive guide, we've covered everything from basic usage to advanced optimization techniques and real-world applications.
Key Takeaways
1. Regular Monitoring: Use `systemd-analyze blame` regularly to establish baselines and detect performance regressions early.
2. Systematic Approach: Always analyze the complete picture using multiple systemd-analyze commands rather than focusing solely on blame output.
3. Targeted Optimization: Focus optimization efforts on services that genuinely impact user experience or system functionality.
4. Documentation: Keep records of your analysis and optimizations to track improvements and avoid repeating unsuccessful changes.
5. Holistic View: Remember that boot time is just one aspect of system performance; balance optimization efforts with system stability and functionality.
Next Steps
After mastering `systemd-analyze blame`, consider exploring:
- Advanced systemd configuration: Custom service files and targets
- System profiling tools: perf, strace, and other performance analysis utilities
- Automated monitoring: Integration with monitoring systems and alerting
- Container optimization: Applying similar principles to containerized environments
- Network performance: Analyzing and optimizing network-dependent services
Final Recommendations
1. Start with the services consuming the most time, but always investigate why they're slow before making changes.
2. Test all optimizations in non-production environments first.
3. Document your baseline performance metrics before making any changes.
4. Remember that some services legitimately require time to start safely – not all slow services need optimization.
5. Consider the trade-offs between boot time and system functionality when making optimization decisions.
By following the practices and techniques outlined in this guide, you'll be well-equipped to diagnose, analyze, and optimize your Linux systems for peak performance while maintaining stability and reliability.