How to use vmstat to monitor Linux system
How to Use vmstat to Monitor Linux System Performance
System monitoring is a critical aspect of Linux system administration, and understanding how your system performs under various conditions is essential for maintaining optimal performance. The `vmstat` command is one of the most powerful and versatile tools available for monitoring virtual memory statistics, system processes, and overall system performance in Linux environments.
This comprehensive guide will walk you through everything you need to know about using `vmstat` effectively, from basic usage to advanced monitoring techniques. Whether you're a beginner system administrator or an experienced Linux professional, this article will provide you with the knowledge and practical skills needed to leverage `vmstat` for comprehensive system monitoring.
What is vmstat?
The `vmstat` (Virtual Memory Statistics) command is a built-in Linux utility that provides detailed information about system processes, memory usage, paging activity, block I/O operations, CPU activity, and kernel statistics. It's part of the procfs-ng package (formerly sysstat) and comes pre-installed on most Linux distributions.
Unlike other monitoring tools that focus on specific aspects of system performance, `vmstat` provides a holistic view of your system's health, making it an invaluable tool for:
- Identifying performance bottlenecks
- Monitoring system resource utilization
- Troubleshooting memory-related issues
- Analyzing I/O performance
- Tracking CPU usage patterns
- Planning system capacity and upgrades
Prerequisites and Requirements
Before diving into `vmstat` usage, ensure you have the following prerequisites:
System Requirements
- Any Linux distribution (Ubuntu, CentOS, RHEL, Debian, etc.)
- Terminal access with basic user privileges
- No special installation required (vmstat is typically pre-installed)
Knowledge Prerequisites
- Basic understanding of Linux command line
- Familiarity with system administration concepts
- Understanding of memory management basics
- Knowledge of CPU and I/O concepts
Verification of vmstat Installation
To verify that `vmstat` is available on your system, run:
```bash
vmstat --version
```
If `vmstat` is not installed, you can install it using your distribution's package manager:
Ubuntu/Debian:
```bash
sudo apt-get update
sudo apt-get install procps
```
CentOS/RHEL/Fedora:
```bash
sudo yum install procps-ng
or for newer versions
sudo dnf install procps-ng
```
Understanding vmstat Output Format
Before exploring various `vmstat` options, it's crucial to understand the standard output format. The basic `vmstat` command displays information in several columns grouped by category:
```bash
vmstat
```
Sample Output:
```
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 1847516 92708 1055648 0 0 45 12 180 350 5 2 92 1 0
```
Column Explanations
Process Information (procs)
- r: Number of runnable processes (running or waiting for run time)
- b: Number of processes in uninterruptible sleep (blocked)
Memory Information (memory)
- swpd: Amount of virtual memory used (KB)
- free: Amount of idle memory (KB)
- buff: Amount of memory used as buffers (KB)
- cache: Amount of memory used as cache (KB)
Swap Information (swap)
- si: Amount of memory swapped in from disk per second (KB/s)
- so: Amount of memory swapped out to disk per second (KB/s)
I/O Information (io)
- bi: Blocks received from a block device per second (blocks/s)
- bo: Blocks sent to a block device per second (blocks/s)
System Information (system)
- in: Number of interrupts per second, including clock interrupts
- cs: Number of context switches per second
CPU Information (cpu)
- us: Time spent running non-kernel code (user time, including nice time)
- sy: Time spent running kernel code (system time)
- id: Time spent idle
- wa: Time spent waiting for I/O
- st: Time stolen from a virtual machine
Basic vmstat Usage Examples
Single Snapshot
The simplest way to use `vmstat` is to run it without any arguments:
```bash
vmstat
```
This provides a snapshot of current system statistics since the last reboot.
Continuous Monitoring
To monitor system performance continuously, specify an interval in seconds:
```bash
vmstat 2
```
This command updates the display every 2 seconds. To stop the monitoring, press `Ctrl+C`.
Limited Sample Collection
To collect a specific number of samples:
```bash
vmstat 2 5
```
This collects 5 samples with a 2-second interval between each sample.
Monitoring with Timestamps
To include timestamps in your output:
```bash
vmstat -t 2
```
Sample Output:
```
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- -----timestamp-----
r b swpd free buff cache si so bi bo in cs us sy id wa st UTC
1 0 0 1825432 93156 1058764 0 0 45 12 181 352 5 2 92 1 0 2023-12-07 10:30:15
```
Advanced vmstat Options and Features
Memory Statistics in Different Units
By default, `vmstat` displays memory information in kilobytes. You can change the units:
```bash
vmstat -S m # Display in megabytes
vmstat -S k # Display in kilobytes (default)
```
Active and Inactive Memory
To view active and inactive memory statistics:
```bash
vmstat -a
```
Sample Output:
```
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free inact active si so bi bo in cs us sy id wa st
1 0 0 1820544 456789 1234567 0 0 45 12 182 354 5 2 92 1 0
```
Disk Statistics
To display disk I/O statistics:
```bash
vmstat -d
```
This shows detailed disk activity for each disk device on your system.
Partition Statistics
To view statistics for specific partitions:
```bash
vmstat -p /dev/sda1
```
Fork Statistics
To display the number of forks since boot:
```bash
vmstat -f
```
Slab Information
To view kernel slab allocator information:
```bash
vmstat -m
```
Practical Monitoring Scenarios
Scenario 1: Identifying Memory Pressure
When investigating potential memory issues, monitor these key metrics:
```bash
vmstat -t 1 10
```
What to look for:
- High values in the `swpd` column indicate swap usage
- Increasing `si` and `so` values suggest active swapping
- Low `free` memory combined with high swap activity indicates memory pressure
Example Analysis:
```
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- -----timestamp-----
r b swpd free buff cache si so bi bo in cs us sy id wa st UTC
3 1 45678 12456 2345 89012 150 75 25 45 890 1245 25 15 45 15 0 2023-12-07 10:35:22
```
In this example, the system shows signs of memory pressure with active swapping (`si=150`, `so=75`) and low free memory.
Scenario 2: CPU Performance Analysis
To analyze CPU performance patterns:
```bash
vmstat 2 30
```
Key metrics to monitor:
- us (user): High values indicate CPU-intensive applications
- sy (system): High values suggest kernel-level processing or system calls
- wa (wait): High values indicate I/O bottlenecks
- id (idle): Low values suggest high CPU utilization
Performance Interpretation:
- `us + sy > 80%`: High CPU utilization
- `wa > 20%`: Potential I/O bottleneck
- `r > number of CPUs`: CPU queue buildup
Scenario 3: I/O Performance Monitoring
For I/O-intensive applications:
```bash
vmstat -d 5
```
Combined with regular vmstat monitoring:
```bash
vmstat 1 | awk '{print strftime("%Y-%m-%d %H:%M:%S"), $0}'
```
This adds timestamps to help correlate I/O patterns with specific times.
Scenario 4: Long-term Performance Trending
For long-term monitoring and logging:
```bash
vmstat 60 > /var/log/vmstat_$(date +%Y%m%d).log &
```
This runs vmstat in the background, collecting samples every minute and logging to a dated file.
Interpreting vmstat Results for Performance Tuning
Memory Performance Indicators
Healthy Memory System:
- `free` memory > 10% of total RAM
- Minimal or zero swap usage (`swpd`, `si`, `so`)
- `buff` and `cache` showing reasonable values
Memory Issues:
- Consistently high swap usage
- Frequent swap in/out activity
- Very low free memory with high cache usage
CPU Performance Indicators
Balanced CPU Usage:
- `us` + `sy` < 80% consistently
- `wa` < 10% most of the time
- `r` (run queue) < number of CPU cores
CPU Bottlenecks:
- Sustained high `us` or `sy` values
- High `wa` values indicating I/O wait
- Run queue (`r`) consistently higher than CPU count
I/O Performance Indicators
Good I/O Performance:
- Reasonable `bi` and `bo` values for workload
- Low `wa` (I/O wait) percentage
- Minimal process blocking (`b` column)
I/O Bottlenecks:
- High `wa` values (>20%)
- Many blocked processes (`b`)
- Disproportionate `bi` or `bo` values
Combining vmstat with Other Tools
vmstat with sar
Combine vmstat with `sar` for comprehensive monitoring:
```bash
Terminal 1
vmstat 5
Terminal 2
sar -u 5
```
vmstat with iostat
For detailed I/O analysis:
```bash
Terminal 1
vmstat 2
Terminal 2
iostat -x 2
```
Creating Comprehensive Monitoring Scripts
Here's a practical monitoring script that combines vmstat with other utilities:
```bash
#!/bin/bash
system_monitor.sh
LOGFILE="/var/log/system_monitor_$(date +%Y%m%d_%H%M%S).log"
DURATION=${1:-300} # Default 5 minutes
INTERVAL=${2:-5} # Default 5 seconds
echo "Starting system monitoring for $DURATION seconds..." | tee -a $LOGFILE
echo "Interval: $INTERVAL seconds" | tee -a $LOGFILE
echo "Log file: $LOGFILE"
echo "========================================" >> $LOGFILE
Function to log system info
log_system_info() {
echo "=== $(date) ===" >> $LOGFILE
echo "--- vmstat ---" >> $LOGFILE
vmstat 1 2 | tail -1 >> $LOGFILE
echo "--- Load Average ---" >> $LOGFILE
uptime >> $LOGFILE
echo "--- Memory Usage ---" >> $LOGFILE
free -h >> $LOGFILE
echo "" >> $LOGFILE
}
Monitor for specified duration
END_TIME=$(($(date +%s) + $DURATION))
while [ $(date +%s) -lt $END_TIME ]; do
log_system_info
sleep $INTERVAL
done
echo "Monitoring completed. Check $LOGFILE for results."
```
Troubleshooting Common Issues
Issue 1: vmstat Command Not Found
Problem: `bash: vmstat: command not found`
Solution:
```bash
Ubuntu/Debian
sudo apt-get install procps
CentOS/RHEL
sudo yum install procps-ng
```
Issue 2: Permission Denied Errors
Problem: Cannot access certain system statistics
Solution:
- Run with appropriate privileges
- Check if `/proc` filesystem is mounted
- Verify user permissions for system monitoring
Issue 3: Inconsistent or Unusual Values
Problem: vmstat showing unexpected values
Troubleshooting Steps:
1. Verify system time and date
2. Check for system load anomalies
3. Compare with other monitoring tools
4. Review system logs for errors
```bash
Comprehensive system check
date
uptime
free -h
df -h
dmesg | tail -20
```
Issue 4: High I/O Wait Times
Problem: Consistently high `wa` values in CPU section
Investigation Steps:
```bash
Check disk usage
df -h
Check for disk errors
dmesg | grep -i error
Monitor disk I/O
iostat -x 2 5
Check for high I/O processes
iotop -o
```
Best Practices for vmstat Usage
1. Establish Baseline Measurements
Before troubleshooting performance issues, establish baseline measurements during normal operations:
```bash
Collect baseline data during different periods
vmstat 300 > baseline_normal_hours.log &
vmstat 60 > baseline_peak_hours.log &
```
2. Use Appropriate Monitoring Intervals
Choose monitoring intervals based on your needs:
- Real-time troubleshooting: 1-2 seconds
- General monitoring: 5-10 seconds
- Long-term trending: 60-300 seconds
3. Combine Multiple Monitoring Approaches
Don't rely solely on vmstat; combine it with other tools:
```bash
Comprehensive monitoring command
{
echo "=== System Overview ==="
uptime
echo "=== Memory Usage ==="
free -h
echo "=== vmstat Sample ==="
vmstat 1 3
echo "=== Disk Usage ==="
df -h
} > system_health_$(date +%Y%m%d_%H%M%S).txt
```
4. Automate Regular Health Checks
Create automated health check scripts:
```bash
#!/bin/bash
daily_health_check.sh
ALERT_THRESHOLD_CPU=80
ALERT_THRESHOLD_MEM=90
LOGFILE="/var/log/daily_health_$(date +%Y%m%d).log"
Get current stats
CPU_USAGE=$(vmstat 1 2 | tail -1 | awk '{print 100-$15}')
MEM_USAGE=$(free | grep Mem | awk '{printf "%.0f", $3/$2 * 100}')
echo "$(date): CPU Usage: ${CPU_USAGE}%, Memory Usage: ${MEM_USAGE}%" >> $LOGFILE
Alert if thresholds exceeded
if (( $(echo "$CPU_USAGE > $ALERT_THRESHOLD_CPU" | bc -l) )); then
echo "ALERT: High CPU usage detected: ${CPU_USAGE}%" | mail -s "CPU Alert" admin@example.com
fi
if [ $MEM_USAGE -gt $ALERT_THRESHOLD_MEM ]; then
echo "ALERT: High memory usage detected: ${MEM_USAGE}%" | mail -s "Memory Alert" admin@example.com
fi
```
5. Document and Share Findings
Maintain documentation of your monitoring findings:
```bash
Create monitoring reports
vmstat_report() {
echo "System Performance Report - $(date)"
echo "=================================="
echo "System Uptime:"
uptime
echo ""
echo "Memory Summary:"
free -h
echo ""
echo "Current vmstat snapshot:"
vmstat
echo ""
echo "5-minute performance sample:"
vmstat 10 30
}
```
Advanced Use Cases and Automation
Performance Alerting System
Create an advanced alerting system using vmstat:
```bash
#!/bin/bash
performance_monitor.sh
CONFIG_FILE="/etc/performance_monitor.conf"
LOG_FILE="/var/log/performance_monitor.log"
Default thresholds
CPU_THRESHOLD=80
MEMORY_THRESHOLD=85
LOAD_THRESHOLD=5.0
IO_WAIT_THRESHOLD=20
Load configuration if exists
[ -f "$CONFIG_FILE" ] && source "$CONFIG_FILE"
monitor_performance() {
while true; do
# Get current metrics
VMSTAT_OUTPUT=$(vmstat 1 2 | tail -1)
CPU_IDLE=$(echo $VMSTAT_OUTPUT | awk '{print $15}')
CPU_USAGE=$((100 - CPU_IDLE))
IO_WAIT=$(echo $VMSTAT_OUTPUT | awk '{print $16}')
LOAD_AVG=$(uptime | awk -F'load average:' '{print $2}' | awk '{print $1}' | sed 's/,//')
# Memory usage
MEM_USAGE=$(free | grep Mem | awk '{printf "%.1f", $3/$2 * 100}')
# Check thresholds and alert
check_and_alert "CPU" "$CPU_USAGE" "$CPU_THRESHOLD"
check_and_alert "Memory" "$MEM_USAGE" "$MEMORY_THRESHOLD"
check_and_alert "I/O Wait" "$IO_WAIT" "$IO_WAIT_THRESHOLD"
# Log current status
echo "$(date): CPU:${CPU_USAGE}% MEM:${MEM_USAGE}% IO_WAIT:${IO_WAIT}% LOAD:${LOAD_AVG}" >> "$LOG_FILE"
sleep 60
done
}
check_and_alert() {
local metric="$1"
local current="$2"
local threshold="$3"
if (( $(echo "$current > $threshold" | bc -l) )); then
alert_message="ALERT: $metric usage is ${current}% (threshold: ${threshold}%)"
echo "$alert_message" | logger -t performance_monitor
# Add email notification here if needed
fi
}
Start monitoring
monitor_performance
```
Integration with System Monitoring Tools
Integrate vmstat with popular monitoring solutions:
Nagios Plugin Example:
```bash
#!/bin/bash
check_vmstat.sh - Nagios plugin for vmstat monitoring
WARNING_CPU=70
CRITICAL_CPU=90
WARNING_MEM=80
CRITICAL_MEM=95
Get vmstat data
VMSTAT_DATA=$(vmstat 1 2 | tail -1)
CPU_USAGE=$(echo $VMSTAT_DATA | awk '{print 100-$15}')
MEM_TOTAL=$(free | grep Mem | awk '{print $2}')
MEM_USED=$(free | grep Mem | awk '{print $3}')
MEM_USAGE=$(echo "scale=0; $MEM_USED * 100 / $MEM_TOTAL" | bc)
Check CPU
if (( $(echo "$CPU_USAGE > $CRITICAL_CPU" | bc -l) )); then
echo "CRITICAL - CPU usage is ${CPU_USAGE}%"
exit 2
elif (( $(echo "$CPU_USAGE > $WARNING_CPU" | bc -l) )); then
echo "WARNING - CPU usage is ${CPU_USAGE}%"
exit 1
fi
Check Memory
if [ $MEM_USAGE -gt $CRITICAL_MEM ]; then
echo "CRITICAL - Memory usage is ${MEM_USAGE}%"
exit 2
elif [ $MEM_USAGE -gt $WARNING_MEM ]; then
echo "WARNING - Memory usage is ${MEM_USAGE}%"
exit 1
fi
echo "OK - CPU: ${CPU_USAGE}%, Memory: ${MEM_USAGE}%"
exit 0
```
Conclusion and Next Steps
The `vmstat` command is an indispensable tool for Linux system administrators and performance analysts. Its comprehensive reporting capabilities make it essential for understanding system behavior, identifying performance bottlenecks, and maintaining optimal system performance.
Key Takeaways
1. Comprehensive Monitoring: vmstat provides a holistic view of system performance including CPU, memory, I/O, and process information
2. Flexible Usage: From quick snapshots to long-term monitoring, vmstat adapts to various monitoring needs
3. Integration Capabilities: Works well with other monitoring tools and can be integrated into automated monitoring solutions
4. Performance Insights: Proper interpretation of vmstat output enables proactive performance management
Recommended Next Steps
1. Practice Regular Monitoring: Implement regular vmstat monitoring in your daily system administration routine
2. Develop Custom Scripts: Create monitoring scripts tailored to your specific environment and requirements
3. Learn Complementary Tools: Expand your monitoring toolkit with tools like `iostat`, `sar`, `top`, and `htop`
4. Establish Baselines: Document normal performance patterns for your systems to better identify anomalies
5. Automate Alerting: Implement automated alerting systems based on vmstat metrics
Additional Resources
To further enhance your system monitoring skills, consider exploring:
- Advanced shell scripting for automation
- System performance tuning techniques
- Integration with monitoring platforms like Nagios, Zabbix, or Prometheus
- Log analysis and correlation techniques
- Capacity planning methodologies
By mastering `vmstat` and implementing the practices outlined in this guide, you'll be well-equipped to maintain high-performing Linux systems and quickly identify and resolve performance issues as they arise. Remember that effective system monitoring is an ongoing process that requires consistent attention and continuous learning to adapt to evolving system requirements and technologies.