How to tune I/O scheduler in Linux
How to Tune I/O Scheduler in Linux
The I/O scheduler is a critical component of the Linux kernel that determines how input/output operations are queued, prioritized, and dispatched to storage devices. Proper I/O scheduler tuning can dramatically improve system performance, reduce latency, and optimize throughput for specific workloads. This comprehensive guide will walk you through understanding, selecting, and optimizing I/O schedulers for various scenarios, from desktop systems to high-performance servers.
Table of Contents
1. [Understanding I/O Schedulers](#understanding-io-schedulers)
2. [Prerequisites and Requirements](#prerequisites-and-requirements)
3. [Available I/O Schedulers](#available-io-schedulers)
4. [Checking Current I/O Scheduler](#checking-current-io-scheduler)
5. [Changing I/O Schedulers](#changing-io-schedulers)
6. [Scheduler-Specific Tuning Parameters](#scheduler-specific-tuning-parameters)
7. [Performance Testing and Benchmarking](#performance-testing-and-benchmarking)
8. [Use Case Optimization](#use-case-optimization)
9. [Troubleshooting Common Issues](#troubleshooting-common-issues)
10. [Best Practices and Tips](#best-practices-and-tips)
11. [Advanced Configuration](#advanced-configuration)
12. [Conclusion](#conclusion)
Understanding I/O Schedulers
I/O schedulers act as intermediaries between applications requesting disk operations and the actual storage hardware. They manage the order in which read and write requests are sent to storage devices, attempting to optimize for factors such as:
- Throughput: Maximum data transfer rate
- Latency: Response time for individual requests
- Fairness: Equal access for competing processes
- Power consumption: Minimizing disk activity for mobile devices
The choice of I/O scheduler significantly impacts system performance, especially under heavy disk usage scenarios. Different schedulers excel in different situations, making proper selection and tuning crucial for optimal system performance.
How I/O Schedulers Work
When an application requests disk I/O, the request doesn't immediately go to the hardware. Instead, it enters a queue managed by the I/O scheduler. The scheduler then:
1. Queues requests based on its algorithm
2. Merges adjacent requests when possible
3. Reorders requests to minimize seek times
4. Prioritizes requests based on process priority or other factors
5. Dispatches requests to the hardware in optimized order
Prerequisites and Requirements
Before tuning I/O schedulers, ensure you have:
System Requirements
- Linux kernel version 2.6 or higher
- Root or sudo privileges
- Basic understanding of storage devices (HDD vs SSD)
- Familiarity with command-line operations
Required Tools
```bash
Install necessary tools for monitoring and testing
sudo apt-get update
sudo apt-get install sysstat iotop hdparm fio
For Red Hat/CentOS systems
sudo yum install sysstat iotop hdparm fio
```
Knowledge Prerequisites
- Understanding of block devices in Linux
- Basic knowledge of file systems
- Familiarity with performance monitoring concepts
Available I/O Schedulers
Linux offers several I/O schedulers, each designed for specific use cases:
1. CFQ (Completely Fair Queuing)
- Best for: Desktop systems, general-purpose servers
- Characteristics: Provides fairness between processes
- Pros: Good balance of throughput and latency
- Cons: Can be suboptimal for SSDs
2. Deadline Scheduler
- Best for: Database servers, real-time applications
- Characteristics: Guarantees maximum latency bounds
- Pros: Excellent for read-heavy workloads
- Cons: May sacrifice some throughput for latency guarantees
3. NOOP (No Operation)
- Best for: SSDs, virtualized environments, RAID arrays
- Characteristics: Minimal scheduling overhead
- Pros: Low CPU usage, ideal for random I/O
- Cons: No optimization for traditional HDDs
4. BFQ (Budget Fair Queuing)
- Best for: Interactive systems, mobile devices
- Characteristics: Focuses on responsiveness
- Pros: Excellent interactive performance
- Cons: Higher CPU overhead
5. mq-deadline (Multi-queue Deadline)
- Best for: Modern multi-core systems with fast storage
- Characteristics: Multi-queue version of deadline
- Pros: Scales well with multiple CPU cores
- Cons: Requires modern hardware for benefits
6. Kyber
- Best for: NVMe SSDs, high-performance storage
- Characteristics: Designed for very fast storage devices
- Pros: Low latency for fast storage
- Cons: Limited tuning options
Checking Current I/O Scheduler
Before making changes, identify your current I/O scheduler configuration:
View Current Scheduler for All Devices
```bash
List all block devices and their schedulers
for device in /sys/block/*/queue/scheduler; do
echo -n "$(basename $(dirname $(dirname $device))): "
cat $device
done
```
Check Specific Device
```bash
Replace 'sda' with your device name
cat /sys/block/sda/queue/scheduler
Output example: noop deadline [cfq]
The scheduler in brackets is currently active
```
View Available Schedulers
```bash
See all available schedulers for a device
cat /sys/block/sda/queue/scheduler
List scheduler modules loaded in kernel
lsmod | grep -E "(cfq|deadline|noop|bfq)"
```
Changing I/O Schedulers
You can change I/O schedulers temporarily (until reboot) or permanently:
Temporary Change (Runtime)
```bash
Change scheduler for specific device
echo deadline | sudo tee /sys/block/sda/queue/scheduler
Verify the change
cat /sys/block/sda/queue/scheduler
Output: noop [deadline] cfq
```
Permanent Change Methods
Method 1: Kernel Boot Parameters
Edit GRUB configuration to set scheduler at boot time:
```bash
Edit GRUB configuration
sudo nano /etc/default/grub
Add or modify GRUB_CMDLINE_LINUX_DEFAULT
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash elevator=deadline"
Update GRUB
sudo update-grub
Reboot to apply changes
sudo reboot
```
Method 2: udev Rules
Create persistent rules for specific devices:
```bash
Create udev rule file
sudo nano /etc/udev/rules.d/60-ioscheduler.rules
Add rules for different device types
For SSDs, use noop or deadline
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="noop"
For HDDs, use cfq or deadline
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="cfq"
Reload udev rules
sudo udevadm control --reload-rules
sudo udevadm trigger
```
Method 3: systemd Service
Create a systemd service for scheduler management:
```bash
Create service file
sudo nano /etc/systemd/system/ioscheduler.service
Add service configuration
[Unit]
Description=Set I/O Scheduler
After=multi-user.target
[Service]
Type=oneshot
ExecStart=/bin/bash -c 'echo deadline > /sys/block/sda/queue/scheduler'
ExecStart=/bin/bash -c 'echo noop > /sys/block/sdb/queue/scheduler'
[Install]
WantedBy=multi-user.target
Enable and start service
sudo systemctl enable ioscheduler.service
sudo systemctl start ioscheduler.service
```
Scheduler-Specific Tuning Parameters
Each scheduler offers tunable parameters for optimization:
CFQ Scheduler Parameters
```bash
View current CFQ settings
ls /sys/block/sda/queue/iosched/
Key CFQ parameters
echo 64 | sudo tee /sys/block/sda/queue/iosched/quantum # Requests per queue round
echo 6 | sudo tee /sys/block/sda/queue/iosched/fifo_expire_sync # Sync request timeout (centiseconds)
echo 42 | sudo tee /sys/block/sda/queue/iosched/fifo_expire_async # Async request timeout (centiseconds)
echo 300 | sudo tee /sys/block/sda/queue/iosched/slice_sync # Time slice for sync requests (ms)
echo 40 | sudo tee /sys/block/sda/queue/iosched/slice_async # Time slice for async requests (ms)
```
Deadline Scheduler Parameters
```bash
Deadline scheduler tuning
echo 50 | sudo tee /sys/block/sda/queue/iosched/read_expire # Read request deadline (ms)
echo 500 | sudo tee /sys/block/sda/queue/iosched/write_expire # Write request deadline (ms)
echo 16 | sudo tee /sys/block/sda/queue/iosched/writes_starved # Reads before write batch
echo 2 | sudo tee /sys/block/sda/queue/iosched/fifo_batch # Requests processed per batch
```
BFQ Scheduler Parameters
```bash
BFQ scheduler tuning
echo 8 | sudo tee /sys/block/sda/queue/iosched/slice_idle # Idle time slice (ms)
echo 125 | sudo tee /sys/block/sda/queue/iosched/timeout_sync # Sync queue timeout (ms)
echo 0 | sudo tee /sys/block/sda/queue/iosched/strict_guarantees # Strict latency guarantees
```
Queue Depth and Read-Ahead Tuning
```bash
Adjust queue depth
echo 32 | sudo tee /sys/block/sda/queue/nr_requests
Tune read-ahead (KB)
echo 128 | sudo tee /sys/block/sda/queue/read_ahead_kb
Set maximum sectors per request
echo 512 | sudo tee /sys/block/sda/queue/max_sectors_kb
```
Performance Testing and Benchmarking
Proper testing is essential to validate scheduler changes:
Using fio for I/O Testing
```bash
Random read test
fio --name=random-read --ioengine=libaio --iodepth=32 --rw=randread \
--bs=4k --direct=1 --size=1G --numjobs=1 --runtime=60 --group_reporting
Sequential write test
fio --name=sequential-write --ioengine=libaio --iodepth=1 --rw=write \
--bs=64k --direct=1 --size=1G --numjobs=1 --runtime=60 --group_reporting
Mixed workload test
fio --name=mixed-workload --ioengine=libaio --iodepth=16 --rw=randrw \
--rwmixread=70 --bs=4k --direct=1 --size=1G --numjobs=4 --runtime=60 --group_reporting
```
Monitoring I/O Performance
```bash
Monitor I/O statistics
iostat -x 1
Watch I/O in real-time
iotop -o
Detailed block device statistics
cat /proc/diskstats
Monitor with sar
sar -d 1 10
```
Creating Test Scripts
```bash
#!/bin/bash
scheduler-test.sh - Test different schedulers
DEVICE="sda"
SCHEDULERS=("noop" "deadline" "cfq")
TEST_FILE="/tmp/iotest"
for scheduler in "${SCHEDULERS[@]}"; do
echo "Testing scheduler: $scheduler"
echo $scheduler | sudo tee /sys/block/$DEVICE/queue/scheduler
# Run test
fio --name=test --ioengine=libaio --iodepth=32 --rw=randread \
--bs=4k --direct=1 --size=100M --numjobs=1 --runtime=30 \
--group_reporting --output="results_$scheduler.txt"
sleep 5
done
```
Use Case Optimization
Different workloads require different scheduler optimizations:
Database Servers
```bash
Optimize for database workloads
echo deadline | sudo tee /sys/block/sda/queue/scheduler
Tune deadline parameters for databases
echo 5 | sudo tee /sys/block/sda/queue/iosched/read_expire
echo 250 | sudo tee /sys/block/sda/queue/iosched/write_expire
echo 8 | sudo tee /sys/block/sda/queue/iosched/writes_starved
Increase queue depth for better throughput
echo 64 | sudo tee /sys/block/sda/queue/nr_requests
```
Web Servers
```bash
CFQ with optimized parameters for web servers
echo cfq | sudo tee /sys/block/sda/queue/scheduler
Favor read operations
echo 4 | sudo tee /sys/block/sda/queue/iosched/fifo_expire_sync
echo 25 | sudo tee /sys/block/sda/queue/iosched/fifo_expire_async
echo 200 | sudo tee /sys/block/sda/queue/iosched/slice_sync
```
SSD Optimization
```bash
NOOP scheduler for SSDs
echo noop | sudo tee /sys/block/sda/queue/scheduler
Disable unnecessary features for SSDs
echo 0 | sudo tee /sys/block/sda/queue/rotational
echo 0 | sudo tee /sys/block/sda/queue/add_random
Optimize read-ahead for SSDs
echo 8 | sudo tee /sys/block/sda/queue/read_ahead_kb
```
Virtual Machines
```bash
Optimize for virtualized environments
echo noop | sudo tee /sys/block/vda/queue/scheduler
Reduce queue depth in VMs
echo 16 | sudo tee /sys/block/vda/queue/nr_requests
Minimal read-ahead in virtualized storage
echo 32 | sudo tee /sys/block/vda/queue/read_ahead_kb
```
Troubleshooting Common Issues
Performance Degradation After Changes
Problem: System becomes slower after scheduler change
Solution:
```bash
Revert to original scheduler
echo cfq | sudo tee /sys/block/sda/queue/scheduler
Check for I/O bottlenecks
iotop -a
Monitor system load
vmstat 1 10
```
Scheduler Not Available
Problem: Desired scheduler not available
Solution:
```bash
Check available schedulers
cat /sys/block/sda/queue/scheduler
Load scheduler module if needed
sudo modprobe bfq
Verify kernel support
grep -i scheduler /boot/config-$(uname -r)
```
High CPU Usage with BFQ
Problem: BFQ causing high CPU utilization
Solution:
```bash
Switch to lower-overhead scheduler
echo deadline | sudo tee /sys/block/sda/queue/scheduler
Reduce BFQ complexity if keeping it
echo 0 | sudo tee /sys/block/sda/queue/iosched/strict_guarantees
```
Inconsistent Performance
Problem: Performance varies significantly
Solution:
```bash
Check for competing processes
ps aux --sort=-%cpu | head -10
Monitor I/O wait
top -b -n1 | grep "Cpu(s)"
Verify scheduler persistence
cat /sys/block/sda/queue/scheduler
```
Best Practices and Tips
General Guidelines
1. Test Before Production: Always benchmark changes in a test environment
2. Monitor Continuously: Use monitoring tools to track performance metrics
3. Document Changes: Keep records of configuration changes and their effects
4. Consider Workload Patterns: Match scheduler to actual usage patterns
Scheduler Selection Guidelines
| Storage Type | Workload | Recommended Scheduler | Alternative |
|--------------|----------|----------------------|-------------|
| HDD | General desktop | CFQ | Deadline |
| HDD | Database | Deadline | CFQ |
| HDD | File server | CFQ | Deadline |
| SSD | Any | NOOP/None | mq-deadline |
| NVMe SSD | High performance | Kyber | mq-deadline |
| VM Storage | Any | NOOP | Deadline |
Performance Tuning Tips
```bash
Create a comprehensive tuning script
#!/bin/bash
optimal-io-setup.sh
DEVICE=$1
STORAGE_TYPE=$2 # hdd or ssd
if [ "$STORAGE_TYPE" == "ssd" ]; then
echo "Optimizing for SSD..."
echo noop | sudo tee /sys/block/$DEVICE/queue/scheduler
echo 0 | sudo tee /sys/block/$DEVICE/queue/rotational
echo 8 | sudo tee /sys/block/$DEVICE/queue/read_ahead_kb
echo 1 | sudo tee /sys/block/$DEVICE/queue/nomerges
elif [ "$STORAGE_TYPE" == "hdd" ]; then
echo "Optimizing for HDD..."
echo deadline | sudo tee /sys/block/$DEVICE/queue/scheduler
echo 1 | sudo tee /sys/block/$DEVICE/queue/rotational
echo 256 | sudo tee /sys/block/$DEVICE/queue/read_ahead_kb
echo 0 | sudo tee /sys/block/$DEVICE/queue/nomerges
fi
echo "Optimization complete for $DEVICE ($STORAGE_TYPE)"
```
Monitoring and Alerting
```bash
Create monitoring script
#!/bin/bash
io-monitor.sh
while true; do
UTIL=$(iostat -x 1 2 | awk '/^sd/ {print $10}' | tail -1)
if (( $(echo "$UTIL > 80" | bc -l) )); then
echo "$(date): High I/O utilization detected: $UTIL%"
# Add alerting logic here
fi
sleep 60
done
```
Advanced Configuration
Multi-Queue Block Layer
For modern systems with multiple CPU cores:
```bash
Enable multi-queue support
echo Y | sudo tee /sys/module/scsi_mod/parameters/use_blk_mq
Check multi-queue status
cat /sys/block/sda/queue/scheduler
Should show mq-deadline, kyber, bfq, or none
```
NUMA Considerations
For NUMA systems, consider CPU affinity:
```bash
Check NUMA topology
numactl --hardware
Set CPU affinity for I/O intensive processes
numactl --cpunodebind=0 --membind=0 your_io_intensive_app
```
Container Optimization
For containerized environments:
```bash
Docker container with I/O optimization
docker run --device-read-iops /dev/sda:1000 \
--device-write-iops /dev/sda:800 \
--device-read-bps /dev/sda:50mb \
your_container
```
Automated Tuning with Tuned
Use the tuned daemon for automatic optimization:
```bash
Install tuned
sudo apt-get install tuned
List available profiles
tuned-adm list
Apply throughput-performance profile
sudo tuned-adm profile throughput-performance
Create custom profile
sudo mkdir /etc/tuned/custom-io
sudo nano /etc/tuned/custom-io/tuned.conf
```
Conclusion
I/O scheduler tuning is a powerful technique for optimizing Linux system performance. The key to successful tuning lies in understanding your specific workload requirements, storage hardware characteristics, and system constraints.
Key Takeaways
1. No Universal Solution: Different schedulers excel in different scenarios
2. Testing is Critical: Always benchmark changes before production deployment
3. Monitor Continuously: Performance can change over time with workload evolution
4. Consider the Whole Stack: I/O scheduling is just one part of storage optimization
Next Steps
After implementing I/O scheduler tuning:
1. Expand Monitoring: Implement comprehensive I/O monitoring
2. Explore File System Tuning: Optimize file system parameters
3. Consider Storage Hardware: Evaluate storage hardware upgrades
4. Learn Advanced Topics: Study kernel I/O subsystem internals
5. Automate Management: Develop scripts for consistent configuration
Additional Resources
- Linux kernel documentation on block layer
- Storage vendor optimization guides
- Performance analysis tools and techniques
- Community forums and mailing lists for specific schedulers
Remember that I/O scheduler tuning is an iterative process. Start with conservative changes, measure their impact, and gradually refine your configuration based on observed performance characteristics. With proper understanding and careful implementation, I/O scheduler tuning can provide significant performance improvements for your Linux systems.
The investment in learning and implementing proper I/O scheduler tuning pays dividends in improved system responsiveness, better resource utilization, and enhanced user experience across all types of Linux deployments, from embedded systems to enterprise servers.