How to scrub/balance → btrfs scrub start /mnt; btrfs balance start /mnt

How to Scrub and Balance Btrfs Filesystems: Complete Guide to `btrfs scrub start` and `btrfs balance start` Table of Contents 1. [Introduction](#introduction) 2. [Prerequisites](#prerequisites) 3. [Understanding Btrfs Scrub Operations](#understanding-btrfs-scrub-operations) 4. [Understanding Btrfs Balance Operations](#understanding-btrfs-balance-operations) 5. [Step-by-Step Guide to Scrubbing](#step-by-step-guide-to-scrubbing) 6. [Step-by-Step Guide to Balancing](#step-by-step-guide-to-balancing) 7. [Practical Examples and Use Cases](#practical-examples-and-use-cases) 8. [Monitoring and Managing Operations](#monitoring-and-managing-operations) 9. [Common Issues and Troubleshooting](#common-issues-and-troubleshooting) 10. [Best Practices and Professional Tips](#best-practices-and-professional-tips) 11. [Advanced Configuration Options](#advanced-configuration-options) 12. [Performance Considerations](#performance-considerations) 13. [Conclusion](#conclusion) Introduction Maintaining the integrity and performance of a Btrfs filesystem requires regular maintenance operations, with scrubbing and balancing being two of the most critical tasks. The `btrfs scrub start` and `btrfs balance start` commands are essential tools for ensuring your Btrfs filesystem remains healthy, performs optimally, and maintains data integrity over time. This comprehensive guide will teach you how to properly execute these maintenance operations, understand their purposes, monitor their progress, and troubleshoot common issues. Whether you're a system administrator managing production servers or a home user with a Btrfs setup, mastering these commands is crucial for long-term filesystem health. By the end of this article, you'll have a thorough understanding of when, why, and how to use these commands effectively, along with the knowledge to avoid common pitfalls and optimize your maintenance routines. Prerequisites Before proceeding with Btrfs scrub and balance operations, ensure you have: System Requirements - A Linux system with Btrfs filesystem mounted - Root or sudo privileges - Btrfs-tools/btrfs-progs package installed - Sufficient free space (especially for balance operations) - Stable power supply (UPS recommended for critical systems) Knowledge Requirements - Basic understanding of Linux command line - Familiarity with filesystem concepts - Understanding of your current Btrfs configuration Verification Commands Check your Btrfs installation and mounted filesystems: ```bash Verify btrfs tools installation btrfs --version List mounted Btrfs filesystems mount | grep btrfs Check filesystem information btrfs filesystem show ``` Understanding Btrfs Scrub Operations What is Btrfs Scrub? Btrfs scrub is a maintenance operation that reads all data and metadata from the filesystem and verifies checksums to detect and repair corruption. Unlike traditional filesystem check tools that work on unmounted filesystems, Btrfs scrub operates on live, mounted filesystems without disrupting normal operations. Key Benefits of Scrubbing 1. Data Integrity Verification: Validates checksums for all data blocks 2. Silent Corruption Detection: Identifies bit rot and storage device errors 3. Automatic Repair: Fixes correctable errors using redundant copies 4. Proactive Maintenance: Prevents data loss before it becomes critical 5. Online Operation: Works while the filesystem is actively used When to Perform Scrub Operations - Regular Schedule: Monthly or quarterly for most systems - After Hardware Changes: Following disk replacements or additions - Storage Concerns: When suspecting hardware issues - Post-Incident: After power outages or system crashes - Before Critical Operations: Prior to major system changes Understanding Btrfs Balance Operations What is Btrfs Balance? Btrfs balance is a maintenance operation that redistributes data and metadata across available devices in a multi-device filesystem. It can also convert between different RAID levels, reclaim unused space, and optimize data placement for better performance. Key Benefits of Balancing 1. Space Reclamation: Frees up unused allocated space 2. RAID Conversion: Changes RAID levels without data loss 3. Performance Optimization: Redistributes data for better access patterns 4. Device Utilization: Balances data across multiple devices 5. Chunk Optimization: Consolidates partially filled chunks When to Perform Balance Operations - Space Issues: When seeing "No space left on device" despite available space - RAID Changes: Converting between RAID levels - Device Management: After adding or removing devices - Performance Tuning: Optimizing data distribution - Regular Maintenance: Periodic cleanup of filesystem structures Step-by-Step Guide to Scrubbing Basic Scrub Operation The simplest form of the scrub command is: ```bash sudo btrfs scrub start /mnt ``` Detailed Scrub Process Step 1: Check Current Status Before starting a scrub, verify no other scrub is running: ```bash sudo btrfs scrub status /mnt ``` Step 2: Start the Scrub Operation ```bash Start scrub with basic options sudo btrfs scrub start /mnt Start scrub with verbose output sudo btrfs scrub start -B /mnt Start scrub with read-only mode (no repairs) sudo btrfs scrub start -r /mnt ``` Step 3: Monitor Progress ```bash Check scrub progress sudo btrfs scrub status /mnt Watch progress continuously watch -n 5 'sudo btrfs scrub status /mnt' ``` Advanced Scrub Options ```bash Scrub with bandwidth limiting (in bytes per second) sudo btrfs scrub start -B -c 2 /mnt Scrub specific device sudo btrfs scrub start -d /dev/sdb1 /mnt Resume interrupted scrub sudo btrfs scrub resume /mnt ``` Step-by-Step Guide to Balancing Basic Balance Operation The basic balance command redistributes all data: ```bash sudo btrfs balance start /mnt ``` Warning: This command can take a very long time and consume significant system resources. Recommended Balance Approach Step 1: Check Filesystem Usage ```bash sudo btrfs filesystem usage /mnt ``` Step 2: Start Selective Balance Instead of balancing everything, target specific chunks: ```bash Balance only chunks that are less than 50% full sudo btrfs balance start -dusage=50 /mnt Balance metadata chunks less than 75% full sudo btrfs balance start -musage=75 /mnt Balance system chunks sudo btrfs balance start -susage=50 /mnt ``` Step 3: Monitor Balance Progress ```bash Check balance status sudo btrfs balance status /mnt Show detailed progress sudo btrfs balance status -v /mnt ``` Advanced Balance Operations ```bash Convert data to different RAID level sudo btrfs balance start -dconvert=raid1 /mnt Balance with device filters sudo btrfs balance start -ddevid=2 /mnt Limit balance to specific device usage sudo btrfs balance start -dlimit=10 /mnt ``` Practical Examples and Use Cases Example 1: Regular Maintenance Routine ```bash #!/bin/bash Monthly maintenance script MOUNT_POINT="/home" LOG_FILE="/var/log/btrfs-maintenance.log" echo "$(date): Starting Btrfs maintenance for $MOUNT_POINT" >> $LOG_FILE Start scrub echo "Starting scrub operation..." >> $LOG_FILE btrfs scrub start -B $MOUNT_POINT Check scrub results SCRUB_STATUS=$(btrfs scrub status $MOUNT_POINT) echo "$SCRUB_STATUS" >> $LOG_FILE Selective balance for space optimization echo "Starting balance operation..." >> $LOG_FILE btrfs balance start -dusage=75 -musage=85 $MOUNT_POINT echo "$(date): Maintenance completed" >> $LOG_FILE ``` Example 2: Post-Hardware Change Maintenance ```bash After adding a new drive to RAID array MOUNT_POINT="/data" First, add the device sudo btrfs device add /dev/sdc $MOUNT_POINT Balance to distribute data across all devices sudo btrfs balance start -dconvert=raid1 -mconvert=raid1 $MOUNT_POINT Scrub to verify integrity across all devices sudo btrfs scrub start $MOUNT_POINT ``` Example 3: Space Reclamation ```bash When facing space issues MOUNT_POINT="/var" Check current usage sudo btrfs filesystem usage $MOUNT_POINT Balance to reclaim space sudo btrfs balance start -dusage=50 $MOUNT_POINT Check results sudo btrfs filesystem usage $MOUNT_POINT ``` Monitoring and Managing Operations Real-Time Monitoring ```bash Create monitoring script #!/bin/bash MOUNT_POINT="/mnt" while true; do clear echo "=== Btrfs Operations Status ===" echo "Scrub Status:" btrfs scrub status $MOUNT_POINT echo echo "Balance Status:" btrfs balance status $MOUNT_POINT echo echo "Filesystem Usage:" btrfs filesystem usage $MOUNT_POINT sleep 30 done ``` Canceling Operations ```bash Cancel running scrub sudo btrfs scrub cancel /mnt Pause balance operation sudo btrfs balance pause /mnt Resume paused balance sudo btrfs balance resume /mnt Cancel balance operation sudo btrfs balance cancel /mnt ``` System Resource Management ```bash Limit I/O impact using ionice sudo ionice -c 3 btrfs balance start -dusage=50 /mnt Use nice to lower CPU priority sudo nice -n 19 btrfs scrub start /mnt ``` Common Issues and Troubleshooting Issue 1: "No Space Left on Device" Error Problem: Balance fails due to insufficient space. Solution: ```bash Check actual usage vs allocated sudo btrfs filesystem usage /mnt Start with very selective balance sudo btrfs balance start -dusage=5 /mnt Gradually increase usage percentage sudo btrfs balance start -dusage=25 /mnt ``` Issue 2: Scrub Finds Uncorrectable Errors Problem: Scrub reports errors that cannot be automatically fixed. Solution: ```bash Check detailed scrub status sudo btrfs scrub status -d /mnt Identify problematic files sudo btrfs check --readonly /dev/sdX Consider file-level recovery sudo btrfs rescue chunk-recover /dev/sdX ``` Issue 3: Balance Operation Stuck Problem: Balance appears to hang or make no progress. Solution: ```bash Check if balance is actually running sudo btrfs balance status -v /mnt If stuck, pause and resume sudo btrfs balance pause /mnt sudo btrfs balance resume /mnt If still stuck, cancel and try selective approach sudo btrfs balance cancel /mnt sudo btrfs balance start -dusage=10 -dlimit=5 /mnt ``` Issue 4: High System Load During Operations Problem: Scrub or balance causes system performance issues. Solution: ```bash Use resource limiting sudo ionice -c 3 nice -n 19 btrfs scrub start /mnt Implement throttling in scripts for usage in 10 20 30 40 50; do sudo btrfs balance start -dusage=$usage -dlimit=3 /mnt sleep 300 # Wait 5 minutes between operations done ``` Issue 5: Device Errors During Operations Problem: Storage device errors interrupt maintenance operations. Solution: ```bash Check system logs sudo dmesg | grep -i error sudo journalctl -u btrfs Verify device health sudo smartctl -a /dev/sdX If device is failing, replace immediately sudo btrfs replace start /dev/old_device /dev/new_device /mnt ``` Best Practices and Professional Tips Scheduling Maintenance 1. Regular Scrubs: Schedule monthly scrubs during low-usage periods 2. Selective Balancing: Use usage-based filters instead of full balance 3. Monitoring: Implement automated monitoring and alerting 4. Documentation: Keep logs of all maintenance operations Resource Management ```bash Create systemd service for controlled maintenance cat > /etc/systemd/system/btrfs-scrub@.service << EOF [Unit] Description=Btrfs scrub on %i After=local-fs.target [Service] Type=oneshot ExecStart=/bin/bash -c 'ionice -c 3 nice -n 19 btrfs scrub start -B %i' IOSchedulingClass=3 CPUSchedulingPolicy=3 EOF ``` Automation Scripts ```bash #!/bin/bash Professional maintenance script with error handling MOUNT_POINTS=("/home" "/var" "/data") LOG_FILE="/var/log/btrfs-maintenance.log" EMAIL_ALERT="admin@example.com" log_message() { echo "$(date '+%Y-%m-%d %H:%M:%S'): $1" | tee -a $LOG_FILE } send_alert() { echo "$1" | mail -s "Btrfs Maintenance Alert" $EMAIL_ALERT } for mount_point in "${MOUNT_POINTS[@]}"; do if [ -d "$mount_point" ]; then log_message "Starting maintenance for $mount_point" # Scrub operation if btrfs scrub start -B "$mount_point"; then scrub_result=$(btrfs scrub status "$mount_point") log_message "Scrub completed: $scrub_result" # Check for errors if echo "$scrub_result" | grep -q "uncorrectable"; then send_alert "Uncorrectable errors found in $mount_point" fi else log_message "Scrub failed for $mount_point" send_alert "Scrub failed for $mount_point" fi # Selective balance if btrfs balance start -dusage=75 -musage=85 "$mount_point"; then log_message "Balance completed for $mount_point" else log_message "Balance failed for $mount_point" fi fi done ``` Performance Optimization 1. Timing: Run maintenance during off-peak hours 2. Resource Limits: Use ionice and nice for I/O and CPU limiting 3. Incremental Approach: Use selective balance with usage filters 4. Monitoring: Track system performance during operations Safety Measures 1. Backup First: Always ensure current backups before maintenance 2. Test Environment: Validate scripts in non-production environments 3. Power Protection: Use UPS for critical systems 4. Documentation: Maintain detailed logs and procedures Advanced Configuration Options Custom Balance Strategies ```bash Age-based balancing sudo btrfs balance start -dusage=50,limit=10 /mnt Device-specific operations sudo btrfs balance start -ddevid=1,limit=5 /mnt Combined filters sudo btrfs balance start -dusage=75,devid=2,limit=3 /mnt ``` Scrub Optimization ```bash Multi-device scrub coordination for device in /dev/sd{b,c,d}1; do sudo btrfs scrub start -d $device /mnt & done wait Bandwidth-limited scrub sudo btrfs scrub start -B -c 1048576 /mnt # 1MB/s limit ``` Integration with Monitoring Systems ```bash Nagios/Icinga check script #!/bin/bash MOUNT_POINT="$1" SCRUB_STATUS=$(btrfs scrub status $MOUNT_POINT 2>/dev/null) if echo "$SCRUB_STATUS" | grep -q "uncorrectable"; then echo "CRITICAL: Uncorrectable errors found" exit 2 elif echo "$SCRUB_STATUS" | grep -q "corrected"; then echo "WARNING: Corrected errors found" exit 1 else echo "OK: No errors detected" exit 0 fi ``` Performance Considerations Impact Assessment 1. I/O Load: Both operations are I/O intensive 2. CPU Usage: Checksum verification requires CPU resources 3. Memory Usage: Metadata operations consume RAM 4. Network: May affect network storage performance Optimization Strategies ```bash Staggered execution #!/bin/bash FILESYSTEMS=("/home" "/var" "/data") for fs in "${FILESYSTEMS[@]}"; do echo "Processing $fs..." ionice -c 3 nice -n 19 btrfs scrub start -B "$fs" # Wait for completion before next filesystem while btrfs scrub status "$fs" | grep -q "running"; do sleep 60 done # Brief pause between filesystems sleep 300 done ``` Resource Monitoring ```bash Monitor system resources during operations #!/bin/bash LOG_FILE="/var/log/btrfs-performance.log" while true; do { echo "Timestamp: $(date)" echo "Load Average: $(uptime | awk -F'load average:' '{print $2}')" echo "Memory Usage: $(free -h | grep Mem)" echo "I/O Stats: $(iostat -x 1 1 | tail -n +4)" echo "---" } >> $LOG_FILE sleep 60 done ``` Conclusion Mastering Btrfs scrub and balance operations is essential for maintaining a healthy, high-performing filesystem. The `btrfs scrub start` and `btrfs balance start` commands provide powerful tools for ensuring data integrity and optimizing space utilization, but they must be used thoughtfully and systematically. Key takeaways from this comprehensive guide: 1. Regular Maintenance: Implement scheduled scrub operations to detect and prevent data corruption 2. Selective Balancing: Use targeted balance operations instead of full filesystem balance to minimize resource impact 3. Monitoring: Continuously monitor operations and system performance to ensure smooth execution 4. Resource Management: Employ proper resource limiting to prevent maintenance operations from impacting system performance 5. Error Handling: Implement robust error detection and alerting mechanisms for proactive issue resolution Remember that both scrub and balance operations can be resource-intensive and time-consuming. Always plan these operations during maintenance windows, ensure adequate system resources, and maintain current backups before beginning any maintenance procedures. By following the best practices, monitoring strategies, and troubleshooting techniques outlined in this guide, you'll be well-equipped to maintain your Btrfs filesystems effectively and prevent common issues before they become critical problems. Regular, well-planned maintenance using these tools will ensure your Btrfs filesystems continue to provide reliable, high-performance storage for years to come. The investment in proper maintenance procedures will pay dividends in system reliability, data integrity, and overall performance. Start implementing these practices gradually, beginning with less critical systems to gain experience before applying them to production environments.