How to view disk usage with du

How to View Disk Usage with du The `du` (disk usage) command is one of the most essential tools in Linux and Unix systems for monitoring and analyzing disk space consumption. Whether you're a system administrator managing server storage, a developer optimizing application resources, or a regular user trying to free up space on your personal computer, understanding how to effectively use the `du` command is crucial for maintaining system performance and preventing storage-related issues. This comprehensive guide will walk you through everything you need to know about the `du` command, from basic usage to advanced techniques that will help you become proficient in disk space analysis and management. Table of Contents 1. [Prerequisites and Requirements](#prerequisites-and-requirements) 2. [Understanding the du Command](#understanding-the-du-command) 3. [Basic du Command Syntax](#basic-du-command-syntax) 4. [Essential du Options and Flags](#essential-du-options-and-flags) 5. [Step-by-Step Usage Examples](#step-by-step-usage-examples) 6. [Advanced du Techniques](#advanced-du-techniques) 7. [Practical Use Cases](#practical-use-cases) 8. [Common Issues and Troubleshooting](#common-issues-and-troubleshooting) 9. [Best Practices and Professional Tips](#best-practices-and-professional-tips) 10. [Performance Considerations](#performance-considerations) 11. [Integration with Other Commands](#integration-with-other-commands) 12. [Conclusion and Next Steps](#conclusion-and-next-steps) Prerequisites and Requirements Before diving into the `du` command, ensure you have the following: System Requirements - A Linux, Unix, macOS, or Unix-like operating system - Access to a terminal or command-line interface - Basic familiarity with command-line navigation - Understanding of file system concepts (directories, files, paths) Permissions Considerations - Read permissions for directories you want to analyze - Some system directories may require elevated privileges (sudo) - Understanding of file ownership and permission concepts Knowledge Prerequisites - Basic command-line navigation skills - Familiarity with file system hierarchy - Understanding of storage units (bytes, KB, MB, GB, TB) Understanding the du Command The `du` command stands for "disk usage" and is designed to estimate file space usage within directories and subdirectories. Unlike the `df` command, which shows filesystem-level disk usage, `du` provides detailed information about individual directories and files, making it invaluable for identifying space consumption patterns and locating large files or directories. How du Works The `du` command traverses directory structures recursively, calculating the total size of files and subdirectories. It reads file metadata to determine sizes and presents the information in various formats depending on the options specified. Key Differences from Other Disk Commands - du vs df: `df` shows filesystem-level usage, while `du` shows directory-level usage - du vs ls: `ls -l` shows individual file sizes, while `du` shows cumulative directory sizes - du vs find: `find` can locate files by criteria, while `du` focuses specifically on size analysis Basic du Command Syntax The fundamental syntax for the `du` command is: ```bash du [OPTIONS] [DIRECTORY/FILE] ``` Default Behavior When run without any options or arguments, `du` displays the disk usage of the current directory and all its subdirectories: ```bash du ``` This command will output something like: ``` 4 ./subdirectory1 8 ./subdirectory2/nested 12 ./subdirectory2 24 . ``` The numbers represent disk usage in kilobytes (KB) by default, and the rightmost column shows the directory path. Essential du Options and Flags Understanding the various options available with `du` is crucial for effective disk usage analysis. Here are the most important flags: Size Display Options -h, --human-readable Displays sizes in human-readable format (K, M, G, T): ```bash du -h ``` Output example: ``` 4.0K ./documents 1.2M ./images 15G ./videos 16G . ``` -k, --kilobytes Forces output in kilobytes (default behavior): ```bash du -k /home/user/documents ``` -m, --megabytes Displays output in megabytes: ```bash du -m /var/log ``` -g, --gigabytes Shows output in gigabytes: ```bash du -g /home ``` Depth and Recursion Control -s, --summarize Shows only the total size for specified directories: ```bash du -sh /home/user ``` Output: ``` 2.5G /home/user ``` -d, --max-depth=N Limits the depth of directory traversal: ```bash du -h -d 1 /var ``` This shows only the immediate subdirectories of `/var` without going deeper. --max-depth=0 Equivalent to `-s`, shows only the total: ```bash du -h --max-depth=0 /home/user ``` Filtering and Selection Options -a, --all Includes individual files in the output, not just directories: ```bash du -ah /etc/ssh ``` -c, --total Adds a grand total at the end: ```bash du -ch /home/user/documents ``` Output: ``` 4.0K /home/user/documents/file1.txt 8.0K /home/user/documents/file2.pdf 12K /home/user/documents 12K total ``` --exclude=PATTERN Excludes files and directories matching the specified pattern: ```bash du -h --exclude="*.tmp" /var/log ``` -x, --one-file-system Stays within the same filesystem, avoiding mounted directories: ```bash du -x / ``` Sorting and Output Options -0, --null Ends each output line with a null character instead of newline (useful for scripting): ```bash du -0 /home | sort -zn ``` --time Shows the last modification time for directories: ```bash du -h --time /var/log ``` Step-by-Step Usage Examples Let's explore practical examples that demonstrate how to use `du` effectively in real-world scenarios. Example 1: Basic Directory Analysis To analyze disk usage in your home directory: ```bash Step 1: Navigate to your home directory cd ~ Step 2: Check total usage du -sh . Step 3: See breakdown of immediate subdirectories du -sh */ Step 4: Get detailed view with human-readable sizes du -h --max-depth=2 ``` Example 2: Finding Large Directories To identify the largest directories in a system: ```bash Find top 10 largest directories in /var du -h /var 2>/dev/null | sort -hr | head -10 Alternative using numeric sort for more accuracy du -k /var 2>/dev/null | sort -nr | head -10 | awk '{print $2}' | xargs du -sh ``` Example 3: Analyzing Log Files System administrators often need to monitor log file sizes: ```bash Check total log directory size du -sh /var/log Analyze individual log files and subdirectories du -ah /var/log | sort -hr | head -20 Monitor specific log types du -h /var/log/*.log 2>/dev/null ``` Example 4: User Directory Analysis For multi-user systems, analyzing user directories: ```bash Check all user directories (requires appropriate permissions) sudo du -sh /home/* Detailed analysis of a specific user du -h --max-depth=3 /home/username Find users consuming the most space sudo du -s /home/* | sort -nr | head -5 ``` Advanced du Techniques Combining du with Other Commands Using du with find for Complex Queries ```bash Find directories larger than 1GB find / -type d -exec du -s {} \; 2>/dev/null | awk '$1 > 1048576 {print $2}' | xargs du -sh Find files larger than 100MB within a directory find /home/user -type f -size +100M -exec du -h {} \; ``` Creating Custom Disk Usage Reports ```bash #!/bin/bash disk_usage_report.sh echo "=== Disk Usage Report ===" echo "Date: $(date)" echo "" echo "Top 10 Largest Directories:" du -h /home 2>/dev/null | sort -hr | head -10 echo "" echo "Total Home Directory Usage:" du -sh /home ``` Regular Expression Patterns with du ```bash Exclude multiple file types du -h --exclude=".log" --exclude=".tmp" --exclude="cache" /var Complex exclusion patterns du -h --exclude-from=exclude_list.txt /home/user ``` Where `exclude_list.txt` contains: ``` *.tmp *.log cache/ .git/ node_modules/ ``` Time-Based Analysis ```bash Show modification times with sizes du -h --time=mtime /var/log Sort by modification time du -h --time /home/user | sort -k2 ``` Practical Use Cases System Administration Scenarios Server Maintenance ```bash Daily disk usage check script #!/bin/bash THRESHOLD=80 USAGE=$(df / | awk 'NR==2 {print $5}' | sed 's/%//') if [ $USAGE -gt $THRESHOLD ]; then echo "Disk usage is ${USAGE}% - investigating large directories" du -h / --max-depth=2 | sort -hr | head -10 fi ``` Database Directory Monitoring ```bash Monitor database growth du -sh /var/lib/mysql/*/ | sort -hr Track changes over time echo "$(date): $(du -sh /var/lib/mysql)" >> db_growth.log ``` Development Environment Management Project Size Analysis ```bash Analyze project directories find ~/projects -maxdepth 1 -type d -exec du -sh {} \; | sort -hr Exclude development artifacts du -h --exclude="node_modules" --exclude=".git" --exclude="target" ~/projects/* ``` Cleanup Identification ```bash Find old cache directories find ~/.cache -type d -mtime +30 -exec du -sh {} \; Identify temporary files du -ah /tmp | grep -E '\.(tmp|temp|cache)$' | sort -hr ``` Personal Computer Maintenance Media File Management ```bash Analyze media directories du -h ~/Pictures ~/Videos ~/Music | sort -hr Find large media files find ~/Pictures -size +10M -exec du -h {} \; | sort -hr ``` Backup Planning ```bash Calculate backup requirements echo "Documents: $(du -sh ~/Documents | cut -f1)" echo "Pictures: $(du -sh ~/Pictures | cut -f1)" echo "Total Home: $(du -sh ~ | cut -f1)" ``` Common Issues and Troubleshooting Permission Denied Errors Problem: `du: cannot read directory '/root': Permission denied` Solution: ```bash Use sudo for system directories sudo du -sh /root Redirect error messages du -sh /home 2>/dev/null Combine both approaches sudo du -sh /root /var/log 2>/dev/null ``` Performance Issues with Large Filesystems Problem: `du` command taking too long on large directories Solutions: ```bash Limit depth to improve performance du -h --max-depth=2 / Use timeout to prevent hanging timeout 300 du -sh /large/directory Run in background for very large operations nohup du -h / > disk_usage.txt 2>&1 & ``` Inconsistent Results Problem: Different results between `du` and `df` Explanation: This is normal behavior because: - `df` shows filesystem-level usage including metadata - `du` shows actual file sizes - Deleted files held open by processes affect `df` but not `du` Investigation: ```bash Compare filesystem vs directory usage df -h /home du -sh /home Check for deleted files held by processes lsof +L1 /home ``` Memory Usage Issues Problem: `du` consuming too much memory on systems with many files Solutions: ```bash Use ulimit to control memory usage ulimit -v 1048576 # Limit to 1GB virtual memory du -h /large/directory Process in smaller chunks find /large/directory -maxdepth 1 -type d -exec du -sh {} \; ``` Special Character Handling Problem: Issues with filenames containing spaces or special characters Solutions: ```bash Use null separator for safe parsing du -0 directory_with_spaces | sort -z Quote paths properly du -sh "directory with spaces" Use find with proper handling find . -type d -print0 | xargs -0 du -sh ``` Best Practices and Professional Tips Efficient Disk Usage Monitoring Regular Monitoring Scripts Create automated scripts for regular disk usage monitoring: ```bash #!/bin/bash /usr/local/bin/disk-monitor.sh LOG_FILE="/var/log/disk-usage.log" DATE=$(date '+%Y-%m-%d %H:%M:%S') { echo "[$DATE] Disk Usage Report" echo "Root filesystem: $(du -sh / 2>/dev/null)" echo "Home directories: $(du -sh /home 2>/dev/null)" echo "Var directory: $(du -sh /var 2>/dev/null)" echo "---" } >> "$LOG_FILE" ``` Cron Job Integration ```bash Add to crontab for daily monitoring 0 6 * /usr/local/bin/disk-monitor.sh ``` Performance Optimization Techniques Smart Exclusions Always exclude unnecessary directories to improve performance: ```bash Comprehensive exclusion list for system analysis du -h --exclude="/proc" --exclude="/sys" --exclude="/dev" \ --exclude="/run" --exclude="/tmp" / ``` Parallel Processing For very large filesystems, consider parallel processing: ```bash Process multiple directories in parallel (du -sh /home &); (du -sh /var &); (du -sh /opt &); wait ``` Data Presentation Best Practices Formatted Output Create readable reports: ```bash Professional disk usage report printf "%-30s %10s\n" "Directory" "Size" printf "%-30s %10s\n" "----------" "----" du -sh /home/* 2>/dev/null | while read size dir; do printf "%-30s %10s\n" "$dir" "$size" done ``` Threshold-Based Reporting Only report directories above certain thresholds: ```bash Report only directories larger than 1GB du -h /var 2>/dev/null | awk '$1 ~ /[0-9]+G/ {print}' ``` Security Considerations Safe Scripting Practices ```bash Always validate input if [[ -d "$1" ]]; then du -sh "$1" else echo "Error: '$1' is not a valid directory" exit 1 fi Use absolute paths in scripts DU_COMMAND="/usr/bin/du" $DU_COMMAND -sh /home ``` Permission Management ```bash Check if directory is readable before processing if [[ -r "$directory" ]]; then du -sh "$directory" else echo "Warning: Cannot read $directory" fi ``` Performance Considerations System Impact The `du` command can be resource-intensive on large filesystems. Consider these factors: I/O Impact - `du` reads directory metadata, causing disk I/O - On busy systems, use `nice` or `ionice` to reduce impact: ```bash Reduce CPU priority nice -n 19 du -sh /large/directory Reduce I/O priority (Linux) ionice -c 3 du -sh /large/directory Combine both nice -n 19 ionice -c 3 du -sh /large/directory ``` Memory Considerations - Large directory structures can consume significant memory - Monitor memory usage during operation: ```bash Monitor du process memory usage watch 'ps aux | grep du' Use memory-efficient alternatives for very large operations find /large/directory -type f -printf "%s\n" | awk '{sum+=$1} END {print sum}' ``` Optimization Strategies Selective Analysis Instead of analyzing entire filesystems, focus on specific areas: ```bash Analyze only user-modifiable areas for dir in /home /var/log /opt /tmp; do echo "$dir: $(du -sh $dir 2>/dev/null | cut -f1)" done ``` Caching Results For frequently accessed information, cache results: ```bash Cache disk usage results CACHE_FILE="/tmp/du_cache_$(date +%Y%m%d)" if [[ ! -f "$CACHE_FILE" || $(find "$CACHE_FILE" -mtime +1) ]]; then du -sh /home/* > "$CACHE_FILE" 2>/dev/null fi cat "$CACHE_FILE" ``` Integration with Other Commands Combining du with System Tools Integration with df ```bash Comprehensive disk analysis echo "Filesystem Usage:" df -h echo "" echo "Directory Usage:" du -sh /home /var /opt ``` Working with find ```bash Find and analyze large files find /var -size +100M -exec du -h {} \; | sort -hr Find directories modified recently and check their size find /home -type d -mtime -7 -exec du -sh {} \; ``` Combining with awk for Advanced Processing ```bash Calculate average file size in directories find /home/user -type f -exec du -k {} \; | awk '{ sum += $1; count++ } END { if (count > 0) print "Average file size:", sum/count "K" }' ``` Scripting Integration Function Libraries Create reusable functions: ```bash disk_utils.sh check_directory_size() { local dir="$1" local threshold="$2" if [[ ! -d "$dir" ]]; then echo "Error: Directory $dir does not exist" return 1 fi local size_kb=$(du -sk "$dir" 2>/dev/null | cut -f1) if [[ $size_kb -gt $threshold ]]; then echo "Warning: $dir is $(du -sh "$dir" | cut -f1)" return 2 fi return 0 } Usage check_directory_size "/var/log" 1048576 # 1GB in KB ``` JSON Output for Modern Applications ```bash Generate JSON output for web applications generate_disk_usage_json() { echo "{" echo ' "timestamp": "'$(date -Iseconds)'",' echo ' "directories": [' first=true du -sh /home/* 2>/dev/null | while IFS=$'\t' read size path; do [[ $first == true ]] && first=false || echo "," echo -n " {\"path\": \"$path\", \"size\": \"$size\"}" done echo "" echo " ]" echo "}" } ``` Conclusion and Next Steps The `du` command is an indispensable tool for disk usage analysis and system maintenance. Throughout this comprehensive guide, we've covered everything from basic usage to advanced techniques that will help you effectively manage disk space across various scenarios. Key Takeaways 1. Basic Usage: Start with simple commands like `du -sh` for quick directory size checks 2. Advanced Options: Leverage flags like `--max-depth`, `--exclude`, and `--human-readable` for targeted analysis 3. Performance: Consider system impact and use optimization techniques for large filesystems 4. Automation: Implement regular monitoring through scripts and cron jobs 5. Integration: Combine `du` with other commands for comprehensive system analysis 6. Troubleshooting: Handle common issues like permissions and performance problems effectively Recommended Next Steps 1. Practice: Start using `du` regularly in your daily system administration tasks 2. Scripting: Develop custom scripts for your specific monitoring needs 3. Automation: Set up automated disk usage reporting for critical systems 4. Advanced Tools: Explore complementary tools like `ncdu` for interactive disk usage analysis 5. Monitoring: Implement threshold-based alerting for proactive disk space management Additional Resources - Manual Pages: Use `man du` for complete option documentation - System Monitoring: Learn about `df`, `lsof`, and `iostat` for comprehensive system analysis - Scripting: Develop bash scripting skills for advanced automation - Performance Tuning: Study I/O optimization techniques for better system performance Final Recommendations - Always test commands in safe environments before using them in production - Regularly monitor disk usage to prevent system issues - Document your custom scripts and procedures for team knowledge sharing - Stay updated with new features and options in newer versions of the `du` command By mastering the `du` command and implementing the techniques covered in this guide, you'll be well-equipped to handle disk usage analysis and management tasks efficiently and professionally. Remember that effective disk space management is an ongoing process that requires regular attention and proactive monitoring to maintain optimal system performance.