How to check directory/file size → du
How to Check Directory/File Size → du
The `du` (disk usage) command is one of the most essential tools for system administrators, developers, and Linux users who need to monitor and analyze disk space usage. Whether you're troubleshooting storage issues, optimizing system performance, or simply trying to understand where your disk space is being consumed, mastering the `du` command is crucial for effective system management.
This comprehensive guide will walk you through everything you need to know about using the `du` command, from basic syntax to advanced usage scenarios, troubleshooting common issues, and implementing best practices for disk space monitoring.
Table of Contents
1. [Introduction to the du Command](#introduction-to-the-du-command)
2. [Prerequisites and Requirements](#prerequisites-and-requirements)
3. [Basic du Command Syntax](#basic-du-command-syntax)
4. [Essential du Command Options](#essential-du-command-options)
5. [Practical Examples and Use Cases](#practical-examples-and-use-cases)
6. [Advanced du Usage Scenarios](#advanced-du-usage-scenarios)
7. [Combining du with Other Commands](#combining-du-with-other-commands)
8. [Common Issues and Troubleshooting](#common-issues-and-troubleshooting)
9. [Best Practices and Professional Tips](#best-practices-and-professional-tips)
10. [Performance Considerations](#performance-considerations)
11. [Alternative Tools and Comparisons](#alternative-tools-and-comparisons)
12. [Conclusion](#conclusion)
Introduction to the du Command
The `du` command, short for "disk usage," is a standard Unix and Linux utility that displays the amount of disk space used by files and directories. Unlike the `df` command which shows filesystem-level disk usage, `du` provides detailed information about individual files and directory structures, making it invaluable for identifying space-consuming files and directories.
The command works by recursively traversing directory structures and calculating the total disk space occupied by each file and subdirectory. This makes it particularly useful for:
- Identifying large files and directories consuming excessive disk space
- Monitoring disk usage trends over time
- Cleaning up storage by locating unnecessary files
- Analyzing directory structures for optimization
- Troubleshooting disk space issues
- Generating reports for system auditing
Prerequisites and Requirements
Before diving into the `du` command usage, ensure you have the following:
System Requirements
- Unix-like operating system (Linux, macOS, BSD, etc.)
- Terminal or command-line access
- Basic understanding of file system navigation
- Appropriate permissions to access target directories
Knowledge Prerequisites
- Familiarity with command-line interface
- Understanding of file system hierarchy
- Basic knowledge of file permissions
- Comfort with terminal navigation commands (`cd`, `ls`, `pwd`)
Tools and Access
- Terminal emulator or SSH access
- Text editor (for creating scripts)
- Administrative privileges (for system directories)
Basic du Command Syntax
The fundamental syntax of the `du` command is straightforward:
```bash
du [OPTIONS] [FILE/DIRECTORY]
```
Simple Usage Examples
```bash
Check current directory size
du
Check specific directory size
du /home/user/documents
Check multiple directories
du /var/log /tmp /home
```
Default Behavior
When executed without options, `du` displays:
- Disk usage for each subdirectory
- Values in 1024-byte blocks (kilobytes)
- Recursive traversal of all subdirectories
- Total usage at the end
```bash
$ du /home/user/projects
4 /home/user/projects/project1/src
8 /home/user/projects/project1
12 /home/user/projects/project2
20 /home/user/projects
```
Essential du Command Options
Understanding the key options available with `du` is crucial for effective usage. Here are the most important flags and their applications:
Human-Readable Format (-h)
The `-h` option displays sizes in human-readable format using appropriate units (K, M, G, T):
```bash
Human-readable output
du -h /var/log
156K /var/log/apache2
2.3M /var/log/mysql
45M /var/log/system
47M /var/log
```
Summary Mode (-s)
Use `-s` to display only the total size without showing subdirectory details:
```bash
Show only total size
du -sh /home/user
2.5G /home/user
Multiple directories summary
du -sh /var/log /tmp /home
47M /var/log
156K /tmp
15G /home
```
Maximum Depth (-d)
Control how deep the directory traversal goes with the `-d` option:
```bash
Show only immediate subdirectories
du -h -d 1 /home/user
500M /home/user/documents
1.2G /home/user/downloads
800M /home/user/projects
2.5G /home/user
Limit to 2 levels deep
du -h -d 2 /var
```
Show All Files (-a)
Include individual files in the output, not just directories:
```bash
Show all files and directories
du -ah /home/user/documents
4.0K /home/user/documents/readme.txt
156K /home/user/documents/report.pdf
2.3M /home/user/documents/presentation.pptx
2.5M /home/user/documents
```
Exclude Patterns (--exclude)
Exclude specific files or directories from the calculation:
```bash
Exclude specific patterns
du -h --exclude="*.log" /var/log
du -h --exclude="node_modules" /home/user/projects
du -h --exclude="*.tmp" --exclude="cache" /home/user
```
Practical Examples and Use Cases
Finding the Largest Directories
Identify the top space-consuming directories in your system:
```bash
Find top 10 largest directories in /home
du -h /home | sort -hr | head -10
Find largest directories in current location
du -h -d 1 . | sort -hr
More comprehensive analysis
du -h /var | sort -hr | head -20
```
Monitoring Specific File Types
Analyze disk usage for specific file types:
```bash
Find all large image files
find /home -name ".jpg" -o -name ".png" -o -name "*.gif" | xargs du -ch
Check video file sizes
find /home -name ".mp4" -o -name ".avi" -o -name "*.mkv" | xargs du -ch | tail -1
```
System Cleanup Analysis
Identify potential cleanup targets:
```bash
Check temporary directories
du -sh /tmp /var/tmp ~/.cache
Analyze log file usage
du -sh /var/log/*
Check package cache (Ubuntu/Debian)
du -sh /var/cache/apt/archives
```
Project Directory Analysis
For developers managing multiple projects:
```bash
Analyze project directories
du -h -d 2 ~/projects | sort -hr
Exclude common build artifacts
du -h --exclude="node_modules" --exclude="target" --exclude=".git" ~/projects
Compare project sizes
for dir in ~/projects/*/; do
echo "$(du -sh --exclude=node_modules "$dir" | cut -f1) - $(basename "$dir")"
done
```
Database and Application Monitoring
Monitor application-specific disk usage:
```bash
Check database sizes
du -sh /var/lib/mysql/*
Web server content analysis
du -h -d 2 /var/www
Application log monitoring
du -sh /var/log/apache2/* | sort -hr
```
Advanced du Usage Scenarios
Time-Based Analysis
Combine `du` with time-based filtering for temporal analysis:
```bash
Files modified in last 7 days
find /home/user -mtime -7 -type f | xargs du -ch | tail -1
Large files older than 30 days
find /var/log -mtime +30 -type f | xargs du -h | sort -hr
```
Cross-Filesystem Handling
Control how `du` handles different filesystems:
```bash
Stay within single filesystem
du -x -h /home
Include all mounted filesystems
du -h /
```
Scripting and Automation
Create automated disk monitoring scripts:
```bash
#!/bin/bash
disk_monitor.sh - Monitor directory sizes
THRESHOLD=1000000 # 1GB in KB
DIRECTORIES=("/home" "/var/log" "/tmp")
for dir in "${DIRECTORIES[@]}"; do
size=$(du -s "$dir" | cut -f1)
if [ "$size" -gt "$THRESHOLD" ]; then
echo "WARNING: $dir is using $(du -sh "$dir" | cut -f1)"
du -h -d 1 "$dir" | sort -hr | head -5
fi
done
```
Network and Remote Usage
Use `du` with remote systems:
```bash
SSH remote disk usage check
ssh user@remote-server "du -sh /var/log"
Remote directory analysis
ssh user@server "du -h /home | sort -hr | head -10"
```
Combining du with Other Commands
Integration with find
Powerful combinations for targeted analysis:
```bash
Find and size large files
find /home -size +100M -type f -exec du -h {} \; | sort -hr
Files larger than 1GB modified recently
find / -size +1G -mtime -30 -type f -exec du -h {} \;
```
Piping to sort and head/tail
Organize output for better analysis:
```bash
Top 20 largest directories
du -h /var | sort -hr | head -20
Smallest directories (excluding zero-size)
du -h /etc | sort -h | grep -v "^0" | head -10
Middle-range sizes
du -h /usr/share | sort -hr | tail -n +11 | head -10
```
Using with grep for Filtering
Filter results based on patterns:
```bash
Only show directories with specific patterns
du -h /var/log | grep -E "(error|access|debug)"
Exclude certain patterns from output
du -h /home | grep -v -E "(cache|tmp|\.git)"
```
Integration with awk for Processing
Process and format output:
```bash
Show only sizes above 100MB
du -h /home | awk '$1 ~ /[0-9]+[GM]/ && $1+0 > 100'
Format output with custom messages
du -sh /var/log/* | awk '{print "Directory " $2 " uses " $1 " of space"}'
```
Common Issues and Troubleshooting
Permission Denied Errors
When encountering permission issues:
```bash
Problem: Permission denied errors
du: cannot read directory '/root': Permission denied
Solution: Use sudo for system directories
sudo du -sh /root
Alternative: Redirect errors to suppress them
du -sh /home 2>/dev/null
Better: Show errors but continue processing
du -sh /var 2>&1 | grep -v "Permission denied"
```
Symbolic Link Handling
Understanding how `du` handles symbolic links:
```bash
By default, du doesn't follow symbolic links
du -sh /usr/bin # Won't follow symlinks
Follow symbolic links with -L
du -shL /usr/bin # Follows symlinks
Count each symlink as its own size with -P (default)
du -shP /usr/bin
```
Large Directory Performance
Optimizing performance for large directories:
```bash
Problem: du takes too long on large directories
Solution: Limit depth and use parallel processing
Limit traversal depth
du -h -d 2 /usr | sort -hr
Use parallel processing for multiple directories
echo -e "/var\n/usr\n/home" | xargs -I {} -P 3 du -sh {}
Background processing for large scans
nohup du -h /large-directory > du-results.txt 2>&1 &
```
Disk Space Discrepancies
Resolving differences between `du` and `df`:
```bash
Check both du and df results
df -h /home
du -sh /home
Reasons for discrepancies:
1. Open deleted files (use lsof to check)
lsof | grep deleted
2. Hard links (du counts each link)
find /home -links +1 -type f
3. Sparse files (du shows actual usage)
du --apparent-size -h file.sparse
du -h file.sparse
```
Memory Usage Issues
Managing memory consumption during large scans:
```bash
Problem: du uses too much memory
Solution: Process directories individually
Instead of: du -h /
Use:
for dir in /*; do
[ -d "$dir" ] && du -sh "$dir"
done
```
Best Practices and Professional Tips
Regular Monitoring Strategies
Implement systematic disk monitoring:
```bash
Create daily disk usage reports
#!/bin/bash
daily_disk_report.sh
DATE=$(date +%Y-%m-%d)
REPORT_DIR="/var/log/disk-reports"
mkdir -p "$REPORT_DIR"
{
echo "Disk Usage Report - $DATE"
echo "=========================="
echo
echo "Top 20 Largest Directories:"
du -h /home | sort -hr | head -20
echo
echo "System Directory Usage:"
du -sh /var/log /tmp /var/cache /usr/share
} > "$REPORT_DIR/disk-usage-$DATE.txt"
```
Efficient Directory Scanning
Optimize scanning strategies:
```bash
Use appropriate depth limits
du -h -d 3 /usr # Usually sufficient for analysis
Exclude unnecessary directories
EXCLUDE_OPTS="--exclude=.git --exclude=node_modules --exclude=.cache"
du -h $EXCLUDE_OPTS /home/user/projects
Batch processing for multiple targets
DIRS=("/var/log" "/tmp" "/var/cache")
printf "%s\n" "${DIRS[@]}" | xargs -I {} du -sh {}
```
Creating Useful Aliases
Set up convenient aliases:
```bash
Add to ~/.bashrc or ~/.zshrc
alias duh='du -h -d 1 | sort -hr'
alias dus='du -sh'
alias dutop='du -h | sort -hr | head -20'
alias dudir='du -h -d 1'
Function for interactive directory analysis
analyze_dir() {
local dir=${1:-.}
echo "Analyzing directory: $dir"
echo "Total size: $(du -sh "$dir" | cut -f1)"
echo "Largest subdirectories:"
du -h -d 1 "$dir" | sort -hr | head -10
}
```
Documentation and Reporting
Maintain proper documentation:
```bash
Generate comprehensive reports
generate_disk_report() {
local output_file="disk-report-$(date +%Y%m%d).txt"
{
echo "=== DISK USAGE ANALYSIS REPORT ==="
echo "Generated: $(date)"
echo "Hostname: $(hostname)"
echo
echo "=== FILESYSTEM OVERVIEW ==="
df -h
echo
echo "=== TOP 20 LARGEST DIRECTORIES ==="
du -h / 2>/dev/null | sort -hr | head -20
echo
echo "=== SYSTEM DIRECTORIES ==="
for dir in /var/log /tmp /var/cache /usr/share; do
[ -d "$dir" ] && echo "$dir: $(du -sh "$dir" 2>/dev/null | cut -f1)"
done
} > "$output_file"
echo "Report generated: $output_file"
}
```
Performance Considerations
Optimizing Large Scans
Strategies for handling large directory structures:
```bash
Use ionice to reduce I/O impact
ionice -c 3 du -sh /large-directory
Combine with nice for CPU priority
nice -n 19 ionice -c 3 du -sh /massive-dataset
Parallel processing for independent directories
parallel "du -sh {} 2>/dev/null" ::: /var/* | sort -hr
```
Memory-Efficient Approaches
Minimize memory usage during scans:
```bash
Process directories incrementally
find /large-dir -maxdepth 1 -type d | while read dir; do
size=$(du -sh "$dir" 2>/dev/null | cut -f1)
echo "$size $dir"
done | sort -hr
Use streaming approach for very large datasets
find /huge-directory -type f -printf "%s %p\n" | \
awk '{size+=$1; files++} END {print "Total:", size/1024/1024 "MB in", files, "files"}'
```
Alternative Tools and Comparisons
Comparison with Other Disk Usage Tools
Understanding when to use different tools:
```bash
du - Detailed directory analysis
du -h -d 2 /home
df - Filesystem-level usage
df -h
ncdu - Interactive disk usage analyzer
ncdu /home
tree - Directory structure with sizes
tree -h -L 2 /home/user
ls - File listing with sizes
ls -lah /home/user/
```
Modern Alternatives
Explore enhanced tools:
```bash
dust - Modern du replacement
dust /home
duf - Better df alternative
duf
gdu - Fast disk usage analyzer with TUI
gdu /home
baobab - Graphical disk usage analyzer (GUI)
baobab
```
Conclusion
The `du` command is an indispensable tool for effective disk space management in Unix and Linux environments. Throughout this comprehensive guide, we've explored everything from basic syntax to advanced usage scenarios, troubleshooting techniques, and best practices.
Key Takeaways
1. Master the Essential Options: The `-h`, `-s`, `-d`, and `-a` options cover most common use cases
2. Combine with Other Tools: Integrate `du` with `sort`, `grep`, `find`, and other utilities for powerful analysis
3. Handle Permissions Properly: Use appropriate privileges and error handling for system directories
4. Optimize for Performance: Consider depth limits, exclusions, and parallel processing for large datasets
5. Automate Monitoring: Create scripts and reports for regular disk usage tracking
6. Understand Limitations: Know when to use alternative tools for specific scenarios
Next Steps
To further enhance your disk management skills:
1. Practice Regular Monitoring: Implement automated disk usage reporting in your systems
2. Explore Advanced Scripting: Create custom tools combining `du` with other utilities
3. Learn Complementary Tools: Familiarize yourself with `ncdu`, `dust`, and other modern alternatives
4. Develop Cleanup Strategies: Use `du` insights to implement effective disk cleanup procedures
5. Monitor Performance Impact: Understand how disk usage affects system performance
Final Recommendations
- Always test commands in non-production environments first
- Keep regular backups before performing large cleanup operations
- Document your disk monitoring procedures for team consistency
- Stay updated with new tools and techniques in the ecosystem
- Consider implementing automated alerting for disk usage thresholds
By mastering the `du` command and following these best practices, you'll be well-equipped to handle disk space management challenges effectively, whether you're a system administrator, developer, or power user working with Unix-like systems.