How to sync files between servers with rsync
How to Sync Files Between Servers with rsync
File synchronization between servers is a critical task for system administrators, developers, and IT professionals managing distributed systems. The `rsync` utility stands as one of the most powerful and versatile tools for efficiently transferring and synchronizing files across different systems. This comprehensive guide will walk you through everything you need to know about using rsync to sync files between servers, from basic concepts to advanced configurations.
Table of Contents
1. [Introduction to rsync](#introduction-to-rsync)
2. [Prerequisites and Requirements](#prerequisites-and-requirements)
3. [Basic rsync Syntax and Options](#basic-rsync-syntax-and-options)
4. [Setting Up SSH Key Authentication](#setting-up-ssh-key-authentication)
5. [Step-by-Step File Synchronization](#step-by-step-file-synchronization)
6. [Advanced rsync Configurations](#advanced-rsync-configurations)
7. [Practical Examples and Use Cases](#practical-examples-and-use-cases)
8. [Troubleshooting Common Issues](#troubleshooting-common-issues)
9. [Best Practices and Security Considerations](#best-practices-and-security-considerations)
10. [Performance Optimization](#performance-optimization)
11. [Monitoring and Logging](#monitoring-and-logging)
12. [Conclusion](#conclusion)
Introduction to rsync
rsync (remote sync) is a fast, versatile, and feature-rich file copying tool that can synchronize files and directories between two locations over a network or locally. Unlike simple copy commands, rsync only transfers the portions of files that have changed, making it extremely efficient for regular backups and synchronization tasks.
Key Features of rsync
- Incremental transfers: Only changed parts of files are transferred
- Preservation of file attributes: Maintains permissions, timestamps, and ownership
- Compression: Reduces bandwidth usage during transfers
- Secure transfers: Works seamlessly with SSH for encrypted connections
- Flexible filtering: Include or exclude files based on patterns
- Bandwidth limiting: Control transfer speeds to avoid network congestion
Prerequisites and Requirements
Before diving into rsync implementation, ensure you have the following prerequisites in place:
System Requirements
- Source and destination servers running Linux, Unix, or macOS
- rsync installed on both systems (usually pre-installed on most Unix-like systems)
- Network connectivity between servers
- Appropriate user permissions on both systems
- SSH access configured between servers (recommended)
Checking rsync Installation
Verify rsync is installed on both servers:
```bash
rsync --version
```
If rsync is not installed, install it using your system's package manager:
```bash
Ubuntu/Debian
sudo apt-get install rsync
CentOS/RHEL/Fedora
sudo yum install rsync
or for newer versions
sudo dnf install rsync
macOS
brew install rsync
```
Basic rsync Syntax and Options
Understanding rsync's syntax is crucial for effective file synchronization. The basic syntax follows this pattern:
```bash
rsync [OPTIONS] SOURCE DESTINATION
```
Essential rsync Options
| Option | Description |
|--------|-------------|
| `-a, --archive` | Archive mode; preserves permissions, timestamps, symbolic links |
| `-v, --verbose` | Increase verbosity for detailed output |
| `-z, --compress` | Compress file data during transfer |
| `-h, --human-readable` | Output numbers in human-readable format |
| `-P, --partial --progress` | Show progress and keep partial files |
| `-n, --dry-run` | Perform trial run without making changes |
| `-r, --recursive` | Recurse into directories |
| `--delete` | Delete extraneous files from destination |
| `--exclude` | Exclude files matching pattern |
| `--include` | Include files matching pattern |
Common Option Combinations
The most frequently used combination is `-avzh`:
- `-a`: Archive mode (preserves everything)
- `-v`: Verbose output
- `-z`: Compression
- `-h`: Human-readable output
Setting Up SSH Key Authentication
For secure and automated synchronization, configure SSH key authentication between servers.
Generate SSH Key Pair
On the source server, generate an SSH key pair:
```bash
ssh-keygen -t rsa -b 4096 -C "rsync-sync-key"
```
When prompted, save the key to a specific location:
```
Enter file in which to save the key: /home/username/.ssh/rsync_key
```
Copy Public Key to Destination Server
Transfer the public key to the destination server:
```bash
ssh-copy-id -i ~/.ssh/rsync_key.pub username@destination-server.com
```
Test SSH Connection
Verify the key-based authentication works:
```bash
ssh -i ~/.ssh/rsync_key username@destination-server.com
```
Step-by-Step File Synchronization
Let's walk through the process of synchronizing files between servers using practical examples.
Step 1: Basic Local to Remote Sync
Synchronize a local directory to a remote server:
```bash
rsync -avzh /local/source/directory/ username@remote-server:/remote/destination/directory/
```
Important Note: The trailing slash (`/`) in the source path affects behavior:
- With trailing slash: Syncs contents of the directory
- Without trailing slash: Syncs the directory itself
Step 2: Remote to Local Sync
Pull files from a remote server to local system:
```bash
rsync -avzh username@remote-server:/remote/source/directory/ /local/destination/directory/
```
Step 3: Server to Server Sync
Synchronize between two remote servers:
```bash
rsync -avzh username1@server1:/path/to/source/ username2@server2:/path/to/destination/
```
Step 4: Using SSH Keys
Specify SSH key for authentication:
```bash
rsync -avzh -e "ssh -i ~/.ssh/rsync_key" /local/source/ username@remote-server:/remote/destination/
```
Advanced rsync Configurations
Exclusion and Inclusion Patterns
Exclude specific files or directories:
```bash
rsync -avzh --exclude='*.log' --exclude='temp/' /source/ user@server:/destination/
```
Use exclusion file for complex patterns:
```bash
Create exclusion file
cat > exclude-list.txt << EOF
*.log
*.tmp
temp/
cache/
*.bak
EOF
Use exclusion file
rsync -avzh --exclude-from=exclude-list.txt /source/ user@server:/destination/
```
Delete Operations
Synchronize and delete files that don't exist in source:
```bash
rsync -avzh --delete /source/ user@server:/destination/
```
Warning: Use `--delete` carefully as it permanently removes files from the destination.
Bandwidth Limiting
Limit bandwidth usage to avoid network congestion:
```bash
rsync -avzh --bwlimit=1000 /source/ user@server:/destination/
```
This limits transfer speed to 1000 KB/s.
Practical Examples and Use Cases
Example 1: Website Deployment
Deploy website files from development to production server:
```bash
#!/bin/bash
Website deployment script
SOURCE_DIR="/var/www/html/staging/"
DEST_SERVER="production-server.com"
DEST_DIR="/var/www/html/production/"
SSH_KEY="~/.ssh/deploy_key"
echo "Starting website deployment..."
rsync -avzh \
--delete \
--exclude='*.log' \
--exclude='cache/' \
--exclude='.git/' \
-e "ssh -i $SSH_KEY" \
$SOURCE_DIR \
deploy@$DEST_SERVER:$DEST_DIR
echo "Deployment completed successfully!"
```
Example 2: Database Backup Synchronization
Sync database backups between servers:
```bash
#!/bin/bash
Database backup sync script
BACKUP_SOURCE="/var/backups/mysql/"
BACKUP_DEST="backup-server.com:/storage/mysql-backups/"
LOG_FILE="/var/log/backup-sync.log"
echo "$(date): Starting backup synchronization" >> $LOG_FILE
rsync -avzh \
--delete \
--include='*.sql.gz' \
--exclude='*' \
--log-file=$LOG_FILE \
$BACKUP_SOURCE \
backup@$BACKUP_DEST
echo "$(date): Backup synchronization completed" >> $LOG_FILE
```
Example 3: Configuration Management
Synchronize configuration files across multiple servers:
```bash
#!/bin/bash
Configuration sync script
CONFIG_SOURCE="/etc/myapp/"
SERVERS=("server1.com" "server2.com" "server3.com")
SSH_KEY="~/.ssh/config_key"
for server in "${SERVERS[@]}"; do
echo "Syncing configuration to $server..."
rsync -avzh \
--checksum \
-e "ssh -i $SSH_KEY" \
$CONFIG_SOURCE \
admin@$server:/etc/myapp/
if [ $? -eq 0 ]; then
echo "Successfully synced to $server"
else
echo "Failed to sync to $server"
fi
done
```
Troubleshooting Common Issues
Issue 1: Permission Denied Errors
Problem: rsync fails with permission denied errors.
Solutions:
1. Check SSH key permissions:
```bash
chmod 600 ~/.ssh/rsync_key
chmod 644 ~/.ssh/rsync_key.pub
```
2. Verify destination directory permissions:
```bash
ssh user@server "ls -la /path/to/destination/"
```
3. Use sudo for privileged operations:
```bash
rsync -avzh /source/ user@server:/destination/ --rsync-path="sudo rsync"
```
Issue 2: Connection Timeouts
Problem: rsync connections timeout on large transfers.
Solutions:
1. Increase SSH timeout values:
```bash
rsync -avzh -e "ssh -o ServerAliveInterval=60 -o ServerAliveCountMax=3" /source/ user@server:/destination/
```
2. Use compression to reduce transfer time:
```bash
rsync -avzh --compress-level=9 /source/ user@server:/destination/
```
Issue 3: Partial Transfer Failures
Problem: Large file transfers fail and need to restart from beginning.
Solution: Use partial transfers and progress monitoring:
```bash
rsync -avzhP --partial-dir=.rsync-partial /source/ user@server:/destination/
```
Issue 4: SSH Host Key Verification
Problem: SSH host key verification failures.
Solutions:
1. Add host to known_hosts:
```bash
ssh-keyscan -H server.com >> ~/.ssh/known_hosts
```
2. Temporarily disable host key checking (not recommended for production):
```bash
rsync -avzh -e "ssh -o StrictHostKeyChecking=no" /source/ user@server:/destination/
```
Best Practices and Security Considerations
Security Best Practices
1. Use SSH Keys: Always prefer key-based authentication over passwords
2. Restrict SSH Access: Configure SSH to only allow specific users and disable root login
3. Use Dedicated Keys: Create separate SSH keys for different synchronization tasks
4. Regular Key Rotation: Periodically rotate SSH keys for enhanced security
Operational Best Practices
1. Test with Dry Run: Always test with `--dry-run` before executing:
```bash
rsync -avzh --dry-run /source/ user@server:/destination/
```
2. Use Logging: Implement comprehensive logging for audit trails:
```bash
rsync -avzh --log-file=/var/log/rsync.log /source/ user@server:/destination/
```
3. Monitor Disk Space: Ensure adequate disk space on destination:
```bash
df -h /destination/path
```
4. Implement Error Handling: Create robust scripts with error handling:
```bash
#!/bin/bash
set -e # Exit on any error
rsync -avzh /source/ user@server:/destination/ || {
echo "Sync failed at $(date)" | mail -s "Sync Failure" admin@company.com
exit 1
}
```
Performance Optimization
1. Use Appropriate Compression: Balance compression level with CPU usage:
```bash
Light compression (faster)
rsync -avz --compress-level=1 /source/ user@server:/destination/
Heavy compression (slower but less bandwidth)
rsync -avz --compress-level=9 /source/ user@server:/destination/
```
2. Parallel Transfers: For multiple directories, use parallel rsync:
```bash
#!/bin/bash
dirs=("dir1" "dir2" "dir3")
for dir in "${dirs[@]}"; do
rsync -avzh "/source/$dir/" "user@server:/destination/$dir/" &
done
wait # Wait for all background jobs to complete
```
3. Optimize for Network Conditions: Adjust based on network quality:
```bash
For high-latency networks
rsync -avzh --whole-file /source/ user@server:/destination/
For low-bandwidth networks
rsync -avzh --compress-level=9 --bwlimit=500 /source/ user@server:/destination/
```
Monitoring and Logging
Comprehensive Logging Setup
Create detailed logging for rsync operations:
```bash
#!/bin/bash
Advanced rsync with logging
LOG_DIR="/var/log/rsync"
LOG_FILE="$LOG_DIR/sync-$(date +%Y%m%d-%H%M%S).log"
ERROR_LOG="$LOG_DIR/errors.log"
mkdir -p $LOG_DIR
{
echo "=== Sync started at $(date) ==="
rsync -avzh \
--stats \
--log-file="$LOG_FILE" \
/source/ \
user@server:/destination/ 2>&1
RSYNC_EXIT_CODE=$?
echo "=== Sync completed at $(date) with exit code $RSYNC_EXIT_CODE ==="
if [ $RSYNC_EXIT_CODE -ne 0 ]; then
echo "ERROR: Sync failed with code $RSYNC_EXIT_CODE" >> "$ERROR_LOG"
fi
} | tee -a "$LOG_FILE"
```
Monitoring Script
Create a monitoring script to track sync operations:
```bash
#!/bin/bash
Rsync monitoring script
check_sync_status() {
local log_file="$1"
local max_age_hours=24
if [ ! -f "$log_file" ]; then
echo "WARNING: Log file not found"
return 1
fi
# Check if log file is recent
if [ $(find "$log_file" -mmin +$((max_age_hours * 60)) | wc -l) -gt 0 ]; then
echo "WARNING: Last sync older than $max_age_hours hours"
return 1
fi
# Check for errors in log
if grep -q "ERROR\|failed\|denied" "$log_file"; then
echo "ERROR: Sync errors detected in log"
return 1
fi
echo "OK: Sync status normal"
return 0
}
Usage
check_sync_status "/var/log/rsync/latest.log"
```
Automation with Cron
Automate regular synchronization using cron jobs:
```bash
Edit crontab
crontab -e
Add entries for automated sync
Daily backup at 2 AM
0 2 * /usr/local/bin/backup-sync.sh >> /var/log/cron-rsync.log 2>&1
Hourly config sync
0 /usr/local/bin/config-sync.sh >> /var/log/cron-rsync.log 2>&1
```
Advanced Use Cases
Incremental Backups with Hardlinks
Create space-efficient incremental backups:
```bash
#!/bin/bash
Incremental backup script with hardlinks
BACKUP_SOURCE="/data/"
BACKUP_DEST="/backups/"
DATE=$(date +%Y-%m-%d_%H-%M-%S)
LATEST_LINK="$BACKUP_DEST/latest"
Create new backup directory
mkdir -p "$BACKUP_DEST/$DATE"
Perform incremental backup
rsync -avzh \
--delete \
--link-dest="$LATEST_LINK" \
"$BACKUP_SOURCE" \
"$BACKUP_DEST/$DATE/"
Update latest symlink
ln -sfn "$BACKUP_DEST/$DATE" "$LATEST_LINK"
echo "Backup completed: $BACKUP_DEST/$DATE"
```
Multi-Server Synchronization
Synchronize files across multiple servers:
```bash
#!/bin/bash
Multi-server sync script
SOURCE="/data/shared/"
SERVERS_FILE="/etc/rsync-servers.conf"
while IFS= read -r server; do
echo "Syncing to $server..."
rsync -avzh \
--timeout=300 \
"$SOURCE" \
"$server:/data/shared/" &
# Limit concurrent syncs
if (( $(jobs -r | wc -l) >= 3 )); then
wait -n # Wait for any job to complete
fi
done < "$SERVERS_FILE"
wait # Wait for all remaining jobs
echo "Multi-server sync completed"
```
Conclusion
rsync is an incredibly powerful tool for file synchronization between servers, offering flexibility, efficiency, and reliability. Throughout this comprehensive guide, we've covered everything from basic syntax to advanced configurations, practical examples, and troubleshooting techniques.
Key Takeaways
1. Start Simple: Begin with basic rsync commands and gradually incorporate advanced features
2. Security First: Always use SSH key authentication and follow security best practices
3. Test Thoroughly: Use dry-run mode to test configurations before implementing
4. Monitor and Log: Implement comprehensive logging and monitoring for production systems
5. Automate Wisely: Use cron jobs and scripts for regular synchronization tasks
Next Steps
To further enhance your rsync skills:
1. Experiment with Different Options: Test various rsync flags to understand their impact
2. Implement Monitoring: Set up alerting for failed synchronizations
3. Optimize Performance: Fine-tune settings based on your network and storage characteristics
4. Explore Alternatives: Consider tools like rclone for cloud storage synchronization
5. Study Advanced Patterns: Learn about more complex synchronization scenarios
By mastering rsync, you'll have a robust solution for file synchronization that can handle everything from simple backups to complex multi-server deployments. Remember to always test your configurations in a safe environment before deploying to production systems.
The power of rsync lies not just in its feature set, but in its reliability and efficiency. With proper implementation and monitoring, rsync can become the backbone of your file synchronization infrastructure, ensuring data consistency and availability across your server environment.