How to snapshot/replicate ZFS → zfs snapshot pool/fs@now; zfs send|receive

How to Snapshot and Replicate ZFS Data Using zfs send and receive Table of Contents 1. [Introduction](#introduction) 2. [Prerequisites](#prerequisites) 3. [Understanding ZFS Snapshots](#understanding-zfs-snapshots) 4. [Basic Snapshot Creation](#basic-snapshot-creation) 5. [ZFS Send and Receive Fundamentals](#zfs-send-and-receive-fundamentals) 6. [Step-by-Step Replication Process](#step-by-step-replication-process) 7. [Advanced Replication Scenarios](#advanced-replication-scenarios) 8. [Incremental Snapshots and Replication](#incremental-snapshots-and-replication) 9. [Remote Replication Over SSH](#remote-replication-over-ssh) 10. [Troubleshooting Common Issues](#troubleshooting-common-issues) 11. [Best Practices and Performance Tips](#best-practices-and-performance-tips) 12. [Monitoring and Maintenance](#monitoring-and-maintenance) 13. [Conclusion](#conclusion) Introduction ZFS (Zettabyte File System) provides powerful snapshot and replication capabilities that enable administrators to create point-in-time copies of data and efficiently transfer them between storage systems. The combination of `zfs snapshot`, `zfs send`, and `zfs receive` commands forms the foundation of ZFS backup and disaster recovery strategies. This comprehensive guide will teach you how to master ZFS snapshot creation and replication, from basic single snapshots to complex incremental replication scenarios. You'll learn to protect your data, create efficient backup workflows, and implement robust disaster recovery solutions using ZFS's built-in capabilities. Whether you're managing a single server or a complex multi-site infrastructure, understanding ZFS snapshot and replication techniques is essential for maintaining data integrity and ensuring business continuity. Prerequisites Before diving into ZFS snapshot and replication, ensure you have: System Requirements - A system running ZFS (OpenZFS, Oracle Solaris ZFS, or FreeBSD ZFS) - Root or sudo privileges for ZFS operations - Sufficient storage space for snapshots and replicas - Network connectivity for remote replication scenarios Knowledge Requirements - Basic understanding of ZFS pools and datasets - Familiarity with command-line operations - Understanding of filesystem concepts and data backup principles Verification Commands ```bash Verify ZFS installation zfs version Check available pools zpool list Verify dataset structure zfs list ``` Understanding ZFS Snapshots ZFS snapshots are read-only, point-in-time copies of datasets that consume minimal space initially due to ZFS's copy-on-write architecture. They provide several key benefits: Snapshot Characteristics - Instantaneous creation: Snapshots are created immediately regardless of dataset size - Space efficient: Only changed data consumes additional space - Atomic consistency: Snapshots capture a consistent state of the entire dataset - Persistent: Snapshots survive system reboots and remain until explicitly deleted Snapshot Naming Convention ZFS snapshots follow a specific naming pattern: ``` pool/dataset@snapshot_name ``` Where: - `pool` is the ZFS pool name - `dataset` is the dataset path (can include nested datasets) - `@snapshot_name` is the user-defined snapshot identifier Basic Snapshot Creation Creating Your First Snapshot The basic syntax for creating a ZFS snapshot is: ```bash zfs snapshot pool/dataset@snapshot_name ``` Practical Examples ```bash Create a snapshot of the entire pool zfs snapshot mypool@backup-2024-01-15 Create a snapshot of a specific dataset zfs snapshot mypool/data@before-update Create a snapshot with timestamp zfs snapshot mypool/home@$(date +%Y%m%d-%H%M%S) Create recursive snapshots (includes all child datasets) zfs snapshot -r mypool/data@full-backup-2024-01-15 ``` Viewing Snapshots ```bash List all snapshots zfs list -t snapshot List snapshots for a specific dataset zfs list -t snapshot mypool/data Show snapshot details with space usage zfs list -t snapshot -o name,used,refer,creation ``` Managing Snapshot Properties ```bash View snapshot properties zfs get all mypool/data@backup-2024-01-15 Check space usage of snapshots zfs list -t snapshot -o name,used,refer mypool/data ``` ZFS Send and Receive Fundamentals The `zfs send` and `zfs receive` commands work together to transfer ZFS datasets and snapshots between systems or storage locations. This mechanism forms the backbone of ZFS replication. How ZFS Send Works `zfs send` creates a stream representation of a snapshot that can be: - Piped to `zfs receive` on the same system - Transferred over a network to another system - Saved to a file for later restoration - Compressed and encrypted during transfer How ZFS Receive Works `zfs receive` reconstructs a dataset from a ZFS send stream, creating: - An exact replica of the original snapshot - All dataset properties and metadata - Complete data integrity verification Basic Send/Receive Syntax ```bash Basic local replication zfs send pool/dataset@snapshot | zfs receive destination/dataset Send to a file zfs send pool/dataset@snapshot > backup.zfs Receive from a file zfs receive destination/dataset < backup.zfs ``` Step-by-Step Replication Process Step 1: Prepare Source and Destination First, ensure both source and destination systems are ready: ```bash On source system - verify dataset exists zfs list mypool/data On destination system - verify pool exists zpool list backuppool Create destination parent dataset if needed zfs create backuppool/replica ``` Step 2: Create Initial Snapshot ```bash Create the initial snapshot for replication zfs snapshot mypool/data@initial-replica ``` Step 3: Perform Initial Full Send ```bash Local replication zfs send mypool/data@initial-replica | zfs receive backuppool/replica/data Verify replication success zfs list backuppool/replica/data zfs list -t snapshot backuppool/replica/data ``` Step 4: Verify Data Integrity ```bash Compare checksums between source and destination zfs get checksum mypool/data@initial-replica zfs get checksum backuppool/replica/data@initial-replica Verify dataset properties zfs get all mypool/data@initial-replica zfs get all backuppool/replica/data@initial-replica ``` Advanced Replication Scenarios Recursive Replication For complex dataset hierarchies, use recursive replication: ```bash Create recursive snapshot zfs snapshot -r mypool/data@full-backup-2024-01-15 Recursive send/receive zfs send -R mypool/data@full-backup-2024-01-15 | \ zfs receive -F backuppool/replica ``` The `-R` flag includes: - All descendant datasets - All snapshots - Dataset properties - Clones and their relationships Compressed Replication Reduce network bandwidth and storage requirements: ```bash Compressed send stream zfs send -c mypool/data@backup | zfs receive backuppool/replica/data Raw send (preserves encryption and compression) zfs send -w mypool/encrypted@backup | zfs receive backuppool/replica/encrypted ``` Large Block Support For datasets with large blocks: ```bash Enable large block support during send zfs send -L mypool/data@backup | zfs receive backuppool/replica/data ``` Incremental Snapshots and Replication Incremental replication is crucial for efficient ongoing backups, transferring only the changes between snapshots. Creating Incremental Snapshots ```bash Create new snapshot for incremental backup zfs snapshot mypool/data@incremental-2024-01-16 Perform incremental send zfs send -i mypool/data@initial-replica mypool/data@incremental-2024-01-16 | \ zfs receive backuppool/replica/data ``` Advanced Incremental Scenarios ```bash Resume interrupted send (ZFS 0.6.4+) zfs send -t | zfs receive backuppool/replica/data Get resume token if transfer was interrupted zfs get receive_resume_token backuppool/replica/data ``` Automated Incremental Backup Script ```bash #!/bin/bash automated-backup.sh POOL="mypool" DATASET="data" BACKUP_POOL="backuppool" DATE=$(date +%Y%m%d-%H%M%S) Create new snapshot zfs snapshot ${POOL}/${DATASET}@auto-${DATE} Find the most recent snapshot on destination LAST_SNAP=$(zfs list -H -t snapshot -o name -s creation ${BACKUP_POOL}/replica/${DATASET} | tail -1 | cut -d@ -f2) if [ -n "$LAST_SNAP" ]; then # Incremental send echo "Performing incremental backup from @${LAST_SNAP}" zfs send -i ${POOL}/${DATASET}@${LAST_SNAP} ${POOL}/${DATASET}@auto-${DATE} | \ zfs receive ${BACKUP_POOL}/replica/${DATASET} else # Full send (first time) echo "Performing initial full backup" zfs send ${POOL}/${DATASET}@auto-${DATE} | \ zfs receive ${BACKUP_POOL}/replica/${DATASET} fi echo "Backup completed: ${POOL}/${DATASET}@auto-${DATE}" ``` Remote Replication Over SSH Remote replication enables off-site backups and disaster recovery capabilities. SSH Setup for ZFS Replication ```bash Generate SSH key for automated backups ssh-keygen -t ed25519 -f ~/.ssh/zfs-backup -N "" Copy public key to remote system ssh-copy-id -i ~/.ssh/zfs-backup.pub backup-server Test SSH connection ssh -i ~/.ssh/zfs-backup backup-server "zpool list" ``` Remote Replication Commands ```bash Basic remote replication zfs send mypool/data@backup | ssh backup-server "zfs receive backuppool/replica/data" Compressed remote replication zfs send -c mypool/data@backup | ssh backup-server "zfs receive backuppool/replica/data" Remote incremental replication zfs send -i mypool/data@old mypool/data@new | \ ssh backup-server "zfs receive backuppool/replica/data" ``` Remote Replication Script with Error Handling ```bash #!/bin/bash remote-backup.sh set -euo pipefail SOURCE_POOL="mypool" SOURCE_DATASET="data" REMOTE_HOST="backup-server" REMOTE_POOL="backuppool" SSH_KEY="~/.ssh/zfs-backup" DATE=$(date +%Y%m%d-%H%M%S) Function for logging log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a /var/log/zfs-backup.log } Create snapshot log "Creating snapshot: ${SOURCE_POOL}/${SOURCE_DATASET}@${DATE}" zfs snapshot "${SOURCE_POOL}/${SOURCE_DATASET}@${DATE}" Get last common snapshot LAST_SNAP=$(ssh -i "${SSH_KEY}" "${REMOTE_HOST}" \ "zfs list -H -t snapshot -o name ${REMOTE_POOL}/replica/${SOURCE_DATASET} 2>/dev/null | tail -1" || echo "") if [ -n "$LAST_SNAP" ]; then LAST_SNAP_NAME=$(echo "$LAST_SNAP" | cut -d@ -f2) log "Performing incremental backup from @${LAST_SNAP_NAME}" zfs send -i "${SOURCE_POOL}/${SOURCE_DATASET}@${LAST_SNAP_NAME}" \ "${SOURCE_POOL}/${SOURCE_DATASET}@${DATE}" | \ ssh -i "${SSH_KEY}" "${REMOTE_HOST}" \ "zfs receive ${REMOTE_POOL}/replica/${SOURCE_DATASET}" else log "Performing initial full backup" zfs send "${SOURCE_POOL}/${SOURCE_DATASET}@${DATE}" | \ ssh -i "${SSH_KEY}" "${REMOTE_HOST}" \ "zfs receive ${REMOTE_POOL}/replica/${SOURCE_DATASET}" fi log "Backup completed successfully" ``` Troubleshooting Common Issues Permission and Privilege Issues Problem: "Permission denied" errors during replication ```bash Solution: Ensure proper ZFS permissions zfs allow -u backup-user send,receive,create,mount mypool zfs allow -u backup-user send,receive,create,mount backuppool ``` Problem: SSH authentication failures ```bash Debug SSH connection ssh -v -i ~/.ssh/zfs-backup backup-server Check SSH key permissions chmod 600 ~/.ssh/zfs-backup chmod 644 ~/.ssh/zfs-backup.pub ``` Snapshot and Replication Errors Problem: "Dataset already exists" error ```bash Solution: Use force flag (destroys existing data) zfs receive -F destination/dataset Or rename existing dataset first zfs rename destination/dataset destination/dataset-old ``` Problem: "Stream is not supported" error ```bash Check ZFS versions on both systems zfs version Use compatible send options zfs send -p mypool/data@snapshot # Include properties zfs send -R mypool/data@snapshot # Recursive send ``` Problem: Interrupted transfers ```bash Check for resume tokens zfs get receive_resume_token destination/dataset Resume interrupted receive zfs send -t | zfs receive destination/dataset ``` Space and Performance Issues Problem: Running out of space during replication ```bash Monitor space usage during transfer watch 'zpool list; zfs list -o space' Clean up old snapshots zfs destroy mypool/data@old-snapshot Use compression to reduce space zfs send -c mypool/data@snapshot | zfs receive destination/dataset ``` Problem: Slow replication performance ```bash Monitor transfer progress pv /dev/stdin | zfs receive destination/dataset Use mbuffer for better performance zfs send mypool/data@snapshot | mbuffer -s 128k -m 1G | zfs receive destination/dataset Enable compression zfs send -c mypool/data@snapshot | zfs receive destination/dataset ``` Network and Connectivity Issues Problem: Network timeouts during remote replication ```bash Use SSH keepalive settings ssh -o ServerAliveInterval=60 -o ServerAliveCountMax=3 backup-server Add to ~/.ssh/config Host backup-server ServerAliveInterval 60 ServerAliveCountMax 3 Compression yes ``` Best Practices and Performance Tips Snapshot Management Best Practices 1. Consistent Naming Convention ```bash Use descriptive, sortable names zfs snapshot mypool/data@$(hostname)-$(date +%Y%m%d-%H%M%S) zfs snapshot mypool/data@manual-before-upgrade-2024-01-15 ``` 2. Automated Cleanup ```bash Script to retain only last 7 daily snapshots #!/bin/bash zfs list -H -t snapshot -o name -s creation mypool/data | \ grep "@daily-" | head -n -7 | \ while read snap; do echo "Destroying old snapshot: $snap" zfs destroy "$snap" done ``` 3. Snapshot Scheduling ```bash Add to crontab for automated snapshots Daily at 2 AM 0 2 * /usr/local/bin/zfs-snapshot.sh daily Hourly during business hours 0 9-17 1-5 /usr/local/bin/zfs-snapshot.sh hourly ``` Replication Performance Optimization 1. Use Compression ```bash Enable compression on datasets zfs set compression=lz4 mypool/data Use compressed send streams zfs send -c mypool/data@snapshot ``` 2. Optimize Network Settings ```bash Increase SSH cipher performance ssh -c aes128-ctr backup-server Use multiple parallel streams for large datasets zfs send mypool/data@snap | pigz | ssh backup-server "pigz -d | zfs receive dest/data" ``` 3. Monitor and Tune ```bash Monitor replication progress zfs send -v mypool/data@snapshot | pv | zfs receive dest/data Check ARC and L2ARC statistics cat /proc/spl/kstat/zfs/arcstats ``` Security Considerations 1. Encryption in Transit ```bash Use SSH for all remote transfers zfs send mypool/data@snap | ssh -C backup-server "zfs receive dest/data" For additional security, use VPN or dedicated network ``` 2. Access Control ```bash Limit ZFS permissions for backup users zfs allow -u backup-user send mypool/data zfs allow -u backup-user receive,create,mount backuppool/replica ``` 3. Audit and Logging ```bash Log all ZFS operations zfs send mypool/data@snap 2>&1 | tee -a /var/log/zfs-backup.log ``` Monitoring and Maintenance Health Monitoring ```bash Check pool health zpool status -v Monitor snapshot space usage zfs list -t snapshot -o name,used,refer Check replication lag #!/bin/bash SOURCE_SNAP=$(zfs list -H -t snapshot -o name,creation -s creation mypool/data | tail -1) DEST_SNAP=$(ssh backup-server "zfs list -H -t snapshot -o name,creation -s creation backuppool/replica/data | tail -1") echo "Source: $SOURCE_SNAP" echo "Destination: $DEST_SNAP" ``` Automated Maintenance Tasks ```bash Weekly maintenance script #!/bin/bash weekly-maintenance.sh Scrub pools zpool scrub mypool ssh backup-server "zpool scrub backuppool" Clean old snapshots (keep last 30 days) CUTOFF_DATE=$(date -d "30 days ago" +%Y%m%d) zfs list -H -t snapshot -o name mypool/data | \ while read snap; do SNAP_DATE=$(echo "$snap" | grep -o '[0-9]\{8\}') if [[ "$SNAP_DATE" < "$CUTOFF_DATE" ]]; then echo "Destroying old snapshot: $snap" zfs destroy "$snap" fi done Verify recent backups RECENT_BACKUP=$(ssh backup-server "zfs list -H -t snapshot -o creation backuppool/replica/data | tail -1") echo "Most recent backup: $RECENT_BACKUP" ``` Disaster Recovery Testing ```bash Test restore procedure #!/bin/bash test-restore.sh TEST_POOL="testpool" BACKUP_SNAP="backuppool/replica/data@latest" Create test environment zpool create "$TEST_POOL" /dev/disk/by-id/test-disk Restore from backup ssh backup-server "zfs send $BACKUP_SNAP" | zfs receive "$TEST_POOL/restored-data" Verify data integrity zfs list "$TEST_POOL/restored-data" zfs get checksum "$TEST_POOL/restored-data" Cleanup zpool destroy "$TEST_POOL" echo "Disaster recovery test completed successfully" ``` Conclusion ZFS snapshot and replication capabilities provide a robust foundation for data protection and disaster recovery strategies. By mastering the `zfs snapshot`, `zfs send`, and `zfs receive` commands, you can implement efficient backup solutions that scale from single systems to complex multi-site infrastructures. Key takeaways from this guide include: - Snapshot Creation: Use consistent naming conventions and automate snapshot creation for regular data protection - Replication Strategies: Implement both full and incremental replication to balance storage efficiency and recovery objectives - Remote Backup: Leverage SSH-based remote replication for off-site data protection - Performance Optimization: Use compression, proper network configuration, and monitoring to optimize replication performance - Maintenance: Establish regular maintenance routines including snapshot cleanup, health monitoring, and disaster recovery testing The combination of ZFS's copy-on-write architecture, efficient snapshot mechanisms, and powerful send/receive capabilities creates a comprehensive data protection solution. Whether you're protecting critical business data, implementing compliance requirements, or ensuring disaster recovery capabilities, ZFS replication provides the tools necessary for enterprise-grade data management. As you implement these techniques in your environment, remember to start with simple scenarios and gradually build complexity. Regular testing of your backup and recovery procedures ensures that your data protection strategy will perform when needed most. The investment in understanding and implementing ZFS replication will pay dividends in data security, operational efficiency, and peace of mind. Continue exploring advanced ZFS features such as encryption, deduplication, and automated management tools to further enhance your data protection capabilities. The robust ecosystem around ZFS ensures that your replication infrastructure can evolve with your organization's growing needs.