How to set up PostgreSQL replication in Linux

How to Set Up PostgreSQL Replication in Linux PostgreSQL replication is a critical feature that allows you to create and maintain multiple copies of your database across different servers. This comprehensive guide will walk you through the complete process of setting up PostgreSQL replication in Linux environments, covering everything from basic streaming replication to advanced configurations. Table of Contents 1. [Introduction to PostgreSQL Replication](#introduction) 2. [Prerequisites and Requirements](#prerequisites) 3. [Understanding Replication Types](#replication-types) 4. [Setting Up the Primary Server](#primary-server-setup) 5. [Configuring the Standby Server](#standby-server-setup) 6. [Testing and Verification](#testing-verification) 7. [Monitoring Replication](#monitoring-replication) 8. [Troubleshooting Common Issues](#troubleshooting) 9. [Best Practices and Security](#best-practices) 10. [Advanced Configurations](#advanced-configurations) 11. [Conclusion](#conclusion) Introduction to PostgreSQL Replication {#introduction} PostgreSQL replication enables you to maintain synchronized copies of your database across multiple servers, providing high availability, load distribution, and disaster recovery capabilities. This process involves a primary server (master) that handles write operations and one or more standby servers (replicas) that receive and apply changes from the primary. Replication offers several key benefits: - High Availability: Automatic failover capabilities ensure minimal downtime - Load Distribution: Read queries can be distributed across multiple servers - Disaster Recovery: Geographic distribution of data protects against site failures - Backup Operations: Perform backups on standby servers without impacting primary performance Prerequisites and Requirements {#prerequisites} Before beginning the PostgreSQL replication setup, ensure you have the following prerequisites in place: System Requirements - Operating System: Linux distribution (Ubuntu 18.04+, CentOS 7+, RHEL 7+, or Debian 9+) - PostgreSQL Version: PostgreSQL 10 or higher (this guide uses PostgreSQL 13) - Network Connectivity: Reliable network connection between primary and standby servers - Storage: Sufficient disk space on both servers (standby should have at least as much space as primary) Hardware Specifications - CPU: Multi-core processor recommended - RAM: Minimum 4GB, 8GB+ recommended for production - Network: Low-latency connection between servers - Storage: SSD storage recommended for optimal performance User Privileges - Root or sudo access on both servers - PostgreSQL superuser privileges - Network firewall configuration permissions Network Configuration Ensure the following ports are accessible between servers: - PostgreSQL port (default: 5432) - SSH port (default: 22) for administrative access Understanding Replication Types {#replication-types} PostgreSQL supports several replication methods: Streaming Replication The most common and recommended approach, where WAL (Write-Ahead Log) records are streamed in real-time from primary to standby servers. This method provides: - Near real-time synchronization - Minimal data loss potential - Efficient network utilization Logical Replication Allows selective replication of specific tables or databases, useful for: - Cross-version replication - Partial database synchronization - Data warehousing scenarios File-Based Log Shipping Traditional method using archived WAL files, suitable for: - High-latency networks - Batch-oriented replication requirements This guide focuses on streaming replication as it's the most widely used and efficient method. Setting Up the Primary Server {#primary-server-setup} Step 1: Install PostgreSQL First, install PostgreSQL on your primary server. For Ubuntu/Debian systems: ```bash Update package repository sudo apt update Install PostgreSQL sudo apt install postgresql postgresql-contrib Start and enable PostgreSQL service sudo systemctl start postgresql sudo systemctl enable postgresql ``` For CentOS/RHEL systems: ```bash Install PostgreSQL repository sudo yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm Install PostgreSQL sudo yum install -y postgresql13-server postgresql13 Initialize database sudo /usr/pgsql-13/bin/postgresql-13-setup initdb Start and enable service sudo systemctl start postgresql-13 sudo systemctl enable postgresql-13 ``` Step 2: Configure PostgreSQL for Replication Edit the main PostgreSQL configuration file: ```bash sudo nano /etc/postgresql/13/main/postgresql.conf ``` Or for CentOS/RHEL: ```bash sudo nano /var/lib/pgsql/13/data/postgresql.conf ``` Add or modify the following parameters: ```conf Connection settings listen_addresses = '*' port = 5432 Replication settings wal_level = replica max_wal_senders = 10 max_replication_slots = 10 hot_standby = on WAL settings wal_keep_segments = 64 # For PostgreSQL < 13 wal_keep_size = 1GB # For PostgreSQL 13+ Archive settings (optional but recommended) archive_mode = on archive_command = 'cp %p /var/lib/postgresql/13/main/archive/%f' Logging settings log_replication_commands = on ``` Step 3: Configure Client Authentication Edit the host-based authentication file: ```bash sudo nano /etc/postgresql/13/main/pg_hba.conf ``` Add the following lines to allow replication connections: ```conf Replication connections host replication replicator 192.168.1.0/24 md5 host replication replicator standby_server_ip/32 md5 Replace 192.168.1.0/24 with your actual network range Replace standby_server_ip with your standby server's IP address ``` Step 4: Create Replication User Connect to PostgreSQL and create a dedicated replication user: ```bash sudo -u postgres psql ``` ```sql -- Create replication user CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'strong_password_here'; -- Grant necessary privileges ALTER USER replicator CREATEDB; -- Exit PostgreSQL \q ``` Step 5: Create Archive Directory Create a directory for WAL archive files: ```bash sudo mkdir -p /var/lib/postgresql/13/main/archive sudo chown postgres:postgres /var/lib/postgresql/13/main/archive sudo chmod 700 /var/lib/postgresql/13/main/archive ``` Step 6: Restart PostgreSQL Service Apply the configuration changes: ```bash sudo systemctl restart postgresql ``` Verify the service is running: ```bash sudo systemctl status postgresql ``` Configuring the Standby Server {#standby-server-setup} Step 1: Install PostgreSQL on Standby Server Follow the same installation steps as the primary server, ensuring you install the same PostgreSQL version. Step 2: Stop PostgreSQL Service ```bash sudo systemctl stop postgresql ``` Step 3: Remove Default Data Directory ```bash sudo rm -rf /var/lib/postgresql/13/main/* ``` Step 4: Create Base Backup from Primary Use `pg_basebackup` to create an initial copy of the primary database: ```bash sudo -u postgres pg_basebackup -h primary_server_ip -D /var/lib/postgresql/13/main -U replicator -P -v -R -X stream -C -S standby_slot ``` Parameters explanation: - `-h`: Primary server hostname or IP address - `-D`: Target data directory - `-U`: Replication user - `-P`: Show progress - `-v`: Verbose output - `-R`: Create standby.signal file and append connection settings - `-X stream`: Stream WAL files during backup - `-C`: Create replication slot - `-S`: Replication slot name Enter the replication user password when prompted. Step 5: Configure Standby Server The `-R` option in pg_basebackup automatically creates a `standby.signal` file and adds connection information to `postgresql.auto.conf`. Verify these settings: ```bash sudo -u postgres cat /var/lib/postgresql/13/main/standby.signal sudo -u postgres cat /var/lib/postgresql/13/main/postgresql.auto.conf ``` The `postgresql.auto.conf` should contain: ```conf primary_conninfo = 'user=replicator host=primary_server_ip port=5432 sslmode=prefer sslcompression=0 gssencmode=prefer krbsrvname=postgres target_session_attrs=any' primary_slot_name = 'standby_slot' ``` Step 6: Configure Additional Settings Edit the main PostgreSQL configuration file on the standby server: ```bash sudo nano /etc/postgresql/13/main/postgresql.conf ``` Ensure these settings are configured: ```conf Hot standby settings hot_standby = on max_standby_streaming_delay = 30s max_standby_archive_delay = 30s Connection settings listen_addresses = '*' port = 5432 Logging log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h ' ``` Step 7: Start Standby Server ```bash sudo systemctl start postgresql sudo systemctl enable postgresql ``` Verify the service is running: ```bash sudo systemctl status postgresql ``` Testing and Verification {#testing-verification} Step 1: Verify Replication Status on Primary Connect to the primary server and check replication status: ```bash sudo -u postgres psql ``` ```sql -- Check replication slots SELECT slot_name, active, restart_lsn FROM pg_replication_slots; -- Check connected standby servers SELECT client_addr, state, sync_state FROM pg_stat_replication; -- View WAL sender processes SELECT pid, usename, application_name, client_addr, state, sync_state FROM pg_stat_replication; ``` Step 2: Verify Replication Status on Standby Connect to the standby server: ```bash sudo -u postgres psql ``` ```sql -- Check if server is in recovery mode SELECT pg_is_in_recovery(); -- Check replication status SELECT status, receive_start_lsn, received_lsn, last_msg_send_time, last_msg_receipt_time FROM pg_stat_wal_receiver; ``` Step 3: Test Data Replication On the primary server, create a test database and table: ```sql -- Create test database CREATE DATABASE test_replication; -- Connect to test database \c test_replication -- Create test table CREATE TABLE replication_test ( id SERIAL PRIMARY KEY, message TEXT, created_at TIMESTAMP DEFAULT NOW() ); -- Insert test data INSERT INTO replication_test (message) VALUES ('First test message'), ('Second test message'), ('Third test message'); -- View data SELECT * FROM replication_test; ``` On the standby server, verify the data has been replicated: ```sql -- Connect to test database \c test_replication -- View replicated data SELECT * FROM replication_test; ``` Step 4: Test Read-Only Nature of Standby Attempt to insert data on the standby server (this should fail): ```sql -- This should produce an error INSERT INTO replication_test (message) VALUES ('This should fail'); ``` Expected error: `ERROR: cannot execute INSERT in a read-only transaction` Monitoring Replication {#monitoring-replication} Monitoring Replication Lag Create a monitoring script to check replication lag: ```bash sudo nano /usr/local/bin/check_replication_lag.sh ``` ```bash #!/bin/bash Check replication lag on standby server STANDBY_LAG=$(sudo -u postgres psql -t -c "SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()));" 2>/dev/null) if [ -z "$STANDBY_LAG" ]; then echo "ERROR: Unable to determine replication lag" exit 1 fi LAG_SECONDS=$(echo "$STANDBY_LAG" | tr -d ' ') echo "Replication lag: $LAG_SECONDS seconds" Alert if lag is greater than 60 seconds if (( $(echo "$LAG_SECONDS > 60" | bc -l) )); then echo "WARNING: Replication lag is high!" exit 1 fi echo "Replication lag is acceptable" exit 0 ``` Make the script executable: ```bash sudo chmod +x /usr/local/bin/check_replication_lag.sh ``` Setting Up Log Monitoring Configure PostgreSQL to log replication events by adding to `postgresql.conf`: ```conf Logging configuration log_min_messages = info log_checkpoints = on log_connections = on log_disconnections = on log_replication_commands = on ``` Creating Monitoring Queries Useful monitoring queries for the primary server: ```sql -- Monitor WAL generation rate SELECT pg_current_wal_lsn(), pg_walfile_name(pg_current_wal_lsn()); -- Check replication slot usage SELECT slot_name, active, restart_lsn, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) as lag_size FROM pg_replication_slots; -- Monitor connected replicas SELECT client_addr, application_name, state, sync_state, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), flush_lsn)) as lag FROM pg_stat_replication; ``` Troubleshooting Common Issues {#troubleshooting} Issue 1: Connection Refused Errors Symptoms: Standby server cannot connect to primary server Solutions: 1. Check firewall settings: ```bash Ubuntu/Debian sudo ufw allow 5432/tcp CentOS/RHEL sudo firewall-cmd --permanent --add-port=5432/tcp sudo firewall-cmd --reload ``` 2. Verify `pg_hba.conf` configuration on primary server 3. Check `listen_addresses` in `postgresql.conf` 4. Test network connectivity: ```bash telnet primary_server_ip 5432 ``` Issue 2: Authentication Failures Symptoms: "FATAL: password authentication failed for user replicator" Solutions: 1. Verify replication user exists and has correct privileges: ```sql SELECT rolname, rolreplication FROM pg_roles WHERE rolname = 'replicator'; ``` 2. Check password in connection string 3. Verify `pg_hba.conf` entry allows the replication user from standby IP Issue 3: High Replication Lag Symptoms: Standby server falls behind primary server Solutions: 1. Check network bandwidth and latency 2. Increase `wal_keep_segments` or `wal_keep_size` 3. Monitor disk I/O on standby server 4. Consider using replication slots: ```sql -- On primary server SELECT pg_create_physical_replication_slot('standby_slot'); ``` Issue 4: WAL Segment Not Found Symptoms: "requested WAL segment has already been removed" Solutions: 1. Increase WAL retention settings: ```conf wal_keep_size = 2GB # Increase size ``` 2. Set up WAL archiving: ```conf archive_mode = on archive_command = 'cp %p /path/to/archive/%f' ``` 3. Use replication slots to prevent WAL removal Issue 5: Standby Server Won't Start Symptoms: PostgreSQL service fails to start on standby Solutions: 1. Check PostgreSQL logs: ```bash sudo tail -f /var/log/postgresql/postgresql-13-main.log ``` 2. Verify file permissions: ```bash sudo chown -R postgres:postgres /var/lib/postgresql/13/main/ ``` 3. Check if `standby.signal` file exists: ```bash ls -la /var/lib/postgresql/13/main/standby.signal ``` Diagnostic Commands Use these commands to diagnose replication issues: ```bash Check PostgreSQL process status ps aux | grep postgres Monitor PostgreSQL logs in real-time sudo tail -f /var/log/postgresql/postgresql-13-main.log Check network connections netstat -an | grep 5432 Verify disk space df -h Check system resources top iostat -x 1 ``` Best Practices and Security {#best-practices} Security Best Practices 1. Use Strong Passwords: Create complex passwords for replication users ```sql ALTER USER replicator PASSWORD 'ComplexPassword123!@#'; ``` 2. Enable SSL/TLS Encryption: ```conf In postgresql.conf ssl = on ssl_cert_file = 'server.crt' ssl_key_file = 'server.key' ``` Update connection string: ```conf primary_conninfo = 'user=replicator host=primary_server_ip port=5432 sslmode=require' ``` 3. Restrict Network Access: Use specific IP addresses in `pg_hba.conf` ```conf host replication replicator 192.168.1.100/32 md5 ``` 4. Regular Security Updates: Keep PostgreSQL and system packages updated ```bash sudo apt update && sudo apt upgrade ``` Performance Optimization 1. Tune WAL Settings: ```conf wal_buffers = 16MB wal_writer_delay = 200ms checkpoint_completion_target = 0.9 ``` 2. Optimize Network Settings: ```conf tcp_keepalives_idle = 600 tcp_keepalives_interval = 30 tcp_keepalives_count = 3 ``` 3. Configure Standby Settings: ```conf hot_standby_feedback = on max_standby_streaming_delay = 30s ``` Backup Strategies 1. Regular Base Backups: Schedule periodic full backups ```bash #!/bin/bash pg_basebackup -h primary_server_ip -D /backup/$(date +%Y%m%d) -U replicator -z -P ``` 2. WAL Archiving: Implement robust WAL archiving ```conf archive_command = 'rsync %p backup_server:/archive/%f' ``` 3. Point-in-Time Recovery: Configure PITR capabilities ```bash Recovery configuration restore_command = 'cp /archive/%f %p' recovery_target_time = '2023-12-01 12:00:00' ``` Maintenance Procedures 1. Regular Monitoring: Implement automated monitoring ```bash Add to crontab /5 * /usr/local/bin/check_replication_lag.sh ``` 2. Log Rotation: Configure log rotation ```bash Add to /etc/logrotate.d/postgresql /var/log/postgresql/*.log { daily rotate 7 compress delaycompress missingok notifempty create 640 postgres adm } ``` 3. Vacuum and Analyze: Regular maintenance on primary ```sql -- Schedule regular maintenance SELECT schemaname, tablename, last_vacuum, last_autovacuum FROM pg_stat_user_tables; ``` Advanced Configurations {#advanced-configurations} Synchronous Replication For zero data loss scenarios, configure synchronous replication: ```conf On primary server synchronous_standby_names = 'standby1,standby2' synchronous_commit = on ``` Cascading Replication Set up multi-tier replication: ```conf On intermediate standby server primary_conninfo = 'host=primary_server ...' Allow connections from downstream standbys ``` Logical Replication Setup Configure selective table replication: ```sql -- On primary server CREATE PUBLICATION my_publication FOR TABLE table1, table2; -- On subscriber CREATE SUBSCRIPTION my_subscription CONNECTION 'host=primary_server_ip dbname=mydb user=replicator' PUBLICATION my_publication; ``` Automatic Failover with Patroni Consider implementing Patroni for automatic failover: ```bash Install Patroni pip install patroni[etcd] Configure patroni.yml scope: postgres-cluster name: node1 ``` Load Balancing Read Queries Configure connection pooling with PgBouncer: ```ini pgbouncer.ini [databases] mydb = host=standby_server_ip port=5432 dbname=mydb [pgbouncer] pool_mode = transaction max_client_conn = 100 ``` Conclusion {#conclusion} Setting up PostgreSQL replication in Linux provides a robust foundation for high availability, load distribution, and disaster recovery. This comprehensive guide has covered the essential steps from basic streaming replication setup to advanced configurations and troubleshooting. Key Takeaways 1. Streaming replication is the most efficient and widely-used replication method 2. Proper configuration of both primary and standby servers is crucial for reliable replication 3. Monitoring and maintenance are essential for long-term success 4. Security considerations should be implemented from the beginning 5. Testing and verification ensure your replication setup works as expected Next Steps After successfully implementing PostgreSQL replication, consider these additional enhancements: 1. Implement automated failover using tools like Patroni or repmgr 2. Set up monitoring and alerting using Prometheus, Grafana, or similar tools 3. Configure backup strategies including point-in-time recovery 4. Optimize performance based on your specific workload requirements 5. Plan disaster recovery procedures and test them regularly Additional Resources - PostgreSQL Official Documentation: Replication - PostgreSQL Wiki: Replication Tutorials - Community forums and mailing lists for ongoing support - Professional PostgreSQL administration courses By following this guide and implementing the best practices outlined, you'll have a reliable PostgreSQL replication setup that can handle production workloads while providing the high availability and disaster recovery capabilities your applications require. Remember to regularly test your replication setup, monitor performance metrics, and stay updated with the latest PostgreSQL releases and security patches. Proper maintenance and monitoring will ensure your replication infrastructure continues to serve your needs effectively over time.