How to configure Keepalived in Linux
How to Configure Keepalived in Linux
Keepalived is a powerful routing software designed to provide high availability and load balancing for Linux systems. Built around the Virtual Router Redundancy Protocol (VRRP), Keepalived enables automatic failover between multiple servers, ensuring continuous service availability even when individual components fail. This comprehensive guide will walk you through the complete process of installing, configuring, and managing Keepalived in Linux environments.
Table of Contents
- [Introduction to Keepalived](#introduction-to-keepalived)
- [Prerequisites and Requirements](#prerequisites-and-requirements)
- [Installation Process](#installation-process)
- [Understanding Keepalived Configuration](#understanding-keepalived-configuration)
- [Step-by-Step Configuration](#step-by-step-configuration)
- [Practical Examples and Use Cases](#practical-examples-and-use-cases)
- [Testing and Validation](#testing-and-validation)
- [Troubleshooting Common Issues](#troubleshooting-common-issues)
- [Best Practices and Security](#best-practices-and-security)
- [Advanced Configuration Options](#advanced-configuration-options)
- [Monitoring and Maintenance](#monitoring-and-maintenance)
- [Conclusion](#conclusion)
Introduction to Keepalived
Keepalived operates as a framework that provides both high availability and load balancing functionality for Linux-based infrastructures. The software implements the VRRP protocol, which allows multiple routers or servers to work together in a coordinated fashion, with one acting as the master and others as backups.
Key Features and Benefits
High Availability: Keepalived ensures service continuity by automatically promoting backup servers to master status when failures occur. This seamless failover process minimizes downtime and maintains service accessibility.
Load Balancing: The software includes a complete implementation of the Linux Virtual Server (LVS) framework, enabling sophisticated load balancing across multiple backend servers.
Health Checking: Keepalived continuously monitors the health of services and servers, automatically removing failed components from the active pool and restoring them when they recover.
VRRP Implementation: The robust VRRP implementation ensures that virtual IP addresses are properly managed and that only one master exists at any given time.
Prerequisites and Requirements
Before beginning the Keepalived configuration process, ensure your environment meets the following requirements:
System Requirements
- Operating System: Linux distribution (Ubuntu 18.04+, CentOS 7+, RHEL 7+, Debian 9+)
- Kernel Version: Linux kernel 2.6 or higher
- Memory: Minimum 512MB RAM (1GB+ recommended for production)
- Network: Multiple network interfaces or VLAN support
- Root Access: Administrative privileges for installation and configuration
Network Prerequisites
- IP Address Planning: Dedicated virtual IP addresses for each service
- Network Segmentation: Proper VLAN or subnet configuration
- Firewall Rules: Configured to allow VRRP traffic (protocol 112)
- Multicast Support: Network infrastructure supporting multicast communication
Software Dependencies
```bash
Essential packages for compilation (if building from source)
gcc
make
autoconf
libnl3-dev
libssl-dev
libpopt-dev
kernel-headers
```
Installation Process
Installing from Package Repositories
Most modern Linux distributions include Keepalived in their official repositories, making installation straightforward.
Ubuntu/Debian Installation
```bash
Update package repositories
sudo apt update
Install Keepalived
sudo apt install keepalived
Verify installation
keepalived --version
```
CentOS/RHEL Installation
```bash
Install EPEL repository (if not already available)
sudo yum install epel-release
Install Keepalived
sudo yum install keepalived
For CentOS 8/RHEL 8
sudo dnf install keepalived
Verify installation
keepalived --version
```
Compiling from Source
For the latest features or custom configurations, you may choose to compile Keepalived from source:
```bash
Download source code
wget https://www.keepalived.org/software/keepalived-2.2.8.tar.gz
tar -xzf keepalived-2.2.8.tar.gz
cd keepalived-2.2.8
Configure compilation
./configure --prefix=/usr/local/keepalived
Compile and install
make
sudo make install
Create systemd service file
sudo cp /usr/local/keepalived/etc/systemd/keepalived.service /etc/systemd/system/
sudo systemctl daemon-reload
```
Understanding Keepalived Configuration
The Keepalived configuration file (`/etc/keepalived/keepalived.conf`) uses a structured format with three main sections:
Global Definitions
This section contains global parameters affecting the entire Keepalived instance:
```bash
global_defs {
# Unique identifier for this Keepalived instance
router_id LVS_DEVEL
# Email notifications
notification_email {
admin@example.com
support@example.com
}
notification_email_from keepalived@example.com
smtp_server 192.168.1.1
smtp_connect_timeout 30
# Script execution user
script_user root
enable_script_security
}
```
VRRP Instance Configuration
VRRP instances define the high availability behavior:
```bash
vrrp_instance VI_1 {
state MASTER # Initial state (MASTER or BACKUP)
interface eth0 # Network interface
virtual_router_id 51 # Unique ID (1-255)
priority 100 # Priority (higher = preferred master)
advert_int 1 # Advertisement interval
authentication {
auth_type PASS
auth_pass mypassword
}
virtual_ipaddress {
192.168.1.100/24
}
}
```
Virtual Server Configuration
Virtual servers define load balancing behavior:
```bash
virtual_server 192.168.1.100 80 {
delay_loop 6 # Health check interval
lb_algo rr # Load balancing algorithm
lb_kind NAT # Load balancing method
persistence_timeout 50 # Session persistence
protocol TCP # Protocol type
real_server 192.168.1.10 80 {
weight 1
TCP_CHECK {
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
}
```
Step-by-Step Configuration
Step 1: Basic Network Setup
Before configuring Keepalived, ensure your network interfaces are properly configured:
```bash
Check current network configuration
ip addr show
Configure network interfaces (example for Ubuntu)
sudo nano /etc/netplan/01-network-manager-all.yaml
Example netplan configuration
network:
version: 2
renderer: networkd
ethernets:
eth0:
dhcp4: no
addresses:
- 192.168.1.10/24
gateway4: 192.168.1.1
nameservers:
addresses: [8.8.8.8, 8.8.4.4]
Apply network configuration
sudo netplan apply
```
Step 2: Create Master Server Configuration
Create the main configuration file for the master server:
```bash
sudo nano /etc/keepalived/keepalived.conf
```
```bash
Master server configuration
global_defs {
router_id MASTER_SERVER
notification_email {
admin@company.com
}
notification_email_from keepalived@company.com
smtp_server localhost
smtp_connect_timeout 30
script_user root
enable_script_security
}
Health check script
vrrp_script chk_nginx {
script "/usr/local/bin/check_nginx.sh"
interval 2
weight -2
fall 3
rise 2
}
VRRP instance for web service
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 110
advert_int 1
authentication {
auth_type PASS
auth_pass SecurePassword123
}
virtual_ipaddress {
192.168.1.100/24 dev eth0
}
track_script {
chk_nginx
}
notify_master "/usr/local/bin/master.sh"
notify_backup "/usr/local/bin/backup.sh"
notify_fault "/usr/local/bin/fault.sh"
}
```
Step 3: Create Backup Server Configuration
Configure the backup server with lower priority:
```bash
Backup server configuration
global_defs {
router_id BACKUP_SERVER
notification_email {
admin@company.com
}
notification_email_from keepalived@company.com
smtp_server localhost
smtp_connect_timeout 30
script_user root
enable_script_security
}
vrrp_script chk_nginx {
script "/usr/local/bin/check_nginx.sh"
interval 2
weight -2
fall 3
rise 2
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 100 # Lower priority than master
advert_int 1
authentication {
auth_type PASS
auth_pass SecurePassword123 # Must match master
}
virtual_ipaddress {
192.168.1.100/24 dev eth0
}
track_script {
chk_nginx
}
notify_master "/usr/local/bin/master.sh"
notify_backup "/usr/local/bin/backup.sh"
notify_fault "/usr/local/bin/fault.sh"
}
```
Step 4: Create Health Check Scripts
Develop custom health check scripts to monitor service availability:
```bash
Create health check script
sudo nano /usr/local/bin/check_nginx.sh
```
```bash
#!/bin/bash
Nginx health check script
Check if nginx is running
if pgrep nginx > /dev/null; then
# Check if nginx responds to HTTP requests
if curl -f http://localhost > /dev/null 2>&1; then
exit 0 # Service is healthy
else
exit 1 # Service is not responding
fi
else
exit 1 # Service is not running
fi
```
```bash
Make script executable
sudo chmod +x /usr/local/bin/check_nginx.sh
```
Step 5: Create Notification Scripts
Implement notification scripts for state changes:
```bash
Master notification script
sudo nano /usr/local/bin/master.sh
```
```bash
#!/bin/bash
Actions to perform when becoming master
echo "$(date): Becoming MASTER" >> /var/log/keepalived-state.log
Start services that should only run on master
systemctl start nginx
systemctl start mysql
Update DNS or load balancer configuration
/usr/local/bin/update_dns.sh master
Send notification
echo "Server $(hostname) is now MASTER" | mail -s "Keepalived State Change" admin@company.com
```
```bash
Backup notification script
sudo nano /usr/local/bin/backup.sh
```
```bash
#!/bin/bash
Actions to perform when becoming backup
echo "$(date): Becoming BACKUP" >> /var/log/keepalived-state.log
Stop services that should only run on master
systemctl stop nginx
systemctl stop mysql
Send notification
echo "Server $(hostname) is now BACKUP" | mail -s "Keepalived State Change" admin@company.com
```
```bash
Make scripts executable
sudo chmod +x /usr/local/bin/master.sh
sudo chmod +x /usr/local/bin/backup.sh
sudo chmod +x /usr/local/bin/fault.sh
```
Practical Examples and Use Cases
Example 1: Web Server High Availability
This example demonstrates setting up high availability for web servers using Nginx:
```bash
Complete configuration for web server HA
global_defs {
router_id WEB_HA_CLUSTER
notification_email {
webadmin@company.com
}
notification_email_from keepalived@web-cluster.company.com
smtp_server mail.company.com
smtp_connect_timeout 30
}
vrrp_script chk_nginx {
script "/bin/bash -c 'curl -f http://localhost:80/ || exit 1'"
interval 3
timeout 3
weight -2
fall 2
rise 1
}
vrrp_instance WEB_SERVERS {
state MASTER
interface eth0
virtual_router_id 100
priority 110
advert_int 1
authentication {
auth_type PASS
auth_pass WebCluster2023!
}
virtual_ipaddress {
10.0.1.100/24 dev eth0
10.0.1.101/24 dev eth0
}
track_script {
chk_nginx
}
notify "/usr/local/bin/notify_state_change.sh"
}
Load balancing configuration
virtual_server 10.0.1.100 80 {
delay_loop 10
lb_algo wrr
lb_kind DR
persistence_timeout 300
protocol TCP
real_server 10.0.1.10 80 {
weight 3
HTTP_GET {
url {
path /health
status_code 200
}
connect_timeout 3
nb_get_retry 3
delay_before_retry 2
}
}
real_server 10.0.1.11 80 {
weight 3
HTTP_GET {
url {
path /health
status_code 200
}
connect_timeout 3
nb_get_retry 3
delay_before_retry 2
}
}
}
```
Example 2: Database High Availability
Configuration for database server failover:
```bash
Database HA configuration
global_defs {
router_id DB_HA_CLUSTER
script_user root
enable_script_security
}
vrrp_script chk_mysql {
script "/usr/local/bin/check_mysql.sh"
interval 5
timeout 3
weight -10
fall 2
rise 1
}
vrrp_instance DATABASE {
state MASTER
interface eth1
virtual_router_id 200
priority 120
advert_int 1
preempt_delay 300
authentication {
auth_type PASS
auth_pass DBCluster2023!
}
virtual_ipaddress {
192.168.10.100/24 dev eth1
}
track_script {
chk_mysql
}
notify_master "/usr/local/bin/mysql_master.sh"
notify_backup "/usr/local/bin/mysql_backup.sh"
notify_stop "/usr/local/bin/mysql_stop.sh"
}
```
MySQL health check script:
```bash
#!/bin/bash
MySQL health check script
MYSQL_USER="healthcheck"
MYSQL_PASS="password"
MYSQL_HOST="localhost"
MYSQL_PORT="3306"
Test MySQL connectivity and basic functionality
mysql -u${MYSQL_USER} -p${MYSQL_PASS} -h${MYSQL_HOST} -P${MYSQL_PORT} \
-e "SELECT 1" > /dev/null 2>&1
if [ $? -eq 0 ]; then
# Additional checks can be added here
# Check replication status, disk space, etc.
exit 0
else
exit 1
fi
```
Testing and Validation
Initial Configuration Testing
Before deploying to production, thoroughly test your Keepalived configuration:
```bash
Test configuration syntax
sudo keepalived -t -f /etc/keepalived/keepalived.conf
Start Keepalived in debug mode
sudo keepalived -D -f /etc/keepalived/keepalived.conf
Check VRRP advertisements
sudo tcpdump -i eth0 vrrp
Monitor system logs
sudo tail -f /var/log/syslog | grep -i keepalived
```
Failover Testing
Systematically test failover scenarios:
```bash
Test 1: Stop Keepalived service on master
sudo systemctl stop keepalived
Test 2: Simulate network failure
sudo iptables -A INPUT -p vrrp -j DROP
sudo iptables -A OUTPUT -p vrrp -j DROP
Test 3: Simulate service failure
sudo systemctl stop nginx
Test 4: Server reboot simulation
sudo reboot
Restore network rules after testing
sudo iptables -D INPUT -p vrrp -j DROP
sudo iptables -D OUTPUT -p vrrp -j DROP
```
Monitoring Commands
Essential commands for monitoring Keepalived status:
```bash
Check virtual IP assignment
ip addr show | grep -A 2 -B 2 "192.168.1.100"
Monitor VRRP state
sudo journalctl -u keepalived -f
Check process status
ps aux | grep keepalived
Network connectivity testing
ping -c 3 192.168.1.100
telnet 192.168.1.100 80
```
Troubleshooting Common Issues
Issue 1: Split-Brain Scenarios
Symptoms: Multiple masters exist simultaneously, causing IP conflicts.
Diagnosis:
```bash
Check for duplicate virtual IPs
ip addr show | grep "192.168.1.100"
Monitor VRRP traffic on both servers
sudo tcpdump -i eth0 -n vrrp
```
Solutions:
- Verify network connectivity between VRRP peers
- Check firewall rules allowing VRRP traffic (protocol 112)
- Ensure consistent authentication passwords
- Review network switch configuration for multicast support
Issue 2: Service Not Starting
Symptoms: Keepalived fails to start or immediately stops.
Diagnosis:
```bash
Check configuration syntax
sudo keepalived -t
Review system logs
sudo journalctl -u keepalived -n 50
Check file permissions
ls -la /etc/keepalived/keepalived.conf
```
Solutions:
- Fix configuration syntax errors
- Verify script permissions and paths
- Check SELinux/AppArmor policies
- Ensure required kernel modules are loaded
Issue 3: Health Check Failures
Symptoms: Frequent failovers or services marked as down incorrectly.
Diagnosis:
```bash
Test health check script manually
sudo /usr/local/bin/check_nginx.sh
echo $?
Review script execution logs
sudo tail -f /var/log/syslog | grep "check_nginx"
```
Solutions:
- Adjust health check intervals and thresholds
- Improve script error handling and logging
- Consider network latency in timeout values
- Implement more sophisticated health checks
Issue 4: Virtual IP Not Accessible
Symptoms: Virtual IP assigned but not reachable from network.
Diagnosis:
```bash
Verify IP assignment
ip addr show eth0
Check routing table
ip route show
Test local connectivity
ping -c 1 -I eth0 192.168.1.100
Check ARP table on other hosts
arp -a | grep 192.168.1.100
```
Solutions:
- Verify network interface configuration
- Check VLAN and subnet settings
- Review firewall rules on both servers and network
- Ensure proper gratuitous ARP configuration
Best Practices and Security
Security Considerations
Authentication: Always use strong passwords for VRRP authentication:
```bash
authentication {
auth_type PASS
auth_pass $(openssl rand -base64 12)
}
```
Script Security: Implement proper script validation and permissions:
```bash
Use dedicated user for scripts
script_user keepalived_script
enable_script_security
Set restrictive permissions
chmod 750 /usr/local/bin/check_*.sh
chown root:keepalived /usr/local/bin/check_*.sh
```
Network Security: Configure firewall rules appropriately:
```bash
Allow VRRP traffic between cluster members
sudo iptables -A INPUT -s 192.168.1.0/24 -p vrrp -j ACCEPT
sudo iptables -A OUTPUT -d 192.168.1.0/24 -p vrrp -j ACCEPT
Allow health check traffic
sudo iptables -A INPUT -s 192.168.1.10,192.168.1.11 -p tcp --dport 80 -j ACCEPT
```
Performance Optimization
Resource Management: Configure appropriate resource limits:
```bash
Systemd service limits
[Service]
LimitNOFILE=65536
LimitNPROC=32768
MemoryLimit=512M
```
Network Tuning: Optimize network parameters:
```bash
Increase network buffers
echo 'net.core.rmem_max = 134217728' >> /etc/sysctl.conf
echo 'net.core.wmem_max = 134217728' >> /etc/sysctl.conf
Apply changes
sudo sysctl -p
```
Configuration Management
Version Control: Maintain configuration files in version control:
```bash
Initialize git repository for configurations
cd /etc/keepalived
git init
git add keepalived.conf
git commit -m "Initial Keepalived configuration"
```
Backup Strategy: Implement regular configuration backups:
```bash
#!/bin/bash
Backup script
DATE=$(date +%Y%m%d_%H%M%S)
cp /etc/keepalived/keepalived.conf /backup/keepalived_${DATE}.conf
find /backup -name "keepalived_*.conf" -mtime +30 -delete
```
Advanced Configuration Options
Multi-Instance Setup
Configure multiple VRRP instances for different services:
```bash
Web service instance
vrrp_instance WEB_SERVICE {
state MASTER
interface eth0
virtual_router_id 10
priority 110
virtual_ipaddress {
10.0.1.100/24
}
}
Database service instance
vrrp_instance DB_SERVICE {
state BACKUP
interface eth1
virtual_router_id 20
priority 100
virtual_ipaddress {
10.0.2.100/24
}
}
```
Advanced Load Balancing
Implement sophisticated load balancing with persistence:
```bash
virtual_server 10.0.1.100 443 {
delay_loop 15
lb_algo sh # Source hash for persistence
lb_kind TUN # IP tunneling
persistence_timeout 3600 # 1-hour session persistence
persistence_granularity 255.255.255.0
protocol TCP
sorry_server 10.0.1.200 443 # Sorry server for maintenance
real_server 10.0.1.10 443 {
weight 100
inhibit_on_failure
SSL_GET {
url {
path /api/health
status_code 200
}
connect_timeout 5
connect_port 443
}
}
}
```
Integration with Monitoring Systems
Configure integration with external monitoring:
```bash
Nagios/Icinga integration
vrrp_instance VI_1 {
# ... standard configuration ...
notify "/usr/local/bin/notify_monitoring.sh"
}
```
Notification script for monitoring integration:
```bash
#!/bin/bash
Monitoring integration script
STATE=$1
INSTANCE=$2
PRIORITY=$3
case $STATE in
"MASTER")
# Update monitoring system
curl -X POST "http://monitoring.company.com/api/update" \
-d "host=$(hostname)&state=master&service=keepalived"
;;
"BACKUP")
curl -X POST "http://monitoring.company.com/api/update" \
-d "host=$(hostname)&state=backup&service=keepalived"
;;
"FAULT")
curl -X POST "http://monitoring.company.com/api/alert" \
-d "host=$(hostname)&state=fault&service=keepalived&priority=high"
;;
esac
```
Monitoring and Maintenance
Log Management
Configure comprehensive logging for troubleshooting:
```bash
Rsyslog configuration for Keepalived
echo "local0.* /var/log/keepalived.log" >> /etc/rsyslog.d/49-keepalived.conf
systemctl restart rsyslog
Logrotate configuration
cat > /etc/logrotate.d/keepalived << EOF
/var/log/keepalived.log {
daily
missingok
rotate 52
compress
delaycompress
notifempty
postrotate
/bin/kill -HUP \`cat /var/run/rsyslogd.pid 2> /dev/null\` 2> /dev/null || true
endscript
}
EOF
```
Performance Monitoring
Implement monitoring for Keepalived performance:
```bash
#!/bin/bash
Performance monitoring script
Check memory usage
MEMORY_USAGE=$(ps -o pid,vsz,rss,comm -C keepalived --no-headers)
echo "Keepalived Memory Usage: $MEMORY_USAGE"
Check file descriptors
PID=$(pgrep keepalived)
FD_COUNT=$(ls -1 /proc/$PID/fd | wc -l)
echo "File Descriptors: $FD_COUNT"
Check network statistics
VRRP_PACKETS=$(netstat -s | grep -i vrrp)
echo "VRRP Statistics: $VRRP_PACKETS"
```
Automated Health Monitoring
Create comprehensive health monitoring:
```bash
#!/bin/bash
Comprehensive health check script
KEEPALIVED_CONFIG="/etc/keepalived/keepalived.conf"
VIRTUAL_IPS=($(grep -oP 'virtual_ipaddress.?{.?\K[0-9.]+' $KEEPALIVED_CONFIG))
Check Keepalived process
if ! pgrep keepalived > /dev/null; then
echo "CRITICAL: Keepalived process not running"
exit 2
fi
Check virtual IP assignment
for VIP in "${VIRTUAL_IPS[@]}"; do
if ! ip addr show | grep -q "$VIP"; then
echo "WARNING: Virtual IP $VIP not assigned"
else
echo "OK: Virtual IP $VIP assigned"
fi
done
Check VRRP advertisements
VRRP_COUNT=$(timeout 10 tcpdump -i any -c 5 vrrp 2>/dev/null | wc -l)
if [ $VRRP_COUNT -lt 3 ]; then
echo "WARNING: Low VRRP advertisement count: $VRRP_COUNT"
else
echo "OK: VRRP advertisements detected: $VRRP_COUNT"
fi
echo "Health check completed"
```
Conclusion
Keepalived provides a robust solution for implementing high availability and load balancing in Linux environments. Through proper configuration, testing, and monitoring, you can achieve reliable service continuity and automatic failover capabilities.
Key Takeaways
1. Proper Planning: Success with Keepalived begins with careful network planning and IP address allocation
2. Security First: Always implement strong authentication and secure script execution practices
3. Thorough Testing: Comprehensive testing of failover scenarios prevents production issues
4. Continuous Monitoring: Regular monitoring and maintenance ensure optimal performance
5. Documentation: Maintain detailed documentation of configurations and procedures
Next Steps
After implementing Keepalived, consider these additional enhancements:
- Integration with Configuration Management: Use tools like Ansible, Puppet, or Chef for configuration management
- Advanced Monitoring: Implement comprehensive monitoring with tools like Prometheus, Nagios, or Zabbix
- Disaster Recovery: Develop and test disaster recovery procedures
- Performance Tuning: Continuously optimize performance based on monitoring data
- Security Hardening: Regular security audits and updates
Additional Resources
- Official Documentation: [keepalived.org](https://keepalived.org)
- VRRP RFC: RFC 3768 for detailed protocol specifications
- Linux Virtual Server: [linuxvirtualserver.org](http://www.linuxvirtualserver.org)
- Community Support: Keepalived mailing lists and forums
By following this comprehensive guide, you should now have a solid foundation for implementing and managing Keepalived in your Linux infrastructure. Remember that high availability is not just about technology—it requires ongoing attention, monitoring, and maintenance to ensure continued reliability and performance.