How to configure Galera Cluster on Linux
How to Configure Galera Cluster on Linux
Introduction
Galera Cluster is a powerful synchronous multi-master database clustering solution that provides high availability, data consistency, and automatic failover for MySQL and MariaDB databases. Unlike traditional master-slave replication, Galera enables true multi-master replication where all nodes can accept both read and write operations simultaneously, ensuring data remains consistent across all cluster members.
In this comprehensive guide, you will learn how to configure a robust Galera Cluster on Linux systems, understand the underlying concepts, implement best practices, and troubleshoot common issues. Whether you're a database administrator seeking high availability solutions or a developer working with distributed applications, this tutorial will provide you with the knowledge needed to successfully deploy and manage Galera Cluster in production environments.
What You'll Learn
By the end of this article, you will understand:
- Core concepts and architecture of Galera Cluster
- Prerequisites and system requirements
- Step-by-step installation and configuration process
- Network and firewall configuration
- Cluster initialization and node joining procedures
- Monitoring and maintenance techniques
- Troubleshooting common issues and performance optimization
Prerequisites and Requirements
System Requirements
Before configuring Galera Cluster, ensure your environment meets the following requirements:
Hardware Requirements:
- Minimum 3 nodes (odd number recommended for split-brain prevention)
- At least 2GB RAM per node (4GB+ recommended for production)
- Sufficient disk space with good I/O performance (SSD recommended)
- Reliable network connectivity between nodes with low latency
Software Requirements:
- Linux distribution (Ubuntu 18.04+, CentOS 7+, RHEL 7+, or Debian 9+)
- MySQL 5.7+ or MariaDB 10.1+ with Galera support
- Root or sudo access on all nodes
- Network Time Protocol (NTP) for time synchronization
Network Requirements:
- Static IP addresses for all cluster nodes
- Open ports: 3306 (MySQL), 4567 (Galera replication), 4568 (IST), 4444 (SST)
- Low network latency between nodes (preferably < 5ms)
- Sufficient bandwidth for data synchronization
Planning Your Cluster
Before implementation, plan your cluster architecture:
```
Node 1 (Primary): 192.168.1.10
Node 2 (Secondary): 192.168.1.11
Node 3 (Secondary): 192.168.1.12
```
Choose descriptive hostnames and ensure DNS resolution or proper `/etc/hosts` entries exist.
Step-by-Step Installation and Configuration
Step 1: Prepare the Environment
Configure Hostnames and DNS Resolution
On each node, set up proper hostname resolution:
```bash
Edit /etc/hosts on all nodes
sudo nano /etc/hosts
Add entries for all cluster nodes
192.168.1.10 galera-node1
192.168.1.11 galera-node2
192.168.1.12 galera-node3
```
Synchronize System Time
Install and configure NTP to ensure time synchronization:
```bash
Ubuntu/Debian
sudo apt update
sudo apt install ntp
CentOS/RHEL
sudo yum install ntp
Start and enable NTP service
sudo systemctl start ntp
sudo systemctl enable ntp
Verify time synchronization
ntpq -p
```
Configure Firewall Rules
Open necessary ports for Galera communication:
```bash
Ubuntu/Debian (UFW)
sudo ufw allow 3306
sudo ufw allow 4567
sudo ufw allow 4568
sudo ufw allow 4444
CentOS/RHEL (firewalld)
sudo firewall-cmd --permanent --add-port=3306/tcp
sudo firewall-cmd --permanent --add-port=4567/tcp
sudo firewall-cmd --permanent --add-port=4568/tcp
sudo firewall-cmd --permanent --add-port=4444/tcp
sudo firewall-cmd --reload
```
Step 2: Install MariaDB with Galera Support
Ubuntu/Debian Installation
```bash
Update package repository
sudo apt update
Install MariaDB server and Galera packages
sudo apt install mariadb-server mariadb-client galera-4
Stop MariaDB service (we'll configure before starting)
sudo systemctl stop mariadb
```
CentOS/RHEL Installation
```bash
Install MariaDB repository
sudo yum install mariadb-server mariadb galera
Alternative: Add MariaDB official repository
curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | sudo bash
sudo yum install MariaDB-server MariaDB-client galera-4
Stop MariaDB service
sudo systemctl stop mariadb
```
Step 3: Configure Galera Cluster Settings
Create Galera Configuration File
Create or modify the Galera configuration file on each node:
```bash
sudo nano /etc/mysql/mariadb.conf.d/60-galera.cnf
```
Configuration for Node 1 (Primary):
```ini
[galera]
Mandatory settings
wsrep_on=ON
wsrep_cluster_name="production_cluster"
wsrep_cluster_address="gcomm://192.168.1.10,192.168.1.11,192.168.1.12"
wsrep_node_address="192.168.1.10"
wsrep_node_name="galera-node1"
wsrep_provider="/usr/lib/galera/libgalera_smm.so"
Replication settings
wsrep_sst_method=rsync
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
Performance optimizations
wsrep_slave_threads=4
wsrep_certify_nonPK=1
wsrep_max_ws_rows=0
wsrep_max_ws_size=2147483647
wsrep_debug=0
wsrep_convert_LOCK_to_trx=0
wsrep_retry_autocommit=1
wsrep_auto_increment_control=1
Network settings
wsrep_provider_options="socket.ssl_key=/etc/mysql/server-key.pem;socket.ssl_cert=/etc/mysql/server-cert.pem;socket.ssl_ca=/etc/mysql/ca-cert.pem"
[mysql_safe]
log-error=/var/log/mariadb/mariadb.log
pid-file=/var/run/mariadb/mariadb.pid
[mysqld]
bind-address=0.0.0.0
```
Configuration for Node 2:
```ini
[galera]
wsrep_on=ON
wsrep_cluster_name="production_cluster"
wsrep_cluster_address="gcomm://192.168.1.10,192.168.1.11,192.168.1.12"
wsrep_node_address="192.168.1.11"
wsrep_node_name="galera-node2"
wsrep_provider="/usr/lib/galera/libgalera_smm.so"
Copy remaining settings from Node 1 configuration
... (same as Node 1, only change node_address and node_name)
```
Configuration for Node 3:
```ini
[galera]
wsrep_on=ON
wsrep_cluster_name="production_cluster"
wsrep_cluster_address="gcomm://192.168.1.10,192.168.1.11,192.168.1.12"
wsrep_node_address="192.168.1.12"
wsrep_node_name="galera-node3"
wsrep_provider="/usr/lib/galera/libgalera_smm.so"
Copy remaining settings from Node 1 configuration
... (same as Node 1, only change node_address and node_name)
```
Step 4: Initialize the Cluster
Bootstrap the First Node
Initialize the cluster by starting the first node in bootstrap mode:
```bash
On Node 1 only
sudo galera_new_cluster
Verify the service is running
sudo systemctl status mariadb
Check cluster status
mysql -u root -p -e "SHOW STATUS LIKE 'wsrep%';"
```
Secure the Installation
Run the security script on the first node:
```bash
sudo mysql_secure_installation
```
Follow the prompts to:
- Set root password
- Remove anonymous users
- Disable remote root login
- Remove test database
- Reload privilege tables
Create Cluster User
Create a dedicated user for cluster operations:
```sql
mysql -u root -p
CREATE USER 'cluster_user'@'%' IDENTIFIED BY 'secure_password';
GRANT ALL PRIVILEGES ON . TO 'cluster_user'@'%';
FLUSH PRIVILEGES;
EXIT;
```
Step 5: Join Additional Nodes
Start Node 2
```bash
On Node 2
sudo systemctl start mariadb
sudo systemctl enable mariadb
Verify cluster membership
mysql -u root -p -e "SHOW STATUS LIKE 'wsrep_cluster_size';"
```
Start Node 3
```bash
On Node 3
sudo systemctl start mariadb
sudo systemctl enable mariadb
Verify all nodes are connected
mysql -u root -p -e "SHOW STATUS LIKE 'wsrep%';" | grep -E "(cluster_size|local_state_comment|ready)"
```
Step 6: Verify Cluster Operation
Test Data Synchronization
Create a test database on Node 1:
```sql
-- On Node 1
mysql -u root -p
CREATE DATABASE test_replication;
USE test_replication;
CREATE TABLE test_table (
id INT AUTO_INCREMENT PRIMARY KEY,
data VARCHAR(255),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
INSERT INTO test_table (data) VALUES ('Test from Node 1');
```
Verify replication on other nodes:
```sql
-- On Node 2 and Node 3
mysql -u root -p
USE test_replication;
SELECT * FROM test_table;
```
Monitor Cluster Status
Check important cluster metrics:
```sql
-- Cluster size (should be 3)
SHOW STATUS LIKE 'wsrep_cluster_size';
-- Local state (should be 'Synced')
SHOW STATUS LIKE 'wsrep_local_state_comment';
-- Cluster status (should be 'Primary')
SHOW STATUS LIKE 'wsrep_cluster_status';
-- Ready status (should be 'ON')
SHOW STATUS LIKE 'wsrep_ready';
```
Advanced Configuration Options
SSL/TLS Encryption
Enable encryption for inter-node communication:
```ini
Add to galera configuration
wsrep_provider_options="socket.ssl=yes;socket.ssl_cert=/etc/mysql/server-cert.pem;socket.ssl_key=/etc/mysql/server-key.pem;socket.ssl_ca=/etc/mysql/ca-cert.pem"
```
State Snapshot Transfer (SST) Methods
Configure different SST methods based on your requirements:
```ini
For faster transfers with large datasets
wsrep_sst_method=mariabackup
wsrep_sst_auth="backup_user:backup_password"
For encrypted transfers
wsrep_sst_method=rsync
wsrep_sst_encrypt=1
```
Performance Tuning
Optimize cluster performance:
```ini
Increase parallel applying threads
wsrep_slave_threads=8
Adjust flow control
wsrep_provider_options="gcs.fc_limit=128;gcs.fc_factor=0.8"
Configure write-set caching
wsrep_provider_options="gcache.size=1GB"
```
Monitoring and Maintenance
Essential Monitoring Queries
Create monitoring scripts to track cluster health:
```sql
-- Comprehensive cluster status
SELECT
VARIABLE_NAME,
VARIABLE_VALUE
FROM INFORMATION_SCHEMA.GLOBAL_STATUS
WHERE VARIABLE_NAME IN (
'wsrep_cluster_size',
'wsrep_local_state_comment',
'wsrep_cluster_status',
'wsrep_ready',
'wsrep_connected',
'wsrep_local_recv_queue',
'wsrep_local_send_queue'
);
```
Automated Health Checks
Create a monitoring script:
```bash
#!/bin/bash
galera_health_check.sh
MYSQL_USER="monitor_user"
MYSQL_PASS="monitor_password"
Check if MySQL is running
if ! systemctl is-active --quiet mariadb; then
echo "ERROR: MariaDB service is not running"
exit 1
fi
Check cluster size
CLUSTER_SIZE=$(mysql -u$MYSQL_USER -p$MYSQL_PASS -e "SHOW STATUS LIKE 'wsrep_cluster_size';" -s -N | awk '{print $2}')
if [ "$CLUSTER_SIZE" -lt 3 ]; then
echo "WARNING: Cluster size is $CLUSTER_SIZE (expected 3)"
fi
Check node state
NODE_STATE=$(mysql -u$MYSQL_USER -p$MYSQL_PASS -e "SHOW STATUS LIKE 'wsrep_local_state_comment';" -s -N | awk '{print $2}')
if [ "$NODE_STATE" != "Synced" ]; then
echo "ERROR: Node state is $NODE_STATE (expected Synced)"
exit 1
fi
echo "Cluster health check passed"
```
Troubleshooting Common Issues
Split-Brain Scenarios
Symptoms: Cluster splits into multiple components, nodes can't communicate.
Solution:
```bash
Identify the primary component
mysql -u root -p -e "SHOW STATUS LIKE 'wsrep_cluster_status';"
If no primary component exists, bootstrap from the most up-to-date node
sudo systemctl stop mariadb
sudo galera_new_cluster
Restart other nodes normally
sudo systemctl start mariadb
```
Node Synchronization Issues
Symptoms: Node stuck in "Joining" or "Donor/Desynced" state.
Diagnosis:
```sql
-- Check local state
SHOW STATUS LIKE 'wsrep_local_state_comment';
-- Check receive queue
SHOW STATUS LIKE 'wsrep_local_recv_queue';
-- Check for conflicts
SHOW STATUS LIKE 'wsrep_local_cert_failures';
```
Solutions:
```bash
For persistent synchronization issues
sudo systemctl stop mariadb
Remove grastate.dat to force SST
sudo rm /var/lib/mysql/grastate.dat
Restart the node
sudo systemctl start mariadb
```
Performance Issues
Common causes and solutions:
1. High queue sizes:
```sql
-- Monitor queue sizes
SHOW STATUS LIKE 'wsrep_local_recv_queue';
SHOW STATUS LIKE 'wsrep_local_send_queue';
```
2. Flow control activation:
```sql
-- Check flow control events
SHOW STATUS LIKE 'wsrep_flow_control_paused';
```
Optimization strategies:
```ini
Increase slave threads
wsrep_slave_threads=16
Adjust flow control parameters
wsrep_provider_options="gcs.fc_limit=256;gcs.fc_factor=0.99"
Increase gcache size
wsrep_provider_options="gcache.size=2GB"
```
Network Connectivity Issues
Diagnosis commands:
```bash
Test network connectivity
telnet 192.168.1.11 4567
Check Galera communication
ss -tlnp | grep :4567
Monitor network traffic
tcpdump -i eth0 port 4567
```
Data Inconsistency
Detection:
```sql
-- Check for certification failures
SHOW STATUS LIKE 'wsrep_local_cert_failures';
-- Monitor conflicts
SHOW STATUS LIKE 'wsrep_local_bf_aborts';
```
Prevention strategies:
1. Use proper transaction isolation levels
2. Implement application-level conflict resolution
3. Design schema to minimize conflicts
4. Use optimistic locking patterns
Best Practices and Tips
Security Best Practices
1. Use SSL/TLS encryption:
```ini
wsrep_provider_options="socket.ssl=yes"
```
2. Create dedicated cluster users:
```sql
CREATE USER 'galera_sst'@'localhost' IDENTIFIED BY 'secure_password';
GRANT RELOAD, LOCK TABLES, PROCESS, REPLICATION CLIENT ON . TO 'galera_sst'@'localhost';
```
3. Restrict network access:
```bash
Use iptables to limit access to cluster ports
iptables -A INPUT -p tcp --dport 4567 -s 192.168.1.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 4567 -j DROP
```
Performance Optimization
1. Configure appropriate gcache size:
```ini
wsrep_provider_options="gcache.size=2GB"
```
2. Optimize InnoDB settings:
```ini
innodb_buffer_pool_size=4G
innodb_log_file_size=256M
innodb_flush_log_at_trx_commit=2
```
3. Use SSD storage for better I/O performance
4. Monitor and tune slave threads:
```ini
wsrep_slave_threads=8 # Adjust based on CPU cores
```
Operational Best Practices
1. Always maintain an odd number of nodes
2. Implement proper backup strategies
3. Use consistent hardware configurations
4. Monitor cluster health continuously
5. Plan maintenance windows for updates
6. Test failover scenarios regularly
Application Design Considerations
1. Handle deadlock exceptions gracefully
2. Use smaller transactions when possible
3. Implement retry logic for failed transactions
4. Avoid long-running transactions
5. Use appropriate isolation levels
Backup and Recovery Strategies
Consistent Backup Methods
Using MariaBackup:
```bash
Create consistent cluster backup
mariabackup --backup --target-dir=/backup/full_backup \
--user=backup_user --password=backup_password \
--galera-info
Prepare backup for restoration
mariabackup --prepare --target-dir=/backup/full_backup
```
Using mysqldump:
```bash
Dump with consistent snapshot
mysqldump --single-transaction --routines --triggers \
--all-databases > cluster_backup.sql
```
Disaster Recovery Procedures
Complete cluster failure recovery:
1. Identify the most recent node:
```bash
Check grastate.dat on all nodes
cat /var/lib/mysql/grastate.dat
```
2. Bootstrap from the most recent node:
```bash
sudo galera_new_cluster
```
3. Restore other nodes from backup if necessary
Conclusion
Configuring Galera Cluster on Linux provides a robust foundation for high-availability database deployments. This comprehensive guide has covered the essential aspects of Galera Cluster setup, from initial installation through advanced configuration, monitoring, and troubleshooting.
Key takeaways from this implementation:
- Proper planning is crucial for successful Galera Cluster deployment
- Network configuration and connectivity are fundamental requirements
- Monitoring and maintenance ensure long-term cluster stability
- Understanding common issues enables quick problem resolution
- Following best practices prevents many operational challenges
Next Steps
After successfully implementing your Galera Cluster:
1. Set up comprehensive monitoring using tools like Prometheus, Grafana, or Nagios
2. Implement automated backup solutions for data protection
3. Create runbooks for common operational procedures
4. Test disaster recovery scenarios in a controlled environment
5. Consider load balancing solutions like HAProxy or ProxySQL
6. Plan for capacity scaling as your application grows
Additional Resources
- Official Galera Documentation: Comprehensive technical reference
- MariaDB Knowledge Base: Detailed configuration examples
- Community Forums: Peer support and troubleshooting assistance
- Professional Support: Consider commercial support for production environments
By following this guide and implementing the recommended practices, you'll have a reliable, high-performance Galera Cluster that can handle your application's database requirements while providing the availability and consistency your users expect.
Remember that database clustering is an ongoing process that requires continuous monitoring, maintenance, and optimization. Stay updated with the latest Galera releases and security patches to ensure your cluster remains secure and performs optimally.