How to configure Elasticsearch cluster in Linux
How to Configure Elasticsearch Cluster in Linux
Elasticsearch is a powerful, distributed search and analytics engine that forms the backbone of many modern applications. Setting up an Elasticsearch cluster in Linux provides high availability, fault tolerance, and improved performance through distributed computing. This comprehensive guide will walk you through the entire process of configuring a production-ready Elasticsearch cluster on Linux systems.
Table of Contents
1. [Introduction](#introduction)
2. [Prerequisites and Requirements](#prerequisites-and-requirements)
3. [System Preparation](#system-preparation)
4. [Installing Elasticsearch](#installing-elasticsearch)
5. [Cluster Configuration](#cluster-configuration)
6. [Node Configuration](#node-configuration)
7. [Security Configuration](#security-configuration)
8. [Starting and Managing the Cluster](#starting-and-managing-the-cluster)
9. [Monitoring and Health Checks](#monitoring-and-health-checks)
10. [Troubleshooting Common Issues](#troubleshooting-common-issues)
11. [Best Practices](#best-practices)
12. [Conclusion](#conclusion)
Introduction
An Elasticsearch cluster consists of multiple nodes working together to store, index, and search data. Unlike a single-node setup, a cluster provides redundancy, scalability, and improved performance. Each node in the cluster can serve different roles: master-eligible nodes manage cluster state, data nodes store and process data, and coordinating nodes handle client requests.
This guide covers setting up a multi-node Elasticsearch cluster with proper security, monitoring, and optimization configurations suitable for production environments.
Prerequisites and Requirements
System Requirements
Before beginning the installation, ensure your Linux systems meet the following requirements:
Hardware Requirements:
- RAM: Minimum 8GB per node (16GB+ recommended for production)
- CPU: Multi-core processor (4+ cores recommended)
- Storage: SSD storage recommended for better I/O performance
- Network: Reliable network connectivity between nodes
Software Requirements:
- Operating System: Ubuntu 18.04+, CentOS 7+, RHEL 7+, or similar Linux distribution
- Java: OpenJDK 11 or Oracle JDK 11 (Elasticsearch 7.x and later includes bundled JDK)
- Root or sudo access on all cluster nodes
Network Configuration
Ensure the following network requirements are met:
- All nodes can communicate with each other on ports 9200 (HTTP) and 9300 (transport)
- Firewall rules allow traffic between cluster nodes
- Each node has a static IP address or reliable hostname resolution
- Network latency between nodes should be minimal (preferably < 1ms)
System Preparation
Step 1: Update System Packages
On each node, update the system packages:
```bash
Ubuntu/Debian
sudo apt update && sudo apt upgrade -y
CentOS/RHEL
sudo yum update -y
or for newer versions
sudo dnf update -y
```
Step 2: Configure System Limits
Elasticsearch requires specific system limits to function properly. Edit the limits configuration:
```bash
sudo vim /etc/security/limits.conf
```
Add the following lines:
```bash
elasticsearch soft nofile 65536
elasticsearch hard nofile 65536
elasticsearch soft nproc 4096
elasticsearch hard nproc 4096
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited
```
Step 3: Configure Virtual Memory
Set the virtual memory map count:
```bash
sudo sysctl -w vm.max_map_count=262144
```
Make this setting permanent:
```bash
echo "vm.max_map_count=262144" | sudo tee -a /etc/sysctl.conf
```
Step 4: Disable Swap
Disable swap to prevent performance issues:
```bash
sudo swapoff -a
```
Comment out swap entries in `/etc/fstab`:
```bash
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
```
Installing Elasticsearch
Method 1: Using Package Repository (Recommended)
Import the Elasticsearch GPG key:
```bash
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
```
Add the repository:
```bash
Ubuntu/Debian
echo "deb https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
CentOS/RHEL
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
```
For CentOS/RHEL, create the repository file:
```bash
sudo tee /etc/yum.repos.d/elasticsearch.repo <Master-eligible nodes: Manage cluster state (minimum 3 for high availability)
- Data nodes: Store and process data
- Coordinating nodes: Handle client requests and distribute queries
- Ingest nodes: Pre-process documents before indexing
Step 1: Configure Cluster Discovery
Create or edit the main configuration file `/etc/elasticsearch/elasticsearch.yml`:
```yaml
Cluster configuration
cluster.name: production-cluster
node.name: node-1
Network configuration
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300
Discovery configuration
discovery.seed_hosts:
- "192.168.1.10:9300"
- "192.168.1.11:9300"
- "192.168.1.12:9300"
cluster.initial_master_nodes:
- "node-1"
- "node-2"
- "node-3"
Path configuration
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
Memory configuration
bootstrap.memory_lock: true
Security configuration (for Elasticsearch 8.x)
xpack.security.enabled: true
xpack.security.enrollment.enabled: true
xpack.security.http.ssl:
enabled: true
keystore.path: certs/http.p12
xpack.security.transport.ssl:
enabled: true
verification_mode: certificate
keystore.path: certs/transport.p12
truststore.path: certs/transport.p12
```
Step 2: Configure JVM Settings
Edit the JVM options file `/etc/elasticsearch/jvm.options`:
```bash
Heap size (set to 50% of available RAM, max 32GB)
-Xms4g
-Xmx4g
GC configuration
-XX:+UseG1GC
-XX:G1HeapRegionSize=16m
-XX:+UseG1GC
-XX:+UnlockExperimentalVMOptions
-XX:+UseG1GC
-XX:MaxGCPauseMillis=50
Memory mapping
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=${ES_TMPDIR}
```
Node Configuration
Master-Eligible Node Configuration
For dedicated master nodes, configure as follows:
```yaml
Node roles
node.roles: [ master ]
Node identification
node.name: master-node-1
cluster.name: production-cluster
Disable data storage on master nodes
node.data: false
node.ingest: false
Network settings
network.host: 192.168.1.10
http.port: 9200
transport.port: 9300
Discovery settings
discovery.seed_hosts:
- "192.168.1.10:9300"
- "192.168.1.11:9300"
- "192.168.1.12:9300"
cluster.initial_master_nodes:
- "master-node-1"
- "master-node-2"
- "master-node-3"
```
Data Node Configuration
For dedicated data nodes:
```yaml
Node roles
node.roles: [ data, data_content, data_hot, data_warm, data_cold ]
Node identification
node.name: data-node-1
cluster.name: production-cluster
Enable data storage
node.data: true
node.master: false
Storage paths
path.data: ["/data1/elasticsearch", "/data2/elasticsearch"]
Network settings
network.host: 192.168.1.20
http.port: 9200
transport.port: 9300
Discovery settings
discovery.seed_hosts:
- "192.168.1.10:9300"
- "192.168.1.11:9300"
- "192.168.1.12:9300"
```
Coordinating Node Configuration
For dedicated coordinating nodes:
```yaml
Node roles
node.roles: []
Node identification
node.name: coordinating-node-1
cluster.name: production-cluster
Disable data and master roles
node.data: false
node.master: false
node.ingest: false
Network settings
network.host: 192.168.1.30
http.port: 9200
transport.port: 9300
```
Security Configuration
Step 1: Enable X-Pack Security
For Elasticsearch 8.x, security is enabled by default. For older versions, enable it manually:
```yaml
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true
```
Step 2: Generate Certificates
Generate certificates for secure communication:
```bash
Generate CA certificate
sudo /usr/share/elasticsearch/bin/elasticsearch-certutil ca --out /etc/elasticsearch/certs/elastic-stack-ca.p12 --pass ""
Generate node certificates
sudo /usr/share/elasticsearch/bin/elasticsearch-certutil cert --ca /etc/elasticsearch/certs/elastic-stack-ca.p12 --out /etc/elasticsearch/certs/elastic-certificates.p12 --pass ""
Set proper permissions
sudo chown elasticsearch:elasticsearch /etc/elasticsearch/certs/*
sudo chmod 660 /etc/elasticsearch/certs/*
```
Step 3: Configure SSL/TLS
Update the configuration file:
```yaml
xpack.security.transport.ssl:
enabled: true
verification_mode: certificate
keystore.path: certs/elastic-certificates.p12
truststore.path: certs/elastic-certificates.p12
xpack.security.http.ssl:
enabled: true
keystore.path: certs/elastic-certificates.p12
```
Step 4: Set Up Authentication
Generate passwords for built-in users:
```bash
sudo /usr/share/elasticsearch/bin/elasticsearch-setup-passwords auto
```
Save the generated passwords securely. You can also set passwords interactively:
```bash
sudo /usr/share/elasticsearch/bin/elasticsearch-setup-passwords interactive
```
Starting and Managing the Cluster
Step 1: Enable and Start Elasticsearch Service
On each node, enable and start the Elasticsearch service:
```bash
Enable service to start on boot
sudo systemctl enable elasticsearch
Start the service
sudo systemctl start elasticsearch
Check service status
sudo systemctl status elasticsearch
```
Step 2: Verify Cluster Formation
Check if nodes have joined the cluster:
```bash
Check cluster health
curl -X GET "localhost:9200/_cluster/health?pretty"
List cluster nodes
curl -X GET "localhost:9200/_cat/nodes?v"
Check cluster state
curl -X GET "localhost:9200/_cluster/state?pretty"
```
If security is enabled, use authentication:
```bash
curl -u elastic:password -X GET "https://localhost:9200/_cluster/health?pretty" -k
```
Step 3: Configure Service Management
Create a systemd service file if using manual installation:
```bash
sudo tee /etc/systemd/system/elasticsearch.service <Symptoms: Node starts but doesn't appear in cluster node list.
Solutions:
1. Check network connectivity:
```bash
telnet 9300
```
2. Verify discovery configuration:
```yaml
discovery.seed_hosts:
- "correct-ip:9300"
```
3. Check firewall rules:
```bash
sudo ufw allow 9200
sudo ufw allow 9300
```
Issue 2: Split-Brain Prevention
Symptoms: Multiple master nodes elected simultaneously.
Solution: Configure minimum master nodes properly:
```yaml
For 3 master-eligible nodes
discovery.zen.minimum_master_nodes: 2
For 5 master-eligible nodes
discovery.zen.minimum_master_nodes: 3
```
Issue 3: Memory Issues
Symptoms: OutOfMemoryError or high GC pressure.
Solutions:
1. Adjust heap size:
```bash
In jvm.options
-Xms8g
-Xmx8g
```
2. Enable memory lock:
```yaml
bootstrap.memory_lock: true
```
3. Monitor field data usage:
```bash
curl -X GET "localhost:9200/_nodes/stats/indices/fielddata?pretty"
```
Issue 4: Disk Space Issues
Symptoms: Cluster goes to read-only mode.
Solutions:
1. Check disk watermarks:
```yaml
cluster.routing.allocation.disk.watermark.low: 85%
cluster.routing.allocation.disk.watermark.high: 90%
cluster.routing.allocation.disk.watermark.flood_stage: 95%
```
2. Clean up old indices:
```bash
Delete old indices
curl -X DELETE "localhost:9200/old-index-*"
Use Index Lifecycle Management (ILM)
curl -X PUT "localhost:9200/_ilm/policy/cleanup-policy" -H 'Content-Type: application/json' -d'
{
"policy": {
"phases": {
"delete": {
"min_age": "30d",
"actions": {
"delete": {}
}
}
}
}
}'
```
Issue 5: SSL/TLS Certificate Problems
Symptoms: SSL handshake failures or certificate errors.
Solutions:
1. Regenerate certificates:
```bash
sudo /usr/share/elasticsearch/bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12 --out elastic-certificates.p12
```
2. Verify certificate configuration:
```yaml
xpack.security.transport.ssl.keystore.path: certs/elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: certs/elastic-certificates.p12
```
3. Check certificate permissions:
```bash
sudo chown elasticsearch:elasticsearch /etc/elasticsearch/certs/*
sudo chmod 660 /etc/elasticsearch/certs/*
```
Best Practices
Hardware and Infrastructure
1. Use SSD storage for better I/O performance
2. Separate master and data nodes in large clusters
3. Use dedicated coordinating nodes for heavy query loads
4. Implement proper network segmentation for security
5. Use load balancers for client connections
Configuration Optimization
1. Set appropriate heap sizes (50% of RAM, max 32GB)
2. Configure proper thread pools:
```yaml
thread_pool:
search:
size: 30
queue_size: 1000
write:
size: 30
queue_size: 200
```
3. Optimize index settings:
```json
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1,
"refresh_interval": "30s",
"index.translog.durability": "async"
}
}
```
Security Best Practices
1. Enable X-Pack Security with proper authentication
2. Use TLS/SSL for all communications
3. Implement role-based access control (RBAC)
4. Regular security updates and patches
5. Network security with firewalls and VPNs
Monitoring and Maintenance
1. Implement comprehensive monitoring with tools like Metricbeat
2. Set up alerting for critical metrics
3. Regular backups using snapshot repositories
4. Index lifecycle management for automated cleanup
5. Performance testing and capacity planning
Backup Strategy
Implement automated backups:
```bash
Create snapshot repository
curl -X PUT "localhost:9200/_snapshot/backup_repository" -H 'Content-Type: application/json' -d'
{
"type": "fs",
"settings": {
"location": "/backup/elasticsearch"
}
}'
Create snapshot
curl -X PUT "localhost:9200/_snapshot/backup_repository/snapshot_1"
```
Conclusion
Setting up an Elasticsearch cluster in Linux requires careful planning and attention to detail. This comprehensive guide has covered all aspects of cluster configuration, from initial system preparation to advanced security and monitoring setups.
Key takeaways:
1. Proper planning is essential for cluster architecture and node roles
2. Security configuration should be implemented from the beginning
3. Monitoring and maintenance are crucial for production stability
4. Performance optimization requires ongoing tuning and adjustment
5. Backup strategies ensure data protection and disaster recovery
Next Steps
After successfully setting up your Elasticsearch cluster:
1. Implement monitoring solutions like Kibana and Metricbeat
2. Set up index templates and lifecycle policies
3. Configure client applications to use the cluster
4. Plan for scaling as data and query volumes grow
5. Establish operational procedures for maintenance and troubleshooting
With proper configuration and maintenance, your Elasticsearch cluster will provide reliable, scalable search and analytics capabilities for your applications. Regular monitoring, updates, and optimization will ensure optimal performance and stability in production environments.
Remember to stay updated with the latest Elasticsearch releases and security patches, and always test configuration changes in a development environment before applying them to production systems.