How to configure Nagios monitoring in Linux
How to Configure Nagios Monitoring in Linux
Nagios is one of the most powerful and widely-used open-source monitoring systems available for Linux environments. This comprehensive guide will walk you through the complete process of installing, configuring, and optimizing Nagios for effective system monitoring. Whether you're a system administrator managing a small network or overseeing enterprise infrastructure, this tutorial provides the knowledge needed to implement robust monitoring solutions.
Table of Contents
1. [Introduction to Nagios](#introduction-to-nagios)
2. [Prerequisites and Requirements](#prerequisites-and-requirements)
3. [Installing Nagios Core](#installing-nagios-core)
4. [Configuring Nagios](#configuring-nagios)
5. [Setting Up Host and Service Monitoring](#setting-up-host-and-service-monitoring)
6. [Configuring Notifications](#configuring-notifications)
7. [Web Interface Setup](#web-interface-setup)
8. [Advanced Configuration](#advanced-configuration)
9. [Troubleshooting Common Issues](#troubleshooting-common-issues)
10. [Best Practices](#best-practices)
11. [Conclusion](#conclusion)
Introduction to Nagios
Nagios is a comprehensive monitoring system that enables organizations to identify and resolve IT infrastructure problems before they affect critical business processes. It monitors hosts and services, alerting users when things go wrong and when they get better. The system provides complete monitoring of applications, services, operating systems, network protocols, system metrics, and network infrastructure.
Key features of Nagios include:
- Real-time monitoring of hosts and services
- Flexible notification system via email, SMS, or custom scripts
- Web-based interface for monitoring status and historical data
- Plugin architecture supporting thousands of community-developed addons
- Event handling for proactive problem resolution
- Performance data collection and trending
Prerequisites and Requirements
Before beginning the Nagios installation and configuration process, ensure your system meets the following requirements:
System Requirements
- Operating System: Linux distribution (Ubuntu 18.04+, CentOS 7+, RHEL 7+, Debian 9+)
- RAM: Minimum 1GB (2GB+ recommended for larger environments)
- Disk Space: At least 2GB free space
- CPU: Single core minimum (multi-core recommended)
Software Dependencies
The following packages are required:
```bash
For Ubuntu/Debian systems
sudo apt-get update
sudo apt-get install -y apache2 apache2-utils php libapache2-mod-php php-gd
sudo apt-get install -y build-essential unzip openssl libssl-dev
For CentOS/RHEL systems
sudo yum update
sudo yum install -y httpd httpd-tools php php-gd gcc glibc glibc-common
sudo yum install -y gd gd-devel make net-snmp openssl-devel unzip
```
User Account Setup
Create dedicated users for Nagios:
```bash
Create nagios user and group
sudo useradd nagios
sudo groupadd nagcmd
sudo usermod -a -G nagcmd nagios
sudo usermod -a -G nagcmd www-data # Ubuntu/Debian
or
sudo usermod -a -G nagcmd apache # CentOS/RHEL
```
Installing Nagios Core
Step 1: Download Nagios Core
Download the latest stable version of Nagios Core:
```bash
cd /tmp
wget https://github.com/NagiosEnterprises/nagioscore/archive/nagios-4.4.6.tar.gz
tar xzf nagios-4.4.6.tar.gz
cd nagioscore-nagios-4.4.6/
```
Step 2: Compile and Install Nagios
Configure and compile the Nagios source code:
```bash
Configure the build
sudo ./configure --with-httpd-conf=/etc/apache2/sites-enabled
Compile Nagios
sudo make all
Install binaries, scripts, and configuration files
sudo make install
sudo make install-init
sudo make install-commandmode
sudo make install-config
sudo make install-webconf
```
Step 3: Install Nagios Plugins
Nagios plugins are essential for monitoring functionality:
```bash
cd /tmp
wget https://github.com/nagios-plugins/nagios-plugins/archive/release-2.3.3.tar.gz
tar xzf release-2.3.3.tar.gz
cd nagios-plugins-release-2.3.3/
Configure and compile plugins
sudo ./tools/setup
sudo ./configure
sudo make
sudo make install
```
Step 4: Set Permissions
Configure proper permissions for Nagios files:
```bash
sudo chown nagios:nagios /usr/local/nagios
sudo chown -R nagios:nagios /usr/local/nagios/libexec
sudo chmod +x /usr/local/nagios/libexec/*
```
Configuring Nagios
Main Configuration File
The primary Nagios configuration file is located at `/usr/local/nagios/etc/nagios.cfg`. Key configuration parameters include:
```bash
Edit the main configuration file
sudo nano /usr/local/nagios/etc/nagios.cfg
```
Important settings to verify:
```ini
Main configuration directory
cfg_dir=/usr/local/nagios/etc/servers
Log file location
log_file=/usr/local/nagios/var/nagios.log
Object cache file
object_cache_file=/usr/local/nagios/var/objects.cache
Command check interval
command_check_interval=15s
Enable notifications
enable_notifications=1
Check external commands
check_external_commands=1
Command file
command_file=/usr/local/nagios/var/rw/nagios.cmd
```
Creating Configuration Directories
Organize configuration files by creating dedicated directories:
```bash
sudo mkdir /usr/local/nagios/etc/servers
sudo mkdir /usr/local/nagios/etc/services
sudo mkdir /usr/local/nagios/etc/contacts
```
Contacts Configuration
Configure notification contacts in `/usr/local/nagios/etc/objects/contacts.cfg`:
```ini
define contact {
contact_name nagiosadmin
use generic-contact
alias Nagios Admin
email admin@yourdomain.com
host_notification_period 24x7
service_notification_period 24x7
host_notification_options d,u,r,f,s
service_notification_options w,u,c,r,f,s
host_notification_commands notify-host-by-email
service_notification_commands notify-service-by-email
}
define contactgroup {
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
}
```
Setting Up Host and Service Monitoring
Host Configuration
Create host configuration files in `/usr/local/nagios/etc/servers/`:
```bash
Create a host configuration file
sudo nano /usr/local/nagios/etc/servers/web-server.cfg
```
Example host configuration:
```ini
define host {
use linux-server
host_name web-server-01
alias Web Server 01
address 192.168.1.100
max_check_attempts 5
check_period 24x7
notification_interval 30
notification_period 24x7
contact_groups admins
}
```
Service Configuration
Define services to monitor on each host:
```ini
define service {
use generic-service
host_name web-server-01
service_description HTTP
check_command check_http
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 60
notification_period 24x7
}
define service {
use generic-service
host_name web-server-01
service_description SSH
check_command check_ssh
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups admins
}
define service {
use generic-service
host_name web-server-01
service_description Current Load
check_command check_nrpe!check_load
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups admins
}
```
Command Definitions
Define custom commands in `/usr/local/nagios/etc/objects/commands.cfg`:
```ini
define command {
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
define command {
command_name check_http_url
command_line $USER1$/check_http -H $HOSTADDRESS$ -u $ARG1$
}
define command {
command_name check_mysql
command_line $USER1$/check_mysql -H $HOSTADDRESS$ -u $ARG1$ -p $ARG2$
}
```
Configuring Notifications
Email Notifications
Configure email notifications by setting up the mail command:
```bash
Install mail utilities
sudo apt-get install mailutils # Ubuntu/Debian
sudo yum install mailx # CentOS/RHEL
```
Update notification commands in `/usr/local/nagios/etc/objects/commands.cfg`:
```ini
define command {
command_name notify-host-by-email
command_line /usr/bin/printf "%b" " Nagios \n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /usr/bin/mail -s " $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ " $CONTACTEMAIL$
}
define command {
command_name notify-service-by-email
command_line /usr/bin/printf "%b" " Nagios \n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | /usr/bin/mail -s " $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ " $CONTACTEMAIL$
}
```
SMS Notifications
For SMS notifications, integrate with SMS gateways or services:
```bash
Create SMS notification script
sudo nano /usr/local/nagios/libexec/notify-by-sms.sh
```
```bash
#!/bin/bash
SMS notification script
PHONE=$1
MESSAGE=$2
Example using SMS gateway API
curl -X POST "https://api.smsgateway.com/send" \
-d "phone=$PHONE" \
-d "message=$MESSAGE" \
-d "api_key=YOUR_API_KEY"
```
Web Interface Setup
Apache Configuration
Configure Apache to serve the Nagios web interface:
```bash
Enable Apache modules
sudo a2enmod rewrite
sudo a2enmod cgi
Create Nagios web user
sudo htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
Restart Apache
sudo systemctl restart apache2 # Ubuntu/Debian
sudo systemctl restart httpd # CentOS/RHEL
```
Web Interface Access
Configure web interface settings in `/usr/local/nagios/etc/cgi.cfg`:
```ini
Users authorized for system information
authorized_for_system_information=nagiosadmin
Users authorized for configuration information
authorized_for_configuration_information=nagiosadmin
Users authorized for system commands
authorized_for_system_commands=nagiosadmin
Users authorized for all services
authorized_for_all_services=nagiosadmin
Users authorized for all hosts
authorized_for_all_hosts=nagiosadmin
```
Starting Nagios Services
Enable and start Nagios services:
```bash
Start and enable Nagios
sudo systemctl start nagios
sudo systemctl enable nagios
Verify Nagios is running
sudo systemctl status nagios
Check configuration syntax
sudo /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
```
Advanced Configuration
NRPE Configuration
Install and configure NRPE (Nagios Remote Plugin Executor) for remote monitoring:
```bash
Install NRPE on monitored hosts
cd /tmp
wget https://github.com/NagiosEnterprises/nrpe/releases/download/nrpe-4.0.3/nrpe-4.0.3.tar.gz
tar xzf nrpe-4.0.3.tar.gz
cd nrpe-4.0.3/
sudo ./configure --enable-command-args --with-ssl-dir=/usr/lib/ssl/
sudo make all
sudo make install
sudo make install-config
sudo make install-init
```
Configure NRPE on remote hosts (`/usr/local/nagios/etc/nrpe.cfg`):
```ini
Allowed hosts
allowed_hosts=127.0.0.1,192.168.1.10
Command definitions
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1
command[check_procs]=/usr/local/nagios/libexec/check_procs -w 250 -c 400 -s RSZDT
```
Performance Data Collection
Enable performance data collection for trending:
```ini
In nagios.cfg
process_performance_data=1
service_perfdata_command=process-service-perfdata
host_perfdata_command=process-host-perfdata
Performance data commands
define command {
command_name process-service-perfdata
command_line /usr/bin/printf "%b" "$LASTSERVICECHECK$\t$HOSTNAME$\t$SERVICEDESC$\t$SERVICESTATE$\t$SERVICEATTEMPT$\t$SERVICESTATETYPE$\t$SERVICEEXECUTIONTIME$\t$SERVICELATENCY$\t$SERVICEOUTPUT$\t$SERVICEPERFDATA$\n" >> /usr/local/nagios/var/service-perfdata.out
}
```
Custom Plugin Development
Create custom monitoring plugins:
```bash
#!/bin/bash
Custom disk usage plugin
/usr/local/nagios/libexec/check_custom_disk.sh
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | sed 's/%//')
WARNING_THRESHOLD=80
CRITICAL_THRESHOLD=90
if [ $DISK_USAGE -ge $CRITICAL_THRESHOLD ]; then
echo "CRITICAL - Disk usage is ${DISK_USAGE}%"
exit 2
elif [ $DISK_USAGE -ge $WARNING_THRESHOLD ]; then
echo "WARNING - Disk usage is ${DISK_USAGE}%"
exit 1
else
echo "OK - Disk usage is ${DISK_USAGE}%"
exit 0
fi
```
Troubleshooting Common Issues
Configuration Validation
Always validate configuration before restarting Nagios:
```bash
Check configuration syntax
sudo /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Common validation errors and solutions:
Error: Could not read object configuration data!
Solution: Check file permissions and syntax in configuration files
Error: Host 'hostname' is not defined anywhere!
Solution: Ensure host is defined in a configuration file included in nagios.cfg
```
Permission Issues
Fix common permission problems:
```bash
Fix ownership issues
sudo chown -R nagios:nagios /usr/local/nagios/var
sudo chown -R nagios:nagcmd /usr/local/nagios/var/rw
Fix command file permissions
sudo chmod 664 /usr/local/nagios/var/rw/nagios.cmd
```
Web Interface Issues
Resolve web interface access problems:
```bash
Check Apache error logs
sudo tail -f /var/log/apache2/error.log # Ubuntu/Debian
sudo tail -f /var/log/httpd/error_log # CentOS/RHEL
Common solutions:
1. Verify htpasswd file exists and has correct permissions
2. Check Apache configuration includes Nagios virtual host
3. Ensure CGI modules are enabled
```
Plugin Execution Issues
Debug plugin execution problems:
```bash
Test plugins manually
sudo -u nagios /usr/local/nagios/libexec/check_http -H www.google.com
Check plugin permissions
ls -la /usr/local/nagios/libexec/
Verify NRPE connectivity
/usr/local/nagios/libexec/check_nrpe -H target_host
```
Log Analysis
Monitor Nagios logs for troubleshooting:
```bash
Main Nagios log
sudo tail -f /usr/local/nagios/var/nagios.log
Common log entries to watch for:
- "Error: Could not create external command file"
- "Warning: Check of service 'X' on host 'Y' timed out"
- "Error: Unable to send notifications"
```
Best Practices
Security Considerations
Implement security best practices:
1. Secure Web Interface Access
```bash
# Use HTTPS for web interface
sudo a2enmod ssl
# Configure SSL certificate
# Restrict access by IP address if possible
```
2. File Permissions
```bash
# Restrict configuration file access
sudo chmod 640 /usr/local/nagios/etc/*.cfg
sudo chown nagios:nagios /usr/local/nagios/etc/*.cfg
```
3. Network Security
- Use firewalls to restrict NRPE access
- Consider VPN for remote monitoring
- Regularly update Nagios and plugins
Performance Optimization
Optimize Nagios performance for larger environments:
1. Check Interval Tuning
```ini
# Adjust check intervals based on criticality
normal_check_interval 5 # Critical services
normal_check_interval 10 # Important services
normal_check_interval 30 # Non-critical services
```
2. Resource Optimization
```ini
# In nagios.cfg
max_concurrent_checks=20
check_result_reaper_frequency=10
max_check_result_reaper_time=30
```
3. Database Integration
- Consider using NDOUtils for database storage
- Implement log rotation policies
- Archive old performance data
Monitoring Strategy
Develop comprehensive monitoring strategies:
1. Layered Monitoring Approach
- Network connectivity (ping)
- Service availability (HTTP, SSH, etc.)
- Performance metrics (CPU, memory, disk)
- Application-specific checks
2. Escalation Procedures
```ini
define hostescalation {
host_name web-server-01
first_notification 1
last_notification 3
notification_interval 60
contact_groups admins
}
```
3. Maintenance Windows
```bash
# Schedule downtime via web interface or external commands
echo "[$(date +%s)] SCHEDULE_HOST_DOWNTIME;web-server-01;$(date +%s);$(($(date +%s) + 3600));1;0;0;nagiosadmin;Maintenance window" > /usr/local/nagios/var/rw/nagios.cmd
```
Documentation and Change Management
Maintain proper documentation:
1. Configuration Management
- Use version control for configuration files
- Document all custom plugins and commands
- Maintain inventory of monitored systems
2. Runbooks and Procedures
- Create response procedures for common alerts
- Document escalation paths
- Maintain contact information
Conclusion
Configuring Nagios monitoring in Linux requires careful planning, proper installation, and ongoing maintenance. This comprehensive guide has covered the essential steps from initial installation through advanced configuration options. Key takeaways include:
1. Proper Planning: Understanding your monitoring requirements before implementation ensures effective coverage and resource utilization.
2. Security First: Implementing proper security measures protects your monitoring infrastructure and sensitive data.
3. Scalable Architecture: Using organized configuration files and proper directory structures facilitates growth and maintenance.
4. Regular Maintenance: Keeping Nagios updated, monitoring logs, and validating configurations ensures reliable operation.
5. Continuous Improvement: Regularly reviewing and optimizing monitoring coverage, thresholds, and notification procedures enhances effectiveness.
Next Steps
After successfully implementing Nagios monitoring:
1. Expand Monitoring Coverage: Gradually add more hosts and services to your monitoring infrastructure
2. Integrate Additional Tools: Consider integrating with tools like Grafana for visualization or PagerDuty for advanced alerting
3. Automate Deployment: Implement configuration management tools like Ansible or Puppet for automated Nagios deployment
4. Develop Custom Solutions: Create custom plugins and integrations specific to your environment
5. Training and Documentation: Ensure team members are trained on Nagios administration and maintain comprehensive documentation
By following this guide and implementing the best practices outlined, you'll have a robust monitoring solution that provides visibility into your infrastructure's health and performance, enabling proactive problem resolution and improved system reliability.