How to monitor → mdadm --monitor --scan --daemonise
How to Monitor RAID Arrays with mdadm --monitor --scan --daemonise
Table of Contents
1. [Introduction](#introduction)
2. [Prerequisites](#prerequisites)
3. [Understanding mdadm Monitor Options](#understanding-mdadm-monitor-options)
4. [Basic Monitoring Setup](#basic-monitoring-setup)
5. [Advanced Configuration Options](#advanced-configuration-options)
6. [Practical Examples and Use Cases](#practical-examples-and-use-cases)
7. [Troubleshooting Common Issues](#troubleshooting-common-issues)
8. [Best Practices and Professional Tips](#best-practices-and-professional-tips)
9. [Integration with System Services](#integration-with-system-services)
10. [Conclusion](#conclusion)
Introduction
RAID (Redundant Array of Independent Disks) monitoring is a critical aspect of system administration that ensures data integrity and availability. The `mdadm` utility in Linux provides powerful monitoring capabilities through its `--monitor` functionality, which can continuously watch RAID arrays for failures, degradation, and other important events.
The command `mdadm --monitor --scan --daemonise` represents one of the most essential tools for automated RAID monitoring in Linux environments. This comprehensive guide will walk you through setting up, configuring, and maintaining a robust RAID monitoring system that can alert you to potential issues before they become critical failures.
By the end of this article, you'll understand how to implement automated RAID monitoring, configure appropriate alerting mechanisms, troubleshoot common issues, and follow industry best practices for maintaining healthy RAID arrays in production environments.
Prerequisites
Before implementing mdadm monitoring, ensure you have the following requirements met:
System Requirements
- Linux system with mdadm installed (version 3.0 or higher recommended)
- Root or sudo privileges for system configuration
- Active RAID arrays configured with mdadm
- Basic understanding of Linux command line operations
Software Dependencies
```bash
Verify mdadm installation
mdadm --version
Install mdadm if not present (Ubuntu/Debian)
sudo apt-get update && sudo apt-get install mdadm
Install mdadm (CentOS/RHEL/Fedora)
sudo yum install mdadm
or for newer versions
sudo dnf install mdadm
```
Configuration Files
Ensure the following configuration files are properly set up:
- `/etc/mdadm/mdadm.conf` (Debian/Ubuntu) or `/etc/mdadm.conf` (CentOS/RHEL)
- `/proc/mdstat` accessible for reading array status
- Mail system configured for notifications (optional but recommended)
Understanding mdadm Monitor Options
Core Monitoring Parameters
The `mdadm --monitor` command accepts several key parameters that control its behavior:
--monitor
The `--monitor` flag puts mdadm into monitoring mode, where it continuously watches specified RAID arrays for changes in their status. This mode is designed to run as a long-running process that can detect and respond to various RAID events.
--scan
The `--scan` option instructs mdadm to automatically discover and monitor all RAID arrays listed in the configuration file (`/etc/mdadm/mdadm.conf` or `/etc/mdadm.conf`). This eliminates the need to manually specify each array device.
--daemonise
The `--daemonise` (or `--daemonize` in American spelling) option causes mdadm to run as a background daemon process, detaching from the terminal and continuing to run even after the user logs out.
Additional Important Options
```bash
Common monitoring options
--delay=seconds # Time between checks (default: 60 seconds)
--mail=email@domain.com # Email address for notifications
--program=script # Custom script to run on events
--syslog # Send notifications to syslog
--test # Test mode - don't actually send notifications
```
Basic Monitoring Setup
Step 1: Configure mdadm.conf
First, ensure your `/etc/mdadm/mdadm.conf` file is properly configured:
```bash
Generate or update mdadm configuration
sudo mdadm --detail --scan >> /etc/mdadm/mdadm.conf
Example mdadm.conf content
ARRAY /dev/md0 metadata=1.2 name=server:0 UUID=12345678:90abcdef:12345678:90abcdef
ARRAY /dev/md1 metadata=1.2 name=server:1 UUID=87654321:fedcba09:87654321:fedcba09
Monitoring configuration
MAILADDR admin@example.com
PROGRAM /usr/local/bin/raid-alert.sh
```
Step 2: Basic Monitor Command
Start monitoring with the basic command:
```bash
Basic monitoring command
sudo mdadm --monitor --scan --daemonise
Verify the daemon is running
ps aux | grep mdadm
pgrep -f "mdadm.*monitor"
```
Step 3: Verify Monitoring Status
Check that monitoring is active and working:
```bash
Check RAID array status
cat /proc/mdstat
View mdadm process details
sudo systemctl status mdmonitor # On systemd systems
sudo service mdmonitor status # On SysV init systems
Check system logs for mdadm messages
sudo journalctl -u mdmonitor -f # systemd
sudo tail -f /var/log/syslog | grep mdadm
```
Advanced Configuration Options
Custom Monitoring Scripts
Create custom notification scripts for specific events:
```bash
#!/bin/bash
/usr/local/bin/raid-alert.sh
Custom RAID alert script
EVENT="$1"
DEVICE="$2"
COMPONENT="$3"
LOGFILE="/var/log/raid-alerts.log"
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
Log the event
echo "$TIMESTAMP - Event: $EVENT, Device: $DEVICE, Component: $COMPONENT" >> $LOGFILE
case "$EVENT" in
"Fail")
# Critical failure - immediate notification
echo "CRITICAL: RAID device $DEVICE has failed component $COMPONENT" | \
mail -s "URGENT: RAID Failure on $(hostname)" admin@example.com
# Send to monitoring system (e.g., Nagios, Zabbix)
/usr/local/bin/send_alert.sh "CRITICAL" "RAID failure: $DEVICE/$COMPONENT"
;;
"DegradedArray")
# Array is degraded but functional
echo "WARNING: RAID array $DEVICE is running in degraded mode" | \
mail -s "RAID Degraded: $(hostname)" admin@example.com
;;
"SpareActive")
# Spare drive activated
echo "INFO: Spare drive activated for $DEVICE" | \
mail -s "RAID Spare Activated: $(hostname)" admin@example.com
;;
esac
Make script executable
chmod +x /usr/local/bin/raid-alert.sh
```
Fine-tuning Monitor Intervals
Adjust monitoring frequency based on your requirements:
```bash
High-frequency monitoring (every 30 seconds)
sudo mdadm --monitor --scan --daemonise --delay=30
Low-frequency monitoring (every 5 minutes)
sudo mdadm --monitor --scan --daemonise --delay=300
Monitor specific arrays with different intervals
sudo mdadm --monitor /dev/md0 /dev/md1 --delay=60 --daemonise
```
Email Configuration
Set up comprehensive email notifications:
```bash
Configure in mdadm.conf
echo "MAILADDR root@localhost admin@example.com backup-admin@example.com" >> /etc/mdadm/mdadm.conf
Test email functionality
echo "Test RAID monitoring email" | mail -s "RAID Monitor Test" admin@example.com
Configure mail relay if needed
Edit /etc/postfix/main.cf or equivalent MTA configuration
```
Practical Examples and Use Cases
Example 1: Production Server Setup
For a production server with multiple RAID arrays:
```bash
#!/bin/bash
Production RAID monitoring setup script
Update mdadm configuration
sudo mdadm --detail --scan | sudo tee /etc/mdadm/mdadm.conf
Add monitoring configuration
cat << EOF | sudo tee -a /etc/mdadm/mdadm.conf
MAILADDR sysadmin@company.com monitoring@company.com
PROGRAM /opt/monitoring/raid-alert.sh
EOF
Start monitoring with appropriate settings
sudo mdadm --monitor --scan --daemonise --delay=60 --syslog
Enable automatic startup
sudo systemctl enable mdmonitor
sudo systemctl start mdmonitor
echo "RAID monitoring configured for production environment"
```
Example 2: Development Environment
For development systems with less critical monitoring needs:
```bash
Development environment monitoring
sudo mdadm --monitor --scan --daemonise --delay=300 --test
Log to file instead of email
sudo mdadm --monitor --scan --daemonise --delay=300 --program=/usr/local/bin/log-only.sh
```
Example 3: Integration with Monitoring Systems
Integrate with external monitoring systems:
```bash
#!/bin/bash
/usr/local/bin/monitoring-integration.sh
Integration script for external monitoring
EVENT="$1"
DEVICE="$2"
COMPONENT="$3"
Send to Nagios/Icinga
echo "RAID_$EVENT:$DEVICE:$COMPONENT" | /usr/sbin/send_nsca -H monitoring.example.com
Send to Zabbix
zabbix_sender -z zabbix.example.com -s "$(hostname)" -k "raid.status" -o "$EVENT:$DEVICE"
Send to Prometheus Pushgateway
curl -X POST http://pushgateway.example.com:9091/metrics/job/raid_monitor/instance/$(hostname) \
--data-binary "raid_event{device=\"$DEVICE\",component=\"$COMPONENT\"} 1"
Log locally
logger -p local0.warning "RAID Event: $EVENT on $DEVICE ($COMPONENT)"
```
Troubleshooting Common Issues
Issue 1: Monitoring Daemon Not Starting
Symptoms:
- mdadm monitor process doesn't appear in process list
- No monitoring alerts received
- Service fails to start
Solutions:
```bash
Check configuration file syntax
sudo mdadm --config-file=/etc/mdadm/mdadm.conf --test --scan
Verify RAID arrays are properly configured
sudo mdadm --detail --scan
Check for conflicting processes
sudo pkill -f "mdadm.*monitor"
sudo systemctl stop mdmonitor
Start in foreground for debugging
sudo mdadm --monitor --scan --verbose --oneshot
Check system logs
sudo journalctl -u mdmonitor -n 50
sudo tail -f /var/log/messages | grep mdadm
```
Issue 2: Missing or Delayed Notifications
Symptoms:
- RAID events occur but no notifications received
- Delayed notification delivery
- Notifications sent to wrong addresses
Solutions:
```bash
Test email configuration
echo "Test message" | mail -s "Test Subject" your-email@example.com
Check mail queue
mailq
sudo postqueue -f # Flush mail queue
Verify mdadm configuration
grep -E "MAILADDR|PROGRAM" /etc/mdadm/mdadm.conf
Test custom notification scripts
sudo /usr/local/bin/raid-alert.sh "Test" "/dev/md0" "test-component"
Check script permissions and execution
ls -la /usr/local/bin/raid-alert.sh
sudo -u mdadm /usr/local/bin/raid-alert.sh "Test" "/dev/md0" "test"
```
Issue 3: High CPU Usage from Monitor Process
Symptoms:
- mdadm monitor process consuming high CPU
- System performance degradation
- Frequent disk I/O from monitoring
Solutions:
```bash
Increase monitoring delay
sudo pkill -f "mdadm.*monitor"
sudo mdadm --monitor --scan --daemonise --delay=300 # 5 minutes
Monitor system resources
top -p $(pgrep -f "mdadm.*monitor")
iostat -x 1 10
Check for underlying RAID issues
cat /proc/mdstat
sudo mdadm --detail /dev/md0
Review system logs for errors
sudo dmesg | grep -E "(md|raid)"
sudo journalctl -f | grep mdadm
```
Issue 4: False Positive Alerts
Symptoms:
- Frequent unnecessary alerts
- Alerts for normal operations
- Monitoring script triggering incorrectly
Solutions:
```bash
Add filtering to notification scripts
#!/bin/bash
Enhanced raid-alert.sh with filtering
EVENT="$1"
DEVICE="$2"
COMPONENT="$3"
Filter out routine events
case "$EVENT" in
"NewArray"|"ArrayDisappeared")
# Log but don't alert for routine array changes during maintenance
logger "RAID Info: $EVENT on $DEVICE"
exit 0
;;
"TestMessage")
# Skip test messages
exit 0
;;
esac
Continue with normal alert processing...
```
Best Practices and Professional Tips
Security Considerations
```bash
Run monitoring with appropriate permissions
Create dedicated user for mdadm monitoring
sudo useradd -r -s /bin/false -d /var/lib/mdadm mdadm-monitor
Set proper file permissions
sudo chown root:root /etc/mdadm/mdadm.conf
sudo chmod 644 /etc/mdadm/mdadm.conf
Secure notification scripts
sudo chown root:root /usr/local/bin/raid-alert.sh
sudo chmod 755 /usr/local/bin/raid-alert.sh
```
Performance Optimization
```bash
Optimize monitoring intervals based on environment
Production: 60-120 seconds
Development: 300-600 seconds
Critical systems: 30-60 seconds
Use appropriate logging levels
sudo mdadm --monitor --scan --daemonise --delay=60 --syslog --verbose=1
Monitor monitoring performance
#!/bin/bash
Monitor the monitor script
while true; do
MONITOR_PID=$(pgrep -f "mdadm.*monitor")
if [ -n "$MONITOR_PID" ]; then
ps -o pid,pcpu,pmem,time,cmd -p $MONITOR_PID
else
echo "$(date): mdadm monitor not running!" >> /var/log/monitor-check.log
fi
sleep 300
done
```
Backup and Recovery
```bash
Backup mdadm configuration
sudo cp /etc/mdadm/mdadm.conf /etc/mdadm/mdadm.conf.backup.$(date +%Y%m%d)
Create configuration recovery script
#!/bin/bash
/usr/local/bin/recover-mdadm-config.sh
echo "Recovering mdadm configuration..."
sudo mdadm --detail --scan > /tmp/mdadm.conf.recovered
sudo cp /tmp/mdadm.conf.recovered /etc/mdadm/mdadm.conf
sudo systemctl restart mdmonitor
echo "Configuration recovered and monitoring restarted"
```
Monitoring the Monitor
Implement meta-monitoring to ensure your RAID monitoring is working:
```bash
#!/bin/bash
/usr/local/bin/monitor-health-check.sh
Cron job to verify RAID monitoring is functioning
MONITOR_PID=$(pgrep -f "mdadm.*monitor")
LOGFILE="/var/log/monitor-health.log"
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
if [ -z "$MONITOR_PID" ]; then
echo "$TIMESTAMP: ERROR - mdadm monitor not running" >> $LOGFILE
# Restart monitoring
sudo systemctl restart mdmonitor
echo "$TIMESTAMP: Attempted to restart mdmonitor" >> $LOGFILE
# Send alert
echo "RAID monitoring daemon was down and has been restarted on $(hostname)" | \
mail -s "RAID Monitor Service Alert" admin@example.com
else
echo "$TIMESTAMP: OK - mdadm monitor running (PID: $MONITOR_PID)" >> $LOGFILE
fi
Add to crontab
/5 * /usr/local/bin/monitor-health-check.sh
```
Integration with System Services
Systemd Integration
Create a custom systemd service for enhanced control:
```bash
/etc/systemd/system/mdadm-monitor.service
[Unit]
Description=MD Array Monitor
After=multi-user.target
[Service]
Type=forking
ExecStart=/sbin/mdadm --monitor --scan --daemonise --delay=60 --syslog
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
RestartSec=30
[Install]
WantedBy=multi-user.target
Enable and start the service
sudo systemctl daemon-reload
sudo systemctl enable mdadm-monitor.service
sudo systemctl start mdadm-monitor.service
```
Log Rotation
Configure log rotation for monitoring logs:
```bash
/etc/logrotate.d/mdadm-monitor
/var/log/raid-alerts.log {
daily
missingok
rotate 30
compress
delaycompress
notifempty
create 644 root root
postrotate
/usr/bin/systemctl reload mdadm-monitor > /dev/null 2>&1 || true
endscript
}
```
Startup Scripts
For systems without systemd:
```bash
#!/bin/bash
/etc/init.d/mdadm-monitor
SysV init script for mdadm monitoring
case "$1" in
start)
echo "Starting mdadm monitor..."
/sbin/mdadm --monitor --scan --daemonise --delay=60 --syslog
;;
stop)
echo "Stopping mdadm monitor..."
pkill -f "mdadm.*monitor"
;;
restart)
$0 stop
sleep 2
$0 start
;;
status)
if pgrep -f "mdadm.*monitor" > /dev/null; then
echo "mdadm monitor is running"
else
echo "mdadm monitor is not running"
fi
;;
*)
echo "Usage: $0 {start|stop|restart|status}"
exit 1
;;
esac
Make executable and add to system startup
chmod +x /etc/init.d/mdadm-monitor
update-rc.d mdadm-monitor defaults
```
Conclusion
Implementing robust RAID monitoring with `mdadm --monitor --scan --daemonise` is essential for maintaining data integrity and system reliability in Linux environments. This comprehensive guide has covered everything from basic setup to advanced configuration, troubleshooting, and best practices.
Key Takeaways
1. Automated Monitoring: The `--scan --daemonise` combination provides hands-off monitoring of all configured RAID arrays
2. Flexible Alerting: Custom scripts and multiple notification methods ensure you're informed of issues promptly
3. Proactive Maintenance: Regular monitoring helps identify potential problems before they cause data loss
4. Integration Capabilities: mdadm monitoring integrates well with existing system monitoring infrastructure
Next Steps
After implementing mdadm monitoring:
1. Test Your Setup: Regularly test notification systems and recovery procedures
2. Document Procedures: Maintain clear documentation of your monitoring configuration
3. Review and Optimize: Periodically review monitoring logs and adjust settings as needed
4. Plan for Scaling: Consider how monitoring will scale as you add more RAID arrays
Final Recommendations
- Always test monitoring configurations in non-production environments first
- Implement redundant notification methods (email, SMS, monitoring systems)
- Regularly review and update monitoring scripts and configurations
- Keep mdadm and related tools updated to the latest stable versions
- Maintain comprehensive backup strategies alongside RAID monitoring
By following the practices outlined in this guide, you'll have a robust, reliable RAID monitoring system that helps protect your data and ensures system availability. Remember that monitoring is just one part of a comprehensive data protection strategy – regular backups, proper hardware maintenance, and documented procedures are equally important for maintaining a healthy storage infrastructure.