How to generate reports with shell scripts

How to Generate Reports with Shell Scripts Shell scripting provides a powerful and flexible approach to generating automated reports from various data sources. Whether you need system monitoring reports, log analysis summaries, or business intelligence dashboards, shell scripts offer an efficient way to collect, process, and format data into meaningful reports. This comprehensive guide will teach you everything from basic report generation to advanced automation techniques. Table of Contents 1. [Prerequisites and Requirements](#prerequisites-and-requirements) 2. [Understanding Report Generation Fundamentals](#understanding-report-generation-fundamentals) 3. [Setting Up Your Environment](#setting-up-your-environment) 4. [Basic Report Structure and Components](#basic-report-structure-and-components) 5. [Data Collection Techniques](#data-collection-techniques) 6. [Formatting and Presentation](#formatting-and-presentation) 7. [Advanced Report Features](#advanced-report-features) 8. [Automation and Scheduling](#automation-and-scheduling) 9. [Real-World Examples](#real-world-examples) 10. [Troubleshooting Common Issues](#troubleshooting-common-issues) 11. [Best Practices and Tips](#best-practices-and-tips) 12. [Conclusion](#conclusion) Prerequisites and Requirements Before diving into shell script report generation, ensure you have the following prerequisites: System Requirements - Linux, macOS, or Unix-based operating system - Bash shell (version 4.0 or higher recommended) - Basic command-line tools: `awk`, `sed`, `grep`, `sort`, `cut` - Text editor (vim, nano, or your preferred editor) - Appropriate permissions for accessing data sources Knowledge Prerequisites - Basic understanding of shell scripting concepts - Familiarity with command-line operations - Understanding of file systems and permissions - Basic knowledge of text processing tools Optional Tools - `jq` for JSON processing - `xmlstarlet` for XML parsing - `csvkit` for CSV manipulation - Database clients (mysql, psql) if working with databases - Email utilities (mailx, sendmail) for report distribution Understanding Report Generation Fundamentals Report generation with shell scripts involves several key components that work together to create comprehensive, automated reporting solutions. Core Components of Shell Script Reports 1. Data Collection: Gathering information from various sources 2. Data Processing: Cleaning, filtering, and transforming raw data 3. Data Analysis: Performing calculations and generating insights 4. Formatting: Creating readable, professional output 5. Distribution: Delivering reports to stakeholders Types of Reports You Can Generate Shell scripts excel at creating various types of reports: - System monitoring reports: CPU usage, memory consumption, disk space - Log analysis reports: Error summaries, access patterns, security events - Performance reports: Application metrics, response times, throughput - Business reports: Sales summaries, user activity, inventory status - Compliance reports: Security audits, configuration checks, policy violations Setting Up Your Environment Let's establish a proper working environment for report generation: Creating the Directory Structure ```bash #!/bin/bash Create directory structure for report generation mkdir -p ~/reports/{scripts,templates,output,logs,config} mkdir -p ~/reports/output/{daily,weekly,monthly} mkdir -p ~/reports/templates/{html,text,csv} echo "Report generation environment created successfully!" ``` Basic Configuration File Create a configuration file to store common settings: ```bash config/report_config.sh #!/bin/bash Report Configuration REPORT_DIR="$HOME/reports" OUTPUT_DIR="$REPORT_DIR/output" LOG_DIR="$REPORT_DIR/logs" TEMPLATE_DIR="$REPORT_DIR/templates" Email settings SMTP_SERVER="smtp.company.com" FROM_EMAIL="reports@company.com" ADMIN_EMAIL="admin@company.com" Database settings (if applicable) DB_HOST="localhost" DB_USER="report_user" DB_NAME="analytics" Date formats DATE_FORMAT="%Y-%m-%d" DATETIME_FORMAT="%Y-%m-%d %H:%M:%S" ``` Basic Report Structure and Components Every effective report script should follow a consistent structure: Template Report Script ```bash #!/bin/bash Basic Report Template Description: Template for generating system reports Author: Your Name Date: $(date) Source configuration source ~/reports/config/report_config.sh Global variables SCRIPT_NAME=$(basename "$0") REPORT_DATE=$(date +"$DATE_FORMAT") REPORT_TIME=$(date +"$DATETIME_FORMAT") LOG_FILE="$LOG_DIR/${SCRIPT_NAME%.sh}_${REPORT_DATE}.log" Logging function log_message() { local level="$1" local message="$2" echo "[$REPORT_TIME] [$level] $message" | tee -a "$LOG_FILE" } Error handling error_exit() { log_message "ERROR" "$1" exit 1 } Header function generate_header() { cat << EOF ===================================== System Report - $REPORT_DATE Generated: $REPORT_TIME ===================================== EOF } Footer function generate_footer() { cat << EOF ===================================== Report completed at: $(date +"$DATETIME_FORMAT") Generated by: $SCRIPT_NAME ===================================== EOF } Main report function generate_report() { local output_file="$OUTPUT_DIR/daily/system_report_$REPORT_DATE.txt" log_message "INFO" "Starting report generation" { generate_header # Add your report content here generate_footer } > "$output_file" log_message "INFO" "Report saved to: $output_file" } Main execution main() { log_message "INFO" "Script started" generate_report log_message "INFO" "Script completed successfully" } Execute main function main "$@" ``` Data Collection Techniques Effective report generation requires robust data collection methods. Here are various techniques for gathering information: System Information Collection ```bash #!/bin/bash System Information Collector collect_system_info() { cat << EOF SYSTEM INFORMATION ================== Hostname: $(hostname) Operating System: $(uname -o) Kernel Version: $(uname -r) Architecture: $(uname -m) Uptime: $(uptime -p) Current Users: $(who | wc -l) Load Average: $(uptime | awk -F'load average:' '{print $2}') EOF } collect_resource_usage() { cat << EOF RESOURCE USAGE ============== Memory Usage: $(free -h) Disk Usage: $(df -h | grep -E '^/dev/') Top 5 CPU Consuming Processes: $(ps aux --sort=-%cpu | head -6) Top 5 Memory Consuming Processes: $(ps aux --sort=-%mem | head -6) EOF } ``` Log File Analysis ```bash #!/bin/bash Log Analysis Functions analyze_apache_logs() { local log_file="/var/log/apache2/access.log" local start_date="$1" if [[ ! -f "$log_file" ]]; then echo "Log file not found: $log_file" return 1 fi cat << EOF APACHE LOG ANALYSIS - $start_date ================================= Total Requests: $(wc -l < "$log_file") Unique Visitors: $(awk '{print $1}' "$log_file" | sort -u | wc -l) Top 10 Requested URLs: $(awk '{print $7}' "$log_file" | sort | uniq -c | sort -nr | head -10) Top 10 IP Addresses: $(awk '{print $1}' "$log_file" | sort | uniq -c | sort -nr | head -10) HTTP Status Code Summary: $(awk '{print $9}' "$log_file" | sort | uniq -c | sort -nr) EOF } analyze_error_logs() { local error_log="/var/log/apache2/error.log" local today=$(date +%Y-%m-%d) cat << EOF ERROR LOG ANALYSIS - $today =========================== Today's Errors: $(grep "$today" "$error_log" | wc -l) Error Types: $(grep "$today" "$error_log" | awk -F'] ' '{print $3}' | awk '{print $1}' | sort | uniq -c | sort -nr) Recent Critical Errors: $(grep -i "critical\|fatal\|emergency" "$error_log" | tail -5) EOF } ``` Database Data Collection ```bash #!/bin/bash Database Data Collection collect_mysql_stats() { local db_host="$1" local db_user="$2" local db_pass="$3" cat << EOF DATABASE STATISTICS =================== $(mysql -h "$db_host" -u "$db_user" -p"$db_pass" -e " SELECT 'Total Databases' as Metric, COUNT(*) as Value FROM information_schema.SCHEMATA UNION ALL SELECT 'Total Tables', COUNT(*) FROM information_schema.TABLES UNION ALL SELECT 'Database Size (MB)', ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) FROM information_schema.TABLES; ") Top 10 Largest Tables: $(mysql -h "$db_host" -u "$db_user" -p"$db_pass" -e " SELECT table_schema as 'Database', table_name as 'Table', ROUND(((data_length + index_length) / 1024 / 1024), 2) as 'Size (MB)' FROM information_schema.TABLES ORDER BY (data_length + index_length) DESC LIMIT 10; ") EOF } ``` Formatting and Presentation Professional report presentation significantly impacts readability and usefulness. Here are various formatting techniques: HTML Report Generation ```bash #!/bin/bash HTML Report Generator generate_html_header() { local title="$1" cat << EOF $title EOF } generate_html_footer() { cat << EOF EOF } create_html_table() { local data="$1" local headers="$2" echo "" echo "" for header in $headers; do echo "" done echo "" echo "$data" | while IFS= read -r line; do echo "" for field in $line; do echo "" done echo "" done echo "
$header
$field
" } ``` CSV Report Generation ```bash #!/bin/bash CSV Report Functions generate_csv_report() { local output_file="$1" local title="$2" # CSV Header echo "Metric,Value,Timestamp" > "$output_file" # System metrics echo "CPU Usage,$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1),$(date)" >> "$output_file" echo "Memory Usage,$(free | grep Mem | awk '{printf "%.2f", $3/$2 * 100.0}'),$(date)" >> "$output_file" echo "Disk Usage,$(df / | tail -1 | awk '{print $5}' | cut -d'%' -f1),$(date)" >> "$output_file" echo "Load Average,$(uptime | awk -F'load average:' '{print $2}' | cut -d',' -f1),$(date)" >> "$output_file" } csv_to_html_table() { local csv_file="$1" echo "" # Header row head -1 "$csv_file" | sed 's/,/<\/th>
/g; s/^/
/; s/$/<\/th><\/tr>/' # Data rows tail -n +2 "$csv_file" | sed 's/,/<\/td>/g; s/^/
/; s/$/<\/td><\/tr>/' echo "
" } ``` Advanced Formatting with Charts ```bash #!/bin/bash Simple ASCII Chart Generation generate_bar_chart() { local data="$1" local title="$2" local max_width=50 echo "$title" echo "$(printf '=%.0s' $(seq 1 ${#title}))" echo # Find maximum value for scaling local max_val=$(echo "$data" | awk '{print $2}' | sort -n | tail -1) echo "$data" | while read -r label value; do local bar_length=$(( value * max_width / max_val )) local bar=$(printf '#%.0s' $(seq 1 $bar_length)) printf "%-15s |%s %d\n" "$label" "$bar" "$value" done } Usage example: echo -e "CPU\t85\nMemory\t67\nDisk\t45\nNetwork\t23" | generate_bar_chart "Resource Usage (%)" ``` Advanced Report Features Conditional Alerts and Notifications ```bash #!/bin/bash Alert System for Reports check_system_health() { local alerts=() # Check CPU usage local cpu_usage=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1) if (( $(echo "$cpu_usage > 80" | bc -l) )); then alerts+=("HIGH CPU USAGE: ${cpu_usage}%") fi # Check memory usage local mem_usage=$(free | grep Mem | awk '{printf "%.1f", $3/$2 * 100.0}') if (( $(echo "$mem_usage > 85" | bc -l) )); then alerts+=("HIGH MEMORY USAGE: ${mem_usage}%") fi # Check disk usage local disk_usage=$(df / | tail -1 | awk '{print $5}' | cut -d'%' -f1) if [[ $disk_usage -gt 90 ]]; then alerts+=("HIGH DISK USAGE: ${disk_usage}%") fi # Generate alert section if [[ ${#alerts[@]} -gt 0 ]]; then echo "⚠️ SYSTEM ALERTS" echo "==================" for alert in "${alerts[@]}"; do echo "πŸ”΄ $alert" done echo # Send email notification send_alert_email "${alerts[@]}" else echo "βœ… SYSTEM STATUS: ALL NORMAL" echo "============================" echo fi } send_alert_email() { local alerts=("$@") local subject="System Alert - $(hostname) - $(date +%Y-%m-%d)" local body="The following alerts were detected:\n\n" for alert in "${alerts[@]}"; do body+="- $alert\n" done body+="\nPlease investigate immediately.\n\nGenerated by: $SCRIPT_NAME" echo -e "$body" | mail -s "$subject" "$ADMIN_EMAIL" } ``` Historical Data Comparison ```bash #!/bin/bash Historical Data Comparison compare_with_previous() { local current_value="$1" local metric_name="$2" local history_file="$LOG_DIR/${metric_name}_history.log" # Save current value echo "$(date +%s),$current_value" >> "$history_file" # Get previous value (from yesterday) local yesterday=$(date -d "yesterday" +%s) local previous_value=$(awk -F',' -v target="$yesterday" ' { if ($1 <= target) { closest = $2 } } END { print closest } ' "$history_file") if [[ -n "$previous_value" ]]; then local change=$(echo "scale=2; $current_value - $previous_value" | bc) local percent_change=$(echo "scale=2; ($change / $previous_value) * 100" | bc 2>/dev/null || echo "0") echo "$metric_name: $current_value (Change: $change, ${percent_change}%)" # Generate trend indicator if (( $(echo "$change > 0" | bc -l) )); then echo " πŸ“ˆ Trending UP" elif (( $(echo "$change < 0" | bc -l) )); then echo " πŸ“‰ Trending DOWN" else echo " ➑️ No change" fi else echo "$metric_name: $current_value (No historical data)" fi } ``` Multi-Source Data Aggregation ```bash #!/bin/bash Multi-Source Data Aggregation aggregate_server_data() { local servers=("web1.company.com" "web2.company.com" "db1.company.com") local temp_dir="/tmp/report_$$" mkdir -p "$temp_dir" echo "MULTI-SERVER REPORT" echo "===================" echo for server in "${servers[@]}"; do echo "Collecting data from $server..." # Collect data via SSH ssh "$server" " echo 'SERVER: $server' echo 'CPU: '$(top -bn1 | grep \"Cpu(s)\" | awk '{print $2}') echo 'Memory: '$(free | grep Mem | awk '{printf \"%.1f%%\", $3/$2 * 100.0}') echo 'Disk: '$(df / | tail -1 | awk '{print $5}') echo 'Uptime: '$(uptime -p) echo '---' " > "$temp_dir/${server}.data" 2>/dev/null if [[ $? -eq 0 ]]; then cat "$temp_dir/${server}.data" else echo "❌ Failed to connect to $server" fi echo done # Cleanup rm -rf "$temp_dir" } ``` Automation and Scheduling Cron Integration ```bash #!/bin/bash Cron Setup Script setup_report_cron() { local cron_file="/tmp/report_cron" cat << 'EOF' > "$cron_file" Daily system report at 6 AM 0 6 * /home/user/reports/scripts/daily_report.sh Weekly report every Monday at 7 AM 0 7 1 /home/user/reports/scripts/weekly_report.sh Monthly report on the 1st at 8 AM 0 8 1 /home/user/reports/scripts/monthly_report.sh Hourly monitoring report 0 /home/user/reports/scripts/hourly_monitor.sh EOF crontab "$cron_file" rm "$cron_file" echo "Cron jobs installed successfully!" echo "Current crontab:" crontab -l } ``` Report Rotation and Cleanup ```bash #!/bin/bash Report Cleanup and Rotation cleanup_old_reports() { local retention_days=30 local archive_days=7 echo "Starting report cleanup..." # Archive recent reports find "$OUTPUT_DIR" -name "*.txt" -mtime +$archive_days -mtime -$retention_days -exec gzip {} \; # Remove old reports find "$OUTPUT_DIR" -name "*.txt.gz" -mtime +$retention_days -delete find "$OUTPUT_DIR" -name "*.txt" -mtime +$retention_days -delete # Clean log files find "$LOG_DIR" -name "*.log" -mtime +$retention_days -delete echo "Cleanup completed." # Report cleanup statistics echo "Current storage usage:" du -sh "$REPORT_DIR"/* } ``` Real-World Examples Complete System Monitoring Report ```bash #!/bin/bash Complete System Monitoring Report File: system_monitor.sh source ~/reports/config/report_config.sh REPORT_FILE="$OUTPUT_DIR/daily/system_monitor_$(date +%Y%m%d).html" generate_system_report() { { generate_html_header "System Monitoring Report - $(hostname)" cat << EOF

System Monitoring Report

Server: $(hostname)

Generated: $(date)

System Overview

Uptime: $(uptime -p)
Load Average: $(uptime | awk -F'load average:' '{print $2}')
Users Online: $(who | wc -l)

Resource Usage

EOF # Check for high resource usage and apply appropriate CSS class local cpu_usage=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1) local mem_usage=$(free | grep Mem | awk '{printf "%.1f", $3/$2 * 100.0}') if (( $(echo "$cpu_usage > 80" | bc -l) )); then echo '
' else echo '
' fi echo "CPU Usage: ${cpu_usage}%
" if (( $(echo "$mem_usage > 85" | bc -l) )); then echo '
' else echo '
' fi echo "Memory Usage: ${mem_usage}%
" cat << EOF

Disk Usage

EOF df -h | grep -E '^/dev/' | while read -r filesystem size used avail percent mount; do local usage_num=$(echo "$percent" | cut -d'%' -f1) if [[ $usage_num -gt 90 ]]; then echo "" else echo "" fi echo "" done cat << EOF
FilesystemSizeUsedAvailableUse%Mounted on
$filesystem$size$used$avail$percent$mount

Top Processes

CPU Usage

EOF ps aux --sort=-%cpu | head -6 | tail -5 | while read -r user pid cpu mem vsz rss tty stat start time command; do echo "" done echo "
PIDUserCPU%Memory%Command
$pid$user$cpu$mem$command
" generate_html_footer } > "$REPORT_FILE" echo "System report generated: $REPORT_FILE" } Execute report generation generate_system_report Email the report if [[ -f "$REPORT_FILE" ]]; then mail -s "Daily System Report - $(hostname)" -a "$REPORT_FILE" "$ADMIN_EMAIL" < /dev/null fi ``` Log Analysis Report ```bash #!/bin/bash Web Server Log Analysis Report File: web_log_analyzer.sh APACHE_LOG="/var/log/apache2/access.log" ERROR_LOG="/var/log/apache2/error.log" REPORT_DATE=$(date +%Y-%m-%d) YESTERDAY=$(date -d "yesterday" +%Y-%m-%d) analyze_web_logs() { local output_file="$OUTPUT_DIR/daily/web_analysis_$REPORT_DATE.txt" { echo "WEB SERVER LOG ANALYSIS REPORT" echo "==============================" echo "Date: $REPORT_DATE" echo "Server: $(hostname)" echo # Yesterday's statistics echo "YESTERDAY'S STATISTICS ($YESTERDAY)" echo "==================================" local yesterday_requests=$(grep "$YESTERDAY" "$APACHE_LOG" | wc -l) local yesterday_unique_ips=$(grep "$YESTERDAY" "$APACHE_LOG" | awk '{print $1}' | sort -u | wc -l) local yesterday_bandwidth=$(grep "$YESTERDAY" "$APACHE_LOG" | awk '{sum += $10} END {printf "%.2f MB", sum/1024/1024}') echo "Total Requests: $yesterday_requests" echo "Unique Visitors: $yesterday_unique_ips" echo "Bandwidth Used: $yesterday_bandwidth" echo # Top URLs echo "TOP 10 REQUESTED URLS" echo "====================" grep "$YESTERDAY" "$APACHE_LOG" | awk '{print $7}' | sort | uniq -c | sort -nr | head -10 echo # Top IP addresses echo "TOP 10 IP ADDRESSES" echo "==================" grep "$YESTERDAY" "$APACHE_LOG" | awk '{print $1}' | sort | uniq -c | sort -nr | head -10 echo # HTTP status codes echo "HTTP STATUS CODE SUMMARY" echo "=======================" grep "$YESTERDAY" "$APACHE_LOG" | awk '{print $9}' | sort | uniq -c | sort -nr echo # Error analysis echo "ERROR ANALYSIS" echo "=============" local error_count=$(grep "$YESTERDAY" "$ERROR_LOG" | wc -l) echo "Total Errors: $error_count" if [[ $error_count -gt 0 ]]; then echo echo "Error Types:" grep "$YESTERDAY" "$ERROR_LOG" | awk -F'] ' '{print $3}' | awk '{print $1}' | sort | uniq -c | sort -nr echo echo "Recent Critical Errors:" grep -i "critical\|fatal\|emergency" "$ERROR_LOG" | grep "$YESTERDAY" | tail -5 fi echo echo "Report generated at: $(date)" } > "$output_file" echo "Web log analysis completed: $output_file" } Execute analysis if [[ -f "$APACHE_LOG" ]]; then analyze_web_logs else echo "Error: Apache log file not found at $APACHE_LOG" exit 1 fi ``` Troubleshooting Common Issues Permission Problems ```bash #!/bin/bash Permission Checker and Fixer check_permissions() { local issues=() # Check script permissions if [[ ! -x "$0" ]]; then issues+=("Script is not executable: $0") fi # Check directory permissions for dir in "$REPORT_DIR" "$OUTPUT_DIR" "$LOG_DIR"; do if [[ ! -w "$dir" ]]; then issues+=("Directory not writable: $dir") fi done # Check log file access for log_file in "/var/log/apache2/access.log" "/var/log/syslog"; do if [[ -f "$log_file" && ! -r "$log_file" ]]; then issues+=("Log file not readable: $log_file") fi done if [[ ${#issues[@]} -gt 0 ]]; then echo "Permission Issues Found:" for issue in "${issues[@]}"; do echo "❌ $issue" done return 1 else echo "βœ… All permissions OK" return 0 fi } fix_permissions() { echo "Attempting to fix permission issues..." # Make script executable chmod +x "$0" # Create directories with proper permissions mkdir -p "$REPORT_DIR" "$OUTPUT_DIR" "$LOG_DIR" chmod 755 "$REPORT_DIR" "$OUTPUT_DIR" "$LOG_DIR" # Set up log rotation for large log files setup_log_rotation echo "Permission fixes completed." } setup_log_rotation() { local logrotate_config="/tmp/report_logrotate" cat << EOF > "$logrotate_config" $LOG_DIR/*.log { daily rotate 30 compress delaycompress missingok notifempty create 644 } EOF sudo cp "$logrotate_config" "/etc/logrotate.d/reports" rm "$logrotate_config" } ``` Data Validation ```bash #!/bin/bash Data Validation Functions validate_numeric() { local value="$1" local field_name="$2" if [[ ! "$value" =~ ^[0-9]+\.?[0-9]*$ ]]; then log_message "WARNING" "Invalid numeric value for $field_name: $value" return 1 fi return 0 } validate_date() { local date_string="$1" if ! date -d "$date_string" >/dev/null 2>&1; then log_message "ERROR" "Invalid date format: $date_string" return 1 fi return 0 } sanitize_input() { local input="$1" # Remove potentially dangerous characters echo "$input" | sed 's/[;&|`$(){}[\]*?<>]//g' } validate_file_exists() { local file_path="$1" local description="$2" if [[ ! -f "$file_path" ]]; then log_message "ERROR" "$description not found: $file_path" return 1 fi if [[ ! -r "$file_path" ]]; then log_message "ERROR" "$description not readable: $file_path" return 1 fi return 0 } check_disk_space() { local required_mb="$1" local path="$2" local available_mb=$(df "$path" | tail -1 | awk '{print int($4/1024)}') if [[ $available_mb -lt $required_mb ]]; then log_message "ERROR" "Insufficient disk space. Required: ${required_mb}MB, Available: ${available_mb}MB" return 1 fi return 0 } ``` Error Recovery and Resilience ```bash #!/bin/bash Error Recovery Functions retry_command() { local max_attempts="$1" local delay="$2" shift 2 local command=("$@") local attempt=1 while [[ $attempt -le $max_attempts ]]; do if "${command[@]}"; then return 0 else log_message "WARNING" "Command failed (attempt $attempt/$max_attempts): ${command[*]}" if [[ $attempt -lt $max_attempts ]]; then sleep "$delay" fi ((attempt++)) fi done log_message "ERROR" "Command failed after $max_attempts attempts: ${command[*]}" return 1 } safe_remote_execution() { local server="$1" local command="$2" local timeout="$3" timeout "$timeout" ssh -o ConnectTimeout=10 -o StrictHostKeyChecking=no "$server" "$command" 2>/dev/null local exit_code=$? case $exit_code in 0) return 0 ;; 124) log_message "WARNING" "Command timed out on $server" return 1 ;; 255) log_message "WARNING" "SSH connection failed to $server" return 1 ;; *) log_message "WARNING" "Command failed on $server with exit code $exit_code" return 1 ;; esac } create_backup() { local source_file="$1" local backup_dir="$2" if [[ -f "$source_file" ]]; then local backup_file="$backup_dir/$(basename "$source_file").$(date +%Y%m%d_%H%M%S).backup" cp "$source_file" "$backup_file" log_message "INFO" "Backup created: $backup_file" fi } ``` Best Practices and Tips Code Organization and Maintainability Following best practices ensures your reporting scripts are maintainable, reliable, and scalable: Modular Design ```bash #!/bin/bash Modular Report Design Example Source common functions source "${SCRIPT_DIR}/lib/logging.sh" source "${SCRIPT_DIR}/lib/data_collection.sh" source "${SCRIPT_DIR}/lib/formatting.sh" source "${SCRIPT_DIR}/lib/email.sh" Main report orchestrator main() { init_logging validate_environment collect_data || error_exit "Data collection failed" process_data || error_exit "Data processing failed" format_report || error_exit "Report formatting failed" distribute_report || error_exit "Report distribution failed" cleanup_temp_files log_message "INFO" "Report generation completed successfully" } ``` Configuration Management ```bash #!/bin/bash Advanced Configuration Management load_config() { local config_file="$1" local env="${2:-production}" # Load base configuration if [[ -f "$config_file" ]]; then source "$config_file" else error_exit "Configuration file not found: $config_file" fi # Load environment-specific configuration local env_config="${config_file%.sh}_${env}.sh" if [[ -f "$env_config" ]]; then source "$env_config" log_message "INFO" "Loaded environment configuration: $env" fi } validate_config() { local required_vars=("REPORT_DIR" "OUTPUT_DIR" "LOG_DIR" "ADMIN_EMAIL") for var in "${required_vars[@]}"; do if [[ -z "${!var}" ]]; then error_exit "Required configuration variable not set: $var" fi done } ``` Security Considerations Secure Credential Handling ```bash #!/bin/bash Secure Credential Management load_credentials() { local cred_file="$HOME/.config/reports/credentials" # Check file permissions if [[ -f "$cred_file" ]]; then local perms=$(stat -c %a "$cred_file") if [[ "$perms" != "600" ]]; then error_exit "Credentials file has incorrect permissions. Expected 600, found $perms" fi source "$cred_file" else error_exit "Credentials file not found: $cred_file" fi } Use environment variables for sensitive data get_db_password() { if [[ -n "$DB_PASSWORD" ]]; then echo "$DB_PASSWORD" elif [[ -f "/var/run/secrets/db_password" ]]; then cat "/var/run/secrets/db_password" else error_exit "Database password not available" fi } ``` Input Sanitization ```bash #!/bin/bash Input Sanitization Functions sanitize_sql_input() { local input="$1" # Remove SQL injection patterns echo "$input" | sed "s/[';\\\\]//g" | sed "s/--.*$//" } sanitize_filename() { local filename="$1" # Remove path traversal and dangerous characters echo "$filename" | sed 's/[^a-zA-Z0-9._-]/_/g' | sed 's/^\.*//' | cut -c1-255 } validate_email() { local email="$1" if [[ "$email" =~ ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ ]]; then return 0 else return 1 fi } ``` Performance Optimization Efficient Data Processing ```bash #!/bin/bash Performance Optimization Techniques Use process substitution for large datasets process_large_dataset() { local data_source="$1" while IFS= read -r line; do # Process each line efficiently process_line "$line" done < <(get_data_stream "$data_source") } Parallel processing for multiple servers collect_parallel_data() { local servers=("$@") local temp_dir="/tmp/parallel_$$" mkdir -p "$temp_dir" # Start parallel processes for server in "${servers[@]}"; do { collect_server_data "$server" > "$temp_dir/$server.data" } & done # Wait for all processes to complete wait # Combine results cat "$temp_dir"/*.data > "$temp_dir/combined.data" cat "$temp_dir/combined.data" rm -rf "$temp_dir" } Memory-efficient log processing analyze_large_log() { local log_file="$1" local temp_file="/tmp/analysis_$$" # Use streaming processing to avoid loading entire file into memory awk ' { # Count HTTP status codes status_codes[$9]++ # Count IP addresses ip_addresses[$1]++ # Track hourly requests hour = substr($4, 14, 2) hourly_requests[hour]++ } END { print "Status Codes:" for (code in status_codes) { printf "%s: %d\n", code, status_codes[code] } print "\nTop 10 IP Addresses:" PROCINFO["sorted_in"] = "@val_num_desc" count = 0 for (ip in ip_addresses) { printf "%s: %d\n", ip, ip_addresses[ip] if (++count >= 10) break } print "\nHourly Distribution:" for (hour in hourly_requests) { printf "%02d:00: %d\n", hour, hourly_requests[hour] } }' "$log_file" } ``` Testing and Quality Assurance Unit Testing Framework ```bash #!/bin/bash Simple Testing Framework for Shell Scripts run_tests() { local test_count=0 local passed_count=0 local failed_tests=() echo "Running report generation tests..." # Test data collection if test_data_collection; then ((passed_count++)) else failed_tests+=("data_collection") fi ((test_count++)) # Test formatting if test_formatting; then ((passed_count++)) else failed_tests+=("formatting") fi ((test_count++)) # Test validation if test_validation; then ((passed_count++)) else failed_tests+=("validation") fi ((test_count++)) # Print results echo "Test Results: $passed_count/$test_count passed" if [[ ${#failed_tests[@]} -gt 0 ]]; then echo "Failed tests: ${failed_tests[*]}" return 1 fi return 0 } test_data_collection() { local test_data=$(collect_system_info 2>/dev/null) [[ -n "$test_data" ]] && [[ "$test_data" =~ "SYSTEM INFORMATION" ]] } test_formatting() { local test_input="Test,Data,123" local formatted=$(echo "$test_input" | format_csv_line 2>/dev/null) [[ "$formatted" =~ "" ]] } test_validation() { validate_numeric "123.45" "test_field" >/dev/null 2>&1 } ``` Documentation and Maintenance Self-Documenting Scripts ```bash #!/bin/bash Self-Documenting Report Script Purpose: Generate comprehensive system monitoring reports Author: Your Name Version: 2.1.0 Created: 2024-01-15 Modified: 2024-01-20 Dependencies: - bash 4.0+ - awk, sed, grep - mailx (for email notifications) - bc (for calculations) Configuration: - Edit ~/reports/config/report_config.sh - Set up cron job for automation - Ensure proper permissions on log files Usage: ./system_report.sh [options] Options: -h, --help Show this help message -v, --verbose Enable verbose output -d, --debug Enable debug mode -o, --output Specify output directory Examples: ./system_report.sh # Generate standard report ./system_report.sh -v -o /tmp # Verbose output to /tmp ./system_report.sh --debug # Debug mode for troubleshooting show_help() { cat << 'EOF' System Report Generator USAGE: system_report.sh [OPTIONS] OPTIONS: -h, --help Show this help message -v, --verbose Enable verbose output -d, --debug Enable debug mode -o, --output DIR Specify output directory -f, --format FMT Output format (html, text, csv) -e, --email ADDR Email address for report delivery EXAMPLES: # Generate standard HTML report ./system_report.sh # Generate text report with verbose output ./system_report.sh -v -f text # Generate report and email to admin ./system_report.sh -e admin@company.com # Debug mode for troubleshooting ./system_report.sh --debug CONFIGURATION: Edit ~/reports/config/report_config.sh to customize: - Output directories - Email settings - Alert thresholds - Database connections AUTOMATION: Add to crontab for scheduled execution: 0 6 * /path/to/system_report.sh -e admin@company.com For more information, visit: https://company.com/docs/reports EOF } ``` Advanced Monitoring and Alerting Health Check Integration ```bash #!/bin/bash Health Check and Monitoring Integration perform_health_check() { local health_score=100 local issues=() # Check system resources local cpu_usage=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1) if (( $(echo "$cpu_usage > 90" | bc -l) )); then health_score=$((health_score - 30)) issues+=("Critical CPU usage: ${cpu_usage}%") elif (( $(echo "$cpu_usage > 80" | bc -l) )); then health_score=$((health_score - 15)) issues+=("High CPU usage: ${cpu_usage}%") fi # Check memory usage local mem_usage=$(free | grep Mem | awk '{printf "%.1f", $3/$2 * 100.0}') if (( $(echo "$mem_usage > 95" | bc -l) )); then health_score=$((health_score - 35)) issues+=("Critical memory usage: ${mem_usage}%") elif (( $(echo "$mem_usage > 85" | bc -l) )); then health_score=$((health_score - 20)) issues+=("High memory usage: ${mem_usage}%") fi # Check disk usage while read -r filesystem percent; do local usage_num=$(echo "$percent" | cut -d'%' -f1) if [[ $usage_num -gt 95 ]]; then health_score=$((health_score - 25)) issues+=("Critical disk usage on $filesystem: $percent") elif [[ $usage_num -gt 85 ]]; then health_score=$((health_score - 10)) issues+=("High disk usage on $filesystem: $percent") fi done < <(df -h | grep -E '^/dev/' | awk '{print $1, $5}') # Generate health report { echo "SYSTEM HEALTH CHECK" echo "===================" echo "Health Score: $health_score/100" echo "Timestamp: $(date)" echo if [[ $health_score -ge 90 ]]; then echo "Status: βœ… EXCELLENT" elif [[ $health_score -ge 75 ]]; then echo "Status: ⚠️ GOOD" elif [[ $health_score -ge 50 ]]; then echo "Status: ⚠️ WARNING" else echo "Status: πŸ”΄ CRITICAL" fi if [[ ${#issues[@]} -gt 0 ]]; then echo echo "Issues Detected:" for issue in "${issues[@]}"; do echo "- $issue" done fi } | tee "$OUTPUT_DIR/health_check_$(date +%Y%m%d_%H%M%S).txt" return $health_score } ``` Conclusion Shell script report generation provides a powerful, flexible, and cost-effective solution for automating data collection, analysis, and presentation tasks. Throughout this comprehensive guide, we've explored the fundamental concepts, practical techniques, and advanced features that make shell scripting an excellent choice for report automation. Key Takeaways 1. Flexibility and Integration: Shell scripts excel at integrating data from multiple sources, including system commands, log files, databases, and remote servers. This versatility makes them ideal for creating comprehensive reports that span multiple systems and data sources. 2. Automation Capabilities: With proper scheduling through cron jobs and robust error handling, shell script reports can run unattended, providing consistent and reliable reporting without manual intervention. 3. Cost-Effectiveness: Unlike expensive commercial reporting tools, shell scripts leverage existing system tools and require no additional licensing costs, making them an economical choice for organizations of all sizes. 4. Customization: The programmatic nature of shell scripts allows for unlimited customization, enabling you to create reports that exactly match your organization's needs and formatting requirements. 5. Performance: When properly optimized, shell scripts can process large datasets efficiently, especially when combined with powerful Unix tools like `awk`, `sed`, and `grep`. Implementation Strategy When implementing shell script reporting in your organization, consider this phased approach: Phase 1: Foundation - Set up the directory structure and configuration management - Implement basic logging and error handling - Create simple reports for critical systems Phase 2: Enhancement - Add HTML formatting and visual improvements - Implement alert systems and notifications - Create historical data tracking Phase 3: Automation - Set up automated scheduling with cron - Implement report distribution via email - Add cleanup and maintenance routines Phase 4: Advanced Features - Integrate multiple data sources - Add trend analysis and historical comparisons - Implement advanced security measures Future Considerations As your reporting needs evolve, consider these potential enhancements: - Integration with monitoring tools like Nagios, Zabbix, or Prometheus - API integration for cloud services and modern applications - Container-based deployment for scalability and consistency - Integration with configuration management tools like Ansible or Puppet - Advanced visualization using tools like gnuplot or integration with web-based dashboards Final Recommendations 1. Start Simple: Begin with basic reports and gradually add complexity as your needs grow and your expertise develops. 2. Focus on Reliability: Invest time in proper error handling, logging, and testing to ensure your reports are dependable. 3. Document Everything: Maintain comprehensive documentation for configuration, usage, and troubleshooting procedures. 4. Regular Maintenance: Schedule regular reviews of your reporting scripts to ensure they remain effective and secure. 5. Community Engagement: Leverage online communities, forums, and documentation to stay current with best practices and new techniques. Shell script report generation represents a powerful intersection of system administration, data analysis, and automation. By mastering these techniques, you'll be equipped to create robust, efficient, and maintainable reporting solutions that provide valuable insights into your systems and applications. The techniques and examples provided in this guide offer a solid foundation for your reporting journey. Remember that effective reporting is not just about collecting dataβ€”it's about transforming raw information into actionable insights that drive better decision-making and improved system management. Whether you're monitoring server performance, analyzing web traffic, tracking application metrics, or generating compliance reports, shell scripts provide the tools and flexibility needed to create professional, automated reporting solutions that scale with your organization's needs.