Among the most common incidents causing database crashes and service outages are disk full events. When a server's disk space hits 100%, applications can no longer write logs, database engines halt to prevent table corruption, and systems become completely unresponsive.
In this guide, we will write a production-ready Bash script that runs periodically, monitors all mounted disks, filters out virtual filesystems (like loops or ramdisks), checks usage against a configurable threshold, and dispatches a Slack webhook alert when space is low.
DevOps best practice: Don't wait for your server to crash. Standardize monitoring scripts across your fleet, automated via Cron or Ansible playbooks.
The Bash Script
Let's look at the script. It uses standard utilities like df to retrieve disk statistics, parses text dynamically, and utilizes curl to POST json messages to webhooks.
#!/usr/bin/env bash # --------------------------------------------------------------------- # Description: Production disk space monitor script with Slack webhook. # Author: Dinesh Sandilyan (Senior DevOps Engineer) # --------------------------------------------------------------------- set -euo pipefail # --- CONFIGURATION --- THRESHOLD=90 SLACK_WEBHOOK_URL="https://hooks.slack.com/services/YOUR/WEBHOOK/URL" HOSTNAME=$(hostname) # --- ALERTS FUNCTION --- send_slack_alert() { local partition="$1" local use_percent="$2" local payload="$(cat <<EOF { \"text\": \"๐จ *Disk Space Alert* on \`${HOSTNAME}\`\", \"attachments\": [ { \"color\": \"#f7768e\", \"fields\": [ { \"title\": \"Partition\", \"value\": \"\`${partition}\`\", \"short\": true }, { \"title\": \"Current Usage\", \"value\": \"*${use_percent}%* (Threshold: ${THRESHOLD}%)\", \"short\": true } ] } ] } EOF )" # Push alert payload to Slack via curl curl -s -X POST -H 'Content-type: application/json' \ --data "${payload}" "${SLACK_WEBHOOK_URL}" > /dev/null } # --- MAIN EXECUTION --- # Parse df output: # -x tmpfs,devtmpfs,squashfs excludes virtual mounts and snaps df -Ph | grep -vE '^Filesystem|tmpfs|cdrom' | while read -r line; do # Extract details: Partition, Usage% and Mount point partition=$(echo "${line}" | awk '{print $1}') use_percent=$(echo "${line}" | awk '{print $5}' | cut -d'%' -f1) mount_point=$(echo "${line}" | awk '{print $6}') # Compare numeric values if [ "${use_percent}" -ge "${THRESHOLD}" ]; then send_slack_alert "${partition} (mounted on ${mount_point})" "${use_percent}" fi done
How It Works
- Robust settings (
set -euo pipefail): Ensures that if any command fails, or a variable is unassigned, the script immediately exits rather than continuing with corrupted states. - Subnet & File Filtering: The command
df -Phretrieves stats in a POSIX-compliant layout.grep -vEremoves headers and temporary virtual paths such as loop mounts used by Snap packages. - Formatting Slack JSON Payload: A raw string block generates standard Slack Rich Attachments featuring red highlight bars matching our Tokio Night theme (
#f7768e).
Automating the Monitor with Cron
To run this check automatically, we can hook it into the system cron scheduler. We'll set it to run once every hour.
First, make the script executable:
dinesh@prod-srv ~ โฏ chmod +x /opt/scripts/disk_alert.sh
Next, open the system crontab file using crontab -e and append this scheduler configuration:
# Run the disk space monitor script at the top of every hour (0 * * * *)
0 * * * * /opt/scripts/disk_alert.sh >> /var/log/disk_alert.log 2>&1
Conclusion
This automated Bash task serves as a reliable guard against unexpected disk capacity exhaustion. While container logs should be piped to logging aggregators and server metrics should exist in monitoring dashboards (like Grafana), having localized lightweight cron alerts remains a critical redundancy step to secure your critical cloud machines.
Customize the threshold percentage and Slack URL in the script variables to match your team's environments!