HotDisk
When you have a NAS with several drives sitting in a laundry room, temperatures can quickly rise.
Hard drives are very sensitive to heat and can suffer serious damage if they exceed a certain temperature threshold for too long.
After a particularly hot summer that caused a few cold sweats while monitoring my drives’ temperatures, I started looking for a way to automatically shut down the server when disk temperatures stay above their safe limit for an extended period.
Since I couldn’t find a convincing solution, I decided to build my own.
- The script reads SMART temperature data from all SATA drives every minute.
- It counts the number of consecutive minutes the temperature stays above or below the threshold.
- It sends Discord notifications if the threshold is exceeded or when the temperature cools down.
- It triggers a system shutdown if the temperature stays above the limit for the configured duration.
- It logs all temperatures and counter states, and automatically rotates log files.
While I was at it, I also added an installation script that installs the main script, makes it executable, creates a systemd service and timer, and enables them automatically.
The installer also lets you configure various parameters:
| Variable | Description | Default Value |
|---|---|---|
MAX_TEMP | Maximum allowed temperature (°C) before the shutdown countdown starts | 60 |
HOT_DURATION | Consecutive minutes above MAX_TEMP before shutdown | 5 |
COOL_RESET_DURATION | Consecutive minutes below MAX_TEMP to reset all counters | 5 |
LOG_FILE | Path to the main log file | /var/log/hdd_temp_monitor.log |
LOG_ROTATE_COUNT | Number of log files to keep | 7 |
LOG_ROTATE_PERIOD | Log rotation period (daily or weekly) | daily |
DISCORD_WEBHOOK | Discord webhook URL for notifications | Required |
It also runs another script that configures logrotate with the parameters defined above.
Finally, the installer can even be executed directly via a simple curl command followed by one last setup script — perfect for the laziest of us.
I also had to handle several tricky cases: running as root without sudo, using sudo directly, running as a non-sudo user, missing dependencies, permission issues, file creation errors, disk data reading errors, and more.
Concurrent access to the status file also had to be managed carefully.
More details are available directly on the repository: