HotDisk


When you have a NAS with several drives sitting in a laundry room, temperatures can quickly rise.
Hard drives are very sensitive to heat and can suffer serious damage if they exceed a certain temperature threshold for too long.
After a particularly hot summer that caused a few cold sweats while monitoring my drives’ temperatures, I started looking for a way to automatically shut down the server when disk temperatures stay above their safe limit for an extended period.

Since I couldn’t find a convincing solution, I decided to build my own.

  • The script reads SMART temperature data from all SATA drives every minute.
  • It counts the number of consecutive minutes the temperature stays above or below the threshold.
  • It sends Discord notifications if the threshold is exceeded or when the temperature cools down.
  • It triggers a system shutdown if the temperature stays above the limit for the configured duration.
  • It logs all temperatures and counter states, and automatically rotates log files.

While I was at it, I also added an installation script that installs the main script, makes it executable, creates a systemd service and timer, and enables them automatically.
The installer also lets you configure various parameters:

VariableDescriptionDefault Value
MAX_TEMPMaximum allowed temperature (°C) before the shutdown countdown starts60
HOT_DURATIONConsecutive minutes above MAX_TEMP before shutdown5
COOL_RESET_DURATIONConsecutive minutes below MAX_TEMP to reset all counters5
LOG_FILEPath to the main log file/var/log/hdd_temp_monitor.log
LOG_ROTATE_COUNTNumber of log files to keep7
LOG_ROTATE_PERIODLog rotation period (daily or weekly)daily
DISCORD_WEBHOOKDiscord webhook URL for notificationsRequired

It also runs another script that configures logrotate with the parameters defined above.
Finally, the installer can even be executed directly via a simple curl command followed by one last setup script — perfect for the laziest of us.

I also had to handle several tricky cases: running as root without sudo, using sudo directly, running as a non-sudo user, missing dependencies, permission issues, file creation errors, disk data reading errors, and more.

Concurrent access to the status file also had to be managed carefully.

More details are available directly on the repository: