HP Proliant Email HDD Fail Notification

HP offers several support tools for their HP Proliant servers when using Linux. One of the must-have utilities is hpacucli. You can use this to configure your RAID settings but also to verify if your drives are running correctly or failing. Of course, you don’t want to check this manually every now and then, so I am using a script to send me an e-mail whenever one or more drives fail.

You can use the following command to verify if your drives are OK:

[root@mylinuxserver /]# hpacucli controller slot=0 physicaldrive all show

Smart Array P400i in Slot 0 (Embedded)

array A

physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SAS, 146 GB, OK)
physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SAS, 146 GB, OK)

Both drives show up as “OK” which is great. Now, whenever one of the drives shows up as “Failed” or “Rebuilding,” I want to receive an e-mail. You can use the following script for this:

#!/bin/bash
###
#If something went wrong with the HP smartarray disks this script will send an error email
###
MAIL=my@mailaddress.com
HPACUCLI=`which hpacucli`
HPACUCLI_TMP=/tmp/hpacucli.log

if [ `hpacucli controller slot=0 physicaldrive all show | grep '(Failed|Rebuilding)'| wc -l` -gt 0 ]
then
msg="RAID Controller Errors"
#echo $msg
#$msg2=`hpacucli controller slot=1 physicaldrive all show`
logger -p syslog.error -t RAID "$msg"
$HPACUCLI controller slot=0 physicaldrive all show > $HPACUCLI_TMP
mail -s "$HOSTNAME [ERROR] - $msg" "$MAIL" < $HPACUCLI_TMP
rm -f $HPACUCLI_TMP
#else
#echo "Everything Good"
fi

Now, whenever the RAID fails, you will receive a nice e-mail so you know you’ll have a hard disk to swap. I saved the script above in /etc/cron.hourly so that the harddisks are checked every hour.

Tags:


Forum Replies

  1. I found this post really useful, with a few small modifications I am rolling it out to all my servers. Will also be writing one to email on content of IML as read by hplog, another essential HP utility to monitor your server health so as not to get surprised!

    #!/bin/bash
    
    # Server Health script
    # Verifies Server health by querying IML and ACU
    # and emails if a disk has failed or Caution/Critical message
    # is found in Server Event Log (IML)
    #
    # Requires hp-health and hpacucli rpms to be installed
    #################################################
    
    # Global variab
    ... Continue reading in our forum

  2. Saw that my code got messed up above while posted, some missing characters so not cut and pastable!

  3. Hi Matt,

    Thanks for sharing this! I’ll try it the next time I’m installing a proliant server.

    Rene

Ask a question or join the discussion by visiting our Community Forum