HP offers a number of support tools for their HP proliant servers when you are using Linux. One of the must-have utilities is hpacucli. You can use this to configure your RAID settings but also to verify if you drives are running properly or failing. Of course you don’t want to check this manually every now and then so I am using a script to send me an e-mail whenever one or more drives fails.
You can use the following command to verify if your drives are OK:
[root@mylinuxserver /]# hpacucli controller slot=0 physicaldrive all show Smart Array P400i in Slot 0 (Embedded) array A physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SAS, 146 GB, OK) physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SAS, 146 GB, OK)
Both drives show up as “OK” which is great. Now whenever one of the drives shows up as “Failed” or “Rebuilding” I want to receive an e-mail. You can use the following script for this:
#!/bin/bash ### #If something went wrong with the HP smartarray disks this script will send an error email ### MAILemail@example.com HPACUCLI=`which hpacucli` HPACUCLI_TMP=/tmp/hpacucli.log if [ `hpacucli controller slot=0 physicaldrive all show | grep '(Failed|Rebuilding)'| wc -l` -gt 0 ] then msg="RAID Controller Errors" #echo $msg #$msg2=`hpacucli controller slot=1 physicaldrive all show` logger -p syslog.error -t RAID "$msg" $HPACUCLI controller slot=0 physicaldrive all show > $HPACUCLI_TMP mail -s "$HOSTNAME [ERROR] - $msg" "$MAIL" < $HPACUCLI_TMP rm -f $HPACUCLI_TMP #else #echo "Everything Good" fi
Now whenever the RAID fails you will receive a nice e-mail so you know you’ll have an harddisk to swap. I saved the script above in /etc/cron.hourly so that the harddisks are checked every hour.