Fixing a Raid 5 Degraded Array After a Failed Disk
Posted by Adam Hayes
Failed hard drive on a Dell PowerEdge server placed Raid 5 on a PERC 4/di in a degraded state. Here are the steps to rebuild the Raid array on a Dell PowerEdge.
We had a harddrive crash on one of our Dell PowerEdge servers this week. Fortunately, we had no downtime because it only took one drive out of our Raid 5 configuration. Once we installed the new drive everything rebuilt, but the Array Disks still showed in a degraded state. After trying a lot of options like reseating the hot swap drive, rebuilding the raid array again, running a lot of diagnostics on the drives, etc., we were still getting a predictive drive failure with an error code of 2094 in Dell's OpenManage log alerts.
We decided to try another drive. It worked. The first "new" drive we put in was failing the S.M.A.R.T. test and showing a predictive failure. Replacing the "new" broken drive was the trick. So when it was all done these were the steps to get it out of degraded mode on the PERC 4/di controller array disk.
- Remove bad disk (hot swap so no need to power down the machine)
- Replace with a good disk of same size or larger
- It automatically rebuilds the array
- Once the rebuild goes to 100% it still showed in degraded state
- Clear logs
- Clear alert log
- Do a global rescan
- Everything shows a good status and running without problems.
I was very nervous about pulling drives on the fly, but everything went well.
We hadn't seen much of a drop in performance, although most articles I've read mentioned a huge hit in performance. We saw some performance drop in our website log statistics. Other than that everything went very smoothly for a hard drive failing.
The lesson here is always have backups and check your servers often for failing hardware. Preventative checks are much easier to handle compared to rebuilding a machine from the ground up.
« PageRank is Dead. Long Live mozRank
2011 Website Browser Statistics »