Thursday, February 12, 2015

How to replace broken hard drive from Linux software mirror software raid (RAID0)

Check current status:

cat /proc/mdstat

You should see something like this:

976758841 blocks super 1.2 [2/1] [U_]

The underscore ( _ ) means you have a bad hard drive.
Inside the square bracket there are 2 characters which means the raid have 2 members.
And in the example above the 2nd member is not up.


Identify which drive is bad:
In this example, we will assume that your server uses SATA drive and the RAID array members are /dev/sdb and dev/sdc. Since this instruction is about mirrored software raid, both hard drives must be the same size or at least one of them is larger.

Since the current status is [U_] this means that /dev/sdc is bad.


Replace the bad drive:
If your server / computer does not support hot-swap, you have to shutdown the computer and replace the bad hard drive with a good hard drive.


Use fdisk to make sure the good drive has been detected:

fdisk -l

Make sure /dev/sdc has been detected.


Copy partition table from good drive (/dev/sdb) to replaced drive (/dev/sdc):

dd if=/dev/zero of=/dev/sdc bs=512 count=1
sfdisk -d /dev/sdb | sfdisk --force /dev/sdc


Add the replaced drive to RAID array:

mdadm --manage /dev/md0 --add /dev/sdc1


Check the status of the recovery progress:

cat /proc/mdstat


You should see something like this:

md0 : active raid1 sdc1[3] sdb1[2]
      976758841 blocks super 1.2 [2/1] [U_]
      [==============>......]  recovery = 70.0% (683798720/976758841) finish=67.0min speed=72872K/sec


My rebuild has been running for about 1 hour, so your progress should be less.

Obviously the larger your hard drive capacity the slower this recovery progress will be, however you can keep executing 'cat /proc/mdstat' to keep checking.

No comments:

Post a Comment