Index of /anton/mergedisks

[ICO]NameLast modifiedSizeDescription

[DIR]Parent Directory  -  
[TXT]mergedisks.c02-Dec-2004 15:25 2.5KSource
[   ]mergedisks02-Dec-2004 16:20 420KLinux-i386 statically linked binary

What to do if both disks of a RAID1 fail?

We recently shut one of our machines with a RAID1 on two disks (with Linux' md driver) down. When we turned it on again, it had problems with the disks. Later investigation showed that both disks were damaged: The first disk could not read the superblocks of two partitions, the second disk could not even read the partition table.

Fortunately, the unreadable parts were in different places, so we could copy the disks to a third (empty) disk by alternatingly reading from the disks until we found an error, then switching to the other disk etc. The mergedisks program above does this.

Using mergedisks

Connect the two broken disks and the new disk to a different machine (you probably don't trust the machine in which both disks broke simultaneously, do you?). Boot from an existing system disk on that machine, or from a rescue CD (e.g., RIP). Use the kernel option raid=noautodetect to avoid autodetection and automatic synchronization of the disks by the kernel. Then wget mergedisks if you don't have it already.

Type

mergedisks
to get usage information. The way we used it is this:
mergedisks /dev/hdc /dev/hdd /dev/hdb
where hdc and hdd were the broken disks and hdb was the empty disk.

Since the broken disks were partitioned in exactly the same way, and all the partitions were used in RAID1 fashion, we could do this with the whole disks. Otherwise we would have had to do this partitionwise, and have to somehow find out the starts of the partitions for the disk with the broken partition table (mergedisks has no support for that yet, but use the source).

If you have both partition tables available, you should be able to use mergedisks with individual partitions, but we have not tested this; if you want the target to be a file, you have to touch it first.

Mergedisks reports any special conditions (switching disks due to I/O errors etc), and also occasionally outputs a sign of its progress (positions are usually indicated in 4KB block sizes).

If you get "error on both disks" messages, it's not possible to fully restore the disk (there is an error at the same place on both source disks), but if you are lucky, the error will be in free space, or in non-essential data. mergedisks does not write anything for such blocks to the target disk, but continues with the next block.

Eventually (for our 200GB disks after about two hours) mergedisks finishes. The output for our run looked like this:

switching to /dev/hdd at 81
  49787136/dev/hdd EOF at 49787136, done
This means that we found the first error on hdc at 4KB-block #81; we found no later errors on hdd, and eventually finished at block #49787136 (byte 203928109056).

We then disconnected the source drives, put in another new drive, partitioned it the same way, added the partitions to the RAID1s (with raidhotadd), and let Linux recover the RAIDs. Finally, we put both disks into the original machine; we had to run LILO there from the rescue CD, but otherwise there were no problems. The machine has now worked for several days without problems.

Why did both disks fail at the same time?

The disks in question were Maxtor 6Y200P0 200GB disks. There were no signs of warning, and smartctl tells us that both failed drives still feel good (they apparently completely ignore the unrecoverable read errors). One additional funny thing is that the unreadable sectors cannot be written to, either (normally the drives are supposed to relocate the sectors to the spare region on writing).

My current theory on what happened is that something bad (maybe a box-internal brownout) happened shortly after starting the box, which made the disks go berserk and somehow destroy the low-level format of the blocks they were reading at the time (the superblocks of the first disk and the partition table of the second); apparently the disk firmware does not relocate written blocks in this case.

We low-level formatted both disks. One seems to be ok again, tested with a SMART extended self test (supporting my theory that it's not a hardware failure). However, the other disk had read errors after the low-level format; the first error was at a different place after each low-level format, so it's probably not a damaged disk surface (maybe some intermittent error in the controller?).

The moral of the story: If you do RAID1, use disks from different manufacturers. This should reduce the probability that a systematic problem (e.g., bad behaviour on brownouts) will destroy both disks at the same time.


Anton Ertl