[Coco] OT: Help with horrible bad NAS purchase, Linux EXT3 file system blindly used.
lost at l-w.ca
Sun Dec 6 03:42:37 EST 2009
Stephen H. Fischer wrote:
> Is the Mirror software part of Linux EXT3 or must it be Maxtor /
> Seagate's code? The web interface surely is. My only knowledge of Linux
> is that it is something like OS-9.
On all modern and most not-so-modern systems, the actual mirroring/RAID
operations are completely independent of the file system. The
RAID/mirroring subsystem presents an interface to the operating system
that looks just like a regular block device.
File systems operate on top of block devices. In the case of a mirror
configuration, it would be operating on top of the block device
presented by the mirroring/RAID subsystem.
Thus, EXT3 probably never saw the error that the NAS device reported. In
fact, it was probably never bothered by it at all. The mirroring
subsystem would have handled all of that confusion transparently,
assuming correct operation.
Now for a few comments on EXT3. I operate servers for a living. I have
been using EXT3 on them for many years without issues. EXT3 is quite
reliable. In almost every case when EXT3 went wrong, it was because the
underlying storage medium (hard drive, raid array, etc.) failed. In
every other case, it was bad memory in the server. EXT3 is not the best
file system ever created but it works quite well. Of course, if the
implementation is bad, all bets are off, but assuming a recent or even
not so recent linux kernel, the implementation is good.
As far as whether the mirroring is Maxtor/Seagate's gimmick, Linux doing
it, or something else altogether, that is impossible to say without
knowing how the NAS is built. It could be a hardware gimmick doing it
(ideal) but unlikely for cost reasons. It could be some proprietary
Seagate thing but that, too, is unlikely simply because it would be more
expensive to do that. They may have Linux doing the mirror but that
seems unlikely as well since mirroring in the Linux kernel does not
handle ECC errors in a useful manner. Most likely, they're using some
sort of "software raid card" which provides some assistance for the
mirroring but the driver does most of the work. (The same possibilities
apply if they're using a different OS inside the NAS (Windows, BSD, etc.))
In any event, the mirroring bit is not part of EXT3. EXT3 doesn't even
know what mirroring is so it is almost 100% certain that EXT3 has
nothing to do with the problem.
Also, as has been said, current EXT3 implementations do not manage the
bad block list automatically. It is usually managed via fsck and/or the
Most likely, in the NAS, no errors will percolate up to the EXT3 file
system unless both drives fail simultaneously.
I'll close with a note about what could be causing problems that allow
rebuilds to succeed:
One drive may be having trouble reading a block. If it gives up on a
block and remaps it before it is written to, it will read as all zeroes.
That will amost certainly corrupt data. If the NAS is doing things
right, it will detect a CRC error and attempt to correct it. It seems
this NAS is fairly naïve about it and simply rebuilds the mirror, which
is a valid option. The same thing could happen if the drive eventually
reads the sector but doesn't detect it read incorrectly but in the
latter case, there will likely be a subsequent failure and you'll likely
see additional "failures". One of these two is the most likely case. As
disconcerting as it is, it is not necessarily a problem for it to happen
once in a while - that is expected operation for most drives. If it
happens regularly, then you likely have a drive that is "weak" and
probably should not be trusted.
That's enough off-topic rambling. Hopefully it's useful to some folks.
lost at l-w.ca
More information about the Coco