[Beowulf] Big storage
    Leif Nixon 
    nixon at nsc.liu.se
       
    Fri Sep 14 02:21:14 PDT 2007
    
    
  
Loic Tortay <tortay at cc.in2p3.fr> writes:
> During the last HEPiX meeting, Peter Kelemen mentionned something told 
> to him by a ZFS developer (Jeff Bonwick, if I'm not mistaken) about 
> data corrupted by a Fibre Channel HBA during transfer between disk and 
> host.  ZFS, reportedly, detected (and corrected) the corruption.
> Of course a ZFS developer may be biased.
AFAIU, ZFS is designed specifically to handle such situations, but I'd
like to see large scale tests over a range of different hardware.
> I'm probably mis-remembering some of the technical details about this, 
> since they seem quite unlikely now (something about the laser beam 
> being somehow "corrupted", but I think this would be detected by the 
> Fibre Channel link protocols or upper layers checksums).
Yeah, I guess it should. But we recently lost 11 TB data due to a FC
switch port silently trashing a small proportion of the data passing
through it. (Quite possibly ZFS would have saved us.) And I've seen
three similiar incidents at other places in the last few months. So I
have turned up my cynicism knob yet a few notches.
-- 
Leif Nixon                       -            Systems expert
------------------------------------------------------------
National Supercomputer Centre    -      Linkoping University
------------------------------------------------------------
    
    
More information about the Beowulf
mailing list