[Beowulf] Re: failure trends in a large disk drive population
Jim Lux
James.P.Lux at jpl.nasa.gov
Fri Feb 16 14:15:49 PST 2007
At 12:50 PM 2/16/2007, David Mathog wrote:
>Eugen Leitl <eugen at leitl.org> wrote:
>
> > http://labs.google.com/papers/disk_failures.pdf
>
>Interesting. However google apparently uses:
>
> serial and parallel ATA consumer-grade hard disk drives,
> ranging in speed from 5400 to 7200 rpm
>
>Not quite clear what they meant by "consumer-grade", but I'm assuming
>that it's the cheapest disk in that manufacturer's line. I don't
>typically buy those kinds of disks, as they have only a 1 year
>warranty but rather purchase those with 5 year warranties.
But this is potentially a very interesting trade-off, and one right
in line with the Beowulf concept of leveraging cheap consumer gear...
Say you need 100 widgets worth of horsepower. Are you better off
buying 103 pro widgets at $500 and a 3% failure rate or 110 consumer
widgets at $450 and a 10% failure rate.... $51.5K vs $49.5K... the
cheap drives win.. And, in fact, if the drives fail randomly during
the year (not a valid assumption in general, but easy to calculate on
the back of an envelope), then you actually get more compute power
with the cheap drives (105 average vs 101.5 average over the year)
This also assumes that the failure rate is "small" and
"independent" (that is, you don't wind up with a bad batch that all
fail simultaneously from some systemic flaw.. the bane of a
reliability calculation)
One failing I see of many cluster applications is that they are quite
brittle.. that is, they depend on a particular number of processors
toiling on the task, and the complement of processors not changing
during the "run". But this sort of thing makes a 100 node cluster no
different than depending on the one 100xspeed supercomputer.
I think it's pretty obvious that Google has figured out how to
partition their workload in a "can use any number of processors" sort
of way, in which case, they probably should be buying the cheap
drives and just letting them fail (and stay failed.. it's probably
cheaper to replace the whole node than to try and service one)...
James Lux, P.E.
Spacecraft Radio Frequency Subsystems Group
Flight Communications Systems Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875
More information about the Beowulf
mailing list