[Beowulf] 32 nodes cluster price
Joe Landman
landman at scalableinformatics.com
Sun Oct 7 13:23:24 PDT 2007
Bill Rankin wrote:
> Let me offer up a somewhat concrete example of a problem with hardware
> raid.
>
> A local group around here kept some Very Important Data on a hardware
> raid array. Due to several factors, a backup was not made of certain
> data. The device lost a drive and started an automagic rebuild on one
Let me state the obvious here. And yes, I know I am likely "preaching
to the choir"
RAID is not a backup solution. Again, RAID is not a backup solution.
If you run without a backup, we can pretty much guarantee that you are
going to lose data at some point in time. Again, RAID is not a backup
solution.
I don't know if I mentioned it, but RAID is not a backup solution.
Anyone who believes otherwise is begging for trouble. RAID is not a
backup solution.
Backing up your data is *ALWAYS* important, RAID or not. Even if it is
just a mirror of the data.
> of the hot spares. The sudden beating that the other drives took
> (because of the rebuild) caused a second hard drive to fail (always a
> concern with RAID5).
[... anecdote elided ...]
RAID is not a backup solution, anyone mistakenly using it as such *will*
be burned.
> Now while this is kind of a "perfect storm" in turns of hardware and
> data failure, it does illustrate the extent of control that you give up
> when going with a hardware raid solution. I think that the higher end
Er... with all due respect, this wasn't a hardware issue. This was a
policy issue.
If your data is important, back it up. It doesn't matter if it is on a
hardware or software RAID, you absolutely, positively must to a
cost-benefit analysis of the value of the data and the time/effort/money
it would cost to recover when (not if) something goes bump in the night.
RAID is not a backup solution. Not sure I mentioned this.
All hardware has failure modes. All software has bugs. Your choice is
which set of problems are easier to deal with. We have seen crappy
hardware, and abominable software. Bugs in the linux kernel (no, there
couldn't be any, nah... impossible ...) could just as easily wreck your
day as a misguided firmware/hardware bug.
Backups are a risk mitigation strategy. If you have important data, you
need to back it up. Moreover, I argue that you need multiple modalities
of backup/restore. Call this 20+ years of experience in losing data and
thinking (naively) that the backup that I have will actually restore...
properly.
> vendors (ie. NetApp, EMC, et al) have their reliability up to the point
> where this is much less of a risk. But for the low-end beer budget
Er... ah... ok. All of them have similar issues. I occasionally hear
how vendor X's (make the appropriate substitution for X) item, such as a
network card, or disk drive is *obviously* much better than what is
available in the mass market, which is why they charge so much more for
it. The last time a customer noted that about one of the above named
vendors (network card as it turned out), I asked them to pull back the
label on the card and see what was underneath it. Turns out it was a
plain old mass market card with a (vendor X) label slapped on it. I am
sorry to report that for the vast majority of cases of which I am aware,
they (the above named vendors X) use generally the same mass market
stuff you and I do.
Don't mistake this, EMC, Netapp and others *do* offer value. It just
isn't in slapping a new label on something, charging 10x for it, and
somehow convincing the people paying for it that it is magically special
(that is, unless their label maker has some serious undocumented mojo in
that label ...) Their value is in hyperactive support.
> cluster, software raid is probably still the way to go. As for the
> "mid-tier" vendors, I would be very cautious and pay close attention to
> the worst case data lose scenario.
What we tell all our customers (aside from RAID is not a backup
solution) is that they want to minimize risk. Where is the risk? Well
you can trace it out. There are many ways to mitigate risk, and reduce
down time. RAIN is a great example.
But you can build RAIN out of software RAID as easily as hardware RAID.
Remember, all have bugs, your job is to figure out (or work with
someone who does this for you) how to reduce the impact of potential
bugs. RAID is not a backup, and if you run without one, well, ...
> Good luck,
... yeah.
>
> -bill
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 866 888 3112
cell : +1 734 612 4615
More information about the Beowulf
mailing list