[Beowulf] Register article on Opteron - disagree
Patrick Geoffray
patrick at myri.com
Mon Nov 22 01:30:06 PST 2004
Hi Robert,
Robert G. Brown wrote:
> Sure. We can start with the fact that the Top 500 list is irrelevant.
> It is a hardware vendor pissing contest.
No, it's a statistical study. The Top500 list is relevant for what it
was designed for: tracking the evolution of the HPC market, ie where
people put their money.
Don't look at the top 10, don't look at the order, look at the 500
entries as a snapshot of the known 500 biggest machines. In this
context, you are looking for the trends: vector vs scalar, SMP versus
MPP, industry vs academia, etc.
Sure the vendors pay attention to it, and it's certainly a marketing
fight for the top10, but they cannot really influence the whole list, IMHO.
> a) It lists identical hardware configurations as many times as they
> are submitted. Thus "Geotrace" lists positions 109 through 114 with
> identical hardware. If one ran uniq on the list, it would probably end
> up being the top 200. At most. Arguably 100, since some clusters
> differ at most by the number of processors. What's the point, if the
> site's purpose is to encourage and display alternative engineering?
> None at all.
When you take a snapshot, you take everybody on the picture, even the
N-tuples. If "Geotrace" has indeed 15 identical clusters (it is quite
frequent in industry to have several identical clusters BTW), they
should be on the list, otherwise your snapshot of the market is wrong.
Why would you expect *all* of the entries to be different ?
> b) It focusses on a single benchmark useful only to a (small) fraction
> of all HPC applications. I mean, c'mon. Linpack? Have we learned
> >>nothing<< from the last twenty years of developing benchmarks? Don't
Haven't you learn the rule #1 of benchmarking ?
Rule #1: There is no set of benchmarks that is representative enough.
Linpack is a yard stick, that measures 3 things:
1) the system compute mostly right.
2) the system stays up long enough to be useful.
3) the system is big enough to be in the snapshot.
For these requirements, Linpack is just fine.
> get me wrong -- linpack is doubtless a useful measure to at least some
> folks. However, why not actually present a full suite of tests instead
> of just one? Vendors would hate this, just like they hate Larry Mcvoy's
> benchmark suite, because it makes it so difficult to cook up a cluster
> that does just one thing (runs Linpack) well...
I will tell you why the vendors would hate this: it takes a lot of time,
and they are not paid for this. Tuning Linpack on large machine is
already very time consuming. You want 10 more benchmarks to run ? Sure,
who will run them ?
BTW, how do you cook up a cluster to run Linpack well ?
> c) It totally neglects the cost of the clusters. If you have to ask,
What is the price of a machine ? List price ? Including software ?
machine room ? service ? support ? how long the support ?
You would need to define a benchmark to get comparable prices, and trust
the vendors to comply with it.
> best the cost after the vendor discounted the system heavily for the
> advertising benefit of getting a system into the top whatever.
You mean I cannot really buy the VT cluster for $5M dollars ? What a
pity :-)
> I could go on. I mean, look at the banner ads on the site. Vendors
> love this site. If it didn't exist, they'd go and invent it.
Who is paying for the website, the bandwidth, the booth at SC, the
operational expenses ? The banner ads...
> If they want me to take the top500 list seriously, they could start by
> de-commercializing it, running a pretty stringent unique-ing process on
Maybe you mean they should commercializing it ?
In which way is it commercialized today ? It's based on trust, voluntary
submissions, from anyone. Commercializing it would mean paying for
submitting an entry. That would take care of the 15 identical entries
from Geocache, but then only big rich vendors would be in the list.
> the submissions and accepting only the first of a given design or
> architecture, especially for clusters that are more or less turnkey and
> mass produced.
There are no such things as "turnkey and mass produced" clusters, except
when a customer buy several instances of the same configuration. Most
configs are different.
> Then they could run a SERIOUS suite of benchmarkS (note
> plural) on the clusters, one which (like SPEC) attempts to provide
Let me say it again: the Top 500 is not designed to find the best
machine for your needs ! That the job of a RFP and the associated suite
of acceptance tests !!!
when you want to spend enough dollars to buy you a 1 TFlops machine, you
do your homework: you identify the set of benchmarks that you have
confidence in, you ask vendors to bid, you choose one and you test it
for acceptance. You don't tape the Top500 list to the wall and throw a
needle. If you do, I have this nice bridge to sell :-)
> useful information about things like latency, bandwidth, interconnect
> saturation for various communications patterns, speed for a variety of
> actual applications including several with very different
> computation/communication patterns (ranging from embarrassingly parallel
> to fine grained synchronous). Scaling CURVES (rather than a single
> silly number) would be really useful.
Sure, and you do it all again every 6 months.
Seriously, take something like bandwidth: it is pipelined or not ?
cold-start or not ? Back-to-back or through a switch ? how many stages
in the switch ? do you count a megabits as 1000000 bits or 1048576 bits
? Let's imagine the mess when we talk about communication patterns...
> I mean, this site is "sponsored" by some presumably serious computer
> science and research groups (although you'd never know it to look at all
The Top500 team is 4 people, with the same guy doing most of the work
since the list was created. There is no "groups" behind it.
> sponsoring institutions are listed). If they want to do us a real
> public service, they could do some actual computer science and see if
> they couldn't come up with some measure a bit richer than just R_max and
> R_peak....
BTW, the 4 people aforementioned know quite a bit of computer science...
The fact is that the Top500 is a statistical study, not a public
service. You can use it as it is or not, but it does not really matter.
> AMD has more or less "owned" the price/performance sweet spot for the
> last two years. If you have LIMITED money to spend and want to get the
What is the "price/performance sweet spot" ?
How do you measure it ? How do you know that AMD owned it ?
The Top500 list tells you that people put their money on Intel more than
on AMD. If you assume that people do they homework when they buy such
machines, you may deduct that Intel was more attractive in the bidding
process.
I can understand how the Top500 is not matching your expectations, but
the truth is that it was designed to match them. What you are looking
for just does not exist. It's not utopia, you could really imagine a
company following the SPEC model: vendors pay for submitting (a lot of)
results and results are reviewed seriously. All you need is to convince
vendors to spend a lot of money on it. Is the HPC market big enough for
that ? I doubt it.
Patrick
--
Patrick Geoffray
Myricom, Inc.
http://www.myri.com
More information about the Beowulf
mailing list