[Beowulf] Network considerations for new generation cheap beowulfcluster
Jess Cannata
jac67 at georgetown.edu
Mon May 21 07:11:36 PDT 2007
Mark Hahn wrote:
>> I agree that all of the options (Infiniband, Myrinet, and 10 Gb
>> Ethernet) are too expensive.
>
> I'm curious what kinds of costs you're seeing (per-port) for each of
> these.
>
By too expensive, I mean much more expensive than Gig-E which is "free"
on the NIC side and quite cheap on the switch side.
>> I have been looking into the low latency 10 Gb Ethernet cards from
>> NetEffect, which use the iWARP specifications to provide low latency. I
>
> why do you think iWARP is necessary to provide low latency?
I'll admit that I don't have a great understanding how the NetEffect
cards work. I do know that they are using the iWARP protocol (Remote
Direct Memory Access, etc.) to reduce latency, but that isn't the only
thing they are using.
>
>> haven't done any testing, yet, but the numbers that they are
>> releasing show them competitive with Infiniband/Myrinet as the number
>> of processes increase.
>
> do you mean as you increase number of processes on a single node (that
> is,
> sharing a single interconnect port), or number of processes in the
> whole job or cluster?
What I should have said is that the NetEffect card is competitive as the
number of network connections to a process on a single node increases.
Unfortunately, the HPCWire article did not include these numbers. I saw
them is a presentation given by NetEffect.
>
>> Plus, I expect 10 Gb switches to rapidly drop in price. I believe
>> that the
>
> I hope for that as well, but am not sanguine. expensive optic
> tranceivers preclude commoditization of small (~20 pt) switches, and
> I've heard people say bad things about the practicality of
> mass-produced/cheap 10G-baseT.
> (mainly complaining about complexity and power requirements.)
I heard the same things for Gig-E. I'm confident a solution will be
found either through better manufacturing, design, or new technology.
>
>
>> and post some numbers. Here is a link to some of the numbers that
>> NetEffect is publishing:
>>
>> http://www.hpcwire.com/hpc/716435.html
>
> no usable latency numbers there. if you squint, it looks like they're
> claiming latency of around 7 us, which is _not_ competitive with even
> myri 2G (nor recent IB nor myri 10G.)
The numbers that I saw are not on HPCWire. I didn't realize that when I
sent the link. I recommend that people check out the NetEffect cards and
similar interconnects (low latency 10 Gb Ethernet with iWARP) to see if
the claims that they make are valid. I'm not sure that they are, but I
am interested to find out. AMD's developer site has a new cluster with
both Infiniband and NetEffect's low latency 10 Gb cards installed so you
will be able to do direct comparisons between low latency 10 Gb Ethernet
and Infiniband. It is called "Smith." You can find it at
https://devcenter.amd.com/about/systems.php. I haven't tested it, yet,
but I plan to. I'd be interested in hearing about other's experience.
>
>>> the cheapest cable i see is 1 meter and $70
>
> nothing wrong with $70 cables - you need to quote the whole per-port
> price,
> including nic, cable and switch port. it looks to me as if Myri 10G
> is around $1500/port; I've never had a good read on IB prices
> (deconvolved from vendor/discount pricing issues.)
>
>>> Cheapest card i see is $715
>
> nothing wrong with $715, even if the all-in per-port price is $1500 -
> it just means you won't be using $1000 desktop-spec nodes. that's OK,
> since if you're worried about ~3 us latency and 1GB bandwidth, you
> should also be using multiple cores, ECC memory, and probably a few
> GB/node,
> and therefore can easily amortize $715/node.
>
>>> So the node price starts at $765, which is already way way more than
>>> the total price of 1 node.
>
> only if you're looking at extremely low-end nodes. for such nodes,
> the only viable option is zero-cost Gb nics, of course, and
> mass-market switches (ie, not high-end chassis switches, etc).
>
I thought--though I may be mistaken--that this was the point of the
original post. What is/will be the new low-cost network solution? It
doesn't seem to be Infiniband or Myri-10G since their price doesn't seem
to be dropping much.
> 5 years ago, the low-end approach was 100bT; now its 1000bT. the
> prime target for that approach (serial or EP) has simply gotten broader;
> I don't see this as anything to complain about. for "real" parallel,
> you have to pay for the network you need. there as well, you now get
> more for your money, no complaints. complaining that you can't get 1
> us, 1GBps interconnect for $50/port is just silliness.
>
I disagree on this last point. Why can't low latency interconnects
become the standard? It is not just HPC applications that are demanding
low latency networks.
Jess
More information about the Beowulf
mailing list