[Beowulf] interconnect and compiler ?

Thu Jan 29 16:22:10 PST 2009

> But the HTX product is retired,
> and the current DDR PCIe Infiniband chip has overhead similar to the
> HTX chip.

interesting.  the latency numbers I could find were 1.29 (HTX) vs 1.7.
did the latency improve, as well as the overhead?  also, what changed
the overhead?

> I guess you must run a lot of 2-node 2-core jobs, if you're so
> concerned with ping-pong latency ;-) ;-)

I'll bite: suppose I run large MPI jobs (say, 1k rank)
and have 8 cores/node and 1 nic/node.  under what circumstances
would a node be primarily worried about message rate, rather than latency?
does the latency/rate distinction lead to a high-fanout in implementing
barriers/reductions/manycast's?  I'm just not sure what circumstances lead
to a node generating a very high rate of concurrent messages.  the codes 
I see don't typically have a lot of inflight nonblocking messages,
at least not explicitly.  it seems like topology-aware code (or hybrid
mpi/threaded) would have even fewer.

thanks, mark hahn.