[Beowulf] IB vs. Ethernet

Lawrence Stewart stewart at serissa.com
Sat Feb 21 12:34:20 UTC 2026



> On Feb 21, 2026, at 3:28 AM, Greg Lindahl <lindahl at pbm.com> wrote:
> 
> On Thu, Jan 15, 2026 at 08:28:36PM -0500, Lawrence Stewart wrote:
> 
>> I think a 64 byte store at a core should directly become a packet.  No on-die-network, no coherence, no root complex, no host-fabric adapter.  Incoming short messages should be delivered directly to a fifo in the relevant core.
> 
> I think that's a great idea!
> 
> — greg
> 


As Greg, I think, is hinting, this idea was a thing that QLogic HFI’s did, using the core write combining buffers to good effect.  It seems like it is also the basic idea behind MOVDIR64B, which specifies that a 64 byte write will be atomic all the way down.

Using core registers for messaging is much older, with Transputers, Tilera, Dally’s J Machine and arguably Cray E-registers. 

What this is really about is end to end latency. We’ve been stuck at 1 microsecond since the Cray T3D 30 years ago, in spite of 100x improvements in link speed.  If we can eliminate all the middlemen and get switches back to 50 ns forwarding, I think we should be able to get 300 ns end to end in a good size system.

-Larry



More information about the Beowulf mailing list