[Beowulf] New member, upgrading our existing Beowulf cluster

Thu Dec 3 23:29:29 PST 2009

Hi,

On Dec 4, 2009, at 3:34 , Chris Samuel wrote:
>
> How does it deal with pinned DMA memory on NICs ?

What we did in Platform (Scali) MPI, was to drain the HPC  
interconnect, then close it down. The problem was then reduced to  
checkpoint (e.g. using BLCR) N processes. Continuing from checkpoint  
and restarting from it would both re-open the HPC fabric (could be on  
another physical medium though). You could take the checkpoint on IB  
and restart using Gbe.

Combined with an agnostic interconnect support, this feature allows  
you in the case of a failing IB HCA (or failing switch port or cable)   
to restart from last the checkpoint, runn M-1 nodes communicating with  
other M-2 IB capable nodes using IB, and the last node communicating  
with the M-1 nodes using Gbe.

Traditional checkpointing requires snap-shot of the file-system in the  
general case (and restore of the correct snap-shot at restart),  
whereas checkpoint-and-kill (for migration or preemptive batch  
scheduling) does not require integration with file-systems.

Håkon