[Beowulf] [EXTERNAL] Re: Is Crowd Computing the Next Big Thing?

Chris Samuel chris at csamuel.org
Sat Nov 30 18:53:26 PST 2019


On 30/11/19 6:27 pm, Douglas Eadline wrote:

> The most interesting thing I learned was how well
> some laptops functioned for a "users needs" while technically
> in a state of "brokenness" There is a larger lesson there.

This is why I'm a big big fan of compute nodes booting from a set image 
each time, we did it at VLSCI with xCAT and its "statelite" target (so 
we could keep GPFS metadata & other state on an NFS mount from the mgmt 
node for easy booting) with our SGI and IBM hardware and it worked 
really nicely.

At least then everything should be identically broken. ;-)
(and you only need to fix something in one place)

Similar approach here at NERSC with Cray ansible (convergent evolution). 
We keep our recipes/definitions/etc in git and reuse them across systems 
(as much as possible) with config information abstracted out to define 
personalities for image builds and for boot.

All the best,
Chris
-- 
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA


More information about the Beowulf mailing list