[Beowulf] [EXTERNAL] Re: Is Crowd Computing the Next Big Thing?
Chris Samuel
chris at csamuel.org
Sat Nov 30 18:53:26 PST 2019
On 30/11/19 6:27 pm, Douglas Eadline wrote:
> The most interesting thing I learned was how well
> some laptops functioned for a "users needs" while technically
> in a state of "brokenness" There is a larger lesson there.
This is why I'm a big big fan of compute nodes booting from a set image
each time, we did it at VLSCI with xCAT and its "statelite" target (so
we could keep GPFS metadata & other state on an NFS mount from the mgmt
node for easy booting) with our SGI and IBM hardware and it worked
really nicely.
At least then everything should be identically broken. ;-)
(and you only need to fix something in one place)
Similar approach here at NERSC with Cray ansible (convergent evolution).
We keep our recipes/definitions/etc in git and reuse them across systems
(as much as possible) with config information abstracted out to define
personalities for image builds and for boot.
All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
More information about the Beowulf
mailing list