[Beowulf] Contents of Compute Nodes Images vs. Login Node Images
Chris Samuel
chris at csamuel.org
Tue Oct 23 23:34:34 PDT 2018
On Wednesday, 24 October 2018 3:15:51 AM AEDT Ryan Novosielski wrote:
> I realize this may not apply to all cluster setups, but I’m curious what
> other sites do with regard to software (specifically distribution packages,
> not a shared software tree that might be remote mounted) for their login
> nodes vs. their compute nodes.
At VLSCI we had separate xCAT package lists for both, but basically the login
node was a superset of the compute node list. These built RAMdisk images so
keeping them lean (on top of what xCAT automatically strips out for you) was
important.
Here at Swinburne we run the same image on both, but that's a root filesystem
chroot on Lustre so size doesn't impact memory usage (the node boots a
patched oneSIS RAMdisk that brings up OPA and mounts Lustre then pivots over
onto the image there for the rest of the boot). The kernel has a patched
overlayfs2 module that does clever things for that part of the tree to avoid
constantly stat()ing Lustre for things it has already cached (IIRC, that's a
colleagues code).
We install things into the master for the chroot (tracked with git) then have
a script that turns the cache mode off across the cluster, rsync's things into
the actual chroot area, does a drop_caches and then turns the cache mode on
again.
Hope that helps!
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
More information about the Beowulf
mailing list