[Beowulf] experience with HPC running on OpenStack
Chris Samuel
chris at csamuel.org
Tue Jun 30 22:05:11 PDT 2020
On 29/6/20 5:09 pm, Jörg Saßmannshausen wrote:
> we are currently planning a new cluster and this time around the idea was to
> use OpenStack for the HPC part of the cluster as well.
>
> I was wondering if somebody has some first hand experiences on the list here.
At $JOB-2 I helped a group set up a cluster on OpenStack (they were
resource constrained, they had access to OpenStack nodes and that was
it). In my experience it was just another added layer of complexity for
no added benefit and resulted in a number of outages due to failures in
the OpenStack layers underneath.
Given that Slurm which was being used there already had mature cgroups
support there really was no advantage to them to having a layer of
virtualisation on top of the hardware, especially as (if I'm remembering
properly) in the early days the virtualisation layer didn't properly
understand the Intel CPUs we had and so didn't reflect the correct
capabilities to the VM.
All that said, these days it's likely improved, and I know then people
were thinking about OpenStack "Ironic" which was a way for it to manage
bare metal nodes.
But I do know the folks in question eventually managed to go to purely a
bare metal solution and seemed a lot happier for it.
As for IB, I suspect that depends on the capabilities of your
virtualisation layer, but I do believe that is quite possible. This
cluster didn't have IB (when they started getting bare metal nodes they
went RoCE instead).
All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
More information about the Beowulf
mailing list