[Beowulf] Contents of Compute Nodes Images vs. Login Node Images
Prentice Bisbal
pbisbal at pppl.gov
Mon Oct 29 08:17:22 PDT 2018
On 10/28/2018 09:33 AM, Jörg Saßmannshausen wrote:
> Hi Prentice,
>
> that sounds somehow similar of what I have done back at my days at UCL:
> - login node with development packages
> - compute node only what is really needed in terms of software and services
>
> However, if you are removing packages from the compx.xml file manually, how can
> you be sure you are not breaking dependencies?
Two ways:
1. If you omit a package that is a dependency but don't explicitly say
not to install it in your kickstart file, then anaconda should install
it when it resolves dependencies. Whenever I go through this process,
there's always more RPMs installed on the final system than I list,
which is a result of this dependency resolution process.
2. Testing, testing, testing! If you create a situation where a
dependency cannot be resolved, the kickstart will fail with an error,
which you should see when testing your kickstart process.
>
> As I was using Debian, I simply did bare installation and then installed what
> I needed. Once I got a running OS, I rsynced that to a folder on the headnode
> which was the 'image' of the compute nodes. Depending on the cluster, I build
> the software on the login node and copied it to the folder where the image
> was. So during installation, that image folder was copied to the compute nodes
> and I only had to install the boot-loader (I never really looking into how to
> script that as well) using PXE boot. It worked quite well.
> Upgrading a software package simply means installing it inside the image
> foldeer (either chroot if it was a .deb package or just copie the files over)
> and rsynced it to the compute nodes.
>
>
> It was a robust system and I managed to handle th 112 compute nodes quite well
> I had. I even could take care of older and newer nodes and can install highly
> optimised packages on them as well. So nodes which only got avx got only the
> avx enabled software and the ones which had avx2 got these ones.
>
> It might not be the most flashy solution but it was KIS: Kepp It Simple!
>
> All the best from a rainy London
>
> Jörg
>
> Am Dienstag, 23. Oktober 2018, 13:43:43 GMT schrieb Prentice Bisbal via
> Beowulf:
>> Ryan,
>>
>> When I was at IAS, I pared down what was on the compute nodes
>> tremendously. I went through the comps.xml file practically line-by-line
>> and reduced the number of packages installed on the compute nodes to
>> only about 500 RPMs. I can't remember all the details, but I remember
>> omitting the following groups of packages:
>>
>> 1. Anything related to desktop environments, graphics, etc.
>> 2. -devel packages
>> 3. Any RPMS for wireless or bluetooth support.
>> 4. Any kind of service that wasn't strictly needed by the compute nodes.
>>
>> In this case, the user's desktops mounted the same home and project
>> directories and shared application directory (/usr/local), so the user's
>> had all the the GUI, post-processing, and devel packages they needed
>> right on their desktop, so the cluster was used purely for running
>> non-interactive batch jobs. In fact, there was no way for a user to even
>> get an interactive session on the cluster. IAS was a small environment
>> where I had complete control over the desktops and the cluster, so I was
>> able to this. I would do it all again just like that, given as similar
>> environment.
>>
>> I'm currently managing a cluster with PU, and PU only puts the -devel
>> packages, etc. on the the login nodes so users can compile there apps
>> there.
>>
>> So yes, this is still being done.
>>
>> There are definitely benefits to providing specialized packages lists
>> like this:
>>
>> 1. On the IAS cluster, a kickstart installation, including configuration
>> with the post-install script, was very quick - I think it was 5 minutes
>> at most.
>> 2. You generally want as few services running on your compute nodes as
>> possible. The easiest way to keep services from running on your cluster
>> nodes is to not install those services in the first place.
>> 3. Less software installed = smaller attack surface for security exploits.
>>
>> Does this mean you are moving away from Warewulf, or are you creating
>> different Warewulf images for login vs. compute nodes?
>>
>>
>> Prentice
>>
>> On 10/23/2018 12:15 PM, Ryan Novosielski wrote:
>>> Hi there,
>>>
>>> I realize this may not apply to all cluster setups, but I’m curious what
>>> other sites do with regard to software (specifically distribution
>>> packages, not a shared software tree that might be remote mounted) for
>>> their login nodes vs. their compute nodes. From what I knew/conventional
>>> wisdom, sites generally place pared down node images on compute nodes,
>>> only containing the runtime. I’m curious to see if that’s still true, or
>>> if there are people doing something else entirely, etc.
>>>
>>> Thanks.
>>>
>>> --
>>> ____
>>>
>>> || \\UTGERS, |---------------------------
> *O*---------------------------
>>> ||
>>> ||_// the State | Ryan Novosielski - novosirj at rutgers.edu
>>> ||
>>> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS
>>> || Campus
>>> ||
>>> || \\ of NJ | Office of Advanced Research Computing - MSB C630,
>>> || Newark
>>> ||
>>> `'
>>>
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>>> To change your subscription (digest mode or unsubscribe) visit
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list