[Beowulf] HPC meets agentic AI tools, any advice / thoughts?

John Hearns hearnsj at gmail.com
Tue May 19 17:54:31 UTC 2026


Posit workbench? Damn.. I thought you were talking about Posit number system

On Tue, 19 May 2026 at 14:24, Chris Dag via Beowulf <beowulf at beowulf.org>
wrote:

> I'm finding the agentic stuff (Claude Code within VSCode IDE)
> surprisingly useful for HPC operations, support and troubleshooting-
> especially when paired with a RAG that I stuffed full of HPC specific
> documentation on slurm, relion, cryosparc, schrodinger, posit workbench and
> AWS parallelcluster details -- basically the RAG is full of the stuff I
> need to do every day.
>
> My guardrails are usually:
>
>  - The https://hpc-mcp.apps.bioteam.cloud/ custom RAG; I force my agent
> to challenge itself against the RAG for every determination, config
> suggestion or parameter setting
>
>  - A read-only mode for running slurm commands or otherwise poking at the
> filesystem and logs; sudo acts require manual human review before
> proceeding
>
>  - A limited write-allowed mode so it can submit slurm jobs as me, run a
> schrodinger job with a license request or pull a schrodinger job failure
> postmortem file out of the cluster
>
>  - When I want the agent to actually "do stuff" on an HPC system I'll have
> it write out a local ansible playbook or bash script or .md file
> containing instructions. I'll run local linters and static security
> scanners against bash, terraform and ansible files and then have the agent
> stage them to git or the HPC filesystem where I can manually review and
> run them.  For more complex deployments involving lots of playbooks or
> terraform I'll invoke a 5-agent "review committee" to audit the files
> before they go anywhere.
>
>  - In all cases I manually run the terraform, bash script or ansible
> playbook
>
>  - In all cases, my instructions force the LLM to query my custom hpc-docs
> MCP to challenge and verify all commands. 90% of the time, it finds a
> mistake or hallucination involving parallelcluster, slurm, or schrodinger
> specific settings or commands that the RAG will catch and force the agent
> to fix.
>
>  - I use Claude Teams so that Anthropic does not train on our data.  I've got
> a M4 Pro mac mini connected to tailscale running a local LLM that I'll
> sometimes offload jobs to but it's nowhere near as good as the frontier
> models and super slow
>
> My hpc-docs rag is online here with lots of technical documentation on
> sources, ingest and architecture -- https://hpc-mcp.apps.bioteam.cloud/
>
> the actual RAG content itself is gated behind our Okta SSO server because
> I've stuffed the rag with content that is not actually public (some
> consulting notes, some vendor stuff that is not public)  so it's not
> actually useful by anyone else but I'd love to learn what others are doing
> in this space.
>
> I admit I'm kinda terrified what user-space people would do if not
> paranoid and careful. I think main risk is data loss or mangling on the
> local HPC but also the data leakage risk of end-users talking to remote
> LLMs and sending info there they should not.
>
> my $.02 only
>
>
>
>
> On Tue, May 19, 2026 at 8:14 AM Peter Clapham <pc7 at sanger.ac.uk> wrote:
>
>> OK, so sticking my neck out a little here
>>
>> How are people covering the risks from AI agentic tools across their HPC
>> platforms
>>
>> Ducks and listens…
>>
>> Pete
>>
>> Sent from Outlook for Mac
>> ------------------------------
>> The Wellcome Sanger Institute is operated by Genome Research Limited, a
>> charity registered in England with number 1021457 and a company registered
>> in England with number 2742969, whose registered office is Wellcome Sanger
>> Institute, Wellcome Genome Campus, Hinxton, CB10 1SA.
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20260519/f107e5ad/attachment.htm>


More information about the Beowulf mailing list