Skip to content

Add configure target and sample jobscript for Helma CPU cluster.#635

Merged
fassaad merged 2 commits into
masterfrom
helma-cpu-cluster
Jun 3, 2026
Merged

Add configure target and sample jobscript for Helma CPU cluster.#635
fassaad merged 2 commits into
masterfrom
helma-cpu-cluster

Conversation

@jonasschwab

Copy link
Copy Markdown
Member

Also adds libfakeintel.so to spoof Intel CPUs for MKL. See e.g. https://danieldk.eu/Software/Misc/Intel-MKL-on-AMD-Zen

This comment was marked as resolved.

This comment was marked as outdated.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for building/running ALF on the NHR@FAU Helma CPU cluster and introduces an optional MKL “fake Intel CPU” preload library to improve MKL performance on AMD CPUs.

Changes:

  • Add HELMA machine configuration to configure.sh (module setup + IntelLLVM toolchain + MKL/HDF5 wiring).
  • Add Scripts_and_Parameters_files/JobfileHelma.sh sample Slurm job script (incl. optional LD_PRELOAD for MKL spoofing).
  • Add Libraries/Modules/fakeintel.c + Makefile rule to build libfakeintel.so; also update .gitignore and fix typos in existing job scripts.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
configure.sh Adds HELMA machine case to set modules/compiler flags and HDF5 handling.
Scripts_and_Parameters_files/JobfileSuperMUC.sh Comment typo fixes.
Scripts_and_Parameters_files/JobfileSuperMUC-NG.sh Comment typo fix.
Scripts_and_Parameters_files/JobfileFritz.sh Comment typo fix.
Scripts_and_Parameters_files/JobfileHelma.sh New Helma Slurm job script + optional LD_PRELOAD guidance for MKL spoofing.
Libraries/Modules/fakeintel.c New C shim exporting MKL “Intel CPU true” symbols for preloading.
Libraries/Modules/Makefile Builds libfakeintel.so and adds it to the lib target; cleans it.
.gitignore Ignores generated *.so artifacts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Libraries/Modules/Makefile
Comment thread Libraries/Modules/Makefile
@johanneshofmann87

Copy link
Copy Markdown
Contributor

Hi @jonasschwab I just stumbled across this PR, and we might be suffering similar performance issues. Quick questions: As I understand it, one needs to use the preload mechanism, which I couldn't find in the HELMA script. If we want to adapt our local cluster scripts, where do I have to put the preloading? Or is something else taking care of it?

@jonasschwab

Copy link
Copy Markdown
Member Author

Hi @jonasschwab I just stumbled across this PR, and we might be suffering similar performance issues. Quick questions: As I understand it, one needs to use the preload mechanism, which I couldn't find in the HELMA script. If we want to adapt our local cluster scripts, where do I have to put the preloading? Or is something else taking care of it?

To preload the shared object file, one just needs to set the environment variable LD_PRELOAD pointing to the file. In the Helma sample jobscript, the following line is commented out:

export LD_PRELOAD=/absolute/path/to/ALF/Libraries/Modules/libfakeintel.so

@jonasschwab

Copy link
Copy Markdown
Member Author

Since I already have you here @johanneshofmann87: Where do you think is the best location to put fakeintel.c / libfakeintel.so? I put it in Libraries/Modules and it gets automatically compiled with all the rest, but conceptually it might better fit into Scripts_and_Parameters_files/.

@johanneshofmann87

Copy link
Copy Markdown
Contributor

Since I already have you here @johanneshofmann87: Where do you think is the best location to put fakeintel.c / libfakeintel.so? I put it in Libraries/Modules and it gets automatically compiled with all the rest, but conceptually it might better fit into Scripts_and_Parameters_files/.

Hmm, I see your point, but I'm not sure I like Scripts_and_Parameters_files/ better. While it is kind of a runtime file, it is more of (fake) library rather than a script (or parameter file). W could keep it in Library, but move it one level up? Not sure if this is sooo much better...

@fassaad

fassaad commented Apr 29, 2026

Copy link
Copy Markdown
Member

Helma is down for the next weeks, so that we will wait till the machine is up again to test and then merge.

@fassaad fassaad self-assigned this Jun 3, 2026
@fassaad

fassaad commented Jun 3, 2026

Copy link
Copy Markdown
Member

The group here has been using this, and it works just fine. Hence I think that we could merge.

@fassaad fassaad self-requested a review June 3, 2026 15:16
@fassaad fassaad enabled auto-merge June 3, 2026 15:17

@fassaad fassaad left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good.

@fassaad fassaad merged commit a80f979 into master Jun 3, 2026
25 checks passed
@fassaad fassaad deleted the helma-cpu-cluster branch June 3, 2026 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants