Our CrayPEToolchain EasyBlock allows for many different scenarios to generate the cpeCray/cpeGNU/cpeAOCC/cpeAMD modules:
-
The
cpemodule can be loaded first, last or not at all. Note that if the module is not loaded at all, it may be wise to have a different way of setting the default versions for the Cray PE modules.On LUMI, in the LUMI software stacks, these default versions are already set by the
LUMIstack_<yy.mm>_modulerc.luafiles.??? Note "Early days of LUMI, 21.04" In version 21.04 of the CPE, the
cpemodules still had several problems, partly due to an LMOD restriction and partly due to bugs in those modules:- The `cpe` modules set `LMOD_MODULERCFILE` through `setenv` rather than `prepend_path` or `append_path` so they overwrite any file that sets system-wide defaults and visibility from other sources, which is not desirable. - A change to `LMOD_MODULERCFILE` has only effect the next time a module command is executed. This is a restriction not only of LMOD version `8.3.x` used in the 21.04 CPE, but also of versions in the 8.4 and 8.5 series. Hence loading the `cpe/yy.mm` module before loading other versionless modules for the CPE components will not have the desired effect of loading the versions for that specific version of the CPE. - The `cpe` modules do contain code to reload already loaded modules from the CPE in the correct version, but that code is also broken in the `21.04` version as the modules may be loaded in an order in which a module that has already been reloaded in the correct version, gets reloaded once more with a versionless load, which may reload the wrong version. This is because the LUA loop with `pairs` doesn't have a fixed order of going over the entries in the LUA table. The order should be such that no module reloads an other module that has already been reloaded in the correct version.Note that loading the
cpemodule in a single module load command with other modules does not always have the effect that one would expect at first. The change of the defaults for modules only takes effect at the nextmodulecommand. -
The matching
PrgEnv-*module can either be loaded, or its loading can just be emulated by only setting the environment variable that this module sets, but relying on thecray_targetsvariable and dependencies list to load all Cray PE components.The reason to avoid loading the
PrgEnv-*module is reproducibility. That module depends on a file in/etcto define the components that will be loaded, and that file cannot distinguish between versions of the CPE. Hence if changes to that file would be made, it has an effect on the working of allcpe*modules that EasyBuild may already have generated.However, in more recent versions of the Cray PE the compiler modules themselves now also force a load of the corresponding
PrgEnv-*module so there is no other choice than to also load these modules. (Not sure when this started as it went unnoticed at first, but this was definitely the case from 24.03 onwards.) -
It is possible to specify target modules via
cray_targets. This is a list just as the dependencies. They will be loaded after thePrgEnv-*module (if the latter is loaded) but before other dependencies specified bydependencies. They do not need to be defined in the EasyBuild external modules file. We chose to load them after thePrgEnv-*module (if the latter is loaded) to be able to overwrite Cray targeting modules loaded by the latter. -
Dependencies in this case will be external modules. It is possible to specify versions by using, e.g.,
( 'gcc/9.3.0', EXTERNAL_MODULE)
Versions should be specified if the
cpemodule is not loaded, even on LUMI, as if a user would executemodule load LUMI/21.04 cpeGNU/21.04
the wrong versions of CPE components might be loaded because of the same LMOD restriction that causes the problems with the Cray
cpe/yy.mmmodules: TheLUMI/yy.mmmodule will add a file that sets the default versions of CPE compoments for the requested LUMI software stack and matching CPE version, but those changes only take effect at the followingmodulecommand, so thecpeGNU/21.04module which is loaded in the above example will not yet see the correct default versions of the modules.Note also that if versions are specified but the
cpemodule is loaded at the end, modules might be reloaded in a different version. -
The default value for various parameters is chosen to generate module files that are as similar as possible to those used ast CSCS (or at least those used for their 20.04 environment), but are not the defaults initially used on LUMI.
The CrayPEToolchain EasyBlock supports the following parameters:
-
PrgEnv: Sets thePrgEnv-*module to load or emulate.- The default is to derive the value from the name of the module to generate:
PrgEnv = 'cray'forcpeCrayPrgEnv = 'gnu'forcpeGNUPrgEnv = 'aocc'forcpeAOCCPrgEnv = 'amd'forcpeAMDPrgEnv = 'intel'forcpeIntelPrgEnv = 'nvidia'forcpeNVIDIA(not tested as we have no access to a machine with a fully working version of this environment) This list is no longer complete as we have no way to fully test the environments for NVIDIA which have also changed during the lifetime of LUMI.
- It is also possible to specify any of these values, or even a different value for a
PrgEnv-*module that is not yet recognized by the EasyBlock.
- The default is to derive the value from the name of the module to generate:
-
PrgEnv_load: Boolean value, indicate if thePrgEnvmodule should be loaded explicitly (if True) or not (if False).Default is
True.If you want to hard-code a version, you can do so by specifying the module with the version in the dependencies.
It is important that all
cpe*modules available in the system at the same time are also generated with the same setting forPrgEnv_loadas otherwise the conflict resolution between those modules would not work correctly. -
PrgEnv_family:-
If
cpeToolchain, the module will declare itself a member of thecpeToolchainfamily. If allcpe*modules are generated that way, this will ensure that no two differentcpe*modules will be loaded simultaneously, which wouldn't work correctly anyway with the Cray compiler wrappers.If
PrgEnv_loadis false, it will also force unload allPrgEnv-*modules to ensure that none is loaded. Otherwise it relies on the family-mechanism used in the LMODPrgEnv-*modules to do the job.This is the most robust option when explicitly loading a
PrgEnv-*module and using LMOD as LMOD will then ensure that no twocpe*modules will be loaded simultaneously and the family mechanism used in the CrayPrgEnv-*modules will do the same for those modules. -
If
PrgEnv, the module will declare itself a member of thePrgEnv-family. This will generate an error ifPrgEnv_loadis True as one cannot load two modules of the same family but is the most robust ootion when using LMOD and emulating thePrgEnv-*module.The LMOD family feature will take care of unloading all other
PrgEnv-*orcpe*modules as they would conflict with the current module. -
If
None(default), which is the only setting that works when TCL-based modules are used and is therefore the default, the module will start with unload commands for all knownPrgEnv-*and allcpe*modules except itself and thePrgEnv-*module that it uses (if it uses one).
It is important that all
cpe*modules available in the system at the same time are also generated with the same setting forPrgEnv_familyas otherwise the conflict resolution between those modules would not work correctly. -
-
CPE_compilerspecifies the (versionless) compiler to load. Possible values are:-
None(default): Derive the name of the compiler module from the name of the module to generate. This may not yet work forcpeNVIDIAas it is not clear what the name of the compiler module will be.If will not add an additional load if that compiler module is already specified in the dependencies.
Note that this will load the module without specifying the version, so it only makes sense to rely on the autodetect feature if the
cpemodule is loaded (and if the bugs with that one are fixed). -
Any other value will be considered the name of the compiler module to load. The module should be versionless. If you want to specify a version, you can do so via
dependencies.No separate load will be generated if the compiler module is also found in the list of dependencies.
-
-
CPE_version: Version of the cpe module to use (if it is used). Possible values:-
None(default): Determine the version from the version of the module to generate, i.e., theversionparameter in the EasyConfig. -
Any other value is interpreted as the value to load.
-
-
CPE_load: Possible values:-
first(default): Load as the very first module. This does not make sense until the LMOD problems withLMOD_MODULERCFILEare fixed. -
after: Load immediately after loadingPrgEnv-*but before loading any other module. This does not make too much sense until the LMOD problems withLMOD_MODULERCFILEare fixed, but it could be a way to first load modules the Cray way and then correct by manually loading correct versions via thecray_targetsanddependenciesparameters.This value will produce an error message when
PrgEnv_loadis set toFalse. -
last: Load as the last module. In earlier versions of the programmine environment, -
this did not make sense due to issues with the
cpemodule, and on LUMI, an issue with overwritingLMOD_MODIULERCFILE. -
None: Do not load thecpemodule but rely on explicit dependencies specified in the list of dependencies instead.
-
-
cray_targets: A list of Cray targetting modules to load. -
dependencies: This is a standard EasyConfig parameter. The versions of the selected PrgEnv, compiler andcraypemodule can be specified through dependencies but those modules will still be loaded according to the scheme below. Any redifinition of thecpemodule is discarded.
-
The
cpe/<CPE_version>module, ifCPE_loadisfirst.If LMOD would be modified to honour changes to
LMOD_MODULERCFILEimmediately as it does with changes toMODULEPATH, this would be the best moment to load thecpemodule as it ensures that all other packages would be loaded with the correct version number immediately. -
The
PrgEnv-<PrgEnv>module, ifPrgEnv_loadis True. -
The targeting modules specified by
cray_targets. Hence they can overwrite the targets set by thePrgEnv-*module which may be usefull on a heterogeneous system should there only be a single configuration for thePrgEnv-*modules for all hardware partitions in the system, or to build acpe*module for cross-compiling.Note that changes to the targeting modules may trigger reloads of other modules loaded by the
PrgEnv-*module. -
The
CPE_compilermodule (or autodected one), unless both PrgEnv-* is loaded explicitly and the module is not in the list of dependencies (in which case we rely on thePrgEnv-*module to do the proper job). -
The craype module (compiler wrappers), unless both PrgEnv-* is loaded explicitly and the module is not in the list of dependencies (in which case we rely on the
PrgEnv-*module to do the proper job). -
The specified dependencies, minus the
cpe/*,PrgEnv-*andcraype/*modules. -
The
cpe/<CPE_version>module, ifCPE_loadislast.In principle this should reload any module loaded before in a version that does not match the selected Cray PE version, and hence will also overwrite versions set in the dependencies. However, in the Cray PE 21.04 release (which was used for early development and testing) the module did not always do the reloads in the proper order to always ensure the right version, and one might even end up with a version that is neither the one specified in the dependencies nor the one specified by the
cpe/*module.
This is the default configuration for this EasyBlock.
A minimal EasyConfig (omitting some mandatory parts such the homepage and description
parameters) is
easyblock = 'CrayPEToolchain'
name = 'cpeGNU'
version = "21.04"
toolchain = SYSTEM
moduleclass = 'toolchain'This generates a module file that activates the toolchain by only loading the
cpe/21.04 and PrgEnv-gnu-modules (in that order). Unfortunately, this scheme
does not work as expected, as LMOD_MODULERCFILE is only
honoured at the next module call. If the effect of LMOD_MODULERCFILE would
be immediate, this would probably be the most efficient way of activating a particular
release of a particular PrgEnv. The module does not belong to any family. Instead it
explicitly unloads other cpe* modules.
Now we first load a PrgEnv-* module and only subsequently the cpe/yy.mm module
that fixes versions for the modules.
easyblock = 'CrayPEToolchain'
name = 'cpeGNU'
version = "21.04"
toolchain = SYSTEM
PrgEnv_family = 'cpeToolchain'
CPE_load = 'after'
moduleclass = 'toolchain'This generates a module file that activates the toolchain by first loading the PrgEnv-gnu
module and then correcting the versions by loading cpe/21.04. This doesn't work
reliably either due to the current design of the module reloading process in the cpe/21.04
module combined with the delayed impact of changes to LMOD_MODULERCFILE.
The module will belong to the cpeToolchain family. That family will take care of
unloading any other cpe* module that would be loaded (provided the PrgEnv_family
parameter was set the same way in their EasyConfigs), while the PrgEnv-gnu module
will take care of unloading other PrgEnv-* modules through the PrgEnv family.
On LUMI, due to the problems with LMOD and the cpe modules, we initially used a setup
without PrgEnv-* or cpe module. One of the functions of the cpe module,
setting the default versions of the Cray PE components, is already done by the LUMI
module that loads the software stack. The other is replaced by hard-coding the necessary
versions in the EasyConfig. One of the functions of the PrgEnv-* modules, setting
and environment variable that tells the compiler wrappers which PE is selected, is
taken over by the EasyBlock which sets the variable in the module file that it generates.
The other, loading the correct targets and other PE modules, is taken over by the craype_targets
parameter and the dependency list. This is the most reproducible setup as it only depends
on versioned components (the partition module already ensures that a particular version
of the Cray targeting modules is made available).
easyblock = 'CrayPEToolchain'
name = 'cpeGNU'
version = "21.04"
toolchain = SYSTEM
PrgEnv_load = False
PrgEnv_family = 'PrgEnv'
CPE_load = None
cray_targets = [
'craype-x86-rome',
'craype-accel-host',
'craype-network-ofi'
]
dependencies = [
('gcc/9.3.0', EXTERNAL_MODULE),
('craype/2.7.6', EXTERNAL_MODULE),
('cray-mpich/8.1.4', EXTERNAL_MODULE),
('cray-libsci/21.04.1.1', EXTERNAL_MODULE),
('cray-dsmml/0.1.4', EXTERNAL_MODULE),
('perftools-base/21.02.0', EXTERNAL_MODULE),
('xpmem', EXTERNAL_MODULE),
]
moduleclass = 'toolchain'The cpeGNU module generated by this EasyConfig will be unloaded if the user would
load a PrgEnv-* module as it is also a member of the PrgEnv family. As such
it is a full replacement of the Cray PrgEnv-gnu module.
In recent versions of the PE (definitely from 24.03 on but likely earlier already)
this does not work anymore as expected as the compiler module still forces the load
of the corresponding PrgEnv-* module. In fact, this is a case where Lmod could be
convinced to load to modules of the same family which is why it went unnoticed at first.
A compromise solution that will work around the problems with LMOD and the cpe
modules yet retain much of the spirit of the Cray PE, and that also can correct the
targeting modules should the PrgEnv-* module not take the ones that you want
(or ensure that at least certain other modules are loaded, even if they would be
removed from the list of modules loaded by PrgEnv-gnu in an update of the system), is
the following setup:
easyblock = 'CrayPEToolchain'
name = 'cpeGNU'
version = '21.04'
toolchain = SYSTEM
CPE_load = 'first'
PrgEnv_load = True
PrgEnv_family = 'cpeToolchain'
cray_targets = [
'craype-x86-rome',
'craype-accel-host',
'craype-network-ofi'
]
dependencies = [
('PrgEnv-gnu/8.0.0', EXTERNAL_MODULE),
('gcc/9.3.0', EXTERNAL_MODULE),
('craype/2.7.6', EXTERNAL_MODULE),
('cray-mpich/8.1.4', EXTERNAL_MODULE),
('cray-libsci/21.04.1.1', EXTERNAL_MODULE),
('cray-dsmml/0.1.4', EXTERNAL_MODULE),
('perftools-base/21.02.0', EXTERNAL_MODULE),
('xpmem', EXTERNAL_MODULE),
]
moduleclass = 'toolchain'This setup will first load the cpe/21.04 and PrgEnv-gnu/8.0.0 modules to stay in
the Cray PE spirit. Next the indicated targeting modules will be loaded, one for the
CPU, one for the accelerator architecture and one for the network. This may trigger
reloads of some other modules and will overwrite targeting modules of the same type
loaded by PrgEnv-gnu. Finally, the gcc compiler module, the craype module and all
other modules from the dependency list are loaded with the versions specified.
This setup is a compromise that on one hand stays close to the Cray PE spirit by using
the cpe and PrgEnv-gnu modules, yet works around some problems, namely:
- Setting
LMOD_MODULERCFILEdoes not work immediately. - Any corrective action when loading
cpeafterPrgEnv-gnudoes not work - On a heterogeneous cluster, the targeting modules loaded by
PrgEnv-gnumay not be the ones you want when cross-compiling or when the system would use the same file defining the modules for the whole system. - The list of modules loaded by
PrgEnv-gnumay change as it is determined by a single file on the system that does not depend on the version of the Cray PE. In this case, you can always be sure that at least the modules mentioned in the dependency list andcray_targetsparameter will be loaded.
A variant of this would set CPE_load = 'after' which would load the cpe/21.04
module immediately after loading PrgEnv-gnu rather than just before, but with the
current flaws of the cpe/21.04 module this still does not solve all problems:
easyblock = 'CrayPEToolchain'
name = 'cpeGNU'
version = '21.04'
toolchain = SYSTEM
CPE_load = 'after'
PrgEnv_load = True
PrgEnv_family = 'cpeToolchain'
cray_targets = [
'craype-x86-rome',
'craype-accel-host',
'craype-network-ofi'
]
dependencies = [
('PrgEnv-gnu/8.0.0', EXTERNAL_MODULE),
('gcc/9.3.0', EXTERNAL_MODULE),
('craype/2.7.6', EXTERNAL_MODULE),
('cray-mpich/8.1.4', EXTERNAL_MODULE),
('cray-libsci/21.04.1.1', EXTERNAL_MODULE),
('cray-dsmml/0.1.4', EXTERNAL_MODULE),
('perftools-base/21.02.0', EXTERNAL_MODULE),
('xpmem', EXTERNAL_MODULE),
]
moduleclass = 'toolchain'This is yet another compromise scenario:
- Loading
cpe/yy.mmfirst ensures that further modules a user might load after loading thecpe*module will load in the proper versions if a user does a versionless load. - Mimicing
PrgEnv-*and loading modules explicitly ensures reproducibility over time as the list of modules loaded does not depend on a single file elsewhere in the system configuration which is not specific to a particular release of the PE. - Hard-coding the versions ensures that we avoid the problems caused by the implementation
of the
cpe/yy.mmmodules (certainly in releases up to and including 21.06)
easyblock = 'CrayPEToolchain'
name = 'cpeGNU'
version = '21.04'
toolchain = SYSTEM
PrgEnv_load = False
PrgEnv_family = 'PrgEnv'
CPE_load = 'first'
cray_targets = [
'craype-x86-rome',
'craype-accel-host',
'craype-network-ofi'
]
dependencies = [
('PrgEnv-gnu/8.0.0', EXTERNAL_MODULE),
('gcc/9.3.0', EXTERNAL_MODULE),
('craype/2.7.6', EXTERNAL_MODULE),
('cray-mpich/8.1.4', EXTERNAL_MODULE),
('cray-libsci/21.04.1.1', EXTERNAL_MODULE),
('cray-dsmml/0.1.4', EXTERNAL_MODULE),
('perftools-base/21.02.0', EXTERNAL_MODULE),
('xpmem', EXTERNAL_MODULE),
]
moduleclass = 'toolchain'This would be a valid scenario once the cpe/yy.mm modules have been corrected and
work as they should. In this scenario,
-
We mimic
PrgEnv-*by setting the necessary environment variables and then loading a list of versionless modules. This avoids a problem with the actual PrgEnv modules as the list of modules they load depends on a single system file which is the same for all releases of the PE and hence may change over time. -
At the end the relevant
cpe/yy.mmmodule is loaded to fix the versions of all already loaded modules.
The corresponding EasyConfig file (minus help etc.) is:
easyblock = 'CrayPEToolchain'
name = 'cpeGNU'
version = '21.04'
toolchain = SYSTEM
PrgEnv_load = False
PrgEnv_family = 'PrgEnv'
CPE_load = 'last'
cray_targets = [
'craype-x86-rome',
'craype-accel-host',
'craype-network-ofi'
]
dependencies = [
('gcc', EXTERNAL_MODULE),
('craype', EXTERNAL_MODULE),
('cray-mpich', EXTERNAL_MODULE),
('cray-libsci', EXTERNAL_MODULE),
('cray-dsmml', EXTERNAL_MODULE),
('perftools-base', EXTERNAL_MODULE),
('xpmem', EXTERNAL_MODULE),
]
moduleclass = 'toolchain'As there is no way anymore to avoid using the PrgEnv-* modules, we
switched to the following scheme from 24.11 onwards:
easyblock = 'CrayPEToolchain'
name = 'cpeGNU'
version = '21.04'
toolchain = SYSTEM
PrgEnv_load = True
PrgEnv_family = 'cpeToolchain'
CPE_load = None
cray_targets = [
'craype-x86-rome',
'craype-accel-host',
'craype-network-ofi'
]
dependencies = [
('PrgEnv-gnu/8.6.0', EXTERNAL_MODULE),
('gcc-native/14.2', EXTERNAL_MODULE),
('craype/2.7.34', EXTERNAL_MODULE),
('cray-mpich/8.1.32', EXTERNAL_MODULE),
('cray-libsci/25.03.0', EXTERNAL_MODULE),
('cray-dsmml/0.3.1', EXTERNAL_MODULE),
('perftools-base/25.03.0', EXTERNAL_MODULE),
('xpmem', EXTERNAL_MODULE),]
moduleclass = 'toolchain'(Note that this is a simplified file.)
-
We avoid using the
cpemodules as they continue to behave a bit strange and also print a confusing message at unload. -
We now explicitly load the
PrgEnv-*module that is needed as the first module in the dependencies. -
But we still also explicitly load all other modules that we really want, all with the correct version (except for
xpmemas its version changes with system updates and not PE updates) to ensure that we get the expected version at all times of at least these modules.