Skip to content

Replace costly call to 'scontrol show config' with pmix env vars#204

Open
nvcastet wants to merge 1 commit intoNVIDIA:mainfrom
nvcastet:rm_scontrol_config
Open

Replace costly call to 'scontrol show config' with pmix env vars#204
nvcastet wants to merge 1 commit intoNVIDIA:mainfrom
nvcastet:rm_scontrol_config

Conversation

@nvcastet
Copy link

@nvcastet nvcastet commented Aug 8, 2024

scontrol show config issues RPC calls to master node.
When Slurm is configured with per-user RPC rate-limiting (rl_enable), the command can be throttled causing a large variance in rank start times since it is called in a hook executed per rank.
This PR leverages env variables to get the info for the PMI hook instead of calling scontrol show config.

CC @flx42 @3XX0

@3XX0
Copy link
Member

3XX0 commented Oct 24, 2024

I would have to look at it again, but there were good reason not to do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants