Ticket 16278 - ntasks not automatically determined
Summary: ntasks not automatically determined
Status: RESOLVED INFOGIVEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Configuration (show other tickets)
Version: 23.02.0
Hardware: Linux Linux
: --- 4 - Minor Issue
Assignee: Oscar Hernández
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2023-03-15 10:56 MDT by Trung Nguyen
Modified: 2023-07-04 15:15 MDT (History)
3 users (show)

See Also:
Site: University of Chicago
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
job_submit.lua (3.12 KB, text/x-lua)
2023-03-18 17:41 MDT, Trung Nguyen
Details
slurm.conf (31.17 KB, text/plain)
2023-03-18 17:42 MDT, Trung Nguyen
Details
job_submit.lua with if num_tasks not defined (3.48 KB, text/x-lua)
2023-03-21 09:42 MDT, Trung Nguyen
Details

Note You need to log in before you can comment on or make changes to this ticket.
Description Trung Nguyen 2023-03-15 10:56:59 MDT
Dear SchedMD suppport,

we have upgraded SLURM to 23.02.0 on our site and overall everything operates as expected. However, we got notices from the users that in our updated version `--ntasks` (SLURM_NTASKS) seems not be automatically determined from $SLURM_JOB_NUM_NODES * $SLURM_NTASKS_PER_NODE.

Users need to specify explicitly `--nodes` and/or `--ntasks-per-node` and determine ntasks inside their job scripts. In many cases, they are having `--ntasks` specified in their job scripts, but without `--nodes` nor `--ntasks-per-node`, and the jobs crashed.


Are there ways to modify SLURM settings/configuration to handle this situation without having to recompile SLURM and deploy again? Any suggestion is appreciated.

Thank you,
-Trung
 
Trung D. Nguyen, Ph.D. (he/him)
Sr. Computational Scientist, Molecular Engineering and Scientific Computing
Research Computing Center, The University of Chicago
6054 S Drexel Ave., Chicago, IL 60637
Comment 2 Oscar Hernández 2023-03-17 06:05:54 MDT
Hi Trung,

You are right, in 23.02 there has been a slight change in the way the variable SLURM_NTASKS is defined. Now, slurm won't set it in the job's environment unless the user explicitly requests the number of tasks (with -n/--ntasks). The idea is to prevent job steps inheriting ntasks when not defined in the job submission[1].

>Are there ways to modify SLURM settings/configuration to handle this situation 
>without having to recompile SLURM and deploy again? Any suggestion is 
>appreciated.
Yes. If you want to restore previous behavior without users noticing, I would suggest you to configure "JobSubmitpluigns=lua"[2] in slurm.conf. Inside it, you can set "ntasks = nodes * tasks_per_node", so that Slurm will act as if it was explicitly requested by the user.

Do not know if you are already making use of this functionality. But if you are new to it, you have a detailed documentation about it here[3]. The idea would go around creating a job_submit() function that does something like (you can make any variation to adapt it to your needs):
####
function slurm_job_submit(job_desc, part_list, submit_uid)
        -- check if ntasks was defined by the user
        if job_desc.num_tasks == slurm.NO_VAL then
                -- check if ntasks_per_node and num_nodes were requested
                if job_desc.ntasks_per_node ~= slurm.NO_VAL16 and job_desc.min_nodes ~= slurm.NO_VAL then
                        -- set ntasks to the corresponding value 
                        job_desc.num_tasks=job_desc.ntasks_per_node*job_desc.min_nodes
                end
        end
        return slurm.SUCCESS
end
####

For the plugin to work, both "slurm_job_submit" and "slurm_job_modify" need to be defined. If you want no action to be done when a job is modified, you can just define it as an empty function.E.g:
####
function slurm_job_modify(job_desc, job_rec, part_list, modify_uid)

        return slurm.SUCCESS
end
####

After setting the new slurm.conf option and the script. It should be enough with "scontrol reconfigure" for the changes to take effect.

Let me know if you think that could cover your needs. And please, feel free to ask if you have any doubt in this regard.

Cheers,
Oscar

[1]https://github.com/SchedMD/slurm/commit/ef513023ad87a3870bf575efd2329672819c59f0
.
[2]https://slurm.schedmd.com/slurm.conf.html#OPT_lua
.
[3]https://slurm.schedmd.com/job_submit_plugins.html#lua
Comment 3 Trung Nguyen 2023-03-18 17:41:51 MDT
Created attachment 29408 [details]
job_submit.lua
Comment 4 Trung Nguyen 2023-03-18 17:42:10 MDT
Created attachment 29409 [details]
slurm.conf
Comment 5 Trung Nguyen 2023-03-18 17:58:16 MDT
Hi Oscar,

thanks for your detailed response. 

We have already specified "JobSubmitpluigns=lua" our slurm.conf file (attached). The job_submit.lua script is attached here as well. By "Inside it", did you mean that we should set "ntasks = nodes * tasks_per_node" in this job_submit.lua script so that Slurm will act as if it was explicitly requested by the user?
And the way to set ntasks is to implement the slurm_job_submit and slurm_job_modify functions in the job_submit.lua, as you suggested?

Or should we create another lua script?

Sorry if I misunderstood anything.

Thanks,
-Trung
Comment 6 Oscar Hernández 2023-03-20 03:20:00 MDT
Hi Trung,

> We have already specified "JobSubmitpluigns=lua" our slurm.conf file
> (attached). The job_submit.lua script is attached here as well. By "Inside
> it", did you mean that we should set "ntasks = nodes * tasks_per_node" in
> this job_submit.lua script so that Slurm will act as if it was explicitly
> requested by the user?
Yes. By setting ntasks inside the script, Slurm will behave as it was requested by the user, defining the variable. I did some testing with the suggested code from the previous comment to confirm it.

> And the way to set ntasks is to implement the slurm_job_submit and
> slurm_job_modify functions in the job_submit.lua, as you suggested?
Yes, you should modify the existing script. As you already seem to have a job_submit.lua script working right now, I would suggest just to add a new 'if' block after all the other stuff you have defined in the "slurm_job_submit" function. There is no need to modify the "slurm_job_modify" function.

Some other information that might help:

- In case you modify the script and by mistake you save it in a broken/bad formatted state. Slurm will use the previous working version (cached version). Slurm does that to avoid any disruption in job submissions when modifying this script. There is no need to restart/reconfigure Slurm every time the script is modified.

- With my suggested if snippet, ntasks will only be set if it was not defined before by the user, and if both: nnodes and ntasks_per_node are defined by the user. Feel free to modify it to adapt your own rules, but take into account that all submitted jobs will go through this new logic. So please, test it carefully after the modification.

My simple test script was:

###
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=4
##SBATCH --ntasks=8

echo "SLURM JOB NODES is $SLURM_JOB_NUM_NODES"
echo "NTASKS PER NODE is $SLURM_NTASKS_PER_NODE"
echo "NTASKS is          $SLURM_NTASKS"
###

Output with the job_submit.lua set:
SLURM JOB NODES is 2
NTASKS PER NODE is 4
NTASKS is          8

Output without having it set (your current behavior):
SLURM JOB NODES is 2
NTASKS PER NODE is 4
NTASKS is          

> Sorry if I misunderstood anything.
You got things right! Please, do feel free to ask me to clarify anything if there is any other doubt.

Kind regards,
Oscar
Comment 7 Trung Nguyen 2023-03-21 09:42:39 MDT
Created attachment 29442 [details]
job_submit.lua with if num_tasks not defined

Hi Oscar,

thanks very much for the detailed instructions. One question: the comment said "check if ntasks_per_node and num_nodes were requested" but I notice you put job_desc.min_nodes instead of job_desc.num_nodes, was it intentional?

We first included your suggested if snippet to the slurm_job_submit function with job_desc.min_nodes. After running scontrol reconfigure, using a simple job script like yours, we still see SLURM_NTASKS undefined.

We also tried to replace job_desc.min_nodes with job_desc.num_nodes (as in the attached job_submit.lua script), and ran scontrol reconfigure. Still no effect to the undefined SLURM_NTASKS.

Do you have any suggestion for us to better debug the issue? The code snippet looks straightforward to me, and we don't know what was missing here. Is there any difference between slurm.NO_VAL16 and slurm.NO_VAL?

Thanks,
-Trung
Comment 8 Oscar Hernández 2023-03-21 10:15:56 MDT
Hi Trung,

> thanks very much for the detailed instructions. One question: the comment
> said "check if ntasks_per_node and num_nodes were requested" but I notice
> you put job_desc.min_nodes instead of job_desc.num_nodes, was it intentional?
Yes, it is intentional. job_desc.min_nodes should reflect the nodes requested with -N.

> We first included your suggested if snippet to the slurm_job_submit function
> with job_desc.min_nodes. After running scontrol reconfigure, using a simple
> job script like yours, we still see SLURM_NTASKS undefined.
That is strange. I am running 23.02 and did send you the exact code snippet that I used. Which was working for me. Could you check if there is any error related like "error: job_submit/lua:" in the slurmctld.log?

> Do you have any suggestion for us to better debug the issue? The code
> snippet looks straightforward to me, and we don't know what was missing
> here. Is there any difference between slurm.NO_VAL16 and slurm.NO_VAL?
Yes, NO_VAL(32bit) and NO_VAL16(16bit) are used accordingly to each variable type. It is important for each value to have its corresponding NO_VAL, otherwise the checks would fail. For now, I would suggest you to check the logs just in case.

Would like to point out that my suggested workaround sets ntasks only when both: -N (num nodes) and --ntasks-per-node were specified. Not when only specifying one of them. What job are you using for testing?

Let me double check using your job_submit.lua.

Kind regards,
Oscar
Comment 9 Trung Nguyen 2023-03-21 11:01:37 MDT
Hi Oscar,

thanks for your detailed explanation. I have reverted it to min_nodes, and after we reran scontrol reconfigure, the tests show SLURM_NTASKS defined as expected now.

I appreciate your support very much.

Best,
-Trung