Bug 6120

Summary: Missing trailing space with squeue -o %b (GRES)
Product: Slurm Reporter: Stephane Thiell <sthiell>
Component: User CommandsAssignee: Albert Gil <albert.gil>
Status: RESOLVED FIXED QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: kaylea.nelson, mrodgers, tdockendorf
Version: 18.08.3   
Hardware: Linux   
OS: Linux   
See Also: https://bugs.schedmd.com/show_bug.cgi?id=6141
Site: Stanford Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: 18.08.4 19.05.0pre2 Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---
Attachments: squeue tres print fix
Fixes missing suffixes for TRES.

Description Stephane Thiell 2018-11-28 18:06:22 MST
Hello Tim,

Minor issue: one of our monitoring script broke today after migrating from Slurm 17 to 18 on Sherlock. It looks like the -o '%b' (GRES) format option of squeue doesn't add a trailing space anymore. See below:


$ squeue -rh -o '%g %u %P %b %T %C %D %R' -j 33003323
rzia emmagg rzia N/ARUNNING 1 1 sh-106-12
$ squeue -rh -o '%g %u %P %b %T %C %D %R' -j 33236568
epop gabourie gpu gpu:1RUNNING 1 1 sh-112-04

Our current workaround is to specify a format length which is long enough to cover all gres types on the cluster:

$ squeue -rh -o '%g %u %P %10b %T %C %D %R' -j 33236568
epop gabourie gpu gpu:1     RUNNING 1 1 sh-112-04
$ squeue -rh -o '%g %u %P %10b %T %C %D %R' -j 33003323
rzia emmagg rzia N/A       RUNNING 1 1 sh-106-12

Thanks!
Stephane
Comment 3 Trey Dockendorf 2018-11-29 13:54:56 MST
Supplied patch in bug report 6141 fixes the issue with N/A printing, curious if fixes actual issue when TRES is present.
Comment 6 Jason Booth 2018-12-03 09:00:13 MST
*** Bug 6141 has been marked as a duplicate of this bug. ***
Comment 7 Jason Booth 2018-12-03 09:01:50 MST
Created attachment 8496 [details]
squeue tres print fix

Trey Dockendorf (tdockendorf@osc.edu) fix / contribution from 6141.
Comment 8 Albert Gil 2018-12-03 09:17:22 MST
Created attachment 8497 [details]
Fixes missing suffixes for TRES.
Comment 11 Albert Gil 2018-12-06 04:41:37 MST
Hi Stephane, Trey,

> Supplied patch in bug report 6141 fixes the issue with N/A printing, curious
> if fixes actual issue when TRES is present.

Yes, the patch fixes both, the trailing spaces and the "DELIM" issues.
In fact, anything after the %b and before any other specified was actually ignored.

Trey's patch has been merged and properly authored at:
https://github.com/SchedMD/slurm/commit/9b0399b84373b3360a61b68c6725b0770c7f9dab

Thanks Trey!


Albert
Comment 12 Danny Auble 2018-12-06 11:55:48 MST
*** Bug 6144 has been marked as a duplicate of this bug. ***
Comment 13 Stephane Thiell 2018-12-07 12:02:25 MST
Thanks!
Comment 14 Kaylea Nelson 2019-06-07 11:35:36 MDT
We are currently on Slurm 18.08.5 and we are still seeing the issue when using the SQUEUE_FORMAT2/-O/--Format formatting options. -o works as expected after the fix.

# -o, works as expected
$ squeue -o "%i,%j"
NAME,JOBID
mesau,32994447_[77-399%20]
schedule-,30983558
schedule-,30983557
schedule-,30983556
LFeHNTMSFeL_-1_m1_BP86_def2-TZVP_OptHess.mpi,32999232
LFeHNTMSFeL_-1_m9_BP86_def2-TZVP_OptHess.mpi,32999246
LFeHCTMSFeL_-1_m2_BP86_def2-TZVP_OptHess.mpi,32999250
LFeHCTMSFeL_-1_m10_BP86_def2-TZVP_OptHess.mpi,32999251
LFeSCHTMS_0_m1_BP86_def2-TZVP_OptHess_NotTight.mpi,33001227

# -O, missing delimiter
$ squeue -O "name,jobid"
NAME                JOBID
mesau               32994447
schedule-           30983558
schedule-           30983557
schedule-           30983556
LFeHNTMSFeL_-1_m1_BP32999232
LFeHNTMSFeL_-1_m9_BP32999246
LFeHCTMSFeL_-1_m2_BP32999250
LFeHCTMSFeL_-1_m10_B32999251
LFeSCHTMS_0_m1_BP86_33001227

# -o, works as expected
$ squeue -o "%i %j"
NAME JOBID
mesau 32994447_[77-399%20]
schedule- 30983558
schedule- 30983557
schedule- 30983556
LFeHNTMSFeL_-1_m1_BP86_def2-TZVP_OptHess.mpi 32999232
LFeHNTMSFeL_-1_m9_BP86_def2-TZVP_OptHess.mpi 32999246
LFeHCTMSFeL_-1_m2_BP86_def2-TZVP_OptHess.mpi 32999250
LFeHCTMSFeL_-1_m10_BP86_def2-TZVP_OptHess.mpi 32999251
LFeSCHTMS_0_m1_BP86_def2-TZVP_OptHess_NotTight.mpi 33001227

# -O, fails completely with any delimiter other than ","
$ squeue -O "name jobid"
squeue: error: Invalid job format specification: name jobid
Comment 16 Albert Gil 2019-06-10 05:15:50 MDT
Hi Kaylea,

> We are currently on Slurm 18.08.5 and we are still seeing the issue when
> using the SQUEUE_FORMAT2/-O/--Format formatting options. -o works as
> expected after the fix.

Actually what you are reporting in comment 14 is the expected behavior (or at least a known limitation) in the sense that as you can see in the manpage one of the differences between -o and -O is that -O "Requests a comma separated list of job information to be displayed".

So, for -O we can only specify the desired fields, but not any other delimiter of character in the list of desired fields.
Just each field, justification and size.

The same happens with sinfo's pair of options.

I can see that you may want this limitation to be removed to allow the same level of flexibility/delimiters between -o and -O though, so if you are still interested and because it's not actually related to the original bug, please file a new bug referencing your comment here as 5-Enhancement and there we will discuss it further.
Is that fine for you?

Thanks,
Albert
Comment 17 Kaylea Nelson 2019-06-10 08:42:40 MDT
Sounds good, will do!
Comment 18 Albert Gil 2019-06-10 09:23:03 MDT
Closing as fixed again.