Ticket 6380

Summary: Seff reports incorrect usage
Product: Slurm Reporter: Steve Ford <fordste5>
Component: User CommandsAssignee: Felip Moll <felip.moll>
Status: RESOLVED DUPLICATE QA Contact:
Severity: 3 - Medium Impact    
Priority: --- CC: jurij.pecar
Version: 18.08.4   
Hardware: Linux   
OS: Linux   
See Also: https://bugs.schedmd.com/show_bug.cgi?id=6332
https://bugs.schedmd.com/show_bug.cgi?id=6382
Site: MSU Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---
Ticket Depends on: 6004    
Ticket Blocks:    

Description Steve Ford 2019-01-18 07:40:07 MST
Hello SchedMD,

We updated SLURM to 18.08.4 to fix an issue we encountered using seff (https://bugs.schedmd.com/show_bug.cgi?id=5882)

Now that we're updated, we are experiencing a different issue. Seff has errors and reports incorrect usage:

17.11.8 seff:
$ seff 6609166
Job ID: 6609166
Cluster: msuhpcc
User/Group: xxxxx/xxxxx
State: TIMEOUT (exit code 0)
Nodes: 1
Cores per node: 16
CPU Utilized: 11176-20:09:13
CPU Efficiency: 9979.29% of 112-00:00:32 core-walltime
Job Wall-clock time: 7-00:00:02
Memory Utilized: 19.99 GB
Memory Efficiency: 24.99% of 80.00 GB

18.08.4 seff:
$ seff 6609166
Use of uninitialized value $lmem in numeric lt (<) at /usr/bin/seff line 130, <DATA> line 624.
Use of uninitialized value $lmem in numeric lt (<) at /usr/bin/seff line 130, <DATA> line 624.
Use of uninitialized value $lmem in numeric lt (<) at /usr/bin/seff line 130, <DATA> line 624.
Job ID: 6609166
Cluster: msuhpcc
User/Group: xxxxx/xxxxx
State: TIMEOUT (exit code 0)
Nodes: 1
Cores per node: 16
CPU Utilized: 11176-20:09:13
CPU Efficiency: 9979.29% of 112-00:00:32 core-walltime
Job Wall-clock time: 7-00:00:02
Memory Utilized: 0.00 MB (estimated maximum)
Memory Efficiency: 0.00% of 80.00 GB (80.00 GB/node)


This looks like the same issue described in https://bugs.schedmd.com/show_bug.cgi?id=6004 and https://bugs.schedmd.com/show_bug.cgi?id=6315. There are patches suggested in https://bugs.schedmd.com/show_bug.cgi?id=6004, can your developers take a look at these before I apply them?

Thank you,
Steve
Comment 2 Felip Moll 2019-01-21 08:48:07 MST
Hi Steve,

I'm debating with dev. team about this issue.

Will keep you posted.
Comment 3 Jurij Pečar 2019-01-21 09:20:26 MST
We have the same problem and I just tried out that patch, which solved it for us.
Comment 4 Felip Moll 2019-01-23 03:11:28 MST
Hi, this particular issue has already been fixed in bug 6004 for
18.08.5 and 19.05.0pre2. I am marking this bug as a duplicate.

In bug 6004 comment 6 you will find also a note about the limitations of seff.

We have another internal bug for addressing these limitations.

*** This ticket has been marked as a duplicate of ticket 6004 ***
Comment 5 Felip Moll 2019-01-23 03:13:32 MST
Commits are:

 18952af9413636120e708db9327d9c9530bbd473
 3bf91e0960f01ea055d89fa7de0f88262d6b3482
 46f967891d6eac78205e3a5ef02ad48cd31a0f7