Ticket 6380 - Seff reports incorrect usage
Summary: Seff reports incorrect usage
Status: RESOLVED DUPLICATE of ticket 6004
Alias: None
Product: Slurm
Classification: Unclassified
Component: User Commands (show other tickets)
Version: 18.08.4
Hardware: Linux Linux
: --- 3 - Medium Impact
Assignee: Felip Moll
QA Contact:
URL:
Depends on: 6004
Blocks:
  Show dependency treegraph
 
Reported: 2019-01-18 07:40 MST by Steve Ford
Modified: 2019-01-23 03:13 MST (History)
1 user (show)

See Also:
Site: MSU
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Steve Ford 2019-01-18 07:40:07 MST
Hello SchedMD,

We updated SLURM to 18.08.4 to fix an issue we encountered using seff (https://bugs.schedmd.com/show_bug.cgi?id=5882)

Now that we're updated, we are experiencing a different issue. Seff has errors and reports incorrect usage:

17.11.8 seff:
$ seff 6609166
Job ID: 6609166
Cluster: msuhpcc
User/Group: xxxxx/xxxxx
State: TIMEOUT (exit code 0)
Nodes: 1
Cores per node: 16
CPU Utilized: 11176-20:09:13
CPU Efficiency: 9979.29% of 112-00:00:32 core-walltime
Job Wall-clock time: 7-00:00:02
Memory Utilized: 19.99 GB
Memory Efficiency: 24.99% of 80.00 GB

18.08.4 seff:
$ seff 6609166
Use of uninitialized value $lmem in numeric lt (<) at /usr/bin/seff line 130, <DATA> line 624.
Use of uninitialized value $lmem in numeric lt (<) at /usr/bin/seff line 130, <DATA> line 624.
Use of uninitialized value $lmem in numeric lt (<) at /usr/bin/seff line 130, <DATA> line 624.
Job ID: 6609166
Cluster: msuhpcc
User/Group: xxxxx/xxxxx
State: TIMEOUT (exit code 0)
Nodes: 1
Cores per node: 16
CPU Utilized: 11176-20:09:13
CPU Efficiency: 9979.29% of 112-00:00:32 core-walltime
Job Wall-clock time: 7-00:00:02
Memory Utilized: 0.00 MB (estimated maximum)
Memory Efficiency: 0.00% of 80.00 GB (80.00 GB/node)


This looks like the same issue described in https://bugs.schedmd.com/show_bug.cgi?id=6004 and https://bugs.schedmd.com/show_bug.cgi?id=6315. There are patches suggested in https://bugs.schedmd.com/show_bug.cgi?id=6004, can your developers take a look at these before I apply them?

Thank you,
Steve
Comment 2 Felip Moll 2019-01-21 08:48:07 MST
Hi Steve,

I'm debating with dev. team about this issue.

Will keep you posted.
Comment 3 Jurij Pečar 2019-01-21 09:20:26 MST
We have the same problem and I just tried out that patch, which solved it for us.
Comment 4 Felip Moll 2019-01-23 03:11:28 MST
Hi, this particular issue has already been fixed in bug 6004 for
18.08.5 and 19.05.0pre2. I am marking this bug as a duplicate.

In bug 6004 comment 6 you will find also a note about the limitations of seff.

We have another internal bug for addressing these limitations.

*** This ticket has been marked as a duplicate of ticket 6004 ***
Comment 5 Felip Moll 2019-01-23 03:13:32 MST
Commits are:

 18952af9413636120e708db9327d9c9530bbd473
 3bf91e0960f01ea055d89fa7de0f88262d6b3482
 46f967891d6eac78205e3a5ef02ad48cd31a0f7