we are running in to "sacct: error: slurmdbd: Query result exceeds size limit" when trying to query accounting data. sacct -b -a --parsable2 --starttime=2019-02-21T12:00:00 --endtime=2019-02-21T13:00:00 Here is the log message during that timeframe on the slurmdbd ==> /var/log/slurmdbd.log <== [2019-03-18T09:54:01.513] error: slurm_pack_list: size limit exceeded Here are the mysql setting that we have for innodb_buffer_pool innodb_buffer_pool_size=4096M innodb_buffer_pool_instances=8 innodb_lock_wait_timeout=100 Can you advice us if this is due to innodb_buffer_pool_size and recommended settings?
This size limit is a built-in limit of 3 GB of data. If a query results in a data size of greater than 3 GB, then the slurmdbd will return the error ESLURM_RESULT_TOO_LARGE, which results in the error message you see. This is mentioned in the slurmdbd.conf man page - see MaxQueryTimeRange https://slurm.schedmd.com/slurmdbd.conf.html#OPT_MaxQueryTimeRange Even though you're using -b, that flag (and all other sacct formatting flags) doesn't reduce the amount of data sent from slurmdbd to sacct. The implementation is actually to get the entire job record, then sacct will print only the information it cares about. We're aware of this limitation and Normally I'd recommend that you reduce the time period of your query, but I notice that it's just a single hour of data. I'm curious how you have >3 GB worth of job and step records in just a single hour. Each job record is usually between 500 bytes - 1 kB (but it can vary, especially with large comment strings), but step records are a lot heavier than job records, and sacct returns step records as well. Maybe there are a ton of steps. Can you try running that same query, but with the -X flag as well (which will exclude steps from the query)? sacct -X -b -a --parsable2 --starttime=2019-02-21T12:00:00 --endtime=2019-02-21T13:00:00
I narrowed this down to a time where sacct hangs. # sacct -X -b -a --parsable2 --starttime=2019-02-21T12:23:45 --endtime=2019-02-21T12:23:46 --format=jobid,elapsed,ncpus,ntasks,state | wc -l 397 [root@emgmt1 ~]# sacct -b -a --parsable2 --starttime=2019-02-21T12:23:45 --endtime=2019-02-21T12:23:46 --format=jobid,elapsed,ncpus,ntasks,state | wc -l 1019653 There seem to be ton of job steps within that time frame. is this usual that it would record that many job steps in one second? 588497.39993|00:00:03|8|8|COMPLETED|588497.39993|COMPLETED|0:0 588497.39994|00:00:02|8|8|COMPLETED|588497.39994|COMPLETED|0:0 588497.39995|00:00:03|8|8|COMPLETED|588497.39995|COMPLETED|0:0 588497.39996|00:00:03|8|8|COMPLETED|588497.39996|COMPLETED|0:0 588497.39997|00:00:02|8|8|COMPLETED|588497.39997|COMPLETED|0:0 588497.39998|00:00:03|8|8|COMPLETED|588497.39998|COMPLETED|0:0 588497.39999|00:00:03|8|8|COMPLETED|588497.39999|COMPLETED|0:0
Nice find. I don't often see how many job steps people are actually running, but if all those steps finish at that same time, then that's totally possible to have 40k steps have the same end time (and apparently happened). The steps record their own end time, which eventually gets passed to the database - so the end time is not when the step records are in the database, but when the steps say "I'm done." When you see the "query result exceeds size limit" error in the future, then I recommend following the procedure you just did - reduce the time span of the query and/or use the -X flag to eliminate steps. Is this a sufficient workaround? It's unfortunately not a very elegant solution. We are aware that we're sending back the entire job and step records, instead of just what was requested. However, changing this is not trivial and would likely require sponsorship.
I will get back to you if this would help as a workaround from the team. Also, what is the difference between the batch and exte steps.
The batch step runs the actual batch script that the user submitted. Each srun is another step. The extern step is created when PrologFlags=contain is set in slurm.conf and is used for pam_slurm_adopt. Its purpose is to create an additional cgroup so that with pam_slurm_adopt a user ssh'ing to the node will get adopted into the extern step cgroup.
Closing as infogiven. Please re-open if you have further issues.