Looks like there is another variant on this bug. Namely if there are jobs pending for a reservation and the reservation is full it will cause the primary scheduler to stop and not schedule any other jobs for that partition. This may be a variation on bug 595
Fixed in commit 08f0f57cc18824d0984bbbbba39ad27d4e796a62
One more note, these changes are more extensive than what I would want to put into version 2.6, which we want to keep really stable, so they will only be in version 14.03. Same as fixes for keeping the node CPU load average current.
On Sun, Mar 16, 2014 at 08:30:37PM +0000, bugs@schedmd.com wrote: > --- Comment #2 from Moe Jette <jette@schedmd.com> --- > One more note, these changes are more extensive than what I would want to > put into version 2.6, which we want to keep really stable, so they will > only be in version 14.03. Same as fixes for keeping the node CPU load > average current. I was just about to backport this patch so it would apply cleanly, when I realized we've already got all the patches this depends on in our local build of 2.6.5, so it applies cleanly already. :-) Is there a bug open for the stale node load average problem? I poked through Bugzilla but didn't see one. Thanks, Moe. john
Hi John, I don't recall any problem with stale node load average. Could you please log a new bug. Thanks, David
Created attachment 704 [details] attachment-18387-0.html Cpu load issue fixed in version14.03. changes too extensive for v2.6. Not sure if I opened bug on it though. On March 19, 2014 4:39:09 PM EDT, bugs@schedmd.com wrote: >http://bugs.schedmd.com/show_bug.cgi?id=630 > >--- Comment #4 from David Bigagli <david@schedmd.com> --- > >Hi John, > I don't recall any problem with stale node load average. Could you >please log a new bug. > >Thanks, > David > >-- >You are receiving this mail because: >You are on the CC list for the bug. >You reported the bug.