MaxArraySize was increased from 1000 to 10000 in slurm.conf; the new config file was distributed to all nodes; an "scontrol reconfigure" returned with no errors or warnings. Thereafter, "scontrol show config" indeed showed "MaxArraySize=10000" when queried. However, users submitting array jobs with indices higher than 1000 (e.g. "0-2449%108") were denied with "Invalid job array specification" errors. The problem is the _valid_array_inx() function. The first time that function is called it caches in the static global "max_array_size" (declared in slurmctld/job_mgr.c) the value of MaxArraySize AT THAT TIME. In this case, that was the original value of 1000. Later in _valid_array_inx() some code checks the slurmctl config timestamp and pulls the new value of MaxArraySize if the config is newer as the static local variable "max_task_cnt". The function subsequently uses "max_array_size" to allocate and index-check the array specification, effectively ignoring the new value of MaxArraySize. The error cited by our users stems from valid = _parse_array_tok(tok, job_desc->array_bitmap, max_array_size); for which the third argument continues to be the originally-configured MaxArraySize (1000) despite the successful "scontrol reconfigure" etc. Other code inside slurmctld seems to be doing the same thing -- caching the initial value of MaxArraySize and ignoring updates. E.g. _xlate_array_dep() in slurmctld/job_scheduler.c. This implies that without rewriting all such functions "scontrol reconfigure" should at least be warning that the configuration change will require a restart of slurmctld.
I should also mention that the code has not changed at the head of the source tree in git, so this issue is still present.