Created attachment 25930 [details] patch slurmd fails to start if cgroup.conf is missing even if the cgroup plugin is not specified in the slurm.conf This is the output of slurm -D with debug5 enabled: slurmd: debug3: Trying to load plugin /usr/lib/x86_64-linux-gnu/slurm-wlm/cgroup_v2.so slurmd: debug3: plugin_load_from_file->_verify_syms: found Slurm plugin name:Cgroup v2 plugin type:cgroup/v2 version:0x160502 slurmd: error: cannot read (null)/user.slice/user-0.slice/session-10.scope/cgroup.controllers: No such file or directory slurmd: error: Couldn't load specified plugin name for cgroup/v2: Plugin init() callback failed slurmd: error: cannot create cgroup context for cgroup/v2 slurmd: error: Unable to initialize cgroup plugin slurmd: error: slurmd initialization failed This is the slurm.conf file: ClusterName=cluster SlurmctldHost=slurmctld MpiDefault=none ProctrackType=proctrack/pgid ReturnToService=1 SlurmctldPidFile=/run/slurmctld.pid SlurmdPidFile=/run/slurmd.pid SlurmdSpoolDir=/var/lib/slurm/slurmd SlurmUser=slurm StateSaveLocation=/var/lib/slurm/slurmctld SwitchType=switch/none TaskPlugin=task/affinity SchedulerType=sched/backfill SelectType=select/cons_tres SelectTypeParameters=CR_Core AccountingStorageHost=slurmdbd AccountingStorageType=accounting_storage/slurmdbd AccountingStoreFlags=job_comment JobCompType=jobcomp/linux JobAcctGatherType=jobacct_gather/none SlurmctldDebug=info SlurmctldLogFile=/var/log/slurm/slurmctld.log SlurmdDebug=debug5 SlurmdLogFile=/var/log/slurm/slurmd.log NodeName=slurmd CPUs=2 State=UNKNOWN PartitionName=debug Nodes=slurmd Default=YES MaxTime=INFINITE State=UP You can find attached a quick workaround I added to the Debian package to solve the issue. The attached patch sets the default cgroup basedir, making slurmd behave like cgroup.conf exists and is empty.
Hi Gennaro, Setting the defaults is the correct approach, but we wanted to do for all the default settings and not only one. So we added the following commits which fixes the issues. |\ | * 2228ca18a7 Set cgroup.conf defaults even without cgroup.conf | * e369902e4c Sort variables | * 741578cd66 Fix function signature |/ These commits are applied to master and will be in the next 23.02 release, but we decided not to include them in 22.05. There's more work going on to try to make the cgroup.conf file optional. We're working on that on an internal bug but it will be probably for 23.11. Having said that I appreciate your contribution and I proceed to close the bug. Thanks a lot!