Hi there. Just installed Slurm 17.02.2 and tested the slurm_pam_adopt module. It compiles, loads and work fine, but it doesn't put the adopted processes in all the configured cgroups. During my tests, it correctly put the SSH adopted process in the "cpuset" and "freezer" subsystems, but misses "devices", "cpuacct" and "memory". Illustration: * Initial job submission: sh-ln01 $ srun -w sh-101-59 -p test --pty bash sh-101-59 $ cat /proc/$$/cgroup 11:hugetlb:/ 10:freezer:/slurm/uid_215845/job_11354/step_0 9:memory:/slurm/uid_215845/job_11354/step_0 8:cpuacct,cpu:/slurm/uid_215845/job_11354/step_0/task_0 7:blkio:/ 6:devices:/slurm/uid_215845/job_11354/step_0 5:pids:/ 4:cpuset:/slurm/uid_215845/job_11354/step_0 3:net_prio,net_cls:/ 2:perf_event:/ 1:name=systemd:/system.slice/slurmd.service [kilian@sh-101-59 ~]$ * SSH'ing to the node: sh-ln01 $ ssh sh-101-59 sh-101-59 $ cat /proc/$$/cgroup 11:hugetlb:/ 10:freezer:/slurm/uid_215845/job_11354/step_extern 9:memory:/ 8:cpuacct,cpu:/ 7:blkio:/ 6:devices:/ 5:pids:/ 4:cpuset:/slurm/uid_215845/job_11354/step_extern 3:net_prio,net_cls:/ 2:perf_event:/ 1:name=systemd:/user.slice/user-215845.slice/session-2724.scope And I can confirm that the shell from the SSH process is confined to the CPUs allocated to the job, but it can freely consume all the memory it wants. We have the following config: JobAcctGatherType = jobacct_gather/cgroup ProctrackType = proctrack/cgroup TaskPlugin = task/cgroup PrologFlags = Alloc,Contain Thanks! -- Kilian
Did you disable pam_systemd in the various PAM configs? The output you've given suggests it may still be enabled.
(In reply to Tim Wickberg from comment #1) > Did you disable pam_systemd in the various PAM configs? The output you've > given suggests it may still be enabled. Wow, spot on, thanks Tim! That's exactly it, disabling pam_systemd fixes the issue. And I see there's a note in the Readme too, I missed it. Thanks! -- Kilian
Forgot to close the ticket. Done.