Ticket 4095 - salloc should remove groups when dropping privileges
Summary: salloc should remove groups when dropping privileges
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: User Commands (show other tickets)
Version: 17.02.7
Hardware: Linux Linux
: --- 6 - No support contract
Assignee: Jacob Jenson
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2017-08-21 18:17 MDT by Philip Kovacs
Modified: 2017-08-28 14:39 MDT (History)
0 users

See Also:
Site: -Other-
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 17.02.8
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
salloc drop privileges patch (657 bytes, patch)
2017-08-21 18:17 MDT, Philip Kovacs
Details | Diff

Note You need to log in before you can comment on or make changes to this ticket.
Description Philip Kovacs 2017-08-21 18:17:31 MDT
Created attachment 5122 [details]
salloc drop privileges patch

Here's my patch for salloc to drop groups before the setuid.  This closes the POSIX POS36-C error that programs such as rpmlint catch easily.
Comment 1 Moe Jette 2017-08-22 09:08:42 MDT
Thank you for your contribution. It will be in our next release. The commit is here:
https://github.com/SchedMD/slurm/commit/c7e6d864dea813b2d34334a6c3ef17ca4583bd28
Comment 4 Tim Wickberg 2017-08-22 13:40:17 MDT
One note - commit a5b47f7b6 drops the EPERM part of the conditional, otherwise if root was running without CAP_SETGID for some reason they would still persist.

I am noticing that salloc does not immediately reject a job submission with --uid set from a non-root account, and will only exit once the allocation request has been granted resulting in stray jobs. A separate bug 4101 has been opened to track that.
Comment 5 Philip Kovacs 2017-08-22 16:11:08 MDT
I had added the EPERM clause because all accounts were hitting that code block, with or without an explicit --uid, so if you were to run `salloc bash` as an ordinary user, you would get a setgroups EPERM error.  The EPERM check lets non-privileged accounts continue.  Bug 4101 is fine, but not strong enough, i.e. all users are issuing setuid in all cases with or without --uid.  Regards, Phil
Comment 6 Moe Jette 2017-08-24 14:21:30 MDT
(In reply to Philip Kovacs from comment #5)
> I had added the EPERM clause because all accounts were hitting that code
> block, with or without an explicit --uid, so if you were to run `salloc
> bash` as an ordinary user, you would get a setgroups EPERM error.  The EPERM
> check lets non-privileged accounts continue.  Bug 4101 is fine, but not
> strong enough, i.e. all users are issuing setuid in all cases with or
> without --uid.  Regards, Phil

Thanks for pointing that out. While the UID is the data structure was initially set to -1, that quickly changed that that user's ID. I changed the "if" test here, which addresses that problem:
https://github.com/SchedMD/slurm/commit/608cfb0b810a4ac4a6cb3509e2c163a40e368e5f
Comment 7 Philip Kovacs 2017-08-24 14:42:19 MDT
Also, by moving the setuid block up a bit, just after the similar setgid block, but before the slurm allocation, you avoid the stranded allocation from ctld in the event setgroups or setuid fail on the EPERM.  It's (probably) better that way since the uid/gid change is finalized before you message ctld for the slurm allocation.  See my patch on bug 4101.
Comment 8 Moe Jette 2017-08-28 09:35:52 MDT
(In reply to Tim Wickberg from comment #4)
> One note - commit a5b47f7b6 drops the EPERM part of the conditional,
> otherwise if root was running without CAP_SETGID for some reason they would
> still persist.
> 
> I am noticing that salloc does not immediately reject a job submission with
> --uid set from a non-root account, and will only exit once the allocation
> request has been granted resulting in stray jobs. A separate bug 4101 has
> been opened to track that.

That change prevented me from being able to execute salloc at all. salloc was always executing setgroups, even for a regular user without the --uid option and exiting.

I've changed the salloc logic to only execute the setgroups call if not running as self, something that I had earlier added to the version 17.11 branch and now backported to version 17.02.
https://github.com/SchedMD/slurm/commit/d03e393058a3b6c8b67eafde2f60dd6101b9774f
Comment 9 Philip Kovacs 2017-08-28 14:34:30 MDT
Are you patching the same code as me?.  The patch I posted on bug 4101

https://bugs.schedmd.com/attachment.cgi?id=5128

was against the 17.02.7 tarball not any more recent branch in git.  I've been running it for quite a while as regular user.  I find it fixes both 4095 and 4101.
Comment 10 Moe Jette 2017-08-28 14:39:44 MDT
(In reply to Philip Kovacs from comment #9)
> Are you patching the same code as me?.  The patch I posted on bug 4101
> 
> https://bugs.schedmd.com/attachment.cgi?id=5128
> 
> was against the 17.02.7 tarball not any more recent branch in git.  I've
> been running it for quite a while as regular user.  I find it fixes both
> 4095 and 4101.

That bug is assigned to someone else. I'll probably leave it in Tim's hands, but I did alert him to this update.