3158 – Question about using srun to start a application daemon

Ticket 3158 - Question about using srun to start a application daemon

Summary: Question about using srun to start a application daemon

Status:	RESOLVED INFOGIVEN

Alias:	None

Product:	Slurm
Classification:	Unclassified
Component:	User Commands (show other tickets)
Version:	16.05.4
Hardware:	Linux Linux

Importance:	--- 4 - Minor Issue
Assignee:	Tim Wickberg
QA Contact:

URL:

Depends on:
Blocks:

Reported:	2016-10-11 13:36 MDT by Brian Haymore
Modified:	2016-10-17 16:55 MDT (History)
CC List:	0 users

See Also:
Site:	University of Utah
Alineos Sites:	---
Atos/Eviden Sites:	---
Confidential Site:	---
Coreweave sites:	---
Cray Sites:	---
DS9 clusters:	---
HPCnow Sites:	---
HPE Sites:	---
IBM Sites:	---
NOAA SIte:	---
OCF Sites:	---
Recursion Pharma Sites:	---
SFW Sites:	---
SNIC sites:	---
Linux Distro:	---
Machine Name:
CLE Version:
Version Fixed:
Target Release:	---
DevPrio:	---
Emory-Cloud Sites:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this ticket.

Description Brian Haymore 2016-10-11 13:36:10 MDT

So we are in process of working up our new cluster OS images for Centos 7. In this step we have been asking ourselves about NOT enabling ssh trust between nodes and instead try to rely upon slurm/munge via srun for all internode access. A key driver in this interest is keeping all processes under slurm/cgroup control so that we do not have any escape. We are aware of BYU's PAM contrib module that does a best effort to capture remote ssh processes and apply them to a cgroup (where a best guess is applied) however we are "seeing" if we can survive with just srun as the tool.

Today in a test setup we were running through some commercial applications and ran into an issue with one that we think we are going to run into with others. That is that these applications were not built for the usual 'mpirun' style start up. Instead they depend on a daemon being fired up on all allocated nodes and then you run the application on top of that.

So in our previuos/current state where ssh is also allowed between nodes they have simply run a for loop through the node list running something like 'ssh nodeXYZZ .../ansoftrmservice start &' and this issues a startup that daemonizes things on each node. Then after that for loop finishes they issues the job startup command.

So we tried to use srun in place of ssh to start up the above command with something like this 'srun -n1 -w nodeXYZZ .../ansoftrmservice start' and this looks like it's going the right direction but as soon as the above startup script finishes and allows the srun to exit and collapse back it seems to clean up any process that was running. So in short daemonizing things doesn't seem to work with srun.

So what advice do you all have on how we might be able to use srun to startup and leave processes on compute nodes via 'init' style startup scripts (such as the example above) or would you steer me that I can't get away from ssh trust? Thanks!

Comment 1 Tim Wickberg 2016-10-11 13:56:31 MDT

(In reply to Brian Haymore from comment #0)
> So we are in process of working up our new cluster OS images for Centos 7. 
> In this step we have been asking ourselves about NOT enabling ssh trust
> between nodes and instead try to rely upon slurm/munge via srun for all
> internode access.  A key driver in this interest is keeping all processes
> under slurm/cgroup control so that we do not have any escape.  We are aware
> of BYU's PAM contrib module that does a best effort to capture remote ssh
> processes and apply them to a cgroup (where a best guess is applied) however
> we are "seeing" if we can survive with just srun as the tool.

It does much better than "best-effort" within the cluster. When connections originate from other nodes under Slurm's control it will always attach it to the correct job. This is handled through a new "callerid" API that the nodes use to check which process under which cgroup/jobid on the originating side initiated the SSH connection, and will then attaches the job to that cgroup on the destination. So SSH between nodes is always handled perfectly.

The "guess" bit only comes in for connections from nodes that aren't under Slurm's control, e.g. login nodes, in which case there are configurable ways to handle that. See the "action_unknown" option to the pam module:

https://github.com/SchedMD/slurm/blob/master/contribs/pam_slurm_adopt/README

I'd also strongly encourage you to move to 16.05.5 when running CentoOS 7 and cgroups; that release removes the requirement for the ReleaseAgent which can conflict with systemd's own cgroup mount options.

> Today in a test setup we were running through some commercial applications
> and ran into an issue with one that we think we are going to run into with
> others.  That is that these applications were not built for the usual
> 'mpirun' style start up.  Instead they depend on a daemon being fired up on
> all allocated nodes and then you run the application on top of that.
> 
> So in our previuos/current state where ssh is also allowed between nodes
> they have simply run a for loop through the node list running something like
> 'ssh nodeXYZZ .../ansoftrmservice start &' and this issues a startup that
> daemonizes things on each node.  Then after that for loop finishes they
> issues the job startup command.
> 
> So we tried to use srun in place of ssh to start up the above command with
> something like this 'srun -n1 -w nodeXYZZ .../ansoftrmservice start' and
> this looks like it's going the right direction but as soon as the above
> startup script finishes and allows the srun to exit and collapse back it
> seems to clean up any process that was running.  So in short daemonizing
> things doesn't seem to work with srun.

Correct; the step would need to block on something - preferably the application itself. If there's no way to prevent it from daemonizing, you could wrap that daemon launch in a script that just sleeps indefinitely after launch, but that's not exactly elegant.

> So what advice do you all have on how we might be able to use srun to
> startup and leave processes on compute nodes via 'init' style startup
> scripts (such as the example above) or would you steer me that I can't get
> away from ssh trust?  Thanks!

I think using pam_slurm_adopt is the better option here - it handles this scenario almost perfectly, whereas launching daemonized jobs with srun conflicts with the cleanup enforcement mechanisms build in to slurmstepd.

Comment 2 Brian Haymore 2016-10-11 17:40:27 MDT

Just a clarification on this point you made:

"I'd also strongly encourage you to move to 16.05.5 when running CentoOS 7 and cgroups; that release removes the requirement for the ReleaseAgent which can conflict with systemd's own cgroup mount options."

Does this mean that after moving to 16.05.5 that I should comment out or remove the release agent lines in my cgroup.conf file for slurm?

Comment 3 Tim Wickberg 2016-10-11 17:47:33 MDT

> Does this mean that after moving to 16.05.5 that I should comment out or remove
> the release agent lines in my cgroup.conf file for slurm?

Yes. Although it won't hurt anything if they're still there.

Comment 4 Brian Haymore 2016-10-13 14:46:16 MDT

Tim, have you all tested/used the pam slurm adopt module with RHEL/Cent7 yet?  I had Ryan on the phone a bit ago and I have things setup as he suggests but my processes that are created from an ssh end up in a systemd cgroup slice for sshd.  Ryan has not yet used the module on a RHEL/Cent7 env so this is new territory for him.

Comment 5 Tim Wickberg 2016-10-13 15:00:44 MDT

(In reply to Brian Haymore from comment #4)
> Tim, have you all tested/used the pam slurm adopt module with RHEL/Cent7
> yet?  I had Ryan on the phone a bit ago and I have things setup as he
> suggests but my processes that are created from an ssh end up in a systemd
> cgroup slice for sshd.  Ryan has not yet used the module on a RHEL/Cent7 env
> so this is new territory for him.

I have not tested it under RHEL7, but will look into it further. If you're testing this at the moment, I believe if you launch sshd directly (not from a service file through systemd) things will start working as expected. (Or for testing you may want to launch a second ssh server on an alternate port, as I'm guessing you need to SSH into the node to manage it.)

I believe there's a way to tell systemd to not contain certain services through cgroups, although I haven't found that in the documentation yet.

Comment 6 Brian Haymore 2016-10-13 17:39:31 MDT

So I just fired up sshd on port 2022 to test this and I"m still just landing in the sshd systemd cgroup setup.  See below:

[u0104663@sp004 ~]$ ps auxwww |grep ^u0
u0104663 23719  0.0  0.0 149532  2184 ?        S    17:06   0:00 sshd: u0104663@pts/1
u0104663 23720  0.1  0.0 115056  3656 pts/1    Ss   17:06   0:00 -bash
u0104663 23937  0.0  0.0 149124  1768 pts/1    R+   17:08   0:00 ps auxwww
u0104663 23938  0.0  0.0 112656   988 pts/1    S+   17:08   0:00 grep --color=auto ^u0

[u0104663@sp004 ~]$ cat /proc/23720/cgroup 
10:blkio:/
9:hugetlb:/
8:net_cls:/
7:cpuset:/
6:perf_event:/
5:devices:/
4:memory:/
3:cpuacct,cpu:/
2:freezer:/
1:name=systemd:/user.slice/user-174091.slice/session-3540.scope


From the system logs though I'm seeing:

Oct 13 17:06:48 sp004 sshd[23696]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=10.242.10.3  user=u0104663
Oct 13 17:06:48 sp004 sshd[23696]: pam_krb5[23696]: TGT verified
Oct 13 17:06:48 sp004 sshd[23696]: pam_krb5[23696]: authentication succeeds for 'u0104663' (u0104663@AD.UTAH.EDU)
Oct 13 17:06:48 sp004 pam_slurm_adopt[23696]: debug:  Reading cgroup.conf file /uufs/scrubpeak.peaks/sys/var/slurm/etc/cgroup.conf
Oct 13 17:06:48 sp004 pam_slurm_adopt[23696]: debug:  Reading slurm.conf file: /uufs/scrubpeak.peaks/sys/var/slurm/etc/slurm.conf
Oct 13 17:06:48 sp004 pam_slurm_adopt[23696]: debug4: found jobid = 32679, stepid = 4294967295
Oct 13 17:06:48 sp004 pam_slurm_adopt[23696]: debug3: Trying to load plugin /uufs/scrubpeak.peaks/sys//pkg/slurm/16.05.5/lib/slurm/auth_munge.so
Oct 13 17:06:48 sp004 pam_slurm_adopt[23696]: debug:  auth plugin for Munge (http://code.google.com/p/munge/) loaded
Oct 13 17:06:48 sp004 pam_slurm_adopt[23696]: debug3: Success.
Oct 13 17:06:48 sp004 pam_slurm_adopt[23696]: Connection by user u0104663: user has only one job 32679
Oct 13 17:06:48 sp004 pam_slurm_adopt[23696]: debug:  _adopt_process: trying to get 32679.4294967295 to adopt 23696
Oct 13 17:06:48 sp004 pam_slurm_adopt[23696]: debug:  Leaving stepd_add_extern_pid
Oct 13 17:06:48 sp004 pam_slurm_adopt[23696]: Process 23696 adopted into job 32679
Oct 13 17:06:48 sp004 sshd[23694]: Accepted keyboard-interactive/pam for u0104663 from 10.242.10.3 port 59784 ssh2
Oct 13 17:06:48 sp004 sshd[23694]: pam_unix(sshd:session): session opened for user u0104663 by (uid=0)


One thing I've noticed is that the logs show a PID, in this case 23696, but that PID is never present when I do ps so I'm left thinking it's from some intermediate process during the login process.

Anyway I'm just passing this along that simply running sshd by hand on a different port isn't enough.

Comment 7 Tim Wickberg 2016-10-13 18:03:12 MDT

Ah - thanks for testing. I think I had the fix backwards, although still haven't confirmed this locally yet.

If you remove the pam_systemd.so entries in /etc/pam.d/ I think that'll avoid this issue. Otherwise the user login process is getting relocated under systemd's hierarchy as you're seeing. I'll continue to test further, and it looks like we should definitely add some notes for RHEL7 users to the README.

Comment 8 Brian Haymore 2016-10-13 18:23:42 MDT

Ryan had me do that so all of my tests today have been with the systemd pan entries commented out.

--
Brian D. Haymore
University of Utah
Center for High Performance Computing
155 South 1452 East RM 405
Salt Lake City, Ut 84112
Phone: 801-558-1150, Fax: 801-585-5366
http://bit.ly/1HO1N2C

On Oct 13, 2016 6:04 PM, bugs@schedmd.com wrote:

Comment # 7<https://bugs.schedmd.com/show_bug.cgi?id=3158#c7> on bug 3158<https://bugs.schedmd.com/show_bug.cgi?id=3158> from Tim Wickberg<mailto:tim@schedmd.com>

Ah - thanks for testing. I think I had the fix backwards, although still
haven't confirmed this locally yet.

If you remove the pam_systemd.so entries in /etc/pam.d/ I think that'll avoid
this issue. Otherwise the user login process is getting relocated under
systemd's hierarchy as you're seeing. I'll continue to test further, and it
looks like we should definitely add some notes for RHEL7 users to the README.

________________________________
You are receiving this mail because:

  *   You reported the bug.

Comment 9 Tim Wickberg 2016-10-14 13:42:40 MDT

So... I'm not sure entirely what's happening on your system.

My setup is a little bit different - Debian strech versus RHEL7 - but by removing the pam_systemd module entirely things are working as they should.

I've enabled pam_slurm_adopt by adding a single line in to /etc/pam.d/sshd:

account required pam_slurm_adopt.so

That's immediately following an "@include common-account" block.

tim@zoidberg:~$ cat /proc/14126/cgroup
9:perf_event:/
8:net_cls,net_prio:/
7:devices:/slurm_node001/uid_1000/job_16686/step_extern
6:cpuset:/slurm_node001/uid_1000/job_16686/step_extern
5:blkio:/system.slice/ssh.service
4:pids:/system.slice/ssh.service
3:cpu,cpuacct:/system.slice/ssh.service
2:freezer:/slurm_node001/uid_1000/job_16686/step_extern
1:name=systemd:/system.slice/ssh.service

So it is still inheriting a few systemd controllers from the ssh process (which I still have launched as a systemd service), although that shouldn't cause any issues - the device, cpuset, and freezer are all being set properly.

I'll note this is with my cgroup.conf having:
CgroupMountPoint="/cgroup"
ConstrainCores=yes
ConstrainDevices=yes
TaskAffinity=yes

Your output in Comment 7 leaves me wondering if you have any Constrain settings enabled - you'd need them for the devices/cpuset controllers to kick in? And you you have ProctrackType=proctrack/cgroup? That's required for the freezer controller to be set.

If you're still seeing issues I'll see if I can get a CentOS7 box setup shortly. And if you could attach the current slurm.conf, cgroup.conf, and contents of /etc/pam.d/ that'd help as well.

And thank you for you patience on this - when pam_slurm_adopt is working properly it's a really nice feature, but the integration varies a bit between systems and isn't well documented. I will take some notes from this issue and either expand the current README, or start writing up a new documentation page covering the installation and setup.

- Tim

Comment 10 Brian Haymore 2016-10-14 13:45:46 MDT

I do have constrain on.  Let me kidn of start over again and do a better job of documenting things so I can share them back with you.  Thanks for digging into this with me. :)

--
Brian D. Haymore
University of Utah
Center for High Performance Computing
155 South 1452 East RM 405
Salt Lake City, Ut 84112
Phone: 801-558-1150, Fax: 801-585-5366
http://bit.ly/1HO1N2C
________________________________
From: bugs@schedmd.com [bugs@schedmd.com]
Sent: Friday, October 14, 2016 1:42 PM
To: Brian Haymore
Subject: [Bug 3158] Question about using srun to start a application daemon


Comment # 9<redir.aspx?REF=zCmb_nZFpV91s_LX94G5qIo-gsNu5wdRBhh1DHrgO8Fgf0WOavTTCAFodHRwczovL2J1Z3Muc2NoZWRtZC5jb20vc2hvd19idWcuY2dpP2lkPTMxNTgjYzk.> on bug 3158<redir.aspx?REF=jNk9aQTe887f5SBqvikRQD0Bu2xOuZmrWuH_eLqwrhhgf0WOavTTCAFodHRwczovL2J1Z3Muc2NoZWRtZC5jb20vc2hvd19idWcuY2dpP2lkPTMxNTg.> from Tim Wickberg<redir.aspx?REF=n7YLRfL03m9mcFgcug72dIjk-V1u29rrdn9-TrHqwR2GpUWOavTTCAFtYWlsdG86dGltQHNjaGVkbWQuY29t>

So... I'm not sure entirely what's happening on your system.

My setup is a little bit different - Debian strech versus RHEL7 - but by
removing the pam_systemd module entirely things are working as they should.

I've enabled pam_slurm_adopt by adding a single line in to /etc/pam.d/sshd:

account    required pam_slurm_adopt.so

That's immediately following an "@include common-account" block.

tim@zoidberg:~$ cat /proc/14126/cgroup
9:perf_event:/
8:net_cls,net_prio:/
7:devices:/slurm_node001/uid_1000/job_16686/step_extern
6:cpuset:/slurm_node001/uid_1000/job_16686/step_extern
5:blkio:/system.slice/ssh.service
4:pids:/system.slice/ssh.service
3:cpu,cpuacct:/system.slice/ssh.service
2:freezer:/slurm_node001/uid_1000/job_16686/step_extern
1:name=systemd:/system.slice/ssh.service

So it is still inheriting a few systemd controllers from the ssh process (which
I still have launched as a systemd service), although that shouldn't cause any
issues - the device, cpuset, and freezer are all being set properly.

I'll note this is with my cgroup.conf having:
CgroupMountPoint="/cgroup"
ConstrainCores=yes
ConstrainDevices=yes
TaskAffinity=yes

Your output in Comment 7<redir.aspx?REF=8v7P58cV1_jCo62tLQ_rrqcNEaj74zNkZlNO1mgyh1iGpUWOavTTCAFodHRwczovL2J1Z3Muc2NoZWRtZC5jb20vc2hvd19idWcuY2dpP2lkPTMxNTgjYzc.> leaves me wondering if you have any Constrain settings
enabled - you'd need them for the devices/cpuset controllers to kick in? And
you you have ProctrackType=proctrack/cgroup? That's required for the freezer
controller to be set.

If you're still seeing issues I'll see if I can get a CentOS7 box setup
shortly. And if you could attach the current slurm.conf, cgroup.conf, and
contents of /etc/pam.d/ that'd help as well.

And thank you for you patience on this - when pam_slurm_adopt is working
properly it's a really nice feature, but the integration varies a bit between
systems and isn't well documented. I will take some notes from this issue and
either expand the current README, or start writing up a new documentation page
covering the installation and setup.

- Tim

________________________________
You are receiving this mail because:

  *   You reported the bug.

Comment 11 Brian Haymore 2016-10-14 14:26:58 MDT

OK Some progress but new questions.

Since I had not setup ssh host based trust between nodes before reaching out to you all about this I may have put the wrong foot forward first.  Yesterday's work I did was without host based trust between nodes.  Meaning I had to type in my password to ssh between nodes.

Today I had setup the ssh trust between nodes before I got your email and set out to try again.  Now things are working with an exception.

If I start a job on a node or nodes I can then from any of the ssh host trust node ssh around and when I ssh into a node with a job I am inherited and I work.  When I ssh to a node without a job I'm given the error note that I have no job and so the adopt plugin rejects me.  Though I have it fall back to pam_access per our setup and I'm still able to log in via the host trust as I'm in the admin group allowed in.

Now up to this point things look great, but when I ssh in from a box that is not part of the ssh host trust, ie it requies me to put in my password I fail to get adopted right...  Now this is where thinking about PAM a bit we use kerberose and that would at least add in needing to pass through pam_krb5.so.  I'm not sure that's the issue, but just reaching for differences.

Thoughts on this front?

--
Brian D. Haymore
University of Utah
Center for High Performance Computing
155 South 1452 East RM 405
Salt Lake City, Ut 84112
Phone: 801-558-1150, Fax: 801-585-5366
http://bit.ly/1HO1N2C
________________________________
From: bugs@schedmd.com [bugs@schedmd.com]
Sent: Friday, October 14, 2016 1:42 PM
To: Brian Haymore
Subject: [Bug 3158] Question about using srun to start a application daemon


Comment # 9<redir.aspx?REF=EvKHoKL4XDtSZjsQg_ENpz5PYW7fxqrQ4D4rgAs-4mIUTjBjb_TTCAFodHRwczovL2J1Z3Muc2NoZWRtZC5jb20vc2hvd19idWcuY2dpP2lkPTMxNTgjYzk.> on bug 3158<redir.aspx?REF=hKATC7DnkswU124atLyOYdJKUgMDCZf2GH3kfnL0KYkUTjBjb_TTCAFodHRwczovL2J1Z3Muc2NoZWRtZC5jb20vc2hvd19idWcuY2dpP2lkPTMxNTg.> from Tim Wickberg<redir.aspx?REF=uHmv_hSAY9bk5ThwhyJV2wWjqfSOxpFT6J9CjaSAWMQ6dDBjb_TTCAFtYWlsdG86dGltQHNjaGVkbWQuY29t>

So... I'm not sure entirely what's happening on your system.

My setup is a little bit different - Debian strech versus RHEL7 - but by
removing the pam_systemd module entirely things are working as they should.

I've enabled pam_slurm_adopt by adding a single line in to /etc/pam.d/sshd:

account    required pam_slurm_adopt.so

That's immediately following an "@include common-account" block.

tim@zoidberg:~$ cat /proc/14126/cgroup
9:perf_event:/
8:net_cls,net_prio:/
7:devices:/slurm_node001/uid_1000/job_16686/step_extern
6:cpuset:/slurm_node001/uid_1000/job_16686/step_extern
5:blkio:/system.slice/ssh.service
4:pids:/system.slice/ssh.service
3:cpu,cpuacct:/system.slice/ssh.service
2:freezer:/slurm_node001/uid_1000/job_16686/step_extern
1:name=systemd:/system.slice/ssh.service

So it is still inheriting a few systemd controllers from the ssh process (which
I still have launched as a systemd service), although that shouldn't cause any
issues - the device, cpuset, and freezer are all being set properly.

I'll note this is with my cgroup.conf having:
CgroupMountPoint="/cgroup"
ConstrainCores=yes
ConstrainDevices=yes
TaskAffinity=yes

Your output in Comment 7<redir.aspx?REF=GcAQ97ySD6hN7MdBXYTjz52NwJ8sIst8GJR1OeBNAS86dDBjb_TTCAFodHRwczovL2J1Z3Muc2NoZWRtZC5jb20vc2hvd19idWcuY2dpP2lkPTMxNTgjYzc.> leaves me wondering if you have any Constrain settings
enabled - you'd need them for the devices/cpuset controllers to kick in? And
you you have ProctrackType=proctrack/cgroup? That's required for the freezer
controller to be set.

If you're still seeing issues I'll see if I can get a CentOS7 box setup
shortly. And if you could attach the current slurm.conf, cgroup.conf, and
contents of /etc/pam.d/ that'd help as well.

And thank you for you patience on this - when pam_slurm_adopt is working
properly it's a really nice feature, but the integration varies a bit between
systems and isn't well documented. I will take some notes from this issue and
either expand the current README, or start writing up a new documentation page
covering the installation and setup.

- Tim

________________________________
You are receiving this mail because:

  *   You reported the bug.

Comment 12 Tim Wickberg 2016-10-14 14:36:09 MDT

(In reply to Brian Haymore from comment #11)
> OK Some progress but new questions.
> 
> Since I had not setup ssh host based trust between nodes before reaching out
> to you all about this I may have put the wrong foot forward first. 
> Yesterday's work I did was without host based trust between nodes.  Meaning
> I had to type in my password to ssh between nodes.
> 
> Today I had setup the ssh trust between nodes before I got your email and
> set out to try again.  Now things are working with an exception.
> 
> If I start a job on a node or nodes I can then from any of the ssh host
> trust node ssh around and when I ssh into a node with a job I am inherited
> and I work.  When I ssh to a node without a job I'm given the error note
> that I have no job and so the adopt plugin rejects me.  Though I have it
> fall back to pam_access per our setup and I'm still able to log in via the
> host trust as I'm in the admin group allowed in.
> 
> Now up to this point things look great, but when I ssh in from a box that is
> not part of the ssh host trust, ie it requies me to put in my password I
> fail to get adopted right...  Now this is where thinking about PAM a bit we
> use kerberose and that would at least add in needing to pass through
> pam_krb5.so.  I'm not sure that's the issue, but just reaching for
> differences.
> 
> Thoughts on this front?

That sounds quite promising... almost there I think. I can say the adoption from outside is what I've been testing mostly at this point, and I have SSH pre-shared keys in place any rely on that for authentication. I don't have pam_krb5 setup, and it sounds like that may be a factor here - I'm not sure what the correct order of the different account lines in the pam config files should be for you, but I can take a rough guess as to what's happening:

If pam_krb5 is setup as 'sufficient', and is tested before pam_slurm_adopt I think what you've described could happen - the 'sufficient' returns immediately which would preclude pam_slurm_adopt from having a chance to adopt the process.

Comment 13 Brian Haymore 2016-10-14 14:53:01 MDT

Not sure if I sent it but this is the /etc/pam.d/password-auth-ac file where I put the entry in.  This differs from where you suggests in sshd, but I also tried in /etc/pam.d/sshd.  As you noted the krb5 is listed earlier in the password-auth-ac and is sufficient. Let me illustrate below:

/etc/pam.d/password-auth-ac (showing my line as it is right now at the top of the ACCOUNT list)
#%PAM-1.0
# This file is auto-generated.
# User changes will be destroyed the next time authconfig is run.
auth        required      pam_env.so
auth        sufficient    pam_unix.so nullok try_first_pass
auth        requisite     pam_succeed_if.so uid >= 1000 quiet_success
auth        sufficient    pam_krb5.so use_first_pass
auth        required      pam_deny.so

account     sufficient    pam_slurm_adopt.so log_level=debug5 action_unknown=newest
account     sufficient    pam_access.so
#DISABLED#account     required      pam_slurm.so
account     required      pam_unix.so broken_shadow
account     sufficient    pam_localuser.so
account     sufficient    pam_succeed_if.so uid < 1000 quiet
account     [default=bad success=ok user_unknown=ignore] pam_krb5.so
account     required      pam_permit.so

password    requisite     pam_pwquality.so try_first_pass local_users_only retry=3 authtok_type=
password    sufficient    pam_unix.so sha512 shadow nis nullok try_first_pass use_authtok
password    sufficient    pam_krb5.so use_authtok
password    required      pam_deny.so

session     optional      pam_keyinit.so revoke
session     required      pam_limits.so
#DISABLED#-session     optional      pam_systemd.so
session     [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid
session     required      pam_unix.so
session     optional      pam_krb5.so



/etc/pam.d/sshd (again as the file is now, though I did remove it for a test from the file above and insert the line at the top of the ACCOUNT section in this file with no change to behavior)
#%PAM-1.0
auth       required     pam_sepermit.so
auth       substack     password-auth
auth       include      postlogin
# Used with polkit to reauthorize users in remote sessions
-auth      optional     pam_reauthorize.so prepare
account    required     pam_nologin.so
account    include      password-auth
password   include      password-auth
# pam_selinux.so close should be the first session rule
session    required     pam_selinux.so close
session    required     pam_loginuid.so
# pam_selinux.so open should only be followed by sessions to be executed in the user context
session    required     pam_selinux.so open env_params
session    required     pam_namespace.so
session    optional     pam_keyinit.so force revoke
session    include      password-auth
session    include      postlogin
# Used with polkit to reauthorize users in remote sessions
-session   optional     pam_reauthorize.so prepare


Does this all steer you in a direction?  Can I share anything more that would help?  Thanks again!

--
Brian D. Haymore
University of Utah
Center for High Performance Computing
155 South 1452 East RM 405
Salt Lake City, Ut 84112
Phone: 801-558-1150, Fax: 801-585-5366
http://bit.ly/1HO1N2C
________________________________
From: bugs@schedmd.com [bugs@schedmd.com]
Sent: Friday, October 14, 2016 2:36 PM
To: Brian Haymore
Subject: [Bug 3158] Question about using srun to start a application daemon


Comment # 12<redir.aspx?REF=yGmeVSOO5KW7c8oBNEqzxvVh7wixBubwUluFV90_PvwOggH-cvTTCAFodHRwczovL2J1Z3Muc2NoZWRtZC5jb20vc2hvd19idWcuY2dpP2lkPTMxNTgjYzEy> on bug 3158<redir.aspx?REF=RNif574ZS815Tl8W_oKwdbNmRSYR5BblSLU0vFTSW8s0qAH-cvTTCAFodHRwczovL2J1Z3Muc2NoZWRtZC5jb20vc2hvd19idWcuY2dpP2lkPTMxNTg.> from Tim Wickberg<redir.aspx?REF=DvqauKNWQ4E3ejeUC4_vGQPr_0uDWV-5OUkkIS9vXcE0qAH-cvTTCAFtYWlsdG86dGltQHNjaGVkbWQuY29t>

(In reply to Brian Haymore from comment #11<redir.aspx?REF=QvnDTGSxzU8h_X5OkgcgX3iDPHvy9aEeY9nwJxExkjY0qAH-cvTTCAFodHRwczovL2J1Z3Muc2NoZWRtZC5jb20vc2hvd19idWcuY2dpP2lkPTMxNTgjYzEx>)
> OK Some progress but new questions.
>
> Since I had not setup ssh host based trust between nodes before reaching out
> to you all about this I may have put the wrong foot forward first.
> Yesterday's work I did was without host based trust between nodes.  Meaning
> I had to type in my password to ssh between nodes.
>
> Today I had setup the ssh trust between nodes before I got your email and
> set out to try again.  Now things are working with an exception.
>
> If I start a job on a node or nodes I can then from any of the ssh host
> trust node ssh around and when I ssh into a node with a job I am inherited
> and I work.  When I ssh to a node without a job I'm given the error note
> that I have no job and so the adopt plugin rejects me.  Though I have it
> fall back to pam_access per our setup and I'm still able to log in via the
> host trust as I'm in the admin group allowed in.
>
> Now up to this point things look great, but when I ssh in from a box that is
> not part of the ssh host trust, ie it requies me to put in my password I
> fail to get adopted right...  Now this is where thinking about PAM a bit we
> use kerberose and that would at least add in needing to pass through
> pam_krb5.so.  I'm not sure that's the issue, but just reaching for
> differences.
>
> Thoughts on this front?

That sounds quite promising... almost there I think. I can say the adoption
from outside is what I've been testing mostly at this point, and I have SSH
pre-shared keys in place any rely on that for authentication. I don't have
pam_krb5 setup, and it sounds like that may be a factor here - I'm not sure
what the correct order of the different account lines in the pam config files
should be for you, but I can take a rough guess as to what's happening:

If pam_krb5 is setup as 'sufficient', and is tested before pam_slurm_adopt I
think what you've described could happen - the 'sufficient' returns immediately
which would preclude pam_slurm_adopt from having a chance to adopt the process.

________________________________
You are receiving this mail because:

  *   You reported the bug.

Comment 14 Tim Wickberg 2016-10-14 15:09:50 MDT

(In reply to Brian Haymore from comment #13)
> Not sure if I sent it but this is the /etc/pam.d/password-auth-ac file where
> I put the entry in.  This differs from where you suggests in sshd, but I
> also tried in /etc/pam.d/sshd.  As you noted the krb5 is listed earlier in
> the password-auth-ac and is sufficient. Let me illustrate below:
>

I'm assuming password-auth is a symlink to password-auth-ac?

I'm trimming this down to just the 'account' sections as they're the only ones relevant here. (Each of account / auth / password / session have their own decision trees.)

password-auth-ac:
> account     sufficient    pam_slurm_adopt.so log_level=debug5
> action_unknown=newest
> account     sufficient    pam_access.so
> #DISABLED#account     required      pam_slurm.so
> account     required      pam_unix.so broken_shadow
> account     sufficient    pam_localuser.so
> account     sufficient    pam_succeed_if.so uid < 1000 quiet
> account     [default=bad success=ok user_unknown=ignore] pam_krb5.so
> account     required      pam_permit.so

sshd:

> account    required     pam_nologin.so
> account    include      password-auth

So, this combines into (trimming some options for clarity):

> account    required     pam_nologin.so
> account     sufficient    pam_slurm_adopt.so
> account     sufficient    pam_access.so
> account     required      pam_unix.so broken_shadow
> account     sufficient    pam_localuser.so
> account     sufficient    pam_succeed_if.so uid < 1000 quiet
> account     [default=bad success=ok user_unknown=ignore] pam_krb5.so
> account     required      pam_permit.so

I believe this is allowing krb5 to grant access to accounts even if pam_slurm_adopt.so fails. (I'm reading through pam.conf(5) trying to wrap my head around this at the moment...)

So those unconstrained logins should have only occurred during your testing when there was no active job on the node. 

Based on my loose understanding of the behavior here, I think you can drop the "account ... pam_krb5" line entirely. It's only used to verify if a user account exists on the node in this context, and if you only want them to be able to login with an active jobs the "sufficient pam_slurm_adopt" is already covering that use case.

Note that this does would affect how they authenticate to the node - krb5 should still be able to handle that through the 'auth' sections.

Comment 15 Brian Haymore 2016-10-14 16:57:58 MDT

Yes password-auth is a link to password-auth-ac.

The unconstrained logins were to nodes I had a job on.  The point was more if I came from somewhere that had ssh trust I would end up within the cgroup, if I had to auth with my password (like from a login node) to these allocated, part of a running job, then I was not in the cgroup.

--
Brian D. Haymore
University of Utah
Center for High Performance Computing
155 South 1452 East RM 405
Salt Lake City, Ut 84112
Phone: 801-558-1150, Fax: 801-585-5366
http://bit.ly/1HO1N2C
________________________________
From: bugs@schedmd.com [bugs@schedmd.com]
Sent: Friday, October 14, 2016 3:09 PM
To: Brian Haymore
Subject: [Bug 3158] Question about using srun to start a application daemon


Comment # 14<redir.aspx?REF=Ie6HOQEjpCDN-CDwiPI9E6SS0sGyHVMAD7B-GGE_vNhGVrMlhfTTCAFodHRwczovL2J1Z3Muc2NoZWRtZC5jb20vc2hvd19idWcuY2dpP2lkPTMxNTgjYzE0> on bug 3158<redir.aspx?REF=Mke-Tsu7ZzhPv3Wu14XfIEv3R2hHd3KI58816KwzRPlGVrMlhfTTCAFodHRwczovL2J1Z3Muc2NoZWRtZC5jb20vc2hvd19idWcuY2dpP2lkPTMxNTg.> from Tim Wickberg<redir.aspx?REF=spzlevxopW3U0s7eBTh0NehKRNE8X_Im3BtnlkXoRlJsfLMlhfTTCAFtYWlsdG86dGltQHNjaGVkbWQuY29t>

(In reply to Brian Haymore from comment #13<redir.aspx?REF=bLiaQBo99aw8Lc0mp3oeB739ExOliRvaLnUkzNt7l4RsfLMlhfTTCAFodHRwczovL2J1Z3Muc2NoZWRtZC5jb20vc2hvd19idWcuY2dpP2lkPTMxNTgjYzEz>)
> Not sure if I sent it but this is the /etc/pam.d/password-auth-ac file where
> I put the entry in.  This differs from where you suggests in sshd, but I
> also tried in /etc/pam.d/sshd.  As you noted the krb5 is listed earlier in
> the password-auth-ac and is sufficient. Let me illustrate below:
>

I'm assuming password-auth is a symlink to password-auth-ac?

I'm trimming this down to just the 'account' sections as they're the only ones
relevant here. (Each of account / auth / password / session have their own
decision trees.)

password-auth-ac:
> account     sufficient    pam_slurm_adopt.so log_level=debug5
> action_unknown=newest
> account     sufficient    pam_access.so
> #DISABLED#account     required      pam_slurm.so
> account     required      pam_unix.so broken_shadow
> account     sufficient    pam_localuser.so
> account     sufficient    pam_succeed_if.so uid < 1000 quiet
> account     [default=bad success=ok user_unknown=ignore] pam_krb5.so
> account     required      pam_permit.so

sshd:

> account    required     pam_nologin.so
> account    include      password-auth

So, this combines into (trimming some options for clarity):

> account    required     pam_nologin.so
> account     sufficient    pam_slurm_adopt.so
> account     sufficient    pam_access.so
> account     required      pam_unix.so broken_shadow
> account     sufficient    pam_localuser.so
> account     sufficient    pam_succeed_if.so uid < 1000 quiet
> account     [default=bad success=ok user_unknown=ignore] pam_krb5.so
> account     required      pam_permit.so

I believe this is allowing krb5 to grant access to accounts even if
pam_slurm_adopt.so fails. (I'm reading through pam.conf(5) trying to wrap my
head around this at the moment...)

So those unconstrained logins should have only occurred during your testing
when there was no active job on the node.

Based on my loose understanding of the behavior here, I think you can drop the
"account ... pam_krb5" line entirely. It's only used to verify if a user
account exists on the node in this context, and if you only want them to be
able to login with an active jobs the "sufficient pam_slurm_adopt" is already
covering that use case.

Note that this does would affect how they authenticate to the node - krb5
should still be able to handle that through the 'auth' sections.

________________________________
You are receiving this mail because:

  *   You reported the bug.

Comment 16 Brian Haymore 2016-10-17 16:55:24 MDT

Thanks for the response again on this one.  We have moved forward with ssh trust again using the pam_slurm_adopt.  We have a few quirks with that, but there is a different ticket for that.  So marking this one closed.  Again, thanks!