Summary: | missing group users has on login node | ||
---|---|---|---|
Product: | Slurm | Reporter: | Xing Huang <x.huang> |
Component: | Other | Assignee: | Tim McMullan <mcmullan> |
Status: | RESOLVED INFOGIVEN | QA Contact: | |
Severity: | 3 - Medium Impact | ||
Priority: | --- | ||
Version: | 21.08.2 | ||
Hardware: | Linux | ||
OS: | Linux | ||
Site: | WA St. Louis | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Linux Distro: | --- |
Machine Name: | CLE Version: | ||
Version Fixed: | Target Release: | --- | |
DevPrio: | --- | Emory-Cloud Sites: | --- |
Description
Xing Huang
2022-01-18 14:25:27 MST
Hi Xing, How many groups is this user a member of? Based on those groups, it looks like you are authenticating against a windows domain, are you using sssd to accomplish this? If so, do you have "Enumerate=yes" set in /etc/sssd/sssd.conf for the domain? At the moment it looks like we aren't getting the full list of groups internally, but I'm looking for some other options as well. Thanks! --Tim Hi Tim, This user is a member of 54 groups, but this is different for each user. Another user is a member of 76 groups so this can be quite large. We are using SSSD to authenticate against a Windows Active Directory domain. We are not using the Enumerate=yes option, however, everything appears to be working properly outside of SLURM. It's only inside of a SLURM job where the group list appears to be truncated. Best, Xing ________________________________ From: bugs@schedmd.com <bugs@schedmd.com> Sent: Wednesday, January 19, 2022 7:10 AM To: Huang, Xing <x.huang@wustl.edu> Subject: [Bug 13217] missing group users has on login node * External Email - Caution * Comment # 1<https://bugs.schedmd.com/show_bug.cgi?id=13217#c1> on bug 13217<https://bugs.schedmd.com/show_bug.cgi?id=13217> from Tim McMullan<mailto:mcmullan@schedmd.com> Hi Xing, How many groups is this user a member of? Based on those groups, it looks like you are authenticating against a windows domain, are you using sssd to accomplish this? If so, do you have "Enumerate=yes" set in /etc/sssd/sssd.conf for the domain? At the moment it looks like we aren't getting the full list of groups internally, but I'm looking for some other options as well. Thanks! --Tim ________________________________ You are receiving this mail because: * You reported the bug. ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. (In reply to Xing Huang from comment #2) > Hi Tim, > > This user is a member of 54 groups, but this is different for each user. > Another user is a member of 76 groups so this can be quite large. Ok, those are somewhat long lists but looking at how we handle this and your configuration, I don't expect this number of groups to be an issue. > We are using SSSD to authenticate against a Windows Active Directory domain. > We are not using the > Enumerate=yes > option, however, everything appears to be working properly outside of SLURM. > It's only inside of a SLURM job where the group list appears to be truncated. The way Slurm and (for example) id handle picking up group lists are different by necessity. There are off and on again reports of problems with when enumeration is disabled which may or may not apply to you in this case. What version of SSSD are you currently running? Is enabling enumeration something you could try to see if the issue is resolved with that setting enabled? Thanks! --Tim Tim, We tried and it did not fix the issue. Best, Xing ________________________________ From: bugs@schedmd.com <bugs@schedmd.com> Sent: Thursday, January 20, 2022 12:28 PM To: Huang, Xing <x.huang@wustl.edu> Subject: [Bug 13217] missing group users has on login node * External Email - Caution * Comment # 3<https://bugs.schedmd.com/show_bug.cgi?id=13217#c3> on bug 13217<https://bugs.schedmd.com/show_bug.cgi?id=13217> from Tim McMullan<mailto:mcmullan@schedmd.com> (In reply to Xing Huang from comment #2<show_bug.cgi?id=13217#c2>) > Hi Tim, > > This user is a member of 54 groups, but this is different for each user. > Another user is a member of 76 groups so this can be quite large. Ok, those are somewhat long lists but looking at how we handle this and your configuration, I don't expect this number of groups to be an issue. > We are using SSSD to authenticate against a Windows Active Directory domain. > We are not using the > Enumerate=yes > option, however, everything appears to be working properly outside of SLURM. > It's only inside of a SLURM job where the group list appears to be truncated. The way Slurm and (for example) id handle picking up group lists are different by necessity. There are off and on again reports of problems with when enumeration is disabled which may or may not apply to you in this case. What version of SSSD are you currently running? Is enabling enumeration something you could try to see if the issue is resolved with that setting enabled? Thanks! --Tim ________________________________ You are receiving this mail because: * You reported the bug. ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. Hi Xing, Is there any other similarities between the groups that are missing? Are they groups the users were recently added to? Is it the same missing group for multiple people? Would you also provide an example of how they are starting the interactive session? Thanks! --Tim Tim, You would see the comparison of difference in IDs reported from ACL before and after launching the interactive job in slurm on the same node. The file with slurm in the name is the one after launching the interactive job and assigned to a particular node while the one with ssh in the name is the one we act as normal user to directly ssh into the same node that was slurm assigned an interactive job to. The example below is shown for two users on our cluster. [chen.ruiqi@node17 chen.ruiqi]$ diff id_ssh_node17 id_slurm_node17 30a31 > 1220779 37a39 > 1255818 53,54d54 < 1359170(nrg-mirrir-biobank) < 1359173(nrg-mirrir-ukb-neuro) [janine.bijsterbosch@node15 ~]$ diff /tmp/id_ssh_node15 /tmp/id_slurm_node15 35a36 > 1220779 50a52,54 > 1255817 > 1255818 > 1255819 56a61 > 1304581 63a69 > 1336310(wuit_eus_9999_user_securew2certificate_targeted) 67,71d72 < 1359170(nrg-mirrir-biobank) < 1359171(nrg-mirrir-ukb-cardiac) < 1359172(nrg-mirrir-ukb-genomic) < 1359173(nrg-mirrir-ukb-neuro) < 1359174(nrg-mirrir-ukb-pheno) So it seems like the NRG groups are the ones most likely to be missing from SLURM. These are the vary ones that we need to control access to the storage. I'm first launching an interactive job using srun: [janine.bijsterbosch@login01 ~]$ srun -N 1 -n 1 --nodelist node15 --mem 100M --time=00:20:00 --pty bash Then I will SSH into the node as the user, and compare the results of the id command. In this way we're doing the comparison on the very same node. We have no way of knowing what order in time that the user was added to the various groups. It does seem that SLURM is 'masking' out some of the groups though. Just let me know if I can provide anything else to help with the debugging. ________________________________ From: bugs@schedmd.com <bugs@schedmd.com> Sent: Monday, January 24, 2022 7:32 AM To: Huang, Xing <x.huang@wustl.edu> Subject: [Bug 13217] missing group users has on login node * External Email - Caution * Comment # 5<https://bugs.schedmd.com/show_bug.cgi?id=13217#c5> on bug 13217<https://bugs.schedmd.com/show_bug.cgi?id=13217> from Tim McMullan<mailto:mcmullan@schedmd.com> Hi Xing, Is there any other similarities between the groups that are missing? Are they groups the users were recently added to? Is it the same missing group for multiple people? Would you also provide an example of how they are starting the interactive session? Thanks! --Tim ________________________________ You are receiving this mail because: * You reported the bug. ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. Thank you for the additional information, I'm doing some more digging to see what might go wrong. Something I'd like to try as a debugging step is to add "LaunchParameters=disable_send_gids" to your slurm.conf. This should force the groups to come from a more local lookup instead of from the ctld. If this does fix the issue it will narrow down the places that the error likely is being introduced. Thanks! --Tim Tim, Thank you. I just did the test for one of the users, chen.ruiqi. [chen.ruiqi@node15 tmp]$ diff id_ssh_node15_new id_slurm_node15_new Looks like add the parameter you suggested fixed the problem. Best, Xing ________________________________ From: bugs@schedmd.com <bugs@schedmd.com> Sent: Tuesday, January 25, 2022 10:13 AM To: Huang, Xing <x.huang@wustl.edu> Subject: [Bug 13217] missing group users has on login node * External Email - Caution * Comment # 7<https://bugs.schedmd.com/show_bug.cgi?id=13217#c7> on bug 13217<https://bugs.schedmd.com/show_bug.cgi?id=13217> from Tim McMullan<mailto:mcmullan@schedmd.com> Thank you for the additional information, I'm doing some more digging to see what might go wrong. Something I'd like to try as a debugging step is to add "LaunchParameters=disable_send_gids" to your slurm.conf. This should force the groups to come from a more local lookup instead of from the ctld. If this does fix the issue it will narrow down the places that the error likely is being introduced. Thanks! --Tim ________________________________ You are receiving this mail because: * You reported the bug. ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. (In reply to Xing Huang from comment #8) > Thank you. I just did the test for one of the users, chen.ruiqi. > [chen.ruiqi@node15 tmp]$ diff id_ssh_node15_new id_slurm_node15_new > Looks like add the parameter you suggested fixed the problem. Thanks for testing that! You can leave that option enabled for now, but I'd still like to track down why its not working before. Having that option specified can generate more load on the domain controller since we do more lookup operations, but as long as you don't see problems you should be OK. I'll let you know if I need any more information to help track that down! Thanks again! --Tim Hi Xing, I've been looking around for the source of the error and one question has come to mind - is this only happening with interactive sessions? If you run a batch job with the same "id" command does that also return an incorrect group list? Thanks! --Tim Tim, Yes, we saw the issue in both cases. Best, Xing ________________________________ From: bugs@schedmd.com <bugs@schedmd.com> Sent: Monday, January 31, 2022 8:26 AM To: Huang, Xing <x.huang@wustl.edu> Subject: [Bug 13217] missing group users has on login node * External Email - Caution * Comment # 10<https://bugs.schedmd.com/show_bug.cgi?id=13217#c10> on bug 13217<https://bugs.schedmd.com/show_bug.cgi?id=13217> from Tim McMullan<mailto:mcmullan@schedmd.com> Hi Xing, I've been looking around for the source of the error and one question has come to mind - is this only happening with interactive sessions? If you run a batch job with the same "id" command does that also return an incorrect group list? Thanks! --Tim ________________________________ You are receiving this mail because: * You reported the bug. ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. Thanks for the clarification! I'm still looking into this. --Tim Hi Tim, Any progress on your side on this issue? Best, Xing ________________________________ From: bugs@schedmd.com <bugs@schedmd.com> Sent: Wednesday, February 2, 2022 7:56 AM To: Huang, Xing <x.huang@wustl.edu> Subject: [Bug 13217] missing group users has on login node * External Email - Caution * Comment # 12<https://bugs.schedmd.com/show_bug.cgi?id=13217#c12> on bug 13217<https://bugs.schedmd.com/show_bug.cgi?id=13217> from Tim McMullan<mailto:mcmullan@schedmd.com> Thanks for the clarification! I'm still looking into this. --Tim ________________________________ You are receiving this mail because: * You reported the bug. ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. Hi Xing, I've been digging around for what might cause this, and so far it seems most likely that either the host the slurmctld is running on isn't returning the full list of groups, or somehow the group cache isn't getting flushed properly... which it really should be. I'm not seeing any changes in your config related to it, and by default the cache refreshes every 10 minutes. To confirm the settings in the controller, can you run "scontrol show config | grep GroupUpdate"? Would you mind running as root "groups $user" where $user is one of the users you were seeing missing groups with on the slurmctld node and seeing if that group list is complete? Thanks, --Tim Tim, This is what I get when checking config setting on the mgt node. [root@mgt slurm]# scontrol show config | grep GroupUpdate GroupUpdateForce = 1 GroupUpdateTime = 600 sec On the mgt node, using groups command gets the same result as using id command (from active directory). However, I remember the problem is not on the mgt node, but on the compute node after launching batch jobs or interactive jobs. [root@mgt slurm]# groups chen.ruiqi chen.ruiqi : domain users wuit_eus_9999_user_securew2certificate_targeted storage-jdquirk-small_animal_mr_facility-ro wuit_eus_9999_sccm_microsoft_office_mix wuit_eus_9999_netaccess_users_high idm-netaccess-high-clients-hc_cm storage-engineering-licenses-ro wuit_eus_2620_gp_dbbs_shortcut wuit_eus_2620_jss_appexclusion idm-staff-studentworkers-dbbs wuit_eus_2620_files_dbbs_list wuit_eus_9999_printing_access storage-wucci-visiopharm-rw wuit_eus_2620_jss_printers storage-dspencer-shared-ro storage-engineering-bin-ro storage-mcallawa-shared-ro storage-bga-site-locks-rw wuit-si-basicauth-bypass storage-wucci-scratch-rw wuit_eus_2620_printers storage-bga-gmsroot-ro wuitglobal require mfa wustlkey_active_users storage-bga-shared-ro storage-home1-home-ro nrg-mirrir-ukb-neuro students_artsci_pri nrg-mirrir-biobank sharepointauthonly storage-ris-sas-ro ad.adm.wukey.auth danforth_students crm stage access wustlkeystudents crm prod access crm test access compute-shinung storage-shinung wustlkeygroups cc_artsci_vphd univcreditonly wustlkeystaff wuit_eus_9999_sccm_microsoft_expression_encoder la_papercut la_students papercut students spwukey compute staff pwp2 wuit_eus_9999_mdm_users wuit_eus_2620_dbbs_all_users janine_bijsterbosch [root@mgt slurm]# id chen.ruiqi uid=2005565(chen.ruiqi) gid=1000070(domain users) groups=1000070(domain users),1336310(wuit_eus_9999_user_securew2certificate_targeted),1208168(storage-jdquirk-small_animal_mr_facility-ro),1022021(wuit_eus_9999_sccm_microsoft_office_mix),1189246(wuit_eus_9999_netaccess_users_high),1259428(idm-netaccess-high-clients-hc_cm),1358928(storage-engineering-licenses-ro),1228112(wuit_eus_2620_gp_dbbs_shortcut),1228113(wuit_eus_2620_jss_appexclusion),1314237(idm-staff-studentworkers-dbbs),1228111(wuit_eus_2620_files_dbbs_list),1021875(wuit_eus_9999_printing_access),1305070(storage-wucci-visiopharm-rw),1228114(wuit_eus_2620_jss_printers),1327201(storage-dspencer-shared-ro),1358962(storage-engineering-bin-ro),1304906(storage-mcallawa-shared-ro),1304616(storage-bga-site-locks-rw),1358910(wuit-si-basicauth-bypass),1305068(storage-wucci-scratch-rw),1228115(wuit_eus_2620_printers),1304231(storage-bga-gmsroot-ro),1000996(wuitglobal require mfa),1004319(wustlkey_active_users),1304619(storage-bga-shared-ro),1254277(storage-home1-home-ro),1255818(nrg-mirrir-ukb-neuro),1000363(students_artsci_pri),1220779(nrg-mirrir-biobank),1182034(sharepointauthonly),1313191(storage-ris-sas-ro),1002932(ad.adm.wukey.auth),1000123(danforth_students),1201356(crm stage access),1181899(wustlkeystudents),1201355(crm prod access),1201357(crm test access),1356700(compute-shinung),1204283(storage-shinung),1182075(wustlkeygroups),1000248(cc_artsci_vphd),1000050(univcreditonly),1181924(wustlkeystaff),1021904(wuit_eus_9999_sccm_microsoft_expression_encoder),1193110(la_papercut),1000030(la_students),1193107(papercut),1000009(students),1004164(spwukey),1208826(compute),1000007(staff),1000083(pwp2),1022213(wuit_eus_9999_mdm_users),1228107(wuit_eus_2620_dbbs_all_users),1012(janine_bijsterbosch) Best, Xing ________________________________ From: bugs@schedmd.com <bugs@schedmd.com> Sent: Thursday, February 10, 2022 7:27 AM To: Huang, Xing <x.huang@wustl.edu> Subject: [Bug 13217] missing group users has on login node * External Email - Caution * Comment # 14<https://bugs.schedmd.com/show_bug.cgi?id=13217#c14> on bug 13217<https://bugs.schedmd.com/show_bug.cgi?id=13217> from Tim McMullan<mailto:mcmullan@schedmd.com> Hi Xing, I've been digging around for what might cause this, and so far it seems most likely that either the host the slurmctld is running on isn't returning the full list of groups, or somehow the group cache isn't getting flushed properly... which it really should be. I'm not seeing any changes in your config related to it, and by default the cache refreshes every 10 minutes. To confirm the settings in the controller, can you run "scontrol show config | grep GroupUpdate"? Would you mind running as root "groups $user" where $user is one of the users you were seeing missing groups with on the slurmctld node and seeing if that group list is complete? Thanks, --Tim ________________________________ You are receiving this mail because: * You reported the bug. ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. (In reply to Xing Huang from comment #15) > Tim, > > This is what I get when checking config setting on the mgt node. > [root@mgt slurm]# scontrol show config | grep GroupUpdate > GroupUpdateForce = 1 > GroupUpdateTime = 600 sec > > On the mgt node, using groups command gets the same result as using id > command (from active directory). However, I remember the problem is not on > the mgt node, but on the compute node after launching batch jobs or > interactive jobs. Thank you! Yes, I'm aware that the issue appears on the nodes, however the ctld is involved in the actual user lookup and sends the gids it thinks the user has along with the job (which is the feature we disabled to handle the problem). I wanted to make sure that the ctld and the compute/front end nodes are all giving us the same group list from the system. If the output on the ctld and the nodes matches its more likely that something is happening in the ctld itself. I'm continuing to look for a source of the problem! Thanks, --Tim > ________________________________ > From: bugs@schedmd.com <bugs@schedmd.com> > Sent: Thursday, February 10, 2022 7:27 AM > To: Huang, Xing <x.huang@wustl.edu> > Subject: [Bug 13217] missing group users has on login node > > > * External Email - Caution * > > Comment # 14<https://bugs.schedmd.com/show_bug.cgi?id=13217#c14> on bug > 13217<https://bugs.schedmd.com/show_bug.cgi?id=13217> from Tim > McMullan<mailto:mcmullan@schedmd.com> > > Hi Xing, > > I've been digging around for what might cause this, and so far it seems most > likely that either the host the slurmctld is running on isn't returning the > full list of groups, or somehow the group cache isn't getting flushed > properly... which it really should be. I'm not seeing any changes in your > config related to it, and by default the cache refreshes every 10 minutes. > > To confirm the settings in the controller, can you run "scontrol show config > | > grep GroupUpdate"? > > Would you mind running as root "groups $user" where $user is one of the users > you were seeing missing groups with on the slurmctld node and seeing if that > group list is complete? > > Thanks, > --Tim > > ________________________________ > You are receiving this mail because: > > * You reported the bug. > > ________________________________ > The materials in this message are private and may contain Protected > Healthcare Information or other information of a sensitive nature. If you > are not the intended recipient, be advised that any unauthorized use, > disclosure, copying or the taking of any action in reliance on the contents > of this information is strictly prohibited. If you have received this email > in error, please immediately notify the sender via telephone or return mail. Tim, I know it would take time to find out the best solution for the bug I reported. However, this is preventing our users to use our queuing system and affecting their research projects. It has been a month since I reported the issue. Is there a temporary solution for me to implement before we find out the ultimate solution? Meanwhile, is it possible to escalate the severity of the ticket? Thanks for your time and help! Best, Xing ________________________________ From: bugs@schedmd.com <bugs@schedmd.com> Sent: Thursday, February 10, 2022 9:42 AM To: Huang, Xing <x.huang@wustl.edu> Subject: [Bug 13217] missing group users has on login node * External Email - Caution * Comment # 16<https://bugs.schedmd.com/show_bug.cgi?id=13217#c16> on bug 13217<https://bugs.schedmd.com/show_bug.cgi?id=13217> from Tim McMullan<mailto:mcmullan@schedmd.com> (In reply to Xing Huang from comment #15<show_bug.cgi?id=13217#c15>) > Tim, > > This is what I get when checking config setting on the mgt node. > [root@mgt slurm]# scontrol show config | grep GroupUpdate > GroupUpdateForce = 1 > GroupUpdateTime = 600 sec > > On the mgt node, using groups command gets the same result as using id > command (from active directory). However, I remember the problem is not on > the mgt node, but on the compute node after launching batch jobs or > interactive jobs. Thank you! Yes, I'm aware that the issue appears on the nodes, however the ctld is involved in the actual user lookup and sends the gids it thinks the user has along with the job (which is the feature we disabled to handle the problem). I wanted to make sure that the ctld and the compute/front end nodes are all giving us the same group list from the system. If the output on the ctld and the nodes matches its more likely that something is happening in the ctld itself. I'm continuing to look for a source of the problem! Thanks, --Tim > ________________________________ > From: bugs@schedmd.com<mailto:bugs@schedmd.com> <bugs@schedmd.com<mailto:bugs@schedmd.com>> > Sent: Thursday, February 10, 2022 7:27 AM > To: Huang, Xing <x.huang@wustl.edu<mailto:x.huang@wustl.edu>> > Subject: [Bug 13217<show_bug.cgi?id=13217>] missing group users has on login node > > > * External Email - Caution * > > Comment # 14<show_bug.cgi?id=13217#c14><https://bugs.schedmd.com/show_bug.cgi?id=13217#c14<show_bug.cgi?id=13217#c14>> on bug > 13217<https://bugs.schedmd.com/show_bug.cgi?id=13217<show_bug.cgi?id=13217>> from Tim > McMullan<mailto:mcmullan@schedmd.com> > > Hi Xing, > > I've been digging around for what might cause this, and so far it seems most > likely that either the host the slurmctld is running on isn't returning the > full list of groups, or somehow the group cache isn't getting flushed > properly... which it really should be. I'm not seeing any changes in your > config related to it, and by default the cache refreshes every 10 minutes. > > To confirm the settings in the controller, can you run "scontrol show config > | > grep GroupUpdate"? > > Would you mind running as root "groups $user" where $user is one of the users > you were seeing missing groups with on the slurmctld node and seeing if that > group list is complete? > > Thanks, > --Tim > > ________________________________ > You are receiving this mail because: > > * You reported the bug. > > ________________________________ > The materials in this message are private and may contain Protected > Healthcare Information or other information of a sensitive nature. If you > are not the intended recipient, be advised that any unauthorized use, > disclosure, copying or the taking of any action in reliance on the contents > of this information is strictly prohibited. If you have received this email > in error, please immediately notify the sender via telephone or return mail. ________________________________ You are receiving this mail because: * You reported the bug. ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. Hey Xing, I thought that adding "LaunchParameters=disable_send_gids" had fixed the problem? Are you not running with that now? If you aren't please do run with it, I had assumed it was left in place since it seemed to fix the issue. Tim, Yes, I did try it and it once worked. However, after the test, you asked me to comment out this parameter and you would continue dig out the root cause for the problem. Is this a temporary solution or a permanent fix? Best, Xing ________________________________ From: bugs@schedmd.com <bugs@schedmd.com> Sent: Monday, February 14, 2022 12:32 PM To: Huang, Xing <x.huang@wustl.edu> Subject: [Bug 13217] missing group users has on login node * External Email - Caution * Comment # 18<https://bugs.schedmd.com/show_bug.cgi?id=13217#c18> on bug 13217<https://bugs.schedmd.com/show_bug.cgi?id=13217> from Tim McMullan<mailto:mcmullan@schedmd.com> Hey Xing, I thought that adding "LaunchParameters=disable_send_gids" had fixed the problem? Are you not running with that now? If you aren't please do run with it, I had assumed it was left in place since it seemed to fix the issue. ________________________________ You are receiving this mail because: * You reported the bug. ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. (In reply to Xing Huang from comment #19) > Tim, > > Yes, I did try it and it once worked. However, after the test, you asked me > to comment out this parameter and you would continue dig out the root cause > for the problem. > Is this a temporary solution or a permanent fix? I'm so sorry I wasn't clear on this! My intentions on this are as follows: If running with "LaunchParameters=disable_send_gids" is working, I'm happy for you to be running with that option. I would like to understand why that option is necessary in your environment. It doesn't seem like it should be, but apparently is... however I don't want you to be in a broken state until we figure that out. As you say, it can take some time. If you are able to work with me for a while on why its required I'd certainly appreciate it. I might ask you to disable it and run with a debugging patch or try some other settings, then re-enable if things don't work. If you have a test system that exhibits the same behavior that's much better for testing when I can't reproduce the issue myself. Thanks! --Tim I just wanted to reach out and see if you have re added "LaunchParameters=disable_send_gids" and if it was still working for you. Thanks, --Tim Tim, Thanks for reaching out to me! We're good now with this option added. If you want to close the ticket, please go ahead. Again, thanks a lot for your help. Best, Xing ________________________________ From: bugs@schedmd.com <bugs@schedmd.com> Sent: Thursday, February 17, 2022 7:59 AM To: Huang, Xing <x.huang@wustl.edu> Subject: [Bug 13217] missing group users has on login node * External Email - Caution * Comment # 21<https://bugs.schedmd.com/show_bug.cgi?id=13217#c21> on bug 13217<https://bugs.schedmd.com/show_bug.cgi?id=13217> from Tim McMullan<mailto:mcmullan@schedmd.com> I just wanted to reach out and see if you have re added "LaunchParameters=disable_send_gids" and if it was still working for you. Thanks, --Tim ________________________________ You are receiving this mail because: * You reported the bug. ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. Thank you Xing! I'm glad to hear that everything is working with that option in place. I'll resolve this now, thanks again! --Tim |