Summary: | SLURM JobAcctGatherType=jobacct_gather/linux reflection procedure of sstat command. | ||
---|---|---|---|
Product: | Slurm | Reporter: | toru matsuoka <tmatsuoka> |
Component: | slurmctld | Assignee: | Danny Auble <da> |
Status: | RESOLVED INFOGIVEN | QA Contact: | |
Severity: | 3 - Medium Impact | ||
Priority: | --- | CC: | da |
Version: | 2.6.2 | ||
Hardware: | Linux | ||
OS: | Linux | ||
Site: | CRAY | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Linux Distro: | --- |
Machine Name: | CLE Version: | ||
Version Fixed: | Target Release: | --- | |
DevPrio: | --- | Emory-Cloud Sites: | --- |
Description
toru matsuoka
2014-06-04 19:35:49 MDT
Hello,SLURM Support ! I am waiting for the reply to this inquiry. Best Regards.. Toru Matsuoka The JobAcctGatherType determines the plugin to be used to collect resource usage information about user jobs, which are run by the slurmd daemon (on the compute nodes) and the slurmstepd daemon (which starts the application, one is started for each srun command). While the JobAcctGatherType configuration parameter is not used by the slurmctld daemon, we strongly recommend the same slurm.conf file be used on every node. If the slurmctld and slurmd daemons are running with different slurm.conf files, the slurmctld will report the error error: Node <name> appears to have a different slurm.conf than the slurmctld. This could cause issues with communication and functionality. Please review both files and make sure they are the same. If this is expected ignore, and set DebugFlags=NO_CONF_HASH in your slurm.conf. The slurmdbd does not use the slurm.conf file and will not need to be restarted. You can change the slurm.conf file and restart the daemons in any order without losing any running or pending jobs, however the sstat program will fail until both the slurmd and slurmstepd are running with the desired JobAcctGatherType. Slurm version 14.03 works better with JobAcctGatherType=jobacct_gather/none (the sstat command just returns zeros). Hello,Slurm Support Team! >The JobAcctGatherType determines the plugin to be used to collect resource >usage information about user jobs, which are run by the slurmd daemon (on the >compute nodes) and the slurmstepd daemon (which starts the application, one is >started for each srun command). →Thank you.I understood it. >While the JobAcctGatherType configuration parameter is not used by the >slurmctld daemon, we strongly recommend the same slurm.conf file be used on >every node. If the slurmctld and slurmd daemons are running with different >slurm.conf files, the slurmctld will report the error >error: Node <name> appears to have a different slurm.conf than the slurmctld. >This could cause issues with communication and functionality. Please review >both files and make sure they are the same. If this is expected ignore, and >set DebugFlags=NO_CONF_HASH in your slurm.conf. →The Customer can same slurm.conf file.Thus,It look likes a no problem state. The slurmdbd does not use the slurm.conf file and will not need to be restarted. →Thank you. I understood it. You can change the slurm.conf file and restart the daemons in any order without losing any running or pending jobs, however the sstat program will fail until both the slurmd and slurmstepd are running with the desired JobAcctGatherType. →In Case,Can JobAcctGatherType parameter JobAcctGatherType=jobacct_gather/linux? And Are both the slurmd and slurmstepd always running? >Slurm version 14.03 works better with JobAcctGatherType=jobacct_gather/none (the sstat command just returns zeros). → We can not Slurm version up from 2.6.2 to 14.03. If Slurm version is 2.6.2, is there any problem concern with JobAcctGatherType parameter? Best Regards.. Toru Matsuoka (In reply to toru matsuoka from comment #3) > You can change the slurm.conf file and restart the daemons in any order > without losing any running or pending jobs, however the sstat program will > fail until both the slurmd and slurmstepd are running with the desired > JobAcctGatherType. > > →In Case,Can JobAcctGatherType parameter > JobAcctGatherType=jobacct_gather/linux? If you want to collect accounting information about jobs on a linux cluster then JobAcctGatherType=jobacct_gather/linux must be set. > And Are both the slurmd and slurmstepd always running? The slurmd should always be running on every compute node. A slurmstepd is running whenever a job step is running on the compute node. One slurmstepd for each job step > >Slurm version 14.03 works better with JobAcctGatherType=jobacct_gather/none (the sstat command just returns zeros). > > → We can not Slurm version up from 2.6.2 to 14.03. > If Slurm version is 2.6.2, is there any problem concern with > JobAcctGatherType parameter? The only problem with Slurm version 2.6 is the sstat errors when the JobAcctGatherType value for sstat is different than what the slurmd is running with. You should at least consider upgrading from version 2.6.2 to 2.6.8. Version 2.6.2 is known to contain about 100 bugs that were fixed in later releases of version 2.6. There will be no loss of jobs or command changes when upgrading, only bug fixes. > Best Regards.. > Toru Matsuoka Hello,Slurm Support Team! I understood about this contents. I want plan version up from Slurm 2.6.2 to 2.6.8. If enable , Please teach me how to simple Slurm update procedure from 2.6.2 to 2.6.8. Best Regards.. Toru Matsuoka (In reply to toru matsuoka from comment #5) > Hello,Slurm Support Team! > > I understood about this contents. > > I want plan version up from Slurm 2.6.2 to 2.6.8. > > If enable , Please teach me how to simple Slurm update procedure from 2.6.2 > to 2.6.8. Slurm is upgraded the same way as any other Linux package. Just install the new RPMs and restart daeemons. There is some more information here: http://slurm.schedmd.com/quickstart_admin.html#upgrade Hello,Slurm Supoort Team! Thanks for slurm supoort. I verified at this URL. I understood about Slurm version up procedure. But, Under the customer's situation, while the upgrade to 2.6.8 from 2.6.2 is for a while, it is difficult. I would like to ask you for support, when 2.6.2 is done and sstat does not use. Best Regards.. Toru Matsuoka (In reply to toru matsuoka from comment #7) > Hello,Slurm Supoort Team! > > Thanks for slurm supoort. > > I verified at this URL. > > I understood about Slurm version up procedure. > > But, > Under the customer's situation, while the upgrade to > 2.6.8 from 2.6.2 is for a while, it is difficult. > > I would like to ask you for support, > when 2.6.2 is done and sstat does not use. > > Best Regards.. > Toru Matsuoka Then just change the configuration to collect accounting information. Set: JobAcctGatherType=jobacct_gather/linux Please open a new ticket if you need more information It understood. Please close this case. |