Summary: | Exploring Hyper-Threading CPU nodes | ||
---|---|---|---|
Product: | Slurm | Reporter: | Damien <damien.leong> |
Component: | Configuration | Assignee: | Marshall Garey <marshall> |
Status: | RESOLVED INFOGIVEN | QA Contact: | |
Severity: | 4 - Minor Issue | ||
Priority: | --- | CC: | brian |
Version: | 16.05.4 | ||
Hardware: | Linux | ||
OS: | Linux | ||
Site: | Monash University | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Linux Distro: | --- |
Machine Name: | CLE Version: | ||
Version Fixed: | Target Release: | --- | |
DevPrio: | --- | Emory-Cloud Sites: | --- |
Description
Damien
2017-07-19 01:40:12 MDT
Damien, You're correct in assuming slurm will report the number of processors differently if hyperthreading is turned on than if it is turned off. To see your actual hardware configuration, use slurmd -C. If you disable hyperthreading in the BIOS, slurm will correctly report only 1 thread per core. Here's a machine with hyperthreading disabled: marshall@knc:~/slurm/master/byu/sbin$ ./slurmd -C NodeName=knc CPUs=12 Boards=1 SocketsPerBoard=2 CoresPerSocket=6 ThreadsPerCore=1 RealMemory=23883 TmpDisk=51175 UpTime=90-21:57:37 Hyperthreading is enabled on this machine: marshall@smd-server:~$ slurmd -C NodeName=smd-server CPUs=24 Boards=1 SocketsPerBoard=2 CoresPerSocket=6 ThreadsPerCore=2 RealMemory=7966 TmpDisk=555165 UpTime=7-02:57:48 Here's a little output of lstopo on the machine with hyperthreading enabled: Machine (7966MB) Package L#0 + L3 L#0 (12MB) L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 PU L#0 (P#0) PU L#1 (P#12) ... L2 L#6 (256KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6 PU L#12 (P#1) PU L#13 (P#13) Core0 has CPUs 0 and 12, Core6 has CPUs 1 and 13, etc. With the following configuration: SelectType=cons_res SelectTypeParameters=cr_core_memory TaskPlugin=task/affinity or task/cgroup srun -n2 will launch 2 tasks packed onto 1 core if both threads are available, but won't allow multiple jobs to share the core: marshall@smd-server:~/byu/slurm/17.02/smd$ srun -n2 cat /proc/self/status | grep -i cpus_allowed_list Cpus_allowed_list: 0 Cpus_allowed_list: 12 To force a task to use a whole core, use --ntasks-per-core=1. marshall@smd-server:~/byu/slurm/17.02/smd$ srun -n2 --ntasks-per-core=1 cat /proc/self/status | grep -i cpus_allowed_list Cpus_allowed_list: 1,13 Cpus_allowed_list: 0,12 If you want a task to only be bound on a single CPU on a core, use --cpu_bind=threads: marshall@smd-server:~/byu/slurm/17.02/smd$ srun -n2 --ntasks-per-core=1 --cpu_bind=threads cat /proc/self/status | grep -i cpus_allowed_list Cpus_allowed_list: 1 Cpus_allowed_list: 0 marshall@smd-server:~/byu/slurm/17.02/smd$ scontrol show job 117 | grep NumCPUs NumNodes=1 NumCPUs=4 NumTasks=2 CPUs/Task=1 ReqB:S:C:T=0:0:*:* Another option is to use --hint=nomultithread, which is functionally equivalent (but it only works if using the task/affinity plugin): marshall@smd-server:~/byu/slurm/17.02/smd$ srun -n2 --hint=nomultithread cat /proc/self/status | grep -i cpus_allowed_list Cpus_allowed_list: 1 Cpus_allowed_list: 0 marshall@smd-server:~/byu/slurm/17.02/smd$ scontrol show job 117 | grep NumCPUs NumNodes=1 NumCPUs=4 NumTasks=2 CPUs/Task=1 ReqB:S:C:T=0:0:*:* On the hyperthreaded nodes, the allocated CPU count will be rounded up to include both threads on the core, as shown above. To make a partition a higher priority than another partition, set its PriorityTier in slurm.conf. Jobs in a partition with higher priority tier will be scheduled before jobs in a partition with a lower priority tier. I suggest reading our page on cpu_management: https://slurm.schedmd.com/cpu_management.html Does this do what you need? Hi Damien, I'd just like to follow up with you. Were you able to figure out how to do what you wanted? Do you still have questions? Damien, I'm going to close this as resolved/info given. Please let me know if you have any further questions or problems. Thanks, Marshall Thanks for the information. Cheers Damien |