Ticket 21688

Summary: scontrol and sinfo disagree on TmpDisk size
Product: Slurm Reporter: griznog <john.hanks>
Component: ConfigurationAssignee: Benjamin Witham <benjamin.witham>
Status: RESOLVED FIXED QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: benjamin.witham, felip.moll
Version: 24.05.3   
Hardware: Linux   
OS: Linux   
Site: CZ Biohub Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: 24.11.1 Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description griznog 2024-12-18 11:23:34 MST
When originally adding these nodes, I incorrectly set TmpDisk=1000000 when the node has 10TB for TmpDisk. Today someone noticed and I corrected the nodes config in slurm.conf to be TmpDisk=10000000, but sinfo still thinks it is 1000000

[I am root!@frankie:(bare):gitolite3@localhost:slurm-config-bruno(master):etc]# scontrol show node cpu-a-1 | grep TmpDisk
   State=IDLE ThreadsPerCore=1 TmpDisk=10000000 Weight=200 Owner=N/A MCS_label=N/A
[I am root!@frankie:(bare):gitolite3@localhost:slurm-config-bruno(master):etc]# sinfo -No "%n %d" | grep cpu-a-1
cpu-a-1 1000000
cpu-a-1 1000000
cpu-a-1 1000000
[I am root!@frankie:(bare):gitolite3@localhost:slurm-config-bruno(master):etc]# sinfo -NO 'NodeList,Disk' | grep cpu-a-1
cpu-a-1             1000000             
cpu-a-1             1000000             
cpu-a-1             1000000             

The Slurm config for these nodes is:

NodeName=cpu-a-1 CPUs=128 Boards=1 SocketsPerBoard=2 CoresPerSocket=64 ThreadsPerCore=1 RealMemory=4096000 TmpDisk=10000000 Feature=cpu,compute,largemem,amd_7h12 Weight=200 State=UNKNOWN
NodeName=cpu-a-2 CPUs=128 Boards=1 SocketsPerBoard=2 CoresPerSocket=64 ThreadsPerCore=1 RealMemory=4096000 TmpDisk=10000000 Feature=cpu,compute,largemem,amd_7h12 Weight=200 State=UNKNOWN


I've tried changing the format options for sinfo, but I always wind up a 0 short of the actual setting. What am I doing wrong here?
Comment 1 griznog 2024-12-18 11:37:07 MST
Looks like i'm late to this party, this is probably a duplicate of 16248 and may not have anything to do with me having originally set the TmpDisk value incorrectly.
Comment 3 Benjamin Witham 2024-12-19 17:18:52 MST
Hello griznog,

I can replicate this and I'm working on it. I'll keep you updated.
Comment 5 Benjamin Witham 2024-12-30 12:06:29 MST
Hello griznog, 

This issue was due to the character buffer that would display the TmpDisk values in sinfo was a little short. We've increased the character buffer, so this behavior has been fixed in commit b8c6c9c0, which is in ahead of 24.11.1. I'll be closing this ticket now.