Ticket 7332 - sinfo bug for memory for multi terabytes
Summary: sinfo bug for memory for multi terabytes
Status: RESOLVED INVALID
Alias: None
Product: Slurm
Classification: Unclassified
Component: User Commands (show other tickets)
Version: 18.08.7
Hardware: Linux Linux
: 6 - No support contract
Assignee: Jacob Jenson
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2019-07-01 15:19 MDT by Dave Turner
Modified: 2019-07-01 15:19 MDT (History)
0 users

See Also:
Site: -Other-
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Dave Turner 2019-07-01 15:19:24 MDT
On Bridges at PSC there are 4 compute nodes of 12 TB.  The scontrol properly shows the RealMemory of 12385927 as defined in slurm.conf, but sinfo misses the last digit as shown below

Bridges scontrol show nodes xl001
NodeName=xl001 Arch=x86_64 CoresPerSocket=18
   CPUAlloc=255 CPUTot=288 CPULoad=7.98
   AvailableFeatures=EGRESS,PERF,ESM,E7-8880,E7-8880v3,PH1
   ActiveFeatures=EGRESS,PERF,ESM,E7-8880,E7-8880v3,PH1
   Gres=(null)
   NodeAddr=xl001 NodeHostName=xl001 Version=18.08
   OS=Linux 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018
   RealMemory=12385927 AllocMem=12288000 FreeMem=12199341 Sockets=16 Boards=1
   State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1000 Owner=N/A MCS_label=N/A
   Partitions=LM,XLM
   BootTime=2019-06-06T05:48:28 SlurmdStartTime=2019-06-06T05:37:06
   CfgTRES=cpu=288,mem=12385927M,billing/gpu=288
   AllocTRES=cpu=255,mem=12000G
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


Bridges sinfo -a --Format=nodehost,freemem,allocmem,memory | grep xl001
xl001               12199302            12288000            1238592
Bridges sinfo -a --Format=nodehost,freemem,memory,allocmem | grep xl001
xl001               12199302            1238592             12288000