| Summary: | scontrol show burst displaying incorrect sizes | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | David Paul <dpaul> |
| Component: | Burst Buffers | Assignee: | Moe Jette <jette> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | djbard, dmjacobsen, dpaul, tim |
| Version: | 15.08.7 | ||
| Hardware: | Cray XC | ||
| OS: | Linux | ||
| Site: | NERSC | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | Cori | CLE Version: | |
| Version Fixed: | 15.08.9 | Target Release: | --- |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
Can you attach the output for "dw_wlm_cli --function pools" ? Slurm parses that to construct the internal BB pool state, and I'm curious if there's a mismatch between that output and dwstat. Were the reservations with the size mismatch created before the update? I'm wondering if its the granularity conversion that messed things up, in Slurm and/or DataWarp. I just reviewed the code in Slurm. We don't save burst buffer allocation sizes (except when emulating a Cray). The information all comes from the Cray APIs when Slurm starts up. That leads me to suspect that the Cray software didn't handle the granularity change well. The APIs report allocation sizes in terms of "quantity" (could of blocks, each having a size of "granularity", where the "granularity" is associated with the pool). Typo in previous message: (In reply to Moe Jette from comment #3) > I just reviewed the code in Slurm. We don't save burst buffer allocation > sizes (except when emulating a Cray). The information all comes from the > Cray APIs when Slurm starts up. That leads me to suspect that the Cray > software didn't handle the granularity change well. The APIs report > allocation sizes in terms of "quantity" (could of blocks, each having a size count > of "granularity", where the "granularity" is associated with the pool). RE: Can you attach the output for "dw_wlm_cli --function pools" ?
nid00837:~ # /opt/cray/dw_wlm/default/bin/dw_wlm_cli --function pools
{"pools": [{"free": 381545, "granularity": 16777216, "id": "test_pool", "quantity": 381545, "units": "bytes"}, {"free": 3936, "granularity": 228606345216, "id": "wlm_pool", "quantity": 4004, "units": "bytes"}]}
RE: Were the reservations with the size mismatch created before the update?
Both. All reservation were created AFTER changing the granularity (which was changed prior to the software updates). The "presv1" PR was created prior to the software updates (1/20). The other PRs were created after the software updates.
Is it correct the Slurm deals with units of MBs (i.e. 1,048,576 bytes)?
(In reply to David Paul from comment #5) > RE: Can you attach the output for "dw_wlm_cli --function pools" ? > > nid00837:~ # /opt/cray/dw_wlm/default/bin/dw_wlm_cli --function pools > {"pools": [{"free": 381545, "granularity": 16777216, "id": "test_pool", > "quantity": 381545, "units": "bytes"}, {"free": 3936, "granularity": > 228606345216, "id": "wlm_pool", "quantity": 4004, "units": "bytes"}]} Doing the math, for "wlm_pool", that works out to: Granularity=218016M TotalSpace=872936064M UsedSpace=14825088M While Slurm reported (from your initial ticket): Granularity=218016M TotalSpace=872936064M UsedSpace=21588704M So that is likely correct. > RE: Were the reservations with the size mismatch created before the update? > > Both. All reservation were created AFTER changing the granularity (which was > changed prior to the software updates). The "presv1" PR was created prior > to the software updates (1/20). The other PRs were created after the > software updates. > > Is it correct the Slurm deals with units of MBs (i.e. 1,048,576 bytes)? Slurm works in units of bytes, but adds a suffix of "M", "G", "T", etc as appropriate. Could you please attach the output of the following 2 commands. This is what Slurm is working from to determine current buffer state: dw_wlm_cli -v --function show_sessions dw_wlm_cli -v --function show_instances > > Is it correct the Slurm deals with units of MBs (i.e. 1,048,576 bytes)?
>
> Slurm works in units of bytes, but adds a suffix of "M", "G", "T", etc as
> appropriate.
PS: Slurm does not display burst buffer size information a decimal point. It only promotes the suffix (e.g. from "M" to "G") if the value can be evenly divided by 1024.
I have removed and recreated the "presv1" PR. The size (5TB) is now displayed correctly (Size=5450400M).
One that is still inconsistent is djbTest - dwstat=212.91GiB , Slurm=Size=928M
nid00837:~ # dwstat most
pool units quantity free gran
test_pool bytes 5.82TiB 5.82TiB 16MiB
wlm_pool bytes 832.5TiB 818.15TiB 212.91GiB
sess state token creator owner created expiration nodes
1374 CA--- djbTest CLI 61692 2016-01-23T17:03:19 never 0
inst state sess bytes nodes created expiration intact label public confs
1193 CA--- 1374 212.91GiB 1 2016-01-23T17:03:19 never true djbTest true 1
[dpaul@cori09]==> scontrol show burst
Name=cray DefaultPool=wlm_pool Granularity=218016M TotalSpace=872936064M UsedSpace=27249280M
StageInTimeout=86400 StageOutTimeout=86400 Flags=EnablePersistent,TeardownFailure
GetSysState=/opt/cray/dw_wlm/default/bin/dw_wlm_cli
Allocated Buffers:
Name=djbTest CreateTime=2016-01-23T17:03:19 Size=928M State=allocated UserID=djbard(61692)
Here are the outputs from the requested commands:
nid00837:~ # /opt/cray/dw_wlm/default/bin/dw_wlm_cli --function show_sessions
{"sessions": [{"created": 1453369801, "creator": "CLI", "expiration": 0, "expired": false, "id": 1342, "links": {"client_nodes": []}, "owner": 69266, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}, "token": "andreyBB"}, {"created": 1453597399, "creator": "CLI", "expiration": 0, "expired": false, "id": 1374, "links": {"client_nodes": []}, "owner": 61692, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}, "token": "djbTest"}, {"created": 1453664981, "creator": "CLI", "expiration": 0, "expired": false, "id": 1396, "links": {"client_nodes": []}, "owner": 61845, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}, "token": "pbbcombine7"}, {"created": 1453683395, "creator": "SLURM", "expiration": 0, "expired": false, "id": 1397, "links": {"client_nodes": ["nid00043"]}, "owner": 61692, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}, "token": "987616"}, {"created": 1453739246, "creator": "SLURM", "expiration": 0, "expired": false, "id": 1410, "links": {"client_nodes": []}, "owner": 60891, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}, "token": "1003072"}, {"created": 1453739246, "creator": "SLURM", "expiration": 0, "expired": false, "id": 1411, "links": {"client_nodes": []}, "owner": 60891, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}, "token": "1003077"}, {"created": 1453740856, "creator": "SLURM", "expiration": 0, "expired": false, "id": 1416, "links": {"client_nodes": ["nid00804", "nid00805", "nid00806", "nid00807", "nid00828", "nid00829", "nid00830", "nid00831"]}, "owner": 60891, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}, "token": "1003080"}, {"created": 1453741406, "creator": "SLURM", "expiration": 0, "expired": false, "id": 1417, "links": {"client_nodes": ["nid00416", "nid00417", "nid00418", "nid00494", "nid00884", "nid00885", "nid00886", "nid00887"]}, "owner": 60891, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}, "token": "1003081"}, {"created": 1453745502, "creator": "SLURM", "expiration": 0, "expired": false, "id": 1427, "links": {"client_nodes": ["nid02231"]}, "owner": 61692, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}, "token": "1005760"}, {"created": 1453745949, "creator": "SLURM", "expiration": 0, "expired": false, "id": 1431, "links": {"client_nodes": ["nid01106"]}, "owner": 61692, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}, "token": "1005806"}, {"created": 1453760704, "creator": "CLI", "expiration": 0, "expired": false, "id": 1457, "links": {"client_nodes": []}, "owner": 61692, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}, "token": "djbTest2"}, {"created": 1453920069, "creator": "CLI", "expiration": 0, "expired": false, "id": 1511, "links": {"client_nodes": []}, "owner": 15448, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}, "token": "presv1"}]}
nid00837:~ # /opt/cray/dw_wlm/default/bin/dw_wlm_cli --function show_instances
{"instances": [{"capacity": {"bytes": 228606345216, "nodes": 1}, "created": 1453369801, "expiration": 0, "expired": false, "id": 1162, "intact": true, "label": "andreyBB", "limits": {"write_window_length": 86400, "write_window_multiplier": 10}, "links": {"configurations": [1417], "session": 1342}, "public": true, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}}, {"capacity": {"bytes": 228606345216, "nodes": 1}, "created": 1453597399, "expiration": 0, "expired": false, "id": 1193, "intact": true, "label": "djbTest", "limits": {"write_window_length": 86400, "write_window_multiplier": 10}, "links": {"configurations": [1450], "session": 1374}, "public": true, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}}, {"capacity": {"bytes": 1143031726080, "nodes": 5}, "created": 1453664981, "expiration": 0, "expired": false, "id": 1200, "intact": true, "label": "pbbcombine7", "limits": {"write_window_length": 86400, "write_window_multiplier": 10}, "links": {"configurations": [1457], "session": 1396}, "public": true, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}}, {"capacity": {"bytes": 228606345216, "nodes": 1}, "created": 1453683395, "expiration": 0, "expired": false, "id": 1201, "intact": true, "label": "I1397-0", "limits": {"write_window_length": 86400, "write_window_multiplier": 10}, "links": {"configurations": [1458], "session": 1397}, "public": false, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}}, {"capacity": {"bytes": 228606345216, "nodes": 1}, "created": 1453739246, "expiration": 0, "expired": false, "id": 1213, "intact": true, "label": "I1410-0", "limits": {"write_window_length": 86400, "write_window_multiplier": 10}, "links": {"configurations": [1471], "session": 1410}, "public": false, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}}, {"capacity": {"bytes": 228606345216, "nodes": 1}, "created": 1453739246, "expiration": 0, "expired": false, "id": 1214, "intact": true, "label": "I1411-0", "limits": {"write_window_length": 86400, "write_window_multiplier": 10}, "links": {"configurations": [1470], "session": 1411}, "public": false, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}}, {"capacity": {"bytes": 228606345216, "nodes": 1}, "created": 1453740856, "expiration": 0, "expired": false, "id": 1219, "intact": true, "label": "I1416-0", "limits": {"write_window_length": 86400, "write_window_multiplier": 10}, "links": {"configurations": [1476], "session": 1416}, "public": false, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}}, {"capacity": {"bytes": 228606345216, "nodes": 1}, "created": 1453741406, "expiration": 0, "expired": false, "id": 1220, "intact": true, "label": "I1417-0", "limits": {"write_window_length": 86400, "write_window_multiplier": 10}, "links": {"configurations": [1477], "session": 1417}, "public": false, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}}, {"capacity": {"bytes": 228606345216, "nodes": 1}, "created": 1453745502, "expiration": 0, "expired": false, "id": 1230, "intact": true, "label": "I1427-0", "limits": {"write_window_length": 86400, "write_window_multiplier": 10}, "links": {"configurations": [1487], "session": 1427}, "public": false, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}}, {"capacity": {"bytes": 228606345216, "nodes": 1}, "created": 1453745949, "expiration": 0, "expired": false, "id": 1234, "intact": true, "label": "I1431-0", "limits": {"write_window_length": 86400, "write_window_multiplier": 10}, "links": {"configurations": [1491], "session": 1431}, "public": false, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}}, {"capacity": {"bytes": 228606345216, "nodes": 1}, "created": 1453760704, "expiration": 0, "expired": false, "id": 1257, "intact": true, "label": "djbTest2", "limits": {"write_window_length": 86400, "write_window_multiplier": 10}, "links": {"configurations": [1514], "session": 1457}, "public": true, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}}, {"capacity": {"bytes": 5715158630400, "nodes": 25}, "created": 1453920069, "expiration": 0, "expired": false, "id": 1297, "intact": true, "label": "presv1", "limits": {"write_window_length": 86400, "write_window_multiplier": 10}, "links": {"configurations": [1554], "session": 1511}, "public": true, "state": {"actualized": true, "fuse_blown": false, "goal": "create", "mixed": false, "transitioning": false}}]}
nid00837:~ #
There is a variable without without a sufficient number of bits, so some high order bits are getting dropped. I need to review the code for more issues of this sort. I should be able to get you a patch within a couple of days. Fortunately, only 2 lines need to change, increasing a couple of variables from 32 to 64 bits. Patch is at location below: https://github.com/SchedMD/slurm/commit/214b3abe9a41895adabc8168f03d4619c92932fc.patch Buffers allocated while the slurmctld daemon should be fine. When the daemon restarts, buffer sizes (expressed in bytes) over 32-bits will get truncated. This fix will be in version 15.08.9 when released, likely mid-February. Thanks for the quick turnaround, much appreciated! |
AT least 3 persistent reservation sizes displays a mismatch between dwstat (correct creation size) and the output of scontrol show burst. Last week we changed the wlm_pool granularity from 400GB to 200GB, updated to 15.08.7, and added some Cray Datawarp patches. Here is one example dwstat = 5.2TiB vs. slurm = 2720M: <dwstat all snipped> sess state token creator owner created expiration nodes 1293 CA--- presv1 CLI 15448 2016-01-19T14:59:21 never 0 inst state sess bytes nodes created expiration intact label public confs 1114 CA--- 1293 5.2TiB 25 2016-01-19T14:59:21 never true presv1 true 1 conf state inst type access_type activs 1368 CA--- 1114 scratch stripe 0 frag state inst capacity gran node 29347 CA-- 1114 212.91GiB 4MiB nid00913 29348 CA-- 1114 212.91GiB 4MiB nid02062 29349 CA-- 1114 212.91GiB 4MiB nid01418 29350 CA-- 1114 212.91GiB 4MiB nid01994 29351 CA-- 1114 212.91GiB 4MiB nid00785 29352 CA-- 1114 212.91GiB 4MiB nid00782 29353 CA-- 1114 212.91GiB 4MiB nid00142 29354 CA-- 1114 212.91GiB 4MiB nid02189 29355 CA-- 1114 212.91GiB 4MiB nid00781 29356 CA-- 1114 212.91GiB 4MiB nid00457 29357 CA-- 1114 212.91GiB 4MiB nid01098 29358 CA-- 1114 212.91GiB 4MiB nid00146 29359 CA-- 1114 212.91GiB 4MiB nid01865 29360 CA-- 1114 212.91GiB 4MiB nid01481 29361 CA-- 1114 212.91GiB 4MiB nid01737 29362 CA-- 1114 212.91GiB 4MiB nid00854 29363 CA-- 1114 212.91GiB 4MiB nid00653 29364 CA-- 1114 212.91GiB 4MiB nid02253 29365 CA-- 1114 212.91GiB 4MiB nid01225 29366 CA-- 1114 212.91GiB 4MiB nid00269 29367 CA-- 1114 212.91GiB 4MiB nid01237 29368 CA-- 1114 212.91GiB 4MiB nid01678 29369 CA-- 1114 212.91GiB 4MiB nid01802 29370 CA-- 1114 212.91GiB 4MiB nid01033 29371 CA-- 1114 212.91GiB 4MiB nid00853 scontrol show burst Name=cray DefaultPool=wlm_pool Granularity=218016M TotalSpace=872936064M UsedSpace=21588704M StageInTimeout=86400 StageOutTimeout=86400 Flags=EnablePersistent,TeardownFailure GetSysState=/opt/cray/dw_wlm/default/bin/dw_wlm_cli Allocated Buffers: Name=presv1 CreateTime=2016-01-19T14:59:21 Size=2720M State=allocated UserID=dpaul(15448) Per User Buffer Use: UserID=dpaul(15448) Used=2720M