Ticket 1589 - gres count underflow
Summary: gres count underflow
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: Other (show other tickets)
Version: 14.11.5
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Moe Jette
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2015-04-08 04:44 MDT by Kilian Cavalotti
Modified: 2019-01-16 13:07 MST (History)
2 users (show)

See Also:
Site: Stanford
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 14.11.6
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
change logs from "error" to "debug" (905 bytes, patch)
2015-04-10 06:50 MDT, Moe Jette
Details | Diff

Note You need to log in before you can comment on or make changes to this ticket.
Description Kilian Cavalotti 2015-04-08 04:44:11 MDT
Hi,

I've recently noticed error messages related to GRES in our slurmctld.log. It doesn't seem to have much impact on jobs, though.
-- 8< -----------------------------------------------------------------------------------------------------
[2015-04-08T09:33:55.918] debug2: cons_res: _will_run_test, job 2031886: overlap=1
[2015-04-08T09:33:55.918] debug3: cons_res: _rm_job_from_res: job 2031886 action 0
[2015-04-08T09:33:55.918] error: gres/gpu: job 2031886 dealloc node gpu-9-1 topo gres count underflow
[2015-04-08T09:33:55.918] error: gres/gpu: job 2031886 dealloc node gpu-9-1 type tesla gres count underflow
[2015-04-08T09:33:55.918] debug3: cons_res: removed job 2031886 from part gpu row 0
-- 8< -----------------------------------------------------------------------------------------------------

Could you please explain what those messages mean? 
They affect most of our GPU nodes (but not all, although their software configuration is strictly the same).

The slurmd.log on the nodes contain some errors related to cgroup memory limits (but I'm not sure they're related), but nothing about GRES:
-- 8< -----------------------------------------------------------------------------------------------------
[2015-04-08T09:29:49.610] Gres Name=gpu Type=tesla Count=4 ID=7696487 File=/dev/nvidia[0-3] CPUs=[0-7] CpuCnt=16
[2015-04-08T09:29:49.610] Gres Name=gpu Type=tesla Count=4 ID=7696487 File=/dev/nvidia[4-7] CPUs=[8-15] CpuCnt=16
[2015-04-08T09:29:49.610] gpu 0 is device number 0
[2015-04-08T09:29:49.610] gpu 1 is device number 1
[2015-04-08T09:29:49.610] gpu 2 is device number 2
[2015-04-08T09:29:49.610] gpu 3 is device number 3
[2015-04-08T09:29:49.610] gpu 4 is device number 4
[2015-04-08T09:29:49.610] gpu 5 is device number 5
[2015-04-08T09:29:49.610] gpu 6 is device number 6
[2015-04-08T09:29:49.610] gpu 7 is device number 7
[2015-04-08T09:29:49.613] No specialized cores configured by default on this node
[2015-04-08T09:29:49.614] Resource spec: system memory limit not configured for this node
[2015-04-08T09:29:49.641] slurmd version 14.11.5 started
[2015-04-08T09:29:49.643] slurmd started on Wed, 08 Apr 2015 09:29:49 -0700
[2015-04-08T09:29:49.644] CPUs=16 Boards=1 Sockets=2 Cores=8 Threads=1 Memory=258319 TmpDisk=192080 Uptime=355490 CPUSpecList=(null)
[2015-04-08T09:30:07.589] error: Error reading step 2031886.0 memory limits
[2015-04-08T09:30:07.607] error: Error reading step 2031886.4294967294 memory limits
-- 8< -----------------------------------------------------------------------------------------------------

I don't know where those error come from either since I can manually get the memory limit:

-- 8< -----------------------------------------------------------------------------------------------------
# cat /cgroup/memory/slurm/uid_18306/job_2031886/step_0/memory.memsw.limit_in_bytes
16777216000
-- 8< -----------------------------------------------------------------------------------------------------


Our gres.conf file looks like this:
-- 8< -----------------------------------------------------------------------------------------------------
# cat /etc/slurm/gres.conf
# 4 GPUs nodes
NodeName=gpu-9-[6-10] Name=gpu Type=gtx File=/dev/nvidia[0-1] CPUs=[0-7]
NodeName=gpu-9-[6-10] Name=gpu Type=gtx File=/dev/nvidia[2-3] CPUs=[8-15]
# 8 GPUs nodes
NodeName=gpu-9-[1-2]  Name=gpu Type=tesla File=/dev/nvidia[0-3] CPUs=[0-7]
NodeName=gpu-9-[1-2]  Name=gpu Type=tesla File=/dev/nvidia[4-7] CPUs=[8-15]
NodeName=gpu-9-[3-5],gpu-13-[1-2],gpu-14-[1-9] Name=gpu Type=gtx File=/dev/nvidia[0-3] CPUs=[0-7]
NodeName=gpu-9-[3-5],gpu-13-[1-2],gpu-14-[1-9] Name=gpu Type=gtx File=/dev/nvidia[4-7] CPUs=[8-15]
-- 8< -----------------------------------------------------------------------------------------------------

and our slurm.conf contains:
-- 8< -----------------------------------------------------------------------------------------------------
JobAcctGatherType       = jobacct_gather/cgroup
ProctrackType           = proctrack/cgroup
TaskPlugin              = task/cgroup
-- 8< -----------------------------------------------------------------------------------------------------

Let me know what other information would be useful.

Thanks!
-- 
Kilian
Comment 1 Moe Jette 2015-04-08 08:34:04 MDT
(In reply to Kilian Cavalotti from comment #0)
> Hi,
> 
> I've recently noticed error messages related to GRES in our slurmctld.log.
> It doesn't seem to have much impact on jobs, though.
> -- 8<
> -----------------------------------------------------------------------------
> ------------------------
> [2015-04-08T09:33:55.918] debug2: cons_res: _will_run_test, job 2031886:
> overlap=1
> [2015-04-08T09:33:55.918] debug3: cons_res: _rm_job_from_res: job 2031886
> action 0
> [2015-04-08T09:33:55.918] error: gres/gpu: job 2031886 dealloc node gpu-9-1
> topo gres count underflow
> [2015-04-08T09:33:55.918] error: gres/gpu: job 2031886 dealloc node gpu-9-1
> type tesla gres count underflow
> [2015-04-08T09:33:55.918] debug3: cons_res: removed job 2031886 from part
> gpu row 0

These messages indicate that the bookkeeping with respect to those GRES on that node is wrong.
Did they all start at a time that matches some other event like configuration change, node reboot, daemon restart, etc?
I would expect this to be self-healing.  Are these messages continuing?
If might be helpful if you could some some logs around the time this started from both the slurmctld and relevant slurmd.



> [2015-04-08T09:29:49.610] gpu 7 is device number 7
> [2015-04-08T09:29:49.613] No specialized cores configured by default on this
> node
> [2015-04-08T09:29:49.614] Resource spec: system memory limit not configured
> for this node
> [2015-04-08T09:29:49.641] slurmd version 14.11.5 started
> [2015-04-08T09:29:49.643] slurmd started on Wed, 08 Apr 2015 09:29:49 -0700
> [2015-04-08T09:29:49.644] CPUs=16 Boards=1 Sockets=2 Cores=8 Threads=1
> Memory=258319 TmpDisk=192080 Uptime=355490 CPUSpecList=(null)
> [2015-04-08T09:30:07.589] error: Error reading step 2031886.0 memory limits
> [2015-04-08T09:30:07.607] error: Error reading step 2031886.4294967294
> memory limits


I've made some changes to the log message for greater clarity:
> Resource spec: system memory limit not configured for this node
Changed to
> Resource spec: Reserved system memory limit not configured for this node

And
> error: Error reading step 2031886.0 memory limits
Changed to
> error: Error reading step 2031886.0 memory limits from slurmstepd
So for some reason the slurmd daemon on the compute node was able to connect to the slurmstepd (job step shepherd), but not coummunicate. Then the slurmd sends a message to the slurmstepd via named socket asking for its memory limits, but the slurmstepd does not respond. I'm wondering if there is a slurmstepd that is somehow wedged: no memory, stopped, whatever. It would be helpful if you could login to that node and see if the slurmstepd is still around and if so what its state is. You might also grep for that job id, "2031886", in the slurmd log file for any clues.
Comment 2 Kilian Cavalotti 2015-04-09 06:03:19 MDT
Hi Moe, 

(In reply to Moe Jette from comment #1)
> These messages indicate that the bookkeeping with respect to those GRES on
> that node is wrong.
> Did they all start at a time that matches some other event like
> configuration change, node reboot, daemon restart, etc?

Not sure, but I restarted them all a few times since I've noticed the messages, and they still continue to display it, pretty much continuously.

-- 8< ------------------------------------------------------------------------
[2015-04-09T10:53:55.796] error: gres/gpu: job 2041855 dealloc node gpu-9-3 topo gres count underflow
[2015-04-09T10:53:55.796] error: gres/gpu: job 2041855 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-09T10:53:55.797] error: gres/gpu: job 2041855 dealloc node gpu-9-3 topo gres count underflow
[2015-04-09T10:53:55.797] error: gres/gpu: job 2041855 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-09T10:53:55.798] error: gres/gpu: job 2041855 dealloc node gpu-9-3 topo gres count underflow
[2015-04-09T10:53:55.798] error: gres/gpu: job 2041855 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-09T10:53:55.799] error: gres/gpu: job 2041855 dealloc node gpu-9-3 topo gres count underflow
[2015-04-09T10:53:55.799] error: gres/gpu: job 2041855 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-09T10:53:55.800] error: gres/gpu: job 2041855 dealloc node gpu-9-3 topo gres count underflow
[2015-04-09T10:53:55.800] error: gres/gpu: job 2041855 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-09T10:53:55.801] error: gres/gpu: job 2041855 dealloc node gpu-9-3 topo gres count underflow
[2015-04-09T10:53:55.801] error: gres/gpu: job 2041855 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-09T10:53:55.802] error: gres/gpu: job 2041855 dealloc node gpu-9-3 topo gres count underflow
[2015-04-09T10:53:55.802] error: gres/gpu: job 2041855 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-09T10:53:55.802] error: gres/gpu: job 2041855 dealloc node gpu-9-3 topo gres count underflow
[2015-04-09T10:53:55.802] error: gres/gpu: job 2041855 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-09T10:53:55.803] error: gres/gpu: job 2041855 dealloc node gpu-9-3 topo gres count underflow
[2015-04-09T10:53:55.803] error: gres/gpu: job 2041855 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-09T10:53:55.804] error: gres/gpu: job 2041855 dealloc node gpu-9-3 topo gres count underflow
[2015-04-09T10:53:55.804] error: gres/gpu: job 2041855 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-09T10:53:55.805] error: gres/gpu: job 2041855 dealloc node gpu-9-3 topo gres count underflow
[2015-04-09T10:53:55.805] error: gres/gpu: job 2041855 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-09T10:53:55.806] error: gres/gpu: job 2041855 dealloc node gpu-9-3 topo gres count underflow
[2015-04-09T10:53:55.806] error: gres/gpu: job 2041855 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-09T10:53:55.807] error: gres/gpu: job 2041855 dealloc node gpu-9-3 topo gres count underflow
[2015-04-09T10:53:55.807] error: gres/gpu: job 2041855 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-09T10:53:55.808] error: gres/gpu: job 2041855 dealloc node gpu-9-3 topo gres count underflow
[2015-04-09T10:53:55.808] error: gres/gpu: job 2041855 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-09T10:53:55.809] error: gres/gpu: job 2041855 dealloc node gpu-9-3 topo gres count underflow
[2015-04-09T10:53:55.809] error: gres/gpu: job 2041855 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-09T10:53:55.810] error: gres/gpu: job 2041855 dealloc node gpu-9-3 topo gres count underflow
-- 8< ------------------------------------------------------------------------

This has the bad consequence of filling up the logs pretty fast.

> I would expect this to be self-healing.  Are these messages continuing?

They are continuing, yes.

> If might be helpful if you could some some logs around the time this started
> from both the slurmctld and relevant slurmd.

Unfortunately, it looks like it started more than a week ago, but logs are purged on a weekly basis, so I'm afraid I don't have those logs anymore.

> So for some reason the slurmd daemon on the compute node was able to connect
> to the slurmstepd (job step shepherd), but not coummunicate. Then the slurmd
> sends a message to the slurmstepd via named socket asking for its memory
> limits, but the slurmstepd does not respond. I'm wondering if there is a
> slurmstepd that is somehow wedged: no memory, stopped, whatever. It would be
> helpful if you could login to that node and see if the slurmstepd is still
> around and if so what its state is. You might also grep for that job id,
> "2031886", in the slurmd log file for any clues.

The slurmstepd process seems to still be alive. The original job is now gone, but on another node with a similar situation, I got this:

slurmctld.log:
-- 8< ------------------------------------------------------------------------
[2015-04-09T11:00:31.636] error: gres/gpu: job 2041855 dealloc node gpu-9-3 type gtx gres count underflow
-- 8< ------------------------------------------------------------------------

slurmd.log on the node:
-- 8< ------------------------------------------------------------------------
# grep  2041855 /var/log/slurm/slurmd.log
[2015-04-08T02:08:59.357] _run_prolog: prolog with lock for job 2041855 ran for 0 seconds
[2015-04-08T02:08:59.357] Launching batch job 2041855 for UID 18306
[2015-04-08T02:08:59.422] [2041855] task/cgroup: /slurm/uid_18306/job_2041855: alloc=16000MB mem.limit=16000MB memsw.limit=16000MB
[2015-04-08T02:08:59.423] [2041855] task/cgroup: /slurm/uid_18306/job_2041855/step_batch: alloc=16000MB mem.limit=16000MB memsw.limit=16000MB
[2015-04-08T02:09:00.420] launch task 2041855.0 request from 18306.46526@10.210.31.108 (port 5045)
[2015-04-08T02:09:01.566] [2041855.0] task/cgroup: /slurm/uid_18306/job_2041855: alloc=16000MB mem.limit=16000MB memsw.limit=16000MB
[2015-04-08T02:09:01.566] [2041855.0] task/cgroup: /slurm/uid_18306/job_2041855/step_0: alloc=16000MB mem.limit=16000MB memsw.limit=16000MB
[2015-04-08T09:40:19.998] error: Error reading step 2041855.0 memory limits
[2015-04-08T09:40:20.054] error: Error reading step 2041855.4294967294 memory limits
[2015-04-09T10:55:25.821] error: Error reading step 2041855.0 memory limits
[2015-04-09T10:55:25.846] error: Error reading step 2041855.4294967294 memory limits
-- 8< ------------------------------------------------------------------------

slurm processes on the node
-- 8< ------------------------------------------------------------------------
[root@gpu-9-3 ~]# ps aux | grep -i slurm
root      6515  0.0  0.0 276284 18816 ?        SLl  10:53   0:00 /usr/sbin/slurmd -M
root      7480  0.0  0.0 105316   912 pts/0    S+   11:02   0:00 grep -i slurm
root      9200  0.0  0.0 289696  5188 ?        Sl   00:35   0:01 slurmstepd: [2046478]
ajvenkat  9205  0.0  0.0 108176  1448 ?        S    00:35   0:00 /bin/bash /var/spool/slurmd/job2046478/slurm_script
root     18923  0.0  0.0 289448  5072 ?        Sl   Apr08   0:03 slurmstepd: [2045902]
yanmingw 18928  0.0  0.0 108164  1400 ?        S    Apr08   0:00 /bin/bash /var/spool/slurmd/job2045902/slurm_script
root     18978  0.0  0.0 431848  5036 ?        Sl   Apr08   0:02 slurmstepd: [2045902.0]
root     24411  0.0  0.0 289424  5016 ?        Sl   Apr08   0:03 slurmstepd: [2041850]
hasantos 24416  0.0  0.0 108164  1432 ?        S    Apr08   0:00 /bin/bash /var/spool/slurmd/job2041850/slurm_script
root     27004  0.0  0.0 289448  5068 ?        Sl   Apr08   0:06 slurmstepd: [2041855]
lmthang  27020  0.0  0.0 108168  1428 ?        S    Apr08   0:00 /bin/bash /var/spool/slurmd/job2041855/slurm_script
root     27066  0.0  0.0 431852  5160 ?        Sl   Apr08   0:06 slurmstepd: [2041855.0]
-- 8< ------------------------------------------------------------------------


Thanks!
-- 
Kilian
Comment 3 Moe Jette 2015-04-09 10:25:33 MDT
There's definitely something very wrong with job 2041855, but it's not clear to me yet.

I would recommend draining that node "gpu-9-3" ("scontrol update nodename=gpu-9-3 state=drain reason=gpu_accounting"). Once all of the other user jobs are gone, I would then set the node state to DOWN  ("scontrol update nodename=gpu-9-3 state=down reason=gpu_accounting")and see if you can remove all vestiges of the job including the slurmstepd. If the app or slurmstepd will not go away with SIGKILL, you might need to reboot that node. Once the node is cleared up, restart the slurmd and run "scontrol update nodename=gpu-9-3 state=resume".
Comment 4 Moe Jette 2015-04-09 10:35:17 MDT
Could you send the output of "scontrol show job 2041855" and "scontrol show job 2041886" (both jobs seem to have fallen into the same type of black hole).
Comment 5 Kilian Cavalotti 2015-04-09 10:39:50 MDT
(In reply to Moe Jette from comment #4)
> Could you send the output of "scontrol show job 2041855" and "scontrol show
> job 2041886" (both jobs seem to have fallen into the same type of black
> hole).

2041886 is done now, but 2041855 is still running:
-- 8< ----------------------------------------------------------------------------
# scontrol show job 2041855
JobId=2041855 JobName=lstm.de2en.50000.d1000.lr1.max5.d4.init0.1.noClip.longer.reverse.dropout0.7
   UserId=lmthang(18306) GroupId=manning(46526)
   Priority=1184 Nice=0 Account=manning QOS=gpu
   JobState=RUNNING Reason=None Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
   RunTime=1-13:29:01 TimeLimit=2-00:00:00 TimeMin=N/A
   SubmitTime=2015-04-08T02:08:43 EligibleTime=2015-04-08T02:08:43
   StartTime=2015-04-08T02:08:59 EndTime=2015-04-10T02:08:59
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   Partition=gpu AllocNode:Sid=sherlock-ln01:46077
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=gpu-9-3
   BatchHost=gpu-9-3
   NumNodes=1 NumCPUs=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=1 MinMemoryCPU=16000M MinTmpDiskNode=0
   Features=(null) Gres=gpu:1 Reservation=(null)
   Shared=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/home/lmthang/lmthang-dl/lstm/lstm.de2en.50000.d1000.lr1.max5.d4.init0.1.noClip.longer.reverse.dropout0.7/job.sh
   WorkDir=/home/lmthang
   StdErr=/home/lmthang/lmthang-dl/lstm/lstm.de2en.50000.d1000.lr1.max5.d4.init0.1.noClip.longer.reverse.dropout0.7/stderr
   StdIn=/dev/null
   StdOut=/home/lmthang/lmthang-dl/lstm/lstm.de2en.50000.d1000.lr1.max5.d4.init0.1.noClip.longer.reverse.dropout0.7/stdout
-- 8< ----------------------------------------------------------------------------

(In reply to Moe Jette from comment #3)
> There's definitely something very wrong with job 2041855, but it's not clear
> to me yet.
> 
> I would recommend draining that node "gpu-9-3" ("scontrol update
> nodename=gpu-9-3 state=drain reason=gpu_accounting"). Once all of the other
> user jobs are gone, I would then set the node state to DOWN  ("scontrol
> update nodename=gpu-9-3 state=down reason=gpu_accounting")and see if you can
> remove all vestiges of the job including the slurmstepd. If the app or
> slurmstepd will not go away with SIGKILL, you might need to reboot that
> node. Once the node is cleared up, restart the slurmd and run "scontrol
> update nodename=gpu-9-3 state=resume".

Ok, I'm gonna drain and reboot the nodes, thanks.

Cheers,
-- 
Kilian
Comment 6 Moe Jette 2015-04-10 05:28:29 MDT
I understand why there are so many log messages. Slurm's backfill scheduler makes a copy of the system state, then simulates the termination of jobs going forward through time to determine when and where the pending jobs will start. If the system state is bad to start with, each time a job is terminated in that simulated state could trigger the underflow message, which can happen a lot.

I will see what can be done to avoid printing the same error message repeatedly.

I will also study the logic that could trigger the bad state in the first place.

Are you still seeing the errors or have things cleared up?
Comment 7 Moe Jette 2015-04-10 06:02:14 MDT
Also, did you happen to recently change the Slurm configuration with respect to GPUs? I'm wondering if the job started under one configuration and then the configuration was changed.
Comment 8 Kilian Cavalotti 2015-04-10 06:09:24 MDT
(In reply to Moe Jette from comment #6)
> I understand why there are so many log messages. Slurm's backfill scheduler
> makes a copy of the system state, then simulates the termination of jobs
> going forward through time to determine when and where the pending jobs will
> start. If the system state is bad to start with, each time a job is
> terminated in that simulated state could trigger the underflow message,
> which can happen a lot.
> 
> I will see what can be done to avoid printing the same error message
> repeatedly.
> 
> I will also study the logic that could trigger the bad state in the first
> place.

Great, thanks!

> Are you still seeing the errors or have things cleared up?

Yes, but I haven't got a chance to reboot the nodes yet, jobs are still running on them. They're drained though, and our maximum runtime is 2 days, so that should be over soon.
Comment 9 Kilian Cavalotti 2015-04-10 06:10:52 MDT
(In reply to Moe Jette from comment #7)
> Also, did you happen to recently change the Slurm configuration with respect
> to GPUs? I'm wondering if the job started under one configuration and then
> the configuration was changed.

Our GRES config changed slightly when we upgraded to Slurm 14.11 in March, but those jobs started after that. And I'm also seeing those messages for jobs that started today on another GPU node that wasn't drained (yet).
Comment 10 Moe Jette 2015-04-10 06:50:11 MDT
Created attachment 1813 [details]
change logs from "error" to "debug"

The attached patch will not fix the problem, but will stop the logging unless you run with a configuration of "SlurmctldDebug=debug" (last I checked it was set to "info". It's just a "band aid" and I am still working on this, but it's all that I can offer at this point.
Comment 11 Kilian Cavalotti 2015-04-10 06:53:23 MDT
(In reply to Moe Jette from comment #10)
> Created attachment 1813 [details]
> change logs from "error" to "debug"
> 
> The attached patch will not fix the problem, but will stop the logging
> unless you run with a configuration of "SlurmctldDebug=debug" (last I
> checked it was set to "info". It's just a "band aid" and I am still working
> on this, but it's all that I can offer at this point.

Thanks for the patch Moe, much appreciated.
Comment 12 Moe Jette 2015-04-10 08:57:53 MDT
I was able to reproduce the "gres underflow" error by changing the number of GPUs on a compute node in the gres.conf file and restarting the slurmd. This could also happen if the gres.conf file were not readable when the slurmd restarts. If this is on the right track, you should see a specific message in your slurmctld log file. Could you search for messages of this sort:
"gres/gpu: count changed for node" ...

That would cause the slurmctld to build a new gres/gpu data structure, but not populated it with job data. When the job record is removed, it causes the underflow. Things return to normal when the job is terminated (i.e. the data structure shows now allocated GPUs and that is correct because there are no active jobs).

I was also able to induce an error "Error reading step 2031886.4294967294 memory limits". I believe this was caused by shutting down the slurmd when communications between slurmd and slurmstepd were in progress. When slurmd restarted there were some problems getting the communications back in sync, but they did get back in sync later.
Comment 13 Kilian Cavalotti 2015-04-10 09:19:40 MDT
(In reply to Moe Jette from comment #12)
> I was able to reproduce the "gres underflow" error by changing the number of
> GPUs on a compute node in the gres.conf file and restarting the slurmd. This
> could also happen if the gres.conf file were not readable when the slurmd
> restarts. If this is on the right track, you should see a specific message
> in your slurmctld log file. Could you search for messages of this sort:
> "gres/gpu: count changed for node" ...

Just did, and didn't find anything, but again, I think we don't have the logs anymore for when this started, unfortunately.

> That would cause the slurmctld to build a new gres/gpu data structure, but
> not populated it with job data. When the job record is removed, it causes
> the underflow. Things return to normal when the job is terminated (i.e. the
> data structure shows now allocated GPUs and that is correct because there
> are no active jobs).

So a reboot of the node once all the started jobs are done would definitely clear this, right?

> I was also able to induce an error "Error reading step 2031886.4294967294
> memory limits". I believe this was caused by shutting down the slurmd when
> communications between slurmd and slurmstepd were in progress. When slurmd
> restarted there were some problems getting the communications back in sync,
> but they did get back in sync later.

That could very well be, since I'm under the impression that this was happening for jobs that were recovered when slurmd restarted.
Comment 14 Moe Jette 2015-04-10 09:27:25 MDT
(In reply to Kilian Cavalotti from comment #13)
> (In reply to Moe Jette from comment #12)
> > That would cause the slurmctld to build a new gres/gpu data structure, but
> > not populated it with job data. When the job record is removed, it causes
> > the underflow. Things return to normal when the job is terminated (i.e. the
> > data structure shows now allocated GPUs and that is correct because there
> > are no active jobs).
> 
> So a reboot of the node once all the started jobs are done would definitely
> clear this, right?

If this problem is the cause (only the logs could tell for sure), then it would suffice to drain the node. Once no jobs remain on the node, you can just return it to service ("scontrol update nodename=... state=resume").

I'm working on a fix that should prevent this from happening again.
Comment 15 Moe Jette 2015-04-10 11:15:47 MDT
I believe that I have both of these problems fixed.

The "Error reading step 2031886.0 memory limits" was bad Slurm logic. It was logging success as an error. This bug is definitely fixed in the commit:
https://github.com/SchedMD/slurm/commit/79658fae831fe390329956941f34265d9f6f0240

The underflow I am not certain about, but if the slurmd reports to slurmctld a change in the GRES configuration (say it can't read the gres.conf file or the count is reset lower in slurm.conf, there are GRES "type" values added, or some other possibilities), then under some conditions the slurmctld will clear the GRES data structures. If a job was actually allocated GRES, that results in an underflow when deallocating the job's resources. I changed the logic so that if the GRES configuration changes while a job is allocated to the node, the event is logged, the data structures are NOT changed, and the node is DRAINED. If there is no active job or this happens at slurmctld startup, then the data structures are built appropriately. Again, I'm not sure if this explains what you saw, but this fix is in the commit:
https://github.com/SchedMD/slurm/commit/3a6fd83ccebd81ce4eb2780bad40ff353f5273c9

Both of these changes will be in v14.11.6 release, probably in the late April.
Comment 16 Kilian Cavalotti 2015-04-10 11:25:10 MDT
(In reply to Moe Jette from comment #15)
> I believe that I have both of these problems fixed.
> 
> Both of these changes will be in v14.11.6 release, probably in the late
> April.

Excellent, thank you so much!
Comment 17 Kilian Cavalotti 2015-04-13 06:04:48 MDT
Hi Moe, 

> The underflow I am not certain about, but if the slurmd reports to slurmctld
> a change in the GRES configuration (say it can't read the gres.conf file or
> the count is reset lower in slurm.conf, there are GRES "type" values added,
> or some other possibilities), then under some conditions the slurmctld will
> clear the GRES data structures. If a job was actually allocated GRES, that
> results in an underflow when deallocating the job's resources. I changed the
> logic so that if the GRES configuration changes while a job is allocated to
> the node, the event is logged, the data structures are NOT changed, and the
> node is DRAINED. If there is no active job or this happens at slurmctld
> startup, then the data structures are built appropriately. Again, I'm not
> sure if this explains what you saw, but this fix is in the commit:
> https://github.com/SchedMD/slurm/commit/
> 3a6fd83ccebd81ce4eb2780bad40ff353f5273c9

I finally get a chance to reboot the nodes, but after the reboot, I still see the messages correlated with "Error reading step jobid.xxx memory limits" on the nodes:

slurmctld.log
-- 8< ----------------------------------------------------------------------
[2015-04-13T11:00:08.565] error: gres/gpu: job 2065318 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-13T11:00:08.566] error: gres/gpu: job 2064902 dealloc node gpu-9-2 type tesla gres count underflow
[2015-04-13T11:00:08.566] error: gres/gpu: job 2064903 dealloc node gpu-9-2 type tesla gres count underflow
[2015-04-13T11:00:08.566] error: gres/gpu: job 2064906 dealloc node gpu-9-2 topo gres count underflow
[2015-04-13T11:00:08.566] error: gres/gpu: job 2064906 dealloc node gpu-9-2 type tesla gres count underflow
[2015-04-13T11:00:08.566] error: gres/gpu: job 2064907 dealloc node gpu-9-2 topo gres count underflow
[2015-04-13T11:00:08.566] error: gres/gpu: job 2064907 dealloc node gpu-9-2 type tesla gres count underflow
-- 8< ----------------------------------------------------------------------

slurmd.log on gpu-9-2
-- 8< ----------------------------------------------------------------------
[2015-04-13T11:00:49.764] error: Error reading step 2064907.0 memory limits
[2015-04-13T11:00:49.781] error: Error reading step 2064910.4294967294 memory limits
[2015-04-13T11:00:49.797] error: Error reading step 2064906.0 memory limits
[2015-04-13T11:00:49.815] error: Error reading step 2064906.4294967294 memory limits
[2015-04-13T11:00:49.826] error: Error reading step 2064903.0 memory limits
[2015-04-13T11:00:49.844] error: Error reading step 2064902.4294967294 memory limits
[2015-04-13T11:00:49.855] error: Error reading step 2064909.4294967294 memory limits
[2015-04-13T11:00:49.872] error: Error reading step 2064911.4294967294 memory limits
[2015-04-13T11:00:49.888] error: Error reading step 2064903.4294967294 memory limits
[2015-04-13T11:00:49.899] error: Error reading step 2064902.0 memory limits
[2015-04-13T11:00:49.916] error: Error reading step 2064911.0 memory limits
[2015-04-13T11:00:49.930] error: Error reading step 2064910.0 memory limits
[2015-04-13T11:00:49.941] error: Error reading step 2064907.4294967294 memory limits
[2015-04-13T11:00:49.955] error: Error reading step 2064908.4294967294 memory limits
[2015-04-13T11:00:49.966] error: Error reading step 2064909.0 memory limits
[2015-04-13T11:00:49.984] error: Error reading step 2064908.0 memory limits
-- 8< ----------------------------------------------------------------------

It's still on 14.11.5 without your patch, though, but I thought that draining the nodes would be enough so make the messages disappear.
Comment 18 Moe Jette 2015-04-13 06:28:31 MDT
The "memory limits" errors will not go away without the patch.

Regarding the "gres underflow" could you grep for those node names in the slurmctld.log file. I'm looking for information about what happened on those node between reboot time and the first underflow error messages.
Comment 19 Kilian Cavalotti 2015-04-13 07:41:11 MDT
Here it is:

-- 8< ----------------------------------------------------------------------------
[root@sherlock-slurm01 ~]# grep gpu-9-3 /var/log/slurm/slurmctld.log   
[2015-04-13T08:48:40.489] update_node: node gpu-9-3 state set to IDLE
[2015-04-13T10:16:56.570] node gpu-9-3 returned to service
[2015-04-13T10:17:11.352] backfill: Started JobId=2064904 on gpu-9-3
[2015-04-13T10:17:11.416] backfill: Started JobId=2064905 on gpu-9-3
[2015-04-13T10:17:11.462] backfill: Started JobId=1854142 on gpu-9-3
[2015-04-13T10:17:11.710] backfill: Started JobId=2065318 on gpu-9-3
[2015-04-13T10:50:37.708] error: Node gpu-9-3 appears to have a different slurm.conf than the slurmctld.  This could cause issues with communication and functionality.  Please review both files and make sure they are the same.  If this is expected ignore, and set DebugFlags=NO_CONF_HASH in your slurm.conf.
[2015-04-13T10:51:03.558] error: gres/gpu: job 2065318 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-13T10:51:03.559] error: gres/gpu: job 2065318 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-13T10:51:03.560] error: gres/gpu: job 2065318 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-13T10:51:03.561] error: gres/gpu: job 2065318 dealloc node gpu-9-3 type gtx gres count underflow
[2015-04-13T10:51:03.562] error: gres/gpu: job 2065318 dealloc node gpu-9-3 type gtx gres count underflow
[... 2300 identical messages ...]
-- 8< ----------------------------------------------------------------------------

The message about having a different Slurm config is dubious, because the Slurm config files should really be the same on the nodes, but I'll double-check everything to make sure that it's indeed the case.
Comment 20 Moe Jette 2015-04-13 08:28:46 MDT
(In reply to Kilian Cavalotti from comment #19)
> The message about having a different Slurm config is dubious, because the
> Slurm config files should really be the same on the nodes, but I'll
> double-check everything to make sure that it's indeed the case.

There is a checksum on the configuration read from slurm.conf. Any files included in slurm.conf are also included in the checksum calculation. Note the checksum reported by slurmd on the compute node does not change until it re-reads the configuration (i.e. slurmd may be reporting the configuration checksum that it started with and not the files on the node currently). Does that help?
Comment 21 Moe Jette 2015-04-13 08:31:23 MDT
(In reply to Kilian Cavalotti from comment #19)
> Here it is:
> ----------------------------------------------------------------------------
> [root@sherlock-slurm01 ~]# grep gpu-9-3 /var/log/slurm/slurmctld.log   
> [2015-04-13T08:48:40.489] update_node: node gpu-9-3 state set to IDLE
> [2015-04-13T10:16:56.570] node gpu-9-3 returned to service
> [2015-04-13T10:17:11.352] backfill: Started JobId=2064904 on gpu-9-3
> [2015-04-13T10:17:11.416] backfill: Started JobId=2064905 on gpu-9-3
> [2015-04-13T10:17:11.462] backfill: Started JobId=1854142 on gpu-9-3
> [2015-04-13T10:17:11.710] backfill: Started JobId=2065318 on gpu-9-3
> [2015-04-13T10:50:37.708] error: Node gpu-9-3 appears to have a different
> slurm.conf than the slurmctld.  This could cause issues with communication
> and functionality.  Please review both files and make sure they are the
> same.  If this is expected ignore, and set DebugFlags=NO_CONF_HASH in your
> slurm.conf.
> [2015-04-13T10:51:03.558] error: gres/gpu: job 2065318 dealloc node gpu-9-3
> type gtx gres count underflow
> [2015-04-13T10:51:03.559] error: gres/gpu: job 2065318 dealloc node gpu-9-3
> type gtx gres count underflow
> [2015-04-13T10:51:03.560] error: gres/gpu: job 2065318 dealloc node gpu-9-3
> type gtx gres count underflow
> [2015-04-13T10:51:03.561] error: gres/gpu: job 2065318 dealloc node gpu-9-3
> type gtx gres count underflow
> [2015-04-13T10:51:03.562] error: gres/gpu: job 2065318 dealloc node gpu-9-3
> type gtx gres count underflow
> [... 2300 identical messages ...]

Can you provide the values of the GRES field on those 4 jobs, at least if they are still available:
scontrol show job 2064904 | grep Gres
scontrol show job 2064905 | grep Gres
scontrol show job 1854142 | grep Gres
scontrol show job 2065318 | grep Gres
Comment 22 Kilian Cavalotti 2015-04-13 08:33:34 MDT
(In reply to Moe Jette from comment #20)
> (In reply to Kilian Cavalotti from comment #19)
> > The message about having a different Slurm config is dubious, because the
> > Slurm config files should really be the same on the nodes, but I'll
> > double-check everything to make sure that it's indeed the case.
> 
> There is a checksum on the configuration read from slurm.conf. Any files
> included in slurm.conf are also included in the checksum calculation. Note
> the checksum reported by slurmd on the compute node does not change until it
> re-reads the configuration (i.e. slurmd may be reporting the configuration
> checksum that it started with and not the files on the node currently). Does
> that help?

Yes, I checked the checksums of everything under /etc/slurm, and it matches now. So no more "configuration mismatch" error on restarting the slurmds on the nodes. Yet, still getting "gres undercount" messages as soon as new jobs start on those machines. Maybe I should try to re-drain and reboot them again?
Comment 23 Kilian Cavalotti 2015-04-13 08:34:21 MDT
> Can you provide the values of the GRES field on those 4 jobs, at least if
> they are still available:
> scontrol show job 2064904 | grep Gres
> scontrol show job 2064905 | grep Gres
> scontrol show job 1854142 | grep Gres
> scontrol show job 2065318 | grep Gres

Sure:

# scontrol show job 2064904 | grep Gres
   Features=titanblack Gres=gpu:1 Reservation=(null)
# scontrol show job 2064905 | grep Gres
   Features=titanblack Gres=gpu:1 Reservation=(null)
# scontrol show job 1854142 | grep Gres
   Features=titanblack Gres=gpu:2 Reservation=(null)
# scontrol show job 2065318 | grep Gres
   Features=titanblack Gres=gpu:1 Reservation=(null)
Comment 24 Moe Jette 2015-04-13 08:52:47 MDT
(In reply to Kilian Cavalotti from comment #22)
> (In reply to Moe Jette from comment #20)
> > (In reply to Kilian Cavalotti from comment #19)
> > > The message about having a different Slurm config is dubious, because the
> > > Slurm config files should really be the same on the nodes, but I'll
> > > double-check everything to make sure that it's indeed the case.
> > 
> > There is a checksum on the configuration read from slurm.conf. Any files
> > included in slurm.conf are also included in the checksum calculation. Note
> > the checksum reported by slurmd on the compute node does not change until it
> > re-reads the configuration (i.e. slurmd may be reporting the configuration
> > checksum that it started with and not the files on the node currently). Does
> > that help?
> 
> Yes, I checked the checksums of everything under /etc/slurm, and it matches
> now. So no more "configuration mismatch" error on restarting the slurmds on
> the nodes. Yet, still getting "gres undercount" messages as soon as new jobs
> start on those machines. Maybe I should try to re-drain and reboot them
> again?

That could help, but at this point I'm not really sure what is happening. Have there been any changes in your configuration with respect to GRES? Perhaps the GRES types were added? Perhaps CPU IDs were added?
Comment 25 Kilian Cavalotti 2015-04-13 09:39:31 MDT
> That could help, but at this point I'm not really sure what is happening.
> Have there been any changes in your configuration with respect to GRES?
> Perhaps the GRES types were added? Perhaps CPU IDs were added?

Ok, I'll try to reboot once again.

CPU ids have been refactored to keep the gres.conf file compact, and gres "types" have been added when we moved to 14.11. Our Slurm.conf now contains:
-- 8< --------------------------------------------------------------------------
NodeName=gpu-9-[1-2]                RealMemory=256000  Weight=1000 Gres=gpu:tesla:8 NodeName=gpu-9-[3-5]                RealMemory=256000  Weight=1000 Gres=gpu:gtx:8   NodeName=gpu-9-[6-9]                RealMemory=64000   Weight=1000 Gres=gpu:gtx:4   NodeName=gpu-9-10                   RealMemory=64000   Weight=1000 Gres=gpu:gtx:4   NodeName=gpu-14-[1-9],gpu-13-[1-2]  RealMemory=64000   Weight=1000 Gres=gpu:gtx:8   
-- 8< --------------------------------------------------------------------------
"gtx" and "tesla" didn't exist before. But we added types on all GPU nodes, and apparently, only gpu-9-x exhibit that problem. Didn't see any "underflow" message for gpu-[13-14]-x.

gres.conf is:
-- 8< --------------------------------------------------------------------------
# 4 GPUs nodes
NodeName=gpu-9-[6-10] Name=gpu Type=gtx File=/dev/nvidia[0-1] CPUs=[0-7]
NodeName=gpu-9-[6-10] Name=gpu Type=gtx File=/dev/nvidia[2-3] CPUs=[8-15]
# 8 GPUs nodes
NodeName=gpu-9-[1-2]  Name=gpu Type=tesla File=/dev/nvidia[0-3] CPUs=[0-7]
NodeName=gpu-9-[1-2]  Name=gpu Type=tesla File=/dev/nvidia[4-7] CPUs=[8-15]
NodeName=gpu-9-[3-5],gpu-13-[1-2],gpu-14-[1-9] Name=gpu Type=gtx File=/dev/nvidia[0-3] CPUs=[0-7]
NodeName=gpu-9-[3-5],gpu-13-[1-2],gpu-14-[1-9] Name=gpu Type=gtx File=/dev/nvidia[4-7] CPUs=[8-15]
-- 8< --------------------------------------------------------------------------

It's still possible that the jobs which generates those messages started with a different slurm.conf, but we've been restarting Slurm daemon many times since that changed, and all of the jobs submitted under the old version are now gone. 

I noticed that as soon as I drain the nodes, the messages disappear (while the jobs are still running). If I "resume" the nodes, messages reappear right away.
Comment 26 Moe Jette 2015-04-13 10:04:42 MDT
(In reply to Kilian Cavalotti from comment #25)
> It's still possible that the jobs which generates those messages started
> with a different slurm.conf, but we've been restarting Slurm daemon many
> times since that changed, and all of the jobs submitted under the old
> version are now gone. 

The slurm.conf in effect at job submit time shouldn't matter. The only issue would be the slurm.conf and gres.conf in effect when slurmctld and slurmd restart (or reconfigure, which re-reads the files). I'll try that sequence though, just in case...

> I noticed that as soon as I drain the nodes, the messages disappear (while
> the jobs are still running). If I "resume" the nodes, messages reappear
> right away.

That would be because Slurm isn't trying to scheduled jobs on the nodes in DRAIN state, so it's not going through the logic to attempt removing those jobs and re-allocating the job's resources (simulated at a later time to determine when and where pending jobs will start). So that shouldn't be relevan.t
Comment 27 Kilian Cavalotti 2015-04-13 11:23:08 MDT
(In reply to Moe Jette from comment #26)
> The slurm.conf in effect at job submit time shouldn't matter. The only issue
> would be the slurm.conf and gres.conf in effect when slurmctld and slurmd
> restart (or reconfigure, which re-reads the files). I'll try that sequence
> though, just in case...

Ok. Is there a way to know which or which part of the individual configuration files doesn't match? 

> That would be because Slurm isn't trying to scheduled jobs on the nodes in
> DRAIN state, so it's not going through the logic to attempt removing those
> jobs and re-allocating the job's resources (simulated at a later time to
> determine when and where pending jobs will start). So that shouldn't be
> relevan.t

Ok, that makes sense, thanks.
Comment 28 Moe Jette 2015-04-13 11:32:37 MDT
(In reply to Kilian Cavalotti from comment #27)
> (In reply to Moe Jette from comment #26)
> > The slurm.conf in effect at job submit time shouldn't matter. The only issue
> > would be the slurm.conf and gres.conf in effect when slurmctld and slurmd
> > restart (or reconfigure, which re-reads the files). I'll try that sequence
> > though, just in case...
> 
> Ok. Is there a way to know which or which part of the individual
> configuration files doesn't match? 

Slurm isn't performing a line-by-line comparison, but builds a checksum while scanning the configuration file. The slurmd daemon on the compute node sends that checksum (along with CPU count, memory size, etc.) to the slurmctld on the head node and any checksum inconsistency logged. The file contents might be equivalent, but an extra space somewhere would through off the checksum. I'm not sure how you are managing these files, but running something csum on the relevant file(s) should identify the inconsistencies.
Comment 29 Kilian Cavalotti 2015-04-14 03:51:16 MDT
(In reply to Moe Jette from comment #28)
> (In reply to Kilian Cavalotti from comment #27)
> > (In reply to Moe Jette from comment #26)
> > > The slurm.conf in effect at job submit time shouldn't matter. The only issue
> > > would be the slurm.conf and gres.conf in effect when slurmctld and slurmd
> > > restart (or reconfigure, which re-reads the files). I'll try that sequence
> > > though, just in case...
> > 
> > Ok. Is there a way to know which or which part of the individual
> > configuration files doesn't match? 
> 
> Slurm isn't performing a line-by-line comparison, but builds a checksum
> while scanning the configuration file. The slurmd daemon on the compute node
> sends that checksum (along with CPU count, memory size, etc.) to the
> slurmctld on the head node and any checksum inconsistency logged. The file
> contents might be equivalent, but an extra space somewhere would through off
> the checksum. I'm not sure how you are managing these files, but running
> something csum on the relevant file(s) should identify the inconsistencies.

I see. So our config files are identical at the checksum level and we use Rocks' mechanisms to distribute the files (411). It may happen that extra files, not used by the slurm config, are present in /etc/slurm, so I was wondering if that could matter. But since you mentioned that checksums are built while the config is scanned, I don't think it would generate the CONF_HASH messages.

But anyway, I think I have it under control now. The messages seem to have disappeared now, after the last gpu nodes reboot. So I guess we're good for now.

Thanks a lot for your help!
Comment 30 Moe Jette 2015-04-14 04:08:36 MDT
(In reply to Kilian Cavalotti from comment #29)
> (In reply to Moe Jette from comment #28)
> > (In reply to Kilian Cavalotti from comment #27)
> > > (In reply to Moe Jette from comment #26)
> > > > The slurm.conf in effect at job submit time shouldn't matter. The only issue
> > > > would be the slurm.conf and gres.conf in effect when slurmctld and slurmd
> > > > restart (or reconfigure, which re-reads the files). I'll try that sequence
> > > > though, just in case...
> > > 
> > > Ok. Is there a way to know which or which part of the individual
> > > configuration files doesn't match? 
> > 
> > Slurm isn't performing a line-by-line comparison, but builds a checksum
> > while scanning the configuration file. The slurmd daemon on the compute node
> > sends that checksum (along with CPU count, memory size, etc.) to the
> > slurmctld on the head node and any checksum inconsistency logged. The file
> > contents might be equivalent, but an extra space somewhere would through off
> > the checksum. I'm not sure how you are managing these files, but running
> > something csum on the relevant file(s) should identify the inconsistencies.
> 
> I see. So our config files are identical at the checksum level and we use
> Rocks' mechanisms to distribute the files (411). It may happen that extra
> files, not used by the slurm config, are present in /etc/slurm, so I was
> wondering if that could matter. But since you mentioned that checksums are
> built while the config is scanned, I don't think it would generate the
> CONF_HASH messages.

Slurm is only generating a checksum on the slurm.conf file, plus the files it contains. It is not checking gres.conf or other files in /etc/slurm. I suspect what happened is that the file changed, but the Slurm daemon did not restart or get notified to re-read the file (via "scontrol reconfig" or SIGHUP). Until the new configuration file is read, the daemons will be working off the old data and report that checksum. So I think the error reported in comment 19 was a good lead: "Node gpu-9-3 appears to have a different slurm.conf than the slurmctld."


> But anyway, I think I have it under control now. The messages seem to have
> disappeared now, after the last gpu nodes reboot. So I guess we're good for
> now.

I was able to reproduce the "underflow" error with very specific changes in configuration files, which is probably the root cause of the error. The changes that I made last week do fix that scenario.

> Thanks a lot for your help!

You are welcome. I'll close the ticket now.