Hello, I see some errors which are flooding slurmctld logs Can you please explain what do they indicate and what's the impact on the system The first one is error: gres/gpu: job 6457286 dealloc node dgpu501-14 type gtx1080ti gres count underflow (0 1) It appears for different jobs and GPU nodes The second error message is error: select/cons_res: node cn603-15-r memory is under-allocated (61440-81920) for JobId=6426888 This also appears for different jobs and nodes Thanks, Ahmed
Hi Ahmed, Both errors shouldn't appear, and they are indicating some internal malfunction that we are working to fix. > The first one is > error: gres/gpu: job 6457286 dealloc node dgpu501-14 type gtx1080ti gres > count underflow (0 1) > It appears for different jobs and GPU nodes This one is still not fixed, but we are already aware and working on it on bug 7468. > The second error message is > error: select/cons_res: node cn603-15-r memory is under-allocated > (61440-81920) for JobId=6426888 > This also appears for different jobs and nodes This is already fixed in branch slurm-19.05 and will be released as part of 19.05.3. See bug 6769 comment 41 for details. Regards, Albert
Hi Ahmed, If this is Ok for you I'm closing this bug as duplicated of bug 7468. Regards, Albert *** This ticket has been marked as a duplicate of ticket 7468 ***