| Summary: | Percentage of memory used during a job | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Steve Shortino <steve.shortino> |
| Component: | Accounting | Assignee: | Ben Roberts <ben> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | ||
| Version: | 17.11.8 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | FRB Kansas | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
Steve Shortino
2019-05-01 14:07:12 MDT
Hi Steve, The way you're doing the calculation, with maxVMsize/reqMem, is the correct way to show users their used vs requested RAM. One thing to note is that if you're using jobacct_gather/linux the usage information collected for a job is pretty good, but can be a little off if the job gets interrupted. If you use jobacct_gather/cgroup the collected data is much more accurate in cases where something goes wrong. I looked into whether there is an existing tool that shows a percentage like that, but there's not. The correct way to get that information would be to pull the data from sacct and calculate it, like you're doing. Let me know if you have additional questions about this. Thanks, Ben Hello Ben, Thank you for the info! I actually ran into an issue calculating this out that has revealed some problems, maybe with how I am reading the memory usage. Can you help me figure out why I am seeing this? JobID|ReqMem|MaxVMSize|ReqCPUS|State|JobName 34233|5Gn||14|FAILED|Model1_Random_Forest 34233.batch|5Gn|189816K|14|FAILED|batch 34233.0|5Gn|2048908K|14|CANCELLED by 3577|Rscript 34234|18Gn||14|FAILED|Model1_Random_Forest 34234.batch|18Gn|189816K|14|FAILED|batch 34234.0|18Gn|2067148K|14|CANCELLED by 3577|Rscript 34235|21Gn||14|FAILED|Model1_Random_Forest 34235.batch|21Gn|189816K|14|FAILED|batch 34235.0|21Gn|2073420K|14|CANCELLED by 3577|Rscript 34236|23Gn||14|FAILED|Model1_Random_Forest 34236.batch|23Gn|189816K|14|FAILED|batch 34236.0|23Gn|2073304K|14|CANCELLED by 3577|Rscript 34237|25Gn||14|COMPLETED|Model1_Random_Forest 34237.batch|25Gn|189816K|14|COMPLETED|batch 34237.0|25Gn|2079604K|14|COMPLETED|Rscript Note that while the job only completed at 25G, it only reports a maxVMsize of 2G? I know that SLURM is seeing the rest of the memory usage, since my output logs have these errors in them: Model1_Random_Forest.2019-05-09T125236.34235.qlog:slurmstepd: error: Step 34235.0 exceeded memory limit (22640340 > 22020096), being killed Model1_Random_Forest.2019-05-09T125708.34236.qlog:slurmstepd: error: Step 34236.0 exceeded memory limit (25315208 > 24117248), being killed Please let me know what you think, and if there is anything else I can provide to help clear up this process! Right now the percentage I am calculating does not appear at all accurate. Best regards, Steve Hi Steve, I've got to amend my previous statement that doing MaxVMSize/ReqMem was the proper way to calculate the percentage of the memory used. The MaxVMSize is the amount of swap the job used. You can find the amount of RAM the job used by looking at MaxRSS. If you're using cgroups you can control the percentage of RAM vs Swap that jobs are allowed to use with AllowedRAMSpace and AllowedSwapSpace in the cgroup.conf file. There's more information about these parameters, and others in the cgroup.conf documentation: https://slurm.schedmd.com/cgroup.conf.html With that in mind you can decide whether you want to include the swap in the calculation of the amount of memory used vs requested. You could do: MaxRSS/ReqMem or (MaxRSS + MaxVMSize)/ReqMem My apologies that I didn't catch that the first time, I was more concerned about whether there was a tool that already did that calculation. Let me know if you have any additional questions. Thanks, Ben NONCONFIDENTIAL // EXTERNAL Hello Ben, I was out of the office on Friday. Thank you for the advice, using MaxRSS is producing numbers that look better. I am using cgroups for control and resource tracking. However, this does bring up another question (Which may or may not be relevant): My compute nodes are all diskless with no swap configured, so what is MaxVMSize capturing? I will do some more testing, but right now MaxRSS / ReqMem is providing the numbers I expect from the jobs I’ve run so far. Thanks and regards, Steve From: bugs@schedmd.com <bugs@schedmd.com> Sent: Friday, May 10, 2019 12:22 PM To: Shortino, Steven M <Steve.Shortino@kc.frb.org> Subject: [External] [Bug 6946] Percentage of memory used during a job NONCONFIDENTIAL // EXTERNAL PLEASE NOTE: This email is not from a Federal Reserve address. Do not click on suspicious links. Do not give out personal or bank information to unknown senders. Comment # 3<https://bugs.schedmd.com/show_bug.cgi?id=6946#c3> on bug 6946<https://bugs.schedmd.com/show_bug.cgi?id=6946> from Ben<mailto:ben@schedmd.com> Hi Steve, I've got to amend my previous statement that doing MaxVMSize/ReqMem was the proper way to calculate the percentage of the memory used. The MaxVMSize is the amount of swap the job used. You can find the amount of RAM the job used by looking at MaxRSS. If you're using cgroups you can control the percentage of RAM vs Swap that jobs are allowed to use with AllowedRAMSpace and AllowedSwapSpace in the cgroup.conf file. There's more information about these parameters, and others in the cgroup.conf documentation: https://slurm.schedmd.com/cgroup.conf.html With that in mind you can decide whether you want to include the swap in the calculation of the amount of memory used vs requested. You could do: MaxRSS/ReqMem or (MaxRSS + MaxVMSize)/ReqMem My apologies that I didn't catch that the first time, I was more concerned about whether there was a tool that already did that calculation. Let me know if you have any additional questions. Thanks, Ben ________________________________ You are receiving this mail because: * You reported the bug. Hi Steve, I've been looking into where the MaxVMSize comes from. This comes back to the way linux calculates the Virtual Size (VSZ), which isn't just the amount of swap space used. As an example on my system you can see that I have 0 swap in use currently: $ swapon -s Filename Type Size Used Priority /swapfile file 2097148 0 -2 But if I look at 'ps' for the processes with the most vsz used there are some processes with quite a bit: $ ps aux --sort -vsz | head -n5 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND ben 5863 0.0 0.2 268771560 41716 ? Sl 09:14 0:03 /usr/lib/x86_64-linux-gnu/libexec/baloorunner ben 2450 0.0 0.4 268706772 73644 ? SNl 08:59 0:08 /usr/bin/baloo_file ben 2454 0.3 3.8 6909856 619080 ? SLl 08:59 1:55 /usr/bin/plasmashell ben 2448 5.9 1.5 3170260 254136 ? Sl 08:59 29:01 /usr/bin/kwin_x11 -session 1012012111e93000154688027200000015320004_1557786765_781859 There are some good descriptions I found of what's included in the vsz reported: ---------------------------------- "VSZ is the Virtual Memory Size. It includes all memory that the process can access, including memory that is swapped out, memory that is allocated, but not used, and memory that is from shared libraries." https://stackoverflow.com/questions/7880784/what-is-rss-and-vsz-in-linux-memory-management "VSZ is virtual memory which a process can use while RSS is physical memory actually allocated at the moment." https://stackoverflow.com/questions/31867856/vsz-vs-rss-memory-and-swap-space ---------------------------------- I assume that if you look at one of your diskless nodes you'll see similar behavior where there is usage reported in the vsz column of 'ps'. Let me know if that's not the case or if you have additional questions about this. Thanks, Ben Hi Steve, I wanted to follow up and make sure the information I sent about the Virtual Size made sense and that you were able to get the information you needed from the report. Let me know if you have any additional questions about this. Thanks, Ben Hi Steve, The information I sent should have helped clarify what the reported Virtual Size field meant and I haven't heard a follow up question. I'll close this ticket as 'InfoGiven' but feel free to update the ticket if you do have additional questions about this. Thanks, Ben |