Ticket 8565 - Where does --gpus=X flag show up in job_descriptor?
Summary: Where does --gpus=X flag show up in job_descriptor?
Status: RESOLVED INFOGIVEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Scheduling (show other tickets)
Version: 18.08.9
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Marcin Stolarek
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2020-02-25 05:59 MST by Mikael Öhman
Modified: 2020-02-28 06:42 MST (History)
1 user (show)

See Also:
Site: SNIC
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: C3SE
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Mikael Öhman 2020-02-25 05:59:07 MST
We have a few special rules we would like to enforce on our GPU nodes.
I have used successfully done so using our job_submit.lua script for the GRES options "--gres=gpu:X" by parsing the job_desc.gres field.

But, I can't find any gpu-fields in the job_descriptor struct:
https://github.com/SchedMD/slurm/blob/master/slurm/slurm.h.in#L1383

Can I not detect this value from the job_submit.lua hook?
Or if I can, which field is it?

Best regards, Mikael
Comment 1 Marcin Stolarek 2020-02-25 06:15:33 MST
Mikael,

You're right, although, "gres" is still available in job_submit.lua for backword compatibility it's now a synonym of tres_per_node. Depending on the option used GPU devices should be visible under one of the folowing:
- tres_per_node
- tres_per_job
- tres_per_socket
- tres_per_task

cheers,
Marcin
Comment 2 Mikael Öhman 2020-02-25 06:26:50 MST
What happens if I need to modify these fields? Should i modify both versions?
Comment 3 Marcin Stolarek 2020-02-25 07:50:31 MST
Each of those is directly connected with the "job description" provided by user, so setting gpu gres in tres_per_job has the meaning of --gpus, tres_per_socket is --gpus-per-socket etc.

Different combinations are triggering slightly different logic on slurmctld side.

If your policies are complicated you may consider using cli_filter (starting from 20.02 those will be available also with lua interface) instead. The obvious limitation of those is that they are "soft" - user tricking the client-side can submit a job that doesn't fulfill them, but the good point is that they are executed all on the user side allowing higher throughput over slurmctl.

Let me know if you have more questions.

cheers,
Marcin
Comment 4 Mikael Öhman 2020-02-26 12:29:52 MST
Sorry, i should have clarified, should i modify both the legacy "job_desc.gres" as well as "job_dec.tres_per_node". Since they are supposed to be synonyms, i'll just put the same value in both when i get around to update our hooks.

Thank you for the help, you can close this ticket now
(I'm not sure what I should select for resolving this ticket)

Best regards, Mikael
Comment 5 Marcin Stolarek 2020-02-27 00:28:33 MST
Mikael,

They are synonyms for job_submit/lua only, it's basically those lines in job_submit_lua.c:
>321         } else if (!xstrcmp(name, "gres")) {                                     
>322                 /* "gres" replaced by "tres_per_node" in v18.08 */               >323                 lua_pushstring (L, job_ptr->tres_per_node);               

>813         } else if (!xstrcmp(name, "gres")) {                                     
>814                 /* "gres" replaced by "tres_per_node" in v18.08 */               
>815                 lua_pushstring (L, job_desc->tres_per_node);                     

>1083         } else if (!xstrcmp(name, "gres")) {                                     
>1084                 /* "gres" replaced by "tres_per_node" in v18.08 */               
>1085                 value_str = luaL_checkstring(L, 3);                              
>1086                 xfree(job_desc->tres_per_node);                                  
>1087                 if (strlen(value_str))                                           
>1088                         job_desc->tres_per_node = xstrdup(value_str);

As you see "gres" is exported to lua from tres_per_node and used to set tres_per_node in job description, so I'd suggest that you use only tres_per_node instead of gres.

If you were developing your own job_submit plugin in C you'd have to use tres_per_node since gres is no longer member of job_desc_msg_t.

Let me know if this clarified the situation. 

cheers,
Marcin

PS. As it goes about closing you can just confirm that in a message and I can mark that as closed. For this ticket, I'm going to call it "infogiven".
Comment 6 Marcin Stolarek 2020-02-28 06:42:14 MST
Mikael,

I'm closing this case now with "information given" status. 

Should you have any additional question please don't hesitate to reopen.

cheers,
Marcin