Ticket 7474

Summary: squeue --steps always calls slurm_load_federation
Product: Slurm Reporter: Thomas HAMEL <hmlth>
Component: User CommandsAssignee: Jacob Jenson <jacob>
Status: RESOLVED INVALID QA Contact:
Severity: 6 - No support contract    
Priority: --- CC: dwightman, fabecassis, jblomqvist, lyeager
Version: 18.08.3   
Hardware: Linux   
OS: Linux   
Site: -Other- Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: Debian
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Thomas HAMEL 2019-07-26 09:26:33 MDT
"squeue" and "squeue --steps" behaves differently. "squeue --steps" always triggers a REQUEST_FED_INFO RPC, and I don't think that the expected behavior without the "--federation" flag.

On a cluster with federation disabled:


```
$ squeue -v
-----------------------------
all         = false
array       = false
federation  = false
format      = (null)
iterate     = 0
job_flag    = 0
jobs        = (null)
licenses    = (null)
local       = false
names       = (null)
nodes       = 
partitions  = (null)
priority    = false
reservation = (null)
sibling      = false
sort        = (null)
start_flag  = 0
states      = (null)
step_flag   = 0
steps       = (null)
users       = (null)
verbose     = 1
-----------------------------


Fri Jul 26 17:24:30 2019
last_update_time=1564154670 records=0
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)


$ sdiag | grep -i fed; squeue --steps --local ; sdiag |grep -i fed
        REQUEST_FED_INFO                        ( 2049) count:31     ave_time:352    total_time:10942
         STEPID     NAME PARTITION     USER      TIME NODELIST
        REQUEST_FED_INFO                        ( 2049) count:31     ave_time:352    total_time:10942

$ sdiag | grep -i fed; squeue  ; sdiag |grep -i fed
        REQUEST_FED_INFO                        ( 2049) count:31     ave_time:352    total_time:10942
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
        REQUEST_FED_INFO                        ( 2049) count:31     ave_time:352    total_time:10942

$ sdiag | grep -i fed; squeue --federation ; sdiag |grep -i fed
        REQUEST_FED_INFO                        ( 2049) count:31     ave_time:352    total_time:10942
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
        REQUEST_FED_INFO                        ( 2049) count:32     ave_time:350    total_time:11220

$ sdiag | grep -i fed; squeue --steps ; sdiag |grep -i fed
        REQUEST_FED_INFO                        ( 2049) count:32     ave_time:350    total_time:11220
         STEPID     NAME PARTITION     USER      TIME NODELIST
        REQUEST_FED_INFO                        ( 2049) count:33     ave_time:348    total_time:11498
```

Some of our users have a heavy usage of the "--steps" flag, and it generates a very large number of useless RPC in the end.

By looking at the code quickly it seems the test is correctly implemented in 

node_info.c:

```
	if ((show_flags & SHOW_FEDERATION) && !(show_flags & SHOW_LOCAL) &&
(slurm_load_federation(&ptr) == SLURM_SUCCESS) &&
```

https://github.com/SchedMD/slurm/blob/a1dd5f46b9cd9130f8c4db668eba5260fb2788af/src/api/node_info.c#L697


But not in job_step_info.c:

```
	if ((show_flags & SHOW_LOCAL) == 0) {
		if (slurm_load_federation(&ptr) ||
!cluster_in_federation(ptr, cluster_name)) {
```

https://github.com/scibian/slurm-llnl/blob/c5d5116d75061a3dcf5b549c5c83e80e0cb114a0/src/api/job_step_info.c#L498
Comment 1 Jacob Jenson 2019-07-26 09:31:31 MDT
Thomas,

Our system couldn't match your email address with a support contract. Could you please let me know which site you work for so we can match this request with a support contract? Once we can match this request up with a support contract the SchedMD support engineers can help you resolve this issue. 

Thanks,
Jacob
Comment 2 Thomas HAMEL 2019-10-07 05:19:08 MDT
I'm not sure we still have a direct support contract with SchedMD, I will open this with our provider. But I still think it's a real issue, I also noticed it with custom output and the -u flag :

$ sdiag |grep -i fed_info
	REQUEST_FED_INFO                        ( 2049) count:1259   ave_time:428    total_time:538997
$ squeue -o "%i %P %t %M %N" -u $USER
JOBID PARTITION ST TIME NODELIST
$ sdiag |grep -i fed_info
	REQUEST_FED_INFO                        ( 2049) count:1260   ave_time:428    total_time:539355
$ squeue -o "%i %P %t %M %N" --local -u $USER
JOBID PARTITION ST TIME NODELIST
$ sdiag |grep -i fed_info
	REQUEST_FED_INFO                        ( 2049) count:1260   ave_time:428    total_time:539355
[...]
$ sdiag |grep -i fed_info
	REQUEST_FED_INFO                        ( 2049) count:1261   ave_time:428    total_time:539814
$ squeue -o "%i %P %t %M %N"
JOBID PARTITION ST TIME NODELIST
$ sdiag |grep -i fed_info
	REQUEST_FED_INFO                        ( 2049) count:1261   ave_time:428    total_time:539814
Comment 3 Luke Yeager 2021-10-19 12:27:26 MDT
Thanks for the detailed bug report, Thomas!

At our site, we recently improved our RPC monitoring and noticed this same issue. We put in a hacky workaround (setting "SQUEUE_LOCAL=1" in /etc/environment), but we would love to see this get fixed at some point. It's sad to see how many unnecessary REQUEST_FED_INFO RPCs our scheduler was processing.