Summary: | sacct -j <jobid> --expand-patterns not returning the correct value for stdout of array jobs | ||
---|---|---|---|
Product: | Slurm | Reporter: | James Owers-Bardsley <jamesowers-bardsley> |
Component: | Accounting | Assignee: | Jacob Jenson <jacob> |
Status: | OPEN --- | QA Contact: | |
Severity: | 6 - No support contract | ||
Priority: | --- | ||
Version: | 24.05.3 | ||
Hardware: | Linux | ||
OS: | Linux | ||
Site: | -Other- | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | NoveTech Sites: | --- |
Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Tzag Elita Sites: | --- |
Linux Distro: | --- | Machine Name: | |
CLE Version: | Version Fixed: | ||
Target Release: | --- | DevPrio: | --- |
Emory-Cloud Sites: | --- |
Description
James Owers-Bardsley
2024-12-13 09:18:12 MST
EDIT: the first example (un expanded), should not include `--expand-patterns` in the call, i.e. should read: ```shell $ sacct -j 40275 --allocations --format 'JobID%32,JobIDRaw,StdOut%-128' JobID JobIDRaw StdOut -------------------------------- ------------ -------------------------------------------------------------------------------------------------------------------------------- 40275_1 40276 /home/username/path/to/%a/zz_logs.eval.%A.log 40275_2 40277 /home/username/path/to/%a/zz_logs.eval.%A.log 40275_3 40278 /home/username/path/to/%a/zz_logs.eval.%A.log 40275_[4-12%3] 40275 /home/username/path/to/%a/zz_logs.eval.%A.log ``` (apologies, had to anon the paths manually, I have run these commands as is). One potentially useful additon: the issue manifests like this if `--array` is used: ``` $ sacct -j 40275 --array --allocations --format 'JobID%32,JobIDRaw,StdOut%-128' --expand-patterns JobID JobIDRaw StdOut -------------------------------- ------------ -------------------------------------------------------------------------------------------------------------------------------- 40275_1 40276 /home/username/path/to/1/zz_logs.eval.40276.log 40275_2 40277 /home/username/path/to/2/zz_logs.eval.40277.log 40275_3 40278 /home/username/path/to/3/zz_logs.eval.40278.log 40275_4 40275 /home/username/path/to/4/zz_logs.eval.40275.log 40275_5 40275 /home/username/path/to/4/zz_logs.eval.40275.log 40275_6 40275 /home/username/path/to/4/zz_logs.eval.40275.log 40275_7 40275 /home/username/path/to/4/zz_logs.eval.40275.log 40275_8 40275 /home/username/path/to/4/zz_logs.eval.40275.log 40275_9 40275 /home/username/path/to/4/zz_logs.eval.40275.log 40275_10 40275 /home/username/path/to/4/zz_logs.eval.40275.log 40275_11 40275 /home/username/path/to/4/zz_logs.eval.40275.log 40275_12 40275 /home/username/path/to/4/zz_logs.eval.40275.log ``` i.e. Issues: - for running/completed jobs: %a expands correctly, but %A has expanded to JobIDRaw. It should be 40275 for all jobs. - for pending jobs: %A has expanded correctly, but %a has expanded to 4 (the first pending task). The only correct path is for task 40275_4: /home/username/path/to/4/zz_logs.eval.40275.log is correct (but doesn't actually exist yet). But as soon as that job starts running, the output will change to be incorrect (/home/username/path/to/4/zz_logs.eval.40279.log). One potentially useful additon: the issue manifests like this if `--array` is used: ``` $ sacct -j 40275 --array --allocations --format 'JobID%32,JobIDRaw,StdOut%-128' --expand-patterns JobID JobIDRaw StdOut -------------------------------- ------------ -------------------------------------------------------------------------------------------------------------------------------- 40275_1 40276 /home/username/path/to/1/zz_logs.eval.40276.log 40275_2 40277 /home/username/path/to/2/zz_logs.eval.40277.log 40275_3 40278 /home/username/path/to/3/zz_logs.eval.40278.log 40275_4 40275 /home/username/path/to/4/zz_logs.eval.40275.log 40275_5 40275 /home/username/path/to/4/zz_logs.eval.40275.log 40275_6 40275 /home/username/path/to/4/zz_logs.eval.40275.log 40275_7 40275 /home/username/path/to/4/zz_logs.eval.40275.log 40275_8 40275 /home/username/path/to/4/zz_logs.eval.40275.log 40275_9 40275 /home/username/path/to/4/zz_logs.eval.40275.log 40275_10 40275 /home/username/path/to/4/zz_logs.eval.40275.log 40275_11 40275 /home/username/path/to/4/zz_logs.eval.40275.log 40275_12 40275 /home/username/path/to/4/zz_logs.eval.40275.log ``` i.e. Issues: - for running/completed jobs: %a expands correctly, but %A has expanded to JobIDRaw. It should be 40275 for all jobs. - for pending jobs: %A has expanded correctly, but %a has expanded to 4 (the first pending task). The only correct path is for task 40275_4: /home/username/path/to/4/zz_logs.eval.40275.log is correct (but doesn't actually exist yet). But as soon as that job starts running, the output will change to be incorrect (/home/username/path/to/4/zz_logs.eval.40279.log). |