sacct returns jobs which end past the -E time. Moreover, they cannot be recovered when querying the next period. calebh@h2repl:~$ echo $STATES out_of_memory,resizing,timeout,cancelled,revoked,deadline,completed,requeued,node_fail,failed,preempted,boot_fail calebh@h2repl:~$ sacct -P -S '2021-04-07T23:30:00' -E '2021-04-07T23:59:59' -s "$STATES" -a -o jobid,state,start,end | grep '04-08' 39125549.0|CANCELLED|2021-04-05T02:07:39|2021-04-08T00:00:22 39125551.0|CANCELLED|2021-04-05T22:06:00|2021-04-08T00:00:22 39125552.0|CANCELLED|2021-04-05T22:09:54|2021-04-08T00:00:23 39125557.0|CANCELLED|2021-04-05T01:59:34|2021-04-08T00:00:22 39125558.0|CANCELLED|2021-04-05T01:59:36|2021-04-08T00:00:23 39125559.0|CANCELLED|2021-04-05T01:59:36|2021-04-08T00:00:23 39125560.0|CANCELLED|2021-04-05T01:59:36|2021-04-08T00:00:22 39125561.0|CANCELLED|2021-04-05T01:59:36|2021-04-08T00:00:22 39125562.0|CANCELLED|2021-04-05T01:59:36|2021-04-08T00:00:22 39125563.0|CANCELLED|2021-04-05T01:59:36|2021-04-08T00:00:22 39228614.2|COMPLETED|2021-04-07T23:56:21|2021-04-08T00:00:12 calebh@h2repl:~$ sacct -P -S '2021-04-08T00:00:00' -E '2021-04-08T00:01:00' -s "$STATES" -a -o jobid,state,start,end | grep '39125549.0'
Kris, The first statement makes sense. sacct will select all jobs that were running during a certain period even if they start before or continue beyond the time period specified by the query. The fact that these jobs don't show up in the second query puzzles me. -Scott
Kris, Can you try this. Just want to check that grep and the states option aren't breaking it. sacct -P -S '2021-04-08T00:00:00' -E '2021-04-08T00:01:00' -a -o jobid,state,start,end -j 39125549,39125551,39125552,39125557,39228614 -Scott
Kris, To amend my first comment: "sacct will select all jobs that were running during a certain period even if they start before or continue beyond the time period specified by the query." When -s (--state) is used, that state must exist in the time period. Most of the filters are applied to just jobs not steps and if a job doesn't pass the filter none of its steps will be shown. I would guess in the second instance that the job was not in any of those states specified (it was probably in "running") between 2021-04-08T00:00:00 and 2021-04-08T00:01:00 Could you run this query so I can see a whole job and all its steps sacct -a -o jobid,state,start,end -j 39125549 -Scott
Hi Scott, Thanks for the additional info - please find output below. > Could you run this query so I can see a whole job and all its steps > sacct -a -o jobid,state,start,end -j 39125549 > > -Scott sacct -a -o jobid,state,start,end -j 39125549 JobID State Start End ------------ ---------- ------------------- ------------------- 39125549 CANCELLED+ 2021-04-05T02:07:20 2021-04-07T23:59:38 39125549.ba+ CANCELLED 2021-04-05T02:07:20 2021-04-07T23:59:40 39125549.ex+ COMPLETED 2021-04-05T02:07:20 2021-04-07T23:59:38 39125549.0 CANCELLED 2021-04-05T02:07:39 2021-04-08T00:00:22
Kris, It looks like 39125549.0 took a little while to fully shutdown. Its parent job ended 44 seconds earlier. Because the parent job didn't fit the second query the step didn't show up. -Scott
Kris, Does this answer your question? Do you have any follow up questions? -Scott
Hi Scott, Adding Caleb
Hi Scott, Adding Caleb to the case. -Kris
Thanks for the info Scott. To confirm, the time range filtering only applies on jobs and not job steps. If this is the case, then I have no further questions; feel free to close the ticket.
Caleb, The time filtering applies first to jobs then to steps. For a step to appear both the job and the step have to be in the time frame. -Scott
Closing ticket. If you have follow up questions feel free to reopen it. -Scott