| Summary: | Application NHC not run on srun without prior allocation | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | David Gloe <david.gloe> |
| Component: | Cray ALPS | Assignee: | Danny Auble <da> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 3 - Medium Impact | ||
| Priority: | --- | CC: | da |
| Version: | 14.03.x | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | CRAY | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
David Gloe
2013-10-24 08:50:41 MDT
I specifically coded it this way. I didn't think it was needed. Look at src/plugins/select/cray/select_cray.c select_p_step_finish(). If an extra NHC is really needed just removing the "else if" will make it run. When I originally coded this it seemed like over kill to have 2 NHC running on the same resources. Perhaps my understanding of exactly what NHC does is flawed. If it really is needed we can just remove the else if. Let me know. I just talked with an NHC developer and he cleared a couple things up.
First, we definitely do want to call both reservation and application cleanup in this case, since the user can specify different tests in each mode. So we'll have some tests which are never run if we don't run application NHC.
Second, there could be an issue if the reservation cleanup is started before the application has exited, but it should be OK if the reservation cleanup starts before application cleanup. Is the job_complete message only sent after the application has completed (I assume this is the case, but just making sure)?
In short, I think the else if you mentioned should be removed and we should always run nhc on job step completion in addition to job completion.
However, I'm slightly curious as to why I had slurmctld debug set to 3 but I didn't see the debug message from that else if:
debug3("step completion %u.%u was received after job "
"allocation is already completing, no extra NHC needed.",
step_ptr->job_ptr->job_id, step_ptr->step_id);
Removing the else if is easy to do, but as it is written today I don't think we can easily guarantee the application NHC will finish before the reservation one is started. We would have to have a counter and either sleep or pthread_cond_wait on something until all the application NHCs finish before starting the reservation NHC. Could you please verify this is an issue or not. The NHC is already slowing things down quite a bit, this would definitely slow things down much more and complicate the code even more. In either case neither the application NHC and reservation NHC will start until after the application is completely done on the nodes it was running on. Debug levels are as such... 0 = quiet 1 = fatal 2 = error 3 = info 4 = verbose 5 = debug 6 = debug2 7 = debug3 8 = debug4 9 = debug5 or you can just put debug2 or debug3 or whatever instead of the old number way. This probably would of been one of the very rare times debug3 would of given you helpful information ;). I didn't mean that the application NHC needs to be before the reservation NHC. What I meant is that you shouldn't call the reservation or application NHC before the application itself has exited. I assume that's the case already, but just wanted to make sure. As long as that's true the application and reservation NHC can run in parallel. Perfect. I just removed the else if here 148d619f8b76a82ebd988fb00f8a21b73b7c3263. This should fix it. Let me know otherwise. |