| Summary: | synchronize task launch when prolog run time is variable | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Ryan Day <day36> |
| Component: | Configuration | Assignee: | Broderick Gardner <broderick> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | sts |
| Version: | 17.11.12 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | LLNL | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
Ryan Day
2019-01-23 11:34:41 MST
There currently isn't a way to setup prologs to wait for other scripts; you would have to write it in the script. Possibly watching for a file in a shared filesystem or potentially a network socket. The reason they start at the same time with PrologFlags=Alloc and sbatch is because the prolog is run on all nodes in the allocation before the batch step starts. So all of the nodes have to finish the prolog before any steps are launched by the batch step. With srun, the job step is sent to the nodes from the beginning, so they run as soon as their prolog is finished. Does that answer your questions? Another option is to create a SPANK plugin that hooks into BeeOND, though that could be a bit more involved than you are looking for. Here is a plugin along those lines, that sets up a private temp directory. https://github.com/hpc2n/spank-private-tmp Okay. That's about what I thought. We'll look into either adding something to the prolog script to make sure that the BeeOND file system is present before finishing or reworking it as a SPANK plugin. Okay, closing this ticket then. |