| Summary: | sbatch with 2000 srun fails jobs | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Sohrab <sohrab1982> |
| Component: | slurmd | Assignee: | Jacob Jenson <jacob> |
| Status: | RESOLVED INVALID | QA Contact: | |
| Severity: | 6 - No support contract | ||
| Priority: | --- | ||
| Version: | - Unsupported Older Versions | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | -Other- | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
| Attachments: | error log | ||
Sohrab, Could you please tell me which site you are from? Our system couldn't match your gmail address with a Slurm support contract. Once we know which site you are from we can either route this ticket to the SchedMD Slurm support team if a current contract exists or discuss Slurm support options. Thanks, Jacob On Thu, Sep 27, 2018 at 5:16 AM <bugs@schedmd.com> wrote: > Site -Other- > Bug ID 5782 <https://bugs.schedmd.com/show_bug.cgi?id=5782> > Summary sbatch with 2000 srun fails jobs > Product Slurm > Version - Unsupported Older Versions > Hardware Linux > OS Linux > Status UNCONFIRMED > Severity 6 - No support contract > Priority --- > Component slurmd > Assignee jacob@schedmd.com > Reporter sohrab1982@gmail.com > > Created attachment 7905 [details] <https://bugs.schedmd.com/attachment.cgi?id=7905> [details] <https://bugs.schedmd.com/attachment.cgi?id=7905&action=edit> > error log > > Hi, > > I am a batch script which calls srun 2000 times. srun commands simply call a > python script with a 2 minutes sleep and a print command. Some of the jobs > finish and some dont (pretty random) .I get errors as attached! > > Best Regards, > Sohrab > > ------------------------------ > You are receiving this mail because: > > - You are the assignee for the bug. > > |
Created attachment 7905 [details] error log Hi, I am a batch script which calls srun 2000 times. srun commands simply call a python script with a 2 minutes sleep and a print command. Some of the jobs finish and some dont (pretty random) .I get errors as attached! Best Regards, Sohrab