| Summary: | running Reconfigure kills GPU jobs on multiple GPU node | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | darrellp |
| Component: | GPU | Assignee: | Dominik Bartkiewicz <bart> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | marcm |
| Version: | 19.05.0 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| See Also: | https://bugs.schedmd.com/show_bug.cgi?id=7727 | ||
| Site: | Allen AI | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | 19.05.3 | Target Release: | --- |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
| Attachments: |
slurm.conf
gres.conf slurmdbd.conf |
||
|
Description
darrellp
2019-09-11 10:53:52 MDT
Created attachment 11545 [details]
gres.conf
Created attachment 11547 [details]
slurmdbd.conf
This appears to be related based on the description and what we are seeing https://bugs.schedmd.com/show_bug.cgi?id=7727 Hi I can reproduce this easily. It looks that the patch from bug 7727 is correct and fixes this issue. I let you know when it will be in the repo. Dominik When will this be released? Debating if we should just integrate our own patch or wait for yours. Thanks! Hi As you probably have already noticed that fix is committed as: https://github.com/SchedMD/slurm/commit/2abd2a3d8d6bdc It will be included in 19.05.3. We plan to release 19.05.3 before end of the month, but we have no strict date yet. Let me know if we can close this ticket now. Dominik Hi Did you apply this patch? Please let me know when we can close this ticket. Dominik Patch applied. Feel free to close.
marc
> On Sep 18, 2019, at 5:29 AM, bugs@schedmd.com wrote:
>
> Dominik Bartkiewicz changed bug 7729
> What Removed Added
> Severity 2 - High Impact 4 - Minor Issue
> Comment # 7 on bug 7729 from Dominik Bartkiewicz
> Hi
>
> Did you apply this patch?
> Please let me know when we can close this ticket.
>
> Dominik
> You are receiving this mail because:
> You are on the CC list for the bug.
|