| Summary: | AllocGRES id recorded as 7696487 | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Kilian Cavalotti <kilian> |
| Component: | Accounting | Assignee: | Director of Support <support> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | felip.moll, greg.wickham, valentin.plugaru |
| Version: | 18.08.4 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| See Also: |
https://bugs.schedmd.com/show_bug.cgi?id=4650 https://bugs.schedmd.com/show_bug.cgi?id=6348 |
||
| Site: | Stanford | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | Sherlock | CLE Version: | |
| Version Fixed: | 18.08.5 | Target Release: | --- |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
Kilian Cavalotti
2019-01-16 10:06:31 MST
I'd actually like to make the config and log files attachment private, but there's no Privacy setting checkbox when I try to upload an attachement. :( (In reply to Kilian Cavalotti from comment #1) > I'd actually like to make the config and log files attachment private, but > there's no Privacy setting checkbox when I try to upload an attachement. :( Hi Kilian, your bug is already private. It will be safe to attach it here. Thanks Hey Kilian, We figured out where the problem is. We are just making sure that our fix looks good before committing. Thanks, -Michael (In reply to Michael Hinton from comment #9) > Hey Kilian, > > We figured out where the problem is. We are just making sure that our fix > looks good before committing. Good news! Thanks for the update. Cheers, -- Kilian Kilian, Here is the patch, slated for 18.08.5: https://github.com/SchedMD/slurm/commit/588aacf5b13da5ef. Let me know if that works for you! Thanks, -Michael Here's a quick explanation: `7696487` is the gpu plugin id, and is simply a special hash of the string `gpu`. Before, Slurm would try to see if that id matched any gpu records parsed from (effectively) a random node's gres.conf. On heterogeneous systems (or homogeneous systems with gres.confs that are mismatched, like in bug 4650), this means that sometimes a gpu record wasn't found, so the string "gpu" wasn't found. The fallback was to use the plugin id instead. The simplifying realization was that searching gpu records from a random node's gres.conf for a gres name string was not very smart. Instead, we can simply check the GresTypes strings configured in the controller's slurm.conf. Hi Michael, (In reply to Michael Hinton from comment #13) > Here's a quick explanation: > > `7696487` is the gpu plugin id, and is simply a special hash of the string > `gpu`. > > Before, Slurm would try to see if that id matched any gpu records parsed > from (effectively) a random node's gres.conf. On heterogeneous systems (or > homogeneous systems with gres.confs that are mismatched, like in bug 4650), > this means that sometimes a gpu record wasn't found, so the string "gpu" > wasn't found. The fallback was to use the plugin id instead. > > The simplifying realization was that searching gpu records from a random > node's gres.conf for a gres name string was not very smart. Instead, we can > simply check the GresTypes strings configured in the controller's slurm.conf. Thanks for the explanation, and for the patch! Cheers, -- Kilian Closing *** Ticket 6599 has been marked as a duplicate of this ticket. *** |