| Summary: | Excessive dns queries | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | lhuang |
| Component: | slurmctld | Assignee: | Director of Support <support> |
| Status: | RESOLVED TIMEDOUT | QA Contact: | |
| Severity: | 3 - Medium Impact | ||
| Priority: | --- | ||
| Version: | 19.05.3 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | NY Genome | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
| Attachments: | slurm.conf | ||
|
Description
lhuang
2020-07-07 09:26:25 MDT
Correction, both of our slurm cluster generates over 30 million dns queries per day. Unsure if this is normal or not. Thanks for reaching out. I suspect this is a configuration problem. Can you attach your slurm.conf to this bug? - Jeff Created attachment 14941 [details] slurm.conf Here is the attachment. ________________________________ From: bugs@schedmd.com <bugs@schedmd.com> Sent: Tuesday, July 7, 2020 3:59 PM To: Luis Huang Subject: [Bug 9349] Excessive dns queries Comment # 2<https://urldefense.com/v3/__https://bugs.schedmd.com/show_bug.cgi?id=9349*c2__;Iw!!C6sPl7C9qQ!EWHIF6MnJ-VwRXrS_eXrD9fgkS4W9woz0AzcRgOyKA385z7q6f_PBq7rFR7f_fA$> on bug 9349<https://urldefense.com/v3/__https://bugs.schedmd.com/show_bug.cgi?id=9349__;!!C6sPl7C9qQ!EWHIF6MnJ-VwRXrS_eXrD9fgkS4W9woz0AzcRgOyKA385z7q6f_PBq7rm1Atk7w$> from Jeff DeGraw<mailto:jeff@schedmd.com> Thanks for reaching out. I suspect this is a configuration problem. Can you attach your slurm.conf to this bug? - Jeff ________________________________ You are receiving this mail because: * You reported the bug. ________________________________ This message is for the recipient's use only, and may contain confidential, privileged or protected information. Any unauthorized use or dissemination of this communication is prohibited. If you received this message in error, please immediately notify the sender and destroy all copies of this message. The recipient should check this email and any attachments for the presence of viruses, as we accept no liability for any damage caused by any virus transmitted by this email. In your compute nodes configuration, NodeHostname aren't configured. From the slurm.conf manual page:
NodeHostname:
Typically this would be the string that "/bin/hostname -s" returns. It may also be the fully qualified domain name as returned by "/bin/hostname -f" (e.g. "foo1.bar.com"), or any valid domain name associated with the host through the host database (/etc/hosts) or DNS, depending on the resolver settings. Note that if the short form of the hostname is not used, it may prevent use of hostlist expressions (the numeric portion in brackets must be at the end of the string). A node range expression can be used to specify a set of nodes. If an expression is used, the number of nodes identified by NodeHostname on a line in the configuration file must be identical to the number of nodes identified by NodeName. By default, the NodeHostname will be identical in value to NodeName.
You should set NodeHostname to the hostname of the machine that the nodes run on. For example, if I were running 10 nodes on a computer called "testcomputer", my config line should begin with:
> NodeName=node[0-9] NodeHostname=testcomputer ...
You also might need to give the fully qualified domain name since you mentioned they are on a subdomain.
Let me know if that helps!
- Jeff
Hi again, I just wanted to follow up with you about this, now that there's been a few days to test the impact of what I suggested. Did configuring NodeHostname resolve the issue? - Jeff I haven't heard back from you in a few days, so I'm going to go ahead and close this ticket, but feel free to open it back up if you need to. If you have any other questions or problems, don't hesitate to reach out. - Jeff |