| Summary: | Bug 14061 - Slurm torque wrapper not submitting the jobs to scheduler | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Shraddha Kiran <Shraddha_Kiran> |
| Component: | Scheduling | Assignee: | Marshall Garey <marshall> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | Manikanta_Eluri, Shraddha_Kiran |
| Version: | - Unsupported Older Versions | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | AMAT | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
| Attachments: | Slurm logs | ||
|
Description
Shraddha Kiran
2022-05-17 14:51:48 MDT
This is a duplicate of bug#14061. In the interest of time, we started working on that issue while Mani's acces was being sorted out. Please see the latest inquiry in bug#14061comment#12. Copied below for your convenience. Mani - > It’s the error pops up when submitting the jobs to Torque via tcad. I am not > sure if this is specific to tcad or Torque itself, but with the error it sounded > like its related to Torque which doesn’t support multi threading. > > Is multithreading supported by default in torque? This error is not something that is part of the Slurm codebase or the wrapper scripts. Can you send us the following: 1. What switches are being called with qsub. 2. Please let us know what server tcad is configured against. 3. Please verify that tcad is calling the qsub wrapper and not a Linux binary file. > $ file /path/to/torque/qsub.pl > /path/to/torque/qsub.pl: Perl script text executable 4. Please also upload the slurmctld.log from the server that spans the time when these jobs are being submitted. *** Ticket 14061 has been marked as a duplicate of this ticket. *** Created attachment 25089 [details] Slurm logs Hi Jason, Please see my answers inline: 1. What switches are being called with qsub. yes, its called from the correct directory. /cm/shared/apps/slurm/19.05.7/bin/qsub 2. Please let us know what server tcad is configured against. TCAD is installed on a netwok storage which is accessible across the cluster nodes. 3. Please verify that tcad is calling the qsub wrapper and not a Linux binary file. > $ file /path/to/torque/qsub.pl > /path/to/torque/qsub.pl: Perl script text executable [root@dcalph000 bin]# file /cm/shared/apps/slurm/19.05.7/bin/qsub /cm/shared/apps/slurm/19.05.7/bin/qsub: Perl script, ASCII text executable [root@dcalph000 bin]# 4. Please also upload the slurmctld.log from the server that spans the time when these jobs are being submitted. Logs are attached. The pre-processing is happening in your application "sptopo3d" before the job is submitted to the qsub wrapper. I suggest you contact synopsys/tcad support regarding this issue. There is probably an option in the application the researcher is using to deselect multithreaded support before it is submitted so that the pre-processing done by synopsys/tcad does not error out. As Jason pointed out, this error does not come from Slurm. Have you been able to resolve this with synopsys/tcad support? (In reply to Marshall Garey from comment #6) > As Jason pointed out, this error does not come from Slurm. Have you been > able to resolve this with synopsys/tcad support? Hello Not yet, we are still trying to fix this. We shall let you know for any updates Thank you Shraddha I'm closing this as infogiven. If you have any Slurm-related questions about this issue, feel free to re-open this ticket. |