| Summary: | How to run heterogeneous slurm job on a system with single node | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Shaheer KM <shaheer> |
| Component: | Heterogeneous Jobs | Assignee: | Chad Vizino <chad> |
| Status: | RESOLVED TIMEDOUT | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | ||
| Version: | 23.02.x | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | Cerebras | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
Are you running these srun commands outside a job allocation? Het jobs are normally run with sbatch however, looking over your requirement, why not just request and entire allocation in sbatch and then use srun to define allocations inside of that single job. Based on what you have in your example, this node has 356 CPU's? One way to approach this would be to request the entire node. sbatch -N 1 -n <number of tasks/CPU's> --mem=0 > NOTE: A memory size specification of zero is treated as a special case and > grants the job access to all of the memory on each node. https://slurm.schedmd.com/sbatch.html#OPT_mem Then, inside your batch script, you would call each srun with the requested task placement for your needs. Can you also let me know if this is an MPI job and if those tasks need to be part of the same MPI coms world? Thanks for the info. This is not an MPI job. I will try out the suggestion and get back to you if we need more info on this. Hello, I was trying the following to get this working. Created a bash script that contains below srun command: srun --unbuffered --kill-on-bad-exit --ntasks=1 --cpus-per-task=28 --mem-per-cpu=32gb : --ntasks=1 --cpus-per-task=28 --mem-per-cpu=32gb : --distribution=cyclic --ntasks=45 python exec.py Then called this script via sbatch command: sbatch -N 1 --nodelist sdf-2 -n 47 --mem=0 csrun.sh This errors out: srun: error: Allocation failure of 1 nodes: job size of 1, already allocated 1 nodes to previous components. Our goal is to get 500Gb memory for 2 out of 47 tasks. Do you have suggestion to make this happen on a single node slurm setup? (In reply to Shaheer KM from comment #3) > I was trying the following to get this working. > > Created a bash script that contains below srun command: > > srun --unbuffered --kill-on-bad-exit --ntasks=1 --cpus-per-task=28 > --mem-per-cpu=32gb : --ntasks=1 --cpus-per-task=28 --mem-per-cpu=32gb : > --distribution=cyclic --ntasks=45 python exec.py > > Then called this script via sbatch command: > > sbatch -N 1 --nodelist sdf-2 -n 47 --mem=0 csrun.sh > > > This errors out: > srun: error: Allocation failure of 1 nodes: job size of 1, already allocated > 1 nodes to previous components. > > > Our goal is to get 500Gb memory for 2 out of 47 tasks. > > Do you have suggestion to make this happen on a single node slurm setup? Hi. A heterogeneous srun within a non-het job as you list above requires at least 2 nodes (see https://slurm.schedmd.com/heterogeneous_jobs.html#het_steps). The docs also list that het jobs typically require one node per component (see https://slurm.schedmd.com/heterogeneous_jobs.html#limitations). As Jason suggested in comment 1, could you just request 1 node and then run parallel sruns using --overlap (using this option is important or the sruns will block and run serially) within your job script? Can you share your slurm.conf file so we can see your configuration? (In reply to Chad Vizino from comment #4) > (In reply to Shaheer KM from comment #3) > > I was trying the following to get this working. > > > > Created a bash script that contains below srun command: > > > > srun --unbuffered --kill-on-bad-exit --ntasks=1 --cpus-per-task=28 > > --mem-per-cpu=32gb : --ntasks=1 --cpus-per-task=28 --mem-per-cpu=32gb : > > --distribution=cyclic --ntasks=45 python exec.py > > > > Then called this script via sbatch command: > > > > sbatch -N 1 --nodelist sdf-2 -n 47 --mem=0 csrun.sh > > > > > > This errors out: > > srun: error: Allocation failure of 1 nodes: job size of 1, already allocated > > 1 nodes to previous components. > > > > > > Our goal is to get 500Gb memory for 2 out of 47 tasks. > > > > Do you have suggestion to make this happen on a single node slurm setup? > Hi. A heterogeneous srun within a non-het job as you list above requires at > least 2 nodes (see > https://slurm.schedmd.com/heterogeneous_jobs.html#het_steps). The docs also > list that het jobs typically require one node per component (see > https://slurm.schedmd.com/heterogeneous_jobs.html#limitations). > > As Jason suggested in comment 1, could you just request 1 node and then run > parallel sruns using --overlap (using this option is important or the sruns > will block and run serially) within your job script? > > Can you share your slurm.conf file so we can see your configuration? Hi. Any update on this? Will plan to close in a couple days unless you'd like to continue to pursue this. Closing for now. If you have more questions, feel free to reopen. |
We have a special setup with a very beefy node with lot of CPUs and huge amount of memory. We need a way to run slurm job in such a way that 2 of the tasks gets 800GB of memory and rest of the jobs gets remaining memory allocated as usual. We tried following command but its trying to spin up job on 3 nodes and gets stuck waiting for resource srun --unbuffered --kill-on-bad-exit --ntasks=1 --cpus-per-task=64 --mem-per-cpu=800gb : --ntasks=1 --cpus-per-task=64 --mem-per-cpu=800gb : --distribution=cyclic --ntasks=14 --cpus-per-task=16 JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 220018+2 sdf singular lab PD 0:00 1 (Resources) 220018+1 sdf singular lab PD 0:00 1 (Resources) 220018+0 sdf singular lab PD 0:00 1 (Resources) Any help in getting a working command here is highly appreciated.