| Summary: | 'scancel --wckey=test' segfaults | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Bruno Mundim <bmundim> |
| Component: | User Commands | Assignee: | Scott Hilton <scott> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | ||
| Version: | 20.02.7 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | SciNet | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | 20.11.8 21.08.0pre1 | Target Release: | --- |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
Bruno Mundim
2021-06-08 14:41:01 MDT
Running scancel and the core file with --wckey=test option: (gdb) run --wckey=test Starting program: /opt/slurm/bin/scancel --wckey=test [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". scancel: Linear node selection plugin loaded with argument 16 scancel: Cray/Aries node selection plugin loaded scancel: select/cons_tres loaded with argument 16 scancel: select/cons_res loaded with argument 16 Program received signal SIGSEGV, Segmentation fault. _filter_job_records () at scancel.c:390 390 in scancel.c (gdb) bt #0 _filter_job_records () at scancel.c:390 #1 _proc_cluster () at scancel.c:165 #2 0x00000000004047f8 in main (argc=2, argv=0x7fffffffcea8) at scancel.c:121 Thanks, Bruno. Bruno, Thanks for pointing this out. I was able to quickly find the issue. It happens when a job doesn't have a wckey and scancel tries to read from a NULL string. I'll send the fix over to be reviewed. -Scott Bruno, The patch is in github with commit 4c953b8998 and should be in 20.11.8 -Scott Thanks! |