Created attachment 17270 [details] modify nhc 60 hard code Dear all, We rely on nhc to check the status of cluster nodes, we have many check items in the nhc script. Under normal circumstances, it takes 20s to execute the nhc script.We found that if the execution time of nhc exceeds 60s (70s-80s)due to the abnormal state of the node, the node cannot go offline. We tested and found that modifying the hard code 60(run_script_health_check in slurmd.c) can improve the situation: from run_script("health_check", conf->health_check_program,0, 60, env, 0); to run_script("health_check", slurm_conf.health_check_program, 0, conf->health_check_interval, env, 0);