I've got a job submit plugin I've written to alter the QOS of a submitted job. It's based on the partition plugin provided with Slurm source. The plugin works fine, but for some reason, I don't get debug messages when running the controller with -v. Another plugin (written nearly identically) does display the messages as expected. "info" messages are being displayed properly, just not the debug messages. Its almost as if the debug level isn't being set properly, but just for this plugin. FWIW: source is available here: https://github.com/atombaby/gizmo-plugins/blob/partition_list/default_qos/job_submit_default_qos.c Thanks much Michael
Hi, -v stands for verbose, to see debug level messages you have to specify -vv. David
(In reply to David Bigagli from comment #1) > Hi, > -v stands for verbose, to see debug level messages you have to specify > -vv. > > David Sorry- should have been clearer. I have been starting slurmctld with double-v: /usr/sbin/slurmctld -D -vv And do see "debug" level messages from other plugins and components. FWIW, I just updated the plugin to include a message using verbose(), and that message does not appear either. Can't help but think I'm missing somethign obvious... Thanks much M
Hi, are you sure your plugin is loaded and configured right? Did you link the plugin with libslurmdb.o? The plugin needs to resolve the slurmdb_qos_rec_t symbol. Using the code you provided I see the messages just fine running slurmctld -Dvv [Jul 16 13:53:25.434307 13067 0x7f01b2cde700] debug: default_qos: starting plugin [Jul 16 13:53:25.434318 13067 0x7f01b2cde700] default_qos: starting plugin [Jul 16 13:53:25.434328 13067 0x7f01b2cde700] debug: default_qos: requested partition "markab" [Jul 16 13:53:25.474023 13067 0x7f01b2cde700] debug: default_qos: checking for qos matching partition "markab" [Jul 16 13:53:25.474047 13067 0x7f01b2cde700] debug: default_qos: comparing to qos "normal" [Jul 16 13:53:25.474062 13067 0x7f01b2cde700] debug: default_qos: comparing to qos "markab" [Jul 16 13:53:25.474073 13067 0x7f01b2cde700] debug: default_qos: found qos markab matching markab [Jul 16 13:53:25.474083 13067 0x7f01b2cde700] default_qos: set job qos to markab this is what I did: 1) took the code 2) put it inside the existing partition job submit plugin so I don't have to create a new directory and makefiles: /src/plugins/job_submit/partition>ls ./ ../ job_submit_partition.c Makefile.am Makefile.in the job_submit_partition.c is now the qos code. 3) I had to modify the Makefile.am to link with the libslurmdb library: + job_submit_partition_la_LDFLAGS = $(SO_LDFLAGS) $(top_builddir)/src/db_api/libslurmdb.o $(PLUGIN_FLAGS) 4) Generate with autogen.sh the new Makefile 5) Ran configure, with the appropriate options, and make. 6) The of course configured the plugin in slurm.conf: jobsubmitplugins=job_submit/partition and restarted the controller and the slurmds. Submit a job and voila le message! :-) David
Hi Dave: Yeah, well, so long as it works for you, I know that I'm more-or-less getting the plugin right. Must be something in the way that I'm testing the plugin (I don't actually reinstall the whole thing, just drop a rebuilt plugin into the appropriate location and restart slurmctld). Anyway- not a critical issue, so let's move on to more pressing issues. Thanks for your help! Michael
Michael, I would point out using slurmdb_qos_get() is probably not the optimal way of getting the QOS. You should look at using the assoc_mgr_qos_list already populated in the slurmctld. That way you don't have to query the database every time a job comes in. Sample code could look something like this... assoc_mgr_lock_t locks = { NO_LOCK, NO_LOCK, READ_LOCK, NO_LOCK, NO_LOCK, NO_LOCK }; assoc_mgr_lock(&locks); itr = list_iterator_create(assoc_mgr_qos_list); while ((qos_ptr = list_next(itr))) { debug( "default_qos: comparing to qos \"%s\"", qos_ptr->name ); if (strcmp( token, qos_ptr->name ) == 0) { debug( "default_qos: found qos %s matching %s", qos_ptr->name, token ); job_desc->qos = xstrdup( qos_ptr->name ); matched = 1; info( "default_qos: set job qos to %s", job_desc->qos ); break; } } list_iterator_destroy(itr); assoc_mgr_unlock(&locks); Hopefully this helps a bit.
A think I usually do is to attach gdb to slurmctld and break in job_submit, then submit a job and gdb will break in all loaded submission plugins. You can then examine the instructions to understand what's going on. Remember to build with CFLAGS=-ggdb -O0 in your environment so the symbol table will be available. David