| Summary: | debug messages aren't being displayed for job submit plugin | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Michael Gutteridge <mrg> |
| Component: | slurmctld | Assignee: | David Bigagli <david> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | da |
| Version: | 2.6.7 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | FHCRC - Fred Hutchinson Cancer Research Center | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
Michael Gutteridge
2014-07-16 07:02:42 MDT
Hi, -v stands for verbose, to see debug level messages you have to specify -vv. David (In reply to David Bigagli from comment #1) > Hi, > -v stands for verbose, to see debug level messages you have to specify > -vv. > > David Sorry- should have been clearer. I have been starting slurmctld with double-v: /usr/sbin/slurmctld -D -vv And do see "debug" level messages from other plugins and components. FWIW, I just updated the plugin to include a message using verbose(), and that message does not appear either. Can't help but think I'm missing somethign obvious... Thanks much M Hi, are you sure your plugin is loaded and configured right? Did you link the plugin with libslurmdb.o? The plugin needs to resolve the slurmdb_qos_rec_t symbol. Using the code you provided I see the messages just fine running slurmctld -Dvv [Jul 16 13:53:25.434307 13067 0x7f01b2cde700] debug: default_qos: starting plugin [Jul 16 13:53:25.434318 13067 0x7f01b2cde700] default_qos: starting plugin [Jul 16 13:53:25.434328 13067 0x7f01b2cde700] debug: default_qos: requested partition "markab" [Jul 16 13:53:25.474023 13067 0x7f01b2cde700] debug: default_qos: checking for qos matching partition "markab" [Jul 16 13:53:25.474047 13067 0x7f01b2cde700] debug: default_qos: comparing to qos "normal" [Jul 16 13:53:25.474062 13067 0x7f01b2cde700] debug: default_qos: comparing to qos "markab" [Jul 16 13:53:25.474073 13067 0x7f01b2cde700] debug: default_qos: found qos markab matching markab [Jul 16 13:53:25.474083 13067 0x7f01b2cde700] default_qos: set job qos to markab this is what I did: 1) took the code 2) put it inside the existing partition job submit plugin so I don't have to create a new directory and makefiles: /src/plugins/job_submit/partition>ls ./ ../ job_submit_partition.c Makefile.am Makefile.in the job_submit_partition.c is now the qos code. 3) I had to modify the Makefile.am to link with the libslurmdb library: + job_submit_partition_la_LDFLAGS = $(SO_LDFLAGS) $(top_builddir)/src/db_api/libslurmdb.o $(PLUGIN_FLAGS) 4) Generate with autogen.sh the new Makefile 5) Ran configure, with the appropriate options, and make. 6) The of course configured the plugin in slurm.conf: jobsubmitplugins=job_submit/partition and restarted the controller and the slurmds. Submit a job and voila le message! :-) David Hi Dave: Yeah, well, so long as it works for you, I know that I'm more-or-less getting the plugin right. Must be something in the way that I'm testing the plugin (I don't actually reinstall the whole thing, just drop a rebuilt plugin into the appropriate location and restart slurmctld). Anyway- not a critical issue, so let's move on to more pressing issues. Thanks for your help! Michael Michael, I would point out using slurmdb_qos_get() is probably not the optimal way of getting the QOS. You should look at using the assoc_mgr_qos_list already populated in the slurmctld. That way you don't have to query the database every time a job comes in. Sample code could look something like this...
assoc_mgr_lock_t locks = { NO_LOCK, NO_LOCK,
READ_LOCK, NO_LOCK, NO_LOCK, NO_LOCK };
assoc_mgr_lock(&locks);
itr = list_iterator_create(assoc_mgr_qos_list);
while ((qos_ptr = list_next(itr))) {
debug( "default_qos: comparing to qos \"%s\"", qos_ptr->name );
if (strcmp( token, qos_ptr->name ) == 0)
{
debug( "default_qos: found qos %s matching %s",
qos_ptr->name, token
);
job_desc->qos = xstrdup( qos_ptr->name );
matched = 1;
info( "default_qos: set job qos to %s", job_desc->qos );
break;
}
}
list_iterator_destroy(itr);
assoc_mgr_unlock(&locks);
Hopefully this helps a bit.
A think I usually do is to attach gdb to slurmctld and break in job_submit, then submit a job and gdb will break in all loaded submission plugins. You can then examine the instructions to understand what's going on. Remember to build with CFLAGS=-ggdb -O0 in your environment so the symbol table will be available. David |