Ticket 11411 - Can't run alternate cli_filter.lua script for testing.
Summary: Can't run alternate cli_filter.lua script for testing.
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: User Commands (show other tickets)
Version: 20.11.5
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Marcin Stolarek
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2021-04-19 15:47 MDT by Geoff
Modified: 2021-05-11 20:24 MDT (History)
0 users

See Also:
Site: Johns Hopkins University Applied Physics Laboratory
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 21.08pre1
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Geoff 2021-04-19 15:47:49 MDT
2 questions.

The cli_filter lua plugin appears to set DEFAULT_SCRIPT_DIR to $(sysconfdir) at compile time.

I have a test slurm setup that uses the same compiled version of slurm as our production cluster so I can test upgrades and changes. It sets the SLURM_CONF environment variable to point to the alternate config location.

We just put a cli_filter.lua script into place but since the path seems to be defined at compile time by the makefile...

     -DDEFAULT_SCRIPT_DIR=\"$(sysconfdir)\" in src/plugins/cli_filter/Makefile.in

...my test cluster ends up seeing the same cli_filter.lua script as my production cluster and it doesn't look in the directory that slurm.conf resides in. I'd like to be able to do further development and testing on the cli_filter.lua script in my test cluster without affecting my production

We compile slurm to an automounted location instead of using RPMs so there is one location for the binaries and configs.

Q1) Is there a way to redefine the cli_filter.lua script that I am missing?

The only definition I see for lua_script_path is in cli_filter_lua.c

    static const char lua_script_path[] = DEFAULT_SCRIPT_DIR "/cli_filter.lua";

Q2) Is there a way to define an alternate plugin directory if I compile my own modified cli_filter_lua.* plugin?

I could probably change the name and put it with normal install but I am a bit concerned about putting a modification like this where the production cluster might load it.

Thanks.
Comment 1 Marcin Stolarek 2021-04-22 02:25:47 MDT
>Q1) Is there a way to redefine the cli_filter.lua script that I am missing?

No, you're reading the code correctly as of today there is not easy option to switch the location/file name at run time.

>Q2) Is there a way to define an alternate plugin directory if I compile my own modified cli_filter_lua.* plugin?
Let me discuss this with our senior members on the approach we can take before going that path.

I'll keep you posted.

cheers,
Marcin
Comment 3 Geoff 2021-04-22 15:47:45 MDT
I copied the src/plugins/cli_filter/lua directory to src/plugins/cli_filter/testlua and made changes to cli_filter_testlua.c and Makefile so I get cli_filter_testlua.a, cli_filter_testlua.la, and cli_filter_testlua.so files when I run make in that directory. I believe they should be a testlua cli_filter plugin and use DEFAULT_SCRIPT_DIR/cli_filter.testlua as the lua script. (I did not incorperrate this change any further up the configure tree)

I am wondering... if I put these three files into the slurm plugin directory will slurm look at them if they are not listed on the "CliFilterPlugins=lua" option in slurm.conf?

My thought is I could change my test cluster slurm.conf to load testlua instead of lua but I don't want to risk our production cluster loading my cli_filter_testlua plugin if I copy it into the plugin directory with the other cli_filter_* plugins without explicitly loading it.

Thanks.
Comment 6 Marcin Stolarek 2021-05-04 04:51:05 MDT
Geoff,

 We had an internal discussion on how to best handle the case. I have a patch changing the behavior to look for lua scripts using the same approach as we do for auxiliary configuration files (like cgroup.conf).

Part of that process is to look for the file in the same directory as slurm.conf, while slurm.conf can be pointed by SLURM_CONF environment variable, which effectively means that you can test new version of cli_filter.lua, doing something like:
>SLURM_CONF=/etc/slurm-test/slurm.conf srun hostanme

I'm addressing the change to Slurm 21.08 since it's not really a bug, but it's easy to apply on 20.11 too. 

cheers,
Marcin
Comment 9 Marcin Stolarek 2021-05-11 20:24:35 MDT
Geoff,

The behavior changed I described in comment 6 got merged to our master branch and will be part of Slurm 21.08 release, commit 58521d19ef29[1]

Backporting it should be easy - the commit/patch should apply just fine on 20.11 branch - if you want to use it for your current production setup.

I'm closing the bug now, should you have any question please reopen.

cheers,
Marcin

[1]https://github.com/SchedMD/slurm/commit/58521d19ef29347770ae5bf50ac6db539e4634e6