Created attachment 35820 [details] Concatenated patches for Slingshot plugin for 23.11 and beyond Attached are latest fixes for the HPE Slingshot switch plugin (concatenated into one patch).
Created attachment 35865 [details] Fix for switch config bug introduced by last patch
Jim - Based on the commit descriptions, am I correct in assuming that the "fabric manager" aka "jackaloped" part of this plugin does not work in Slurm 23.11 currently? Are these patches being provided out-of-band to any customers, or is this meant as a longer-term evolution of those interfaces in anticipation of customer demand? thanks, - Tim
(In reply to Tim Wickberg from comment #2) > Jim - > > Based on the commit descriptions, am I correct in assuming that the "fabric > manager" aka "jackaloped" part of this plugin does not work in Slurm 23.11 > currently? > > Are these patches being provided out-of-band to any customers, or is this > meant as a longer-term evolution of those interfaces in anticipation of > customer demand? > > thanks, > - Tim The slingshot fabric manager and jackaloped are two different REST interfaces, used for different functionality (jackaloped for "instant on", fabric manager for Slingshot accelerated collectives). Both work, although we're not recommending using the instant on functionality as it hasn't been tested for scalability. The collectives feature will be used by customers in future Slingshot releases (hard to say exactly when that will be released, but the plugin code is needed for testing).
> The slingshot fabric manager and jackaloped are two different REST > interfaces, > used for different functionality (jackaloped for "instant on", fabric > manager for Slingshot accelerated collectives). Ah, my mistake conflating those. > Both work, although we're > not recommending using the instant on functionality as it hasn't been tested > for scalability. I would agree that this should not be recommended. > The collectives feature will be used by customers in > future Slingshot releases (hard to say exactly when that will be released, > but the plugin code is needed for testing). Given the "will be", can I assume this is used by zero customers today? We need to have a higher-level discussion on how to manage these plugins going forward. I'll split that discussion to a direct email thread.