Ticket 22374 - Kafka JobComp Plugin erroneously reports multiple partitions
Summary: Kafka JobComp Plugin erroneously reports multiple partitions
Status: OPEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Accounting (show other tickets)
Version: 23.11.10
Hardware: Linux Linux
: 3 - Medium Impact
Assignee: Alejandro Sanchez
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2025-03-17 13:25 MDT by Thomas Langford
Modified: 2025-03-31 11:12 MDT (History)
1 user (show)

See Also:
Site: Yale
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Thomas Langford 2025-03-17 13:25:03 MDT
We are attempting to shift our accounting to use the jobcomp/kafka plugin. On the whole it's working great, but there is an issue with the partition column. 

Our researchers often submit jobs to multiple partitions using comma-separated lists like this:

--partition=day,scavenge

This lets them attempt to submit to the "standard" partitions and fall-back to our scavenge partition (which runs in preempt-mode) if there are no available resources in `day`. It's also used to submit to condo partitions with a fall-back of commons if the condo nodes are full.

While sacct reports the correct partition that the job ran in, jobcomp/kafka includes the full comma-separated list with no indication of which partition the job actually ran in.

Since we treat usage in these various partition types (condo, commons, scavenge) differently in our accounting this is a show-stopper for switching over to jobcomp/kafka. 

Let me know if I can provide any additional info to help y'all reproduce this issue.

Thanks!
Tom Langford
Comment 2 Thomas Langford 2025-03-25 11:38:24 MDT
Hi everyone, it's been a week since I submitted this ticket and I haven't heard anything back yet. Is there any additional information I could provide to help out? These are my JobComp config settings:

rdkafka.conf
--------------
bootstrap.servers=XXX.XXX.XXX.XXX:9092
debug=broker,topic,msg
linger.ms=400
log_level=7

slurm.conf
--------------
JobCompType=jobcomp/kafka
JobCompLoc=/opt/slurm/current/etc/rdkafka.conf
JobCompParams=flush_timeout=200,poll_interval=3,requeue_on_msg_timeout,topic=slurm_accounting

Happy to provide anything else that's useful.
Comment 3 Alejandro Sanchez 2025-03-25 12:46:55 MDT
Hi,

Sorry I didn't get back to this earlier.

Historically all jobcomp plugins (including kafka and elasticsearch sharing a common serialization code path) sent the partition field off of the job_record_t->partition field as-is, which is defined:

char *partition;                /* name of job partition(s) */

Making a change to suit your expectations would be a change in behavior potentially disrupting other sites expectations.

We can discuss this internally and come back to you.
Comment 6 Thomas Langford 2025-03-28 09:30:18 MDT
Thanks for the clarification. I'm surprised that there isn't a field for "partition where this job ran". That's what I expected the "partition" value to be, perhaps there could be a separation of "submitted partition list" from "partition"? 

The corresponding field from sacct gets updated to be the "partition where the job ran", hence my expectation that this would be the same information. 

Do other sites not use the JobComp plugins for long-term accounting? I don't really see the value of the "submitted partition", since that list could contain all partitions. 

We really care about the breakdown of jobs that ran in privately owned partitions vs commons partitions, as that factors into how we report usage to our various research departments. 

Is there another field that I'm missing in the JobComp datastream that indicates which partition a job actually ran in? 

Thanks so much, 
-t
Comment 11 Alejandro Sanchez 2025-03-31 10:32:13 MDT
Hi Thomas,

Just to give you an update, after internal discussion we'll work on changing the "partition" field in jobcomp plugins for 25.05 to reflect the partition the job ran on, instead of the current multi-part submission comma separated string list.

We'll make sure to communicate the change at release time for sites used to previous behavior.

I'll come back to you when it's ready.

Thanks.
Comment 12 Thomas Langford 2025-03-31 11:12:41 MDT
Fantastic, thanks! Looking forward to the implementation. I've been really happy with the Kafka jobcomp plugin, thanks for all the hard work!