Ticket 227 - slurm does not handle config line larger than 4096 chars
Summary: slurm does not handle config line larger than 4096 chars
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: slurmctld (show other tickets)
Version: 2.5.x
Hardware: Linux Linux
: 3 - Medium Impact
Assignee: David Bigagli
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2013-02-08 02:33 MST by Puenlap Lee
Modified: 2013-02-12 08:29 MST (History)
1 user (show)

See Also:
Site: CEA
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
propose fix (1.01 KB, application/octet-stream)
2013-02-08 02:33 MST, Puenlap Lee
Details
problem config file (6.98 KB, application/octet-stream)
2013-02-08 02:34 MST, Puenlap Lee
Details
testing slurm.conf file (88.93 KB, application/octet-stream)
2013-02-12 06:01 MST, David Bigagli
Details

Note You need to log in before you can comment on or make changes to this ticket.
Description Puenlap Lee 2013-02-08 02:33:24 MST
Created attachment 189 [details]
propose fix

[root@cluster] # scontrol reconfig
scontrol: error: Parse error in file /release/bull/automate/slurm_automate.conf line 5:
"07,6252-6263,6265-6266,6269,6288-6301,6303-6323,6482-6485,6526-6532,6534-6539]"
scontrol: error: "Include" failed in file /etc/slurm/slurm.conf line 168
scontrol: fatal: something wrong with opening/reading conf file

slurm_automate.conf file is used for declare the nodesets in all the partitions (production, testing, maintenance...). 
In a installation/stabilisation period the nodesets can be very larges caused by many holes, and cause the line contains more than 4096 chars.
Comment 1 Puenlap Lee 2013-02-08 02:34:46 MST
Created attachment 190 [details]
problem config file

Customer's config file.
Comment 2 David Bigagli 2013-02-08 04:36:41 MST
I will take core of this issue.

 David
Comment 3 David Bigagli 2013-02-12 05:53:56 MST
I did some performance measures and it looks like the malloc()/free() has indeed little impact. The time is probably spent in parsing the file.

 Dynamic memory.
 usec=30700
 usec=29435
 usec=29078
 usec=31375
 usec=31386 

average=30394.8

4MB static buffer
 usec=30159
 usec=30102
 usec=26012
 usec=30159
 usec=34238 

average=30134.0

Perhaps the patch could be a little simpler by avoiding one variable.
diff --git a/src/common/parse_config.c b/src/common/parse_config.c
index 27f756e..7cb386b 100644
--- a/src/common/parse_config.c
+++ b/src/common/parse_config.c
@@ -69,8 +69,6 @@ strong_alias(s_p_hashtbl_create,      slurm_s_p_hashtbl_create);
 strong_alias(s_p_hashtbl_destroy,      slurm_s_p_hashtbl_destroy);
 strong_alias(s_p_parse_file,           slurm_s_p_parse_file);
 
-#define BUFFER_SIZE 4096
-
 #define CONF_HASH_LEN 26
 
 static regex_t keyvalue_re;
@@ -870,8 +868,8 @@ int s_p_parse_file(s_p_hashtbl_t *hashtbl, uint32_t *hash_val, char *filename,
                   bool ignore_new)
 {
        FILE *f;
-       char line[BUFFER_SIZE];
        char *leftover = NULL;
+       char *line;
        int rc = SLURM_SUCCESS;
        int line_number;
        int merged_lines;
@@ -899,9 +897,10 @@ int s_p_parse_file(s_p_hashtbl_t *hashtbl, uint32_t *hash_val, char *filename,
                return SLURM_ERROR;
        }
 
+       line = xmalloc(sizeof(char) * stat_buf.st_size);
        line_number = 1;
        while ((merged_lines = _get_next_line(
-                       line, BUFFER_SIZE, hash_val, f)) > 0) {
+                       line, stat_buf.st_size, hash_val, f)) > 0) {
                /* skip empty lines */
                if (line[0] == '\0') {
                        line_number += merged_lines;
@@ -937,13 +936,14 @@ int s_p_parse_file(s_p_hashtbl_t *hashtbl, uint32_t *hash_val, char *filename,
                line_number += merged_lines;
        }
 
+       xfree(line);
        fclose(f);
        return rc;
 }

Thanks, 
       David
Comment 4 David Bigagli 2013-02-12 06:01:06 MST
Created attachment 191 [details]
testing slurm.conf file
Comment 5 Moe Jette 2013-02-12 07:49:54 MST
I just checked the logic to include other files and it recursively calls s_p_parse_file() (see the function _parse_include_directive()), so I think we should just include this patch in the next release of v2.5 and remove the description of the line size limitations from the man pages for all of the configuration files. Thank you for the patch!
Comment 6 Moe Jette 2013-02-12 08:29:51 MST
Your patch, plus the man page changes, are in the commit shown below. Thanks again.

https://github.com/SchedMD/slurm/commit/97a5fa2fe7509f3847e23bda48f5c8c40b0db695