| Summary: | slurmdbd out of memory, dumping core | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Chris <chris.harwell> |
| Component: | slurmdbd | Assignee: | Alejandro Sanchez <alex> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 3 - Medium Impact | ||
| Priority: | --- | CC: | alex |
| Version: | 15.08.12 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | D E Shaw Research | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
| Attachments: | journalctl -u slurmcl15dbd --since today > /var/tmp/slurmcl15dbd.log | ||
|
Description
Chris
2016-11-15 13:39:51 MST
(In reply to Chris from comment #0) > Created attachment 3714 [details] > journalctl -u slurmcl15dbd --since today > /var/tmp/slurmcl15dbd.log > > Twice in the last two days I've had an alert that slurmdbd was down. Upon > investigation it looks that it ran out of memory and dumped core which, > rather unfortunately, I didn't have enough space for in the file system. Any > advice? I see that your ArchiveDir is /var/spool/slurmcl15/ and you are archiving Events, Jobs and Resvs. I also see this error: Nov 15 14:22:49 drdslurm0001.en.desres.deshaw.com slurmdbd[32218]: error: mysql_query failed: 1021 Disk full (/var/tmp/#sql_1256_0); waiting for someone to free some space... I'm wondering if the archive files (or other logs) filled the /var filesystem and MySQL was thus unable to use the filesytem for temporary copies of tables. Another symptom of disk full is that there wasn't enough space for the core file to be generated. - Can you check the space on /var? - Can you check the output of the core file size with ulimit -a and change it to unlimited if it's limited? > The box which only has slurmdbd and mariadb has 64GB of memory. Do I just > need to figure out a way to get a box with more? Any intermediate steps I > could take? I believe 64GB of memory should suffice for slurmdbd to work properly. > > # what was going on about this time? there were a lot of short jobs. could > that have caused an issue? Anything I should adjust or pay special attention > to? Would commit delay help here or just expose more? The documentation on CommitDelay specifies that if running a very high throughput of jobs you should consider setting this, which seems the case since you mention there were a lot of short jobs. Chris - a few more questions and notes:
- How many jobs / job steps are you running per day?
- Can you run:
mysql> select count(*) from <cluster>_job_table where time_start > unix_timestamp('2016-11-15');
- How large is the raw mysql database at this point?
- Are you running any sacct query asking for records since long time ago? Heavy queries like that could go over the response buffer (4GB):
error: packdouble: Buffer size limit exceeded (4294901818 > 4294901760)
- I believe increasing the Purge frequency might help mitigate the issue too.
- innodb_buffer_pool_size can have a huge impact - we'd recommend setting this as high as half the RAM available on the slurmdbd server. You can check the current setting in MySQL like so:
mysql> SHOW VARIABLES LIKE 'innodb_buffer_pool_size';
- Setting innodb_lock_wait_timeout and innodb_log_file_size to larger values than the default is also recommended.
Thanks! I will get back to you with that info. By purge frequency increase do you mean shorter job and step retention times? -------- Original Message -------- Subject: [Bug 3260] slurmdbd out of memory, dumping core From: bugs@schedmd.com Date: Nov 16, 2016, 08:04 To: "Harwell, Chris" <Chris.Harwell@DEShawResearch.com> Comment # 3<https://bugs.schedmd.com/show_bug.cgi?id=3260#c3> on bug 3260<https://bugs.schedmd.com/show_bug.cgi?id=3260> from Alejandro Sanchez<mailto:alex@schedmd.com> Chris - a few more questions and notes: - How many jobs / job steps are you running per day? - Can you run: mysql> select count(*) from <cluster>_job_table where time_start > unix_timestamp('2016-11-15'); - How large is the raw mysql database at this point? - Are you running any sacct query asking for records since long time ago? Heavy queries like that could go over the response buffer (4GB): error: packdouble: Buffer size limit exceeded (4294901818 > 4294901760) - I believe increasing the Purge frequency might help mitigate the issue too. - innodb_buffer_pool_size can have a huge impact - we'd recommend setting this as high as half the RAM available on the slurmdbd server. You can check the current setting in MySQL like so: mysql> SHOW VARIABLES LIKE 'innodb_buffer_pool_size'; - Setting innodb_lock_wait_timeout and innodb_log_file_size to larger values than the default is also recommended. ________________________________ You are receiving this mail because: * You reported the bug. (In reply to Chris from comment #4) > Thanks! I will get back to you with that info. > > By purge frequency increase do you mean shorter job and step retention times? Exactly. MariaDB [slurmcl2db]> SELECT table_schema table_schema, table_name table_name, engine, ROUND(data_length/1024/1024,2) total_size_mb, table_rows FROM information_schema.tables WHERE table_schema = "slurmcl2db" ORDER BY total_size_mb DESC LIMIT 10;
+--------------+--------------------------+--------+---------------+------------+
| table_schema | table_name | engine | total_size_mb | table_rows |
+--------------+--------------------------+--------+---------------+------------+
| slurmcl2db | cl_job_table | InnoDB | 5766.02 | 13298060 |
| slurmcl2db | cl_step_table | InnoDB | 5733.02 | 10528556 |
| slurmcl2db | qm_step_table | InnoDB | 153.02 | 319390 |
| slurmcl2db | swdev_job_table | InnoDB | 149.17 | 360027 |
| slurmcl2db | qm_job_table | InnoDB | 113.16 | 274370 |
| slurmcl2db | swdev_step_table | InnoDB | 89.13 | 235229 |
| slurmcl2db | cl_event_table | InnoDB | 62.23 | 136641 |
| slurmcl2db | a_assoc_usage_hour_table | InnoDB | 47.59 | 841910 |
| slurmcl2db | a_step_table | InnoDB | 21.52 | 12352 |
| slurmcl2db | a_job_table | InnoDB | 13.02 | 1572 |
+--------------+--------------------------+--------+---------------+------------+
10 rows in set (0.04 sec)
MariaDB [slurmcl2db]> select count(job_db_inx) from cl_job_table;
+-------------------+
| count(job_db_inx) |
+-------------------+
| 14292402 |
+-------------------+
1 row in set (1.73 sec)
MariaDB [slurmcl2db]> select count(job_db_inx) from cl_step_table;
+-------------------+
| count(job_db_inx) |
+-------------------+
| 10904677 |
+-------------------+
1 row in set (3.08 sec)
MariaDB [slurmcl2db]> select count(*) from cl_job_table where time_start > unix_timestamp('2016-11-15') and time_start < unix_timestamp('2016-11-16');
+----------+
| count(*) |
+----------+
| 445486 |
+----------+
1 row in set (2.06 sec)
MariaDB [slurmcl2db]> select count(*) from cl_job_table where time_start > unix_timestamp('2016-11-14') and time_start < unix_timestamp('2016-11-15');
+----------+
| count(*) |
+----------+
| 740696 |
+----------+
1 row in set (2.00 sec)
MariaDB [slurmcl2db]> select count(*) from cl_job_table where time_start > unix_timestamp('2016-11-13') and time_start < unix_timestamp('2016-11-14');
+----------+
| count(*) |
+----------+
| 649306 |
+----------+
1 row in set (2.00 sec)
MariaDB [slurmcl2db]> SHOW VARIABLES LIKE 'innodb%timeout';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| innodb_lock_wait_timeout | 900 |
| innodb_rollback_on_timeout | OFF |
+----------------------------+-------+
2 rows in set (0.00 sec)
MariaDB [slurmcl2db]> SHOW VARIABLES LIKE 'innodb%size';
+---------------------------------+-------------+
| Variable_name | Value |
+---------------------------------+-------------+
| innodb_additional_mem_pool_size | 8388608 |
| innodb_buffer_pool_size | 17179869184 |
| innodb_ibuf_max_size | 8589918208 |
| innodb_log_block_size | 512 |
| innodb_log_buffer_size | 8388608 |
| innodb_log_file_size | 536870912 |
| innodb_max_bitmap_file_size | 104857600 |
| innodb_merge_sort_block_size | 1048576 |
| innodb_page_size | 16384 |
| innodb_purge_batch_size | 20 |
+---------------------------------+-------------+
10 rows in set (0.00 sec)
Chris - can you please set innodb_buffer_pool_size to 50GB and innodb_buffer_pool_instances to 50 and see if this helps mitigating the issue? I think these values would fit better your 64GB box by reading the MariaDB doc[1][2]. Also regarding the disk space: - What's the total/free/used space in /var? Was /var full when this happened? - What's the output of `ulimit -a` for the user running slurmdbd (I think slurmcl2)? Around 500K jobs go to the database daily. As I said maybe increasing the Purge frequency would help too. Please, let me know how it goes. [1] https://mariadb.com/kb/en/mariadb/xtradbinnodb-server-system-variables/#innodb_buffer_pool_size [2] https://mariadb.com/kb/en/mariadb/xtradbinnodb-server-system-variables/#innodb_buffer_pool_instances Could you explain a little more about the 4GB buffer size and what those error messages with Buffer size limit exceeded mean? Is that a slurmdbd compiled in limit and if the data returned from sql exceeds it we start to leak memory?
[harwell@drdslurm0001 SPECS]$ sudo journalctl -u slurmcl15dbd --since today | grep -v DBD_INIT | grep -v Conn | grep error
Nov 17 10:27:28 drdslurm0001.en.desres.deshaw.com slurmdbd[27641]: error: pack32: Buffer size limit exceeded (4294901818 > 4294901760)
Nov 17 10:27:28 drdslurm0001.en.desres.deshaw.com slurmdbd[27641]: error: pack32: Buffer size limit exceeded (4294901818 > 4294901760)
Nov 17 11:00:00 drdslurm0001.en.desres.deshaw.com slurmdbd[24513]: error: We have more allocated time than is possible (6865200 > 288000) for cluster a(80) from 2016-11-17T10:00:00 - 2016-11-17T11:00:00 tres 1
Nov 17 11:00:00 drdslurm0001.en.desres.deshaw.com slurmdbd[24513]: error: We have more time than is possible (288000+288000+0)(576000) > 288000 for cluster a(80) from 2016-11-17T10:00:00 - 2016-11-17T11:00:00 tres 1
Nov 17 11:00:00 drdslurm0001.en.desres.deshaw.com slurmdbd[24513]: error: id_assoc 1071 doesn't have any tres
Nov 17 12:00:00 drdslurm0001.en.desres.deshaw.com slurmdbd[24513]: error: We have more allocated time than is possible (6865200 > 288000) for cluster a(80) from 2016-11-17T11:00:00 - 2016-11-17T12:00:00 tres 1
Nov 17 12:00:00 drdslurm0001.en.desres.deshaw.com slurmdbd[24513]: error: We have more time than is possible (288000+288000+0)(576000) > 288000 for cluster a(80) from 2016-11-17T11:00:00 - 2016-11-17T12:00:00 tres 1
Nov 17 12:00:00 drdslurm0001.en.desres.deshaw.com slurmdbd[24513]: error: id_assoc 1071 doesn't have any tres
Nov 17 13:00:00 drdslurm0001.en.desres.deshaw.com slurmdbd[24513]: error: We have more allocated time than is possible (6865200 > 288000) for cluster a(80) from 2016-11-17T12:00:00 - 2016-11-17T13:00:00 tres 1
Nov 17 13:00:00 drdslurm0001.en.desres.deshaw.com slurmdbd[24513]: error: We have more time than is possible (288000+288000+0)(576000) > 288000 for cluster a(80) from 2016-11-17T12:00:00 - 2016-11-17T13:00:00 tres 1
Nov 17 13:00:00 drdslurm0001.en.desres.deshaw.com slurmdbd[24513]: error: id_assoc 1071 doesn't have any tres
Sorry for the delayed reply. I've made some changes and haven't seen the errors since about 10:30am when I cancelled the reporting script.
I've updated some of the metrics we run to use squeue rather than sacct and disabled the month until now reports which query with sreport and sacct. We've been monitoring slurmdbd memory usage and restarted it twice when the usage reached 60 and 67% of memory on the box.
The slurmcl2 user which runs slurmdbd has unlimited core:
sudo su --shell=/bin/bash - slurmcl2
Last login: Thu Nov 17 14:01:13 EST 2016 on pts/10
-bash-4.2$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 255917
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 16384
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
I'm not 100% positive of the order of events, but yes /var did fill on that day and that is where the archives and logs and core files and mysql tmp files all go, but at least mysql db is on a different /var/lib/mysql partition.
I've since grown the /var partition and removed the partial core files:
[harwell@drdslurm0001 SPECS]$ df -hl /var/ /var/lib/mysql/
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/data-var 96G 7.8G 89G 9% /var
/dev/mapper/data-mysql 32G 18G 15G 56% /var/lib/mysql
I've decreased retention times to 30 days, stepping 1 day at a time and doing the purges by hand with something like this:
y=720hours ;
for i in {60..45} ; do
h=$(( 24*$i )) ;
x=${h}hours ;
echo /opt/slurmcl15/bin/sacctmgr -i archive dump Directory=/var/spool/slurmcl15/ Events Jobs Steps Suspend PurgeEventAfter=$x PurgeJobAfter=$x PurgeStepAfter=$y PurgeSuspendAfter=24hours ;
sudo -u slurmcl2 /opt/slurmcl15/bin/sacctmgr -i archive dump Directory=/var/spool/slurmcl15/ Events Jobs Steps Suspend PurgeEventAfter=$x PurgeJobAfter=$x PurgeStepAfter=$y PurgeSuspendAfter=24hours ;
done
We did see an uptick in jobs per day in Nov.
Setting innodb buffer to 50MB makes me a little nervous since slurmdbd has been ballooning up to 60%. I have doubled it to 32GB and increased the instances from the default of 1 in this version to 32:
Current database: slurmcl2db
+---------------------------------------+-------------+
| Variable_name | Value |
+---------------------------------------+-------------+
| innodb_buffer_pool_instances | 32 |
| innodb_buffer_pool_populate | OFF |
| innodb_buffer_pool_restore_at_startup | 0 |
| innodb_buffer_pool_shm_checksum | ON |
| innodb_buffer_pool_shm_key | 0 |
| innodb_buffer_pool_size | 34359738368 |
+---------------------------------------+-------------+
(In reply to Chris from comment #8) > Could you explain a little more about the 4GB buffer size and what those > error messages with Buffer size limit exceeded mean? Is that a slurmdbd > compiled in limit and if the data returned from sql exceeds it we start to > leak memory? There's this max buffer definition in the code: src/common/pack.h:#define MAX_BUF_SIZE ((uint32_t) 0xffff0000) /* avoid going over 32-bits */ When slurmdbd has to pack and transmit buffer information the max buffer size is defined to 4294901760. > Sorry for the delayed reply. I've made some changes and haven't seen the > errors since about 10:30am when I cancelled the reporting script. > > I've updated some of the metrics we run to use squeue rather than sacct and > disabled the month until now reports which query with sreport and sacct. > We've been monitoring slurmdbd memory usage and restarted it twice when the > usage reached 60 and 67% of memory on the box. > > The slurmcl2 user which runs slurmdbd has unlimited core: > > sudo su --shell=/bin/bash - slurmcl2 > Last login: Thu Nov 17 14:01:13 EST 2016 on pts/10 > -bash-4.2$ ulimit -a > core file size (blocks, -c) 0 > data seg size (kbytes, -d) unlimited > scheduling priority (-e) 0 > file size (blocks, -f) unlimited > pending signals (-i) 255917 > max locked memory (kbytes, -l) unlimited > max memory size (kbytes, -m) unlimited > open files (-n) 16384 > pipe size (512 bytes, -p) 8 > POSIX message queues (bytes, -q) 819200 > real-time priority (-r) 0 > stack size (kbytes, -s) 8192 > cpu time (seconds, -t) unlimited > max user processes (-u) 4096 > virtual memory (kbytes, -v) unlimited > file locks (-x) unlimited Can you please set the 'core file size' for slurmcl2 to unlimited? This way if slurmdbd aborts again we can backtrace the generated core and figure out what's going on. > I'm not 100% positive of the order of events, but yes /var did fill on that > day and that is where the archives and logs and core files and mysql tmp > files all go, but at least mysql db is on a different /var/lib/mysql > partition. > > I've since grown the /var partition and removed the partial core files: > > [harwell@drdslurm0001 SPECS]$ df -hl /var/ /var/lib/mysql/ > Filesystem Size Used Avail Use% Mounted on > /dev/mapper/data-var 96G 7.8G 89G 9% /var > /dev/mapper/data-mysql 32G 18G 15G 56% /var/lib/mysql > > I've decreased retention times to 30 days, stepping 1 day at a time and > doing the purges by hand with something like this: > y=720hours ; > > for i in {60..45} ; do > h=$(( 24*$i )) ; > x=${h}hours ; > echo /opt/slurmcl15/bin/sacctmgr -i archive dump > Directory=/var/spool/slurmcl15/ Events Jobs Steps Suspend PurgeEventAfter=$x > PurgeJobAfter=$x PurgeStepAfter=$y PurgeSuspendAfter=24hours ; > sudo -u slurmcl2 /opt/slurmcl15/bin/sacctmgr -i archive dump > Directory=/var/spool/slurmcl15/ Events Jobs Steps Suspend PurgeEventAfter=$x > PurgeJobAfter=$x PurgeStepAfter=$y PurgeSuspendAfter=24hours ; > done > > We did see an uptick in jobs per day in Nov. > > Setting innodb buffer to 50MB makes me a little nervous since slurmdbd has > been ballooning up to 60%. I have doubled it to 32GB and increased the > instances from the default of 1 in this version to 32: > > Current database: slurmcl2db > > +---------------------------------------+-------------+ > | Variable_name | Value | > +---------------------------------------+-------------+ > | innodb_buffer_pool_instances | 32 | > | innodb_buffer_pool_populate | OFF | > | innodb_buffer_pool_restore_at_startup | 0 | > | innodb_buffer_pool_shm_checksum | ON | > | innodb_buffer_pool_shm_key | 0 | > | innodb_buffer_pool_size | 34359738368 | > +---------------------------------------+-------------+ Ok, great. Are you experiencing more errors? If slurmdbd aborts() again could you please attach the output of: $ gdb /path/to/core slurmdbd (gdb) set pagination off (gdb) set print pretty on (gdb) bt (gdb) thread apply all bt full Thanks. Sorry, I totally pasted the wrong thing. The limit when opening a login shell are different from those of the running slurmdbd process. For that the core file size is really "unlimited". Since we've gotten two partial cores before I have a fairly high degree of confidence in that :> [harwell@drdslurm0001 SPECS]$ cat /proc/$(!!)/limits cat /proc/$(pgrep slurmdbd)/limits Limit Soft Limit Hard Limit Units Max cpu time unlimited unlimited seconds Max file size unlimited unlimited bytes Max data size unlimited unlimited bytes Max stack size unlimited unlimited bytes Max core file size unlimited unlimited bytes Max resident set unlimited unlimited bytes Max processes 255917 255917 processes Max open files 8192 8192 files Max locked memory unlimited unlimited bytes Max address space unlimited unlimited bytes Max file locks unlimited unlimited locks Max pending signals 255917 255917 signals Max msgqueue size 819200 819200 bytes Max nice priority 0 0 Max realtime priority 0 0 Max realtime timeout unlimited unlimited us [harwell@drdslurm0001 SPECS]$ On 11/18/2016 10:58 AM, bugs@schedmd.com wrote: > > *Comment # 9 <https://bugs.schedmd.com/show_bug.cgi?id=3260#c9> on bug > 3260 <https://bugs.schedmd.com/show_bug.cgi?id=3260> from Alejandro > Sanchez <mailto:alex@schedmd.com> * > (In reply to Chris fromcomment #8 <show_bug.cgi?id=3260#c8>) > > Could you explain a little more about the 4GB buffer size and what those > error messages with Buffer size limit exceeded mean? Is that a > slurmdbd > compiled in limit and if the data returned from sql exceeds > it we start to > leak memory? > > There's this max buffer definition in the code: > src/common/pack.h:#define MAX_BUF_SIZE ((uint32_t) 0xffff0000) /* avoid going > over 32-bits */ > > When slurmdbd has to pack and transmit buffer information the max buffer size > is defined to 4294901760. > > > Sorry for the delayed reply. I've made some changes and haven't seen the > errors since about 10:30am when I cancelled the reporting script. > > > I've updated some of the metrics we run to use squeue rather than > sacct and > disabled the month until now reports which query with > sreport and sacct. > We've been monitoring slurmdbd memory usage and > restarted it twice when the > usage reached 60 and 67% of memory on > the box. > > The slurmcl2 user which runs slurmdbd has unlimited core: > > > sudo su --shell=/bin/bash - slurmcl2 > Last login: Thu Nov 17 > 14:01:13 EST 2016 on pts/10 > -bash-4.2$ ulimit -a > core file size > (blocks, -c) 0 > data seg size (kbytes, -d) unlimited > scheduling > priority (-e) 0 > file size (blocks, -f) unlimited > pending signals > (-i) 255917 > max locked memory (kbytes, -l) unlimited > max memory > size (kbytes, -m) unlimited > open files (-n) 16384 > pipe size (512 > bytes, -p) 8 > POSIX message queues (bytes, -q) 819200 > real-time > priority (-r) 0 > stack size (kbytes, -s) 8192 > cpu time (seconds, > -t) unlimited > max user processes (-u) 4096 > virtual memory (kbytes, > -v) unlimited > file locks (-x) unlimited > > Can you please set the 'core file size' for slurmcl2 to unlimited? This way if > slurmdbd aborts again we can backtrace the generated core and figure out what's > going on. > > > I'm not 100% positive of the order of events, but yes /var did fill on that > day and that is where the archives and logs and core files and > mysql tmp > files all go, but at least mysql db is on a different > /var/lib/mysql > partition. > > I've since grown the /var partition > and removed the partial core files: > > [harwell@drdslurm0001 SPECS]$ > df -hl /var/ /var/lib/mysql/ > Filesystem Size Used Avail Use% Mounted > on > /dev/mapper/data-var 96G 7.8G 89G 9% /var > > /dev/mapper/data-mysql 32G 18G 15G 56% /var/lib/mysql > > I've > decreased retention times to 30 days, stepping 1 day at a time and > > doing the purges by hand with something like this: > y=720hours ; > > > for i in {60..45} ; do > h=$(( 24*$i )) ; > x=${h}hours ; > echo > /opt/slurmcl15/bin/sacctmgr -i archive dump > > Directory=/var/spool/slurmcl15/ Events Jobs Steps Suspend > PurgeEventAfter=$x > PurgeJobAfter=$x PurgeStepAfter=$y > PurgeSuspendAfter=24hours ; > sudo -u slurmcl2 > /opt/slurmcl15/bin/sacctmgr -i archive dump > > Directory=/var/spool/slurmcl15/ Events Jobs Steps Suspend > PurgeEventAfter=$x > PurgeJobAfter=$x PurgeStepAfter=$y > PurgeSuspendAfter=24hours ; > done > > We did see an uptick in jobs > per day in Nov. > > Setting innodb buffer to 50MB makes me a little > nervous since slurmdbd has > been ballooning up to 60%. I have doubled > it to 32GB and increased the > instances from the default of 1 in this > version to 32: > > Current database: slurmcl2db > > > +---------------------------------------+-------------+ > | > Variable_name | Value | > > +---------------------------------------+-------------+ > | > innodb_buffer_pool_instances | 32 | > | innodb_buffer_pool_populate | > OFF | > | innodb_buffer_pool_restore_at_startup | 0 | > | > innodb_buffer_pool_shm_checksum | ON | > | innodb_buffer_pool_shm_key > | 0 | > | innodb_buffer_pool_size | 34359738368 | > > +---------------------------------------+-------------+ > > Ok, great. Are you experiencing more errors? If slurmdbd aborts() again could > you please attach the output of: > > $ gdb /path/to/core slurmdbd > (gdb) set pagination off > (gdb) set print pretty on > (gdb) bt > (gdb) thread apply all bt full > > Thanks. > ------------------------------------------------------------------------ > You are receiving this mail because: > > * You reported the bug. > So far so good in general though, slurmdbd is at 12%mem at 7.394g resident after 20hrs. Hopefully the cleanup and the tuning helped. Is there a scenario where if mysqld isn't supplying the answer to the slurmdbd (sacct/sreport) query quickly enough that the slurmdbd memory footprint would be expected to get bigger? On 11/18/2016 10:58 AM, bugs@schedmd.com wrote: > > *Comment # 9 <https://bugs.schedmd.com/show_bug.cgi?id=3260#c9> on bug > 3260 <https://bugs.schedmd.com/show_bug.cgi?id=3260> from Alejandro > Sanchez <mailto:alex@schedmd.com> * > (In reply to Chris fromcomment #8 <show_bug.cgi?id=3260#c8>) > > Could you explain a little more about the 4GB buffer size and what those > error messages with Buffer size limit exceeded mean? Is that a > slurmdbd > compiled in limit and if the data returned from sql exceeds > it we start to > leak memory? > > There's this max buffer definition in the code: > src/common/pack.h:#define MAX_BUF_SIZE ((uint32_t) 0xffff0000) /* avoid going > over 32-bits */ > > When slurmdbd has to pack and transmit buffer information the max buffer size > is defined to 4294901760. > > > Sorry for the delayed reply. I've made some changes and haven't seen the > errors since about 10:30am when I cancelled the reporting script. > > > I've updated some of the metrics we run to use squeue rather than > sacct and > disabled the month until now reports which query with > sreport and sacct. > We've been monitoring slurmdbd memory usage and > restarted it twice when the > usage reached 60 and 67% of memory on > the box. > > The slurmcl2 user which runs slurmdbd has unlimited core: > > > sudo su --shell=/bin/bash - slurmcl2 > Last login: Thu Nov 17 > 14:01:13 EST 2016 on pts/10 > -bash-4.2$ ulimit -a > core file size > (blocks, -c) 0 > data seg size (kbytes, -d) unlimited > scheduling > priority (-e) 0 > file size (blocks, -f) unlimited > pending signals > (-i) 255917 > max locked memory (kbytes, -l) unlimited > max memory > size (kbytes, -m) unlimited > open files (-n) 16384 > pipe size (512 > bytes, -p) 8 > POSIX message queues (bytes, -q) 819200 > real-time > priority (-r) 0 > stack size (kbytes, -s) 8192 > cpu time (seconds, > -t) unlimited > max user processes (-u) 4096 > virtual memory (kbytes, > -v) unlimited > file locks (-x) unlimited > > Can you please set the 'core file size' for slurmcl2 to unlimited? This way if > slurmdbd aborts again we can backtrace the generated core and figure out what's > going on. > > > I'm not 100% positive of the order of events, but yes /var did fill on that > day and that is where the archives and logs and core files and > mysql tmp > files all go, but at least mysql db is on a different > /var/lib/mysql > partition. > > I've since grown the /var partition > and removed the partial core files: > > [harwell@drdslurm0001 SPECS]$ > df -hl /var/ /var/lib/mysql/ > Filesystem Size Used Avail Use% Mounted > on > /dev/mapper/data-var 96G 7.8G 89G 9% /var > > /dev/mapper/data-mysql 32G 18G 15G 56% /var/lib/mysql > > I've > decreased retention times to 30 days, stepping 1 day at a time and > > doing the purges by hand with something like this: > y=720hours ; > > > for i in {60..45} ; do > h=$(( 24*$i )) ; > x=${h}hours ; > echo > /opt/slurmcl15/bin/sacctmgr -i archive dump > > Directory=/var/spool/slurmcl15/ Events Jobs Steps Suspend > PurgeEventAfter=$x > PurgeJobAfter=$x PurgeStepAfter=$y > PurgeSuspendAfter=24hours ; > sudo -u slurmcl2 > /opt/slurmcl15/bin/sacctmgr -i archive dump > > Directory=/var/spool/slurmcl15/ Events Jobs Steps Suspend > PurgeEventAfter=$x > PurgeJobAfter=$x PurgeStepAfter=$y > PurgeSuspendAfter=24hours ; > done > > We did see an uptick in jobs > per day in Nov. > > Setting innodb buffer to 50MB makes me a little > nervous since slurmdbd has > been ballooning up to 60%. I have doubled > it to 32GB and increased the > instances from the default of 1 in this > version to 32: > > Current database: slurmcl2db > > > +---------------------------------------+-------------+ > | > Variable_name | Value | > > +---------------------------------------+-------------+ > | > innodb_buffer_pool_instances | 32 | > | innodb_buffer_pool_populate | > OFF | > | innodb_buffer_pool_restore_at_startup | 0 | > | > innodb_buffer_pool_shm_checksum | ON | > | innodb_buffer_pool_shm_key > | 0 | > | innodb_buffer_pool_size | 34359738368 | > > +---------------------------------------+-------------+ > > Ok, great. Are you experiencing more errors? If slurmdbd aborts() again could > you please attach the output of: > > $ gdb /path/to/core slurmdbd > (gdb) set pagination off > (gdb) set print pretty on > (gdb) bt > (gdb) thread apply all bt full > > Thanks. > ------------------------------------------------------------------------ > You are receiving this mail because: > > * You reported the bug. > (In reply to Chris from comment #11) > So far so good in general though, slurmdbd is at 12%mem at 7.394g > resident after 20hrs. Hopefully the cleanup and the tuning helped. Great. Let's leave the bug open for a while and slurmdbd under observation for a few days just in case any new issue happens. > Is there a scenario where if mysqld isn't supplying the answer to the > slurmdbd (sacct/sreport) query quickly enough that the slurmdbd memory > footprint would be expected to get bigger? If mysqld isn't supplying a result to slurmdbd I would not expect slurmdbd memory footprint to increase. I would if the query result is large such as sacct/sreport queries with large date ranges. Chris, any more memory footprints, issues? No crashes yet :> We've seen the memory go to ~50% and it is sitting at 40% now. Perhaps this is just the new norm with the number of jobs we have? Last started Thu 2016-11-17 14:30:26 EST; 4 days ago We have issued reconfigure a few times, would that be expected to clear memory? top - 12:17:35 up 258 days, 18:44, 1 user, load average: 0.06, 0.09, 0.13 Tasks: 401 total, 2 running, 399 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 0.1 sy, 0.0 ni, 99.8 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 65536736 total, 3657260 free, 58970704 used, 2908772 buff/cache KiB Swap: 67108860 total, 49677032 free, 17431828 used. 6177936 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20307 mysql 20 0 36.387g 0.028t 4720 S 1.6 46.5 846:21.45 mysqld 20364 slurmcl2 2 -18 27.023g 0.025t 1112 S 0.5 40.2 140:34.32 slurmdbd We do see warnings pretty much every hour about large processing times for hourly_rollups. Is any action required or suggested there? Nov 21 00:02:11 drdslurm0001.en.desres.deshaw.com slurmdbd[20364]: Warning: Note very large processing time from hourly_rollup for cl: usec=130528211 began=00:00:00.724 (In reply to Chris from comment #14) > No crashes yet :> Good. > We've seen the memory go to ~50% and it is sitting at 40% now. Perhaps this > is just the new norm with the number of jobs we have? Obviously the more pressure querying slurmdbd the more memory it's gonna consume, but 32GB of memory seems ugly. > We have issued reconfigure a few times, would that be expected to clear > memory? Looking at what reconfig code does, don't think so. It just re-reads the configuration, sets missing uids, resets the logging... but it doesn't look it clears memory at all. > We do see warnings pretty much every hour about large processing times for > hourly_rollups. Is any action required or suggested there? I see in the logs (from just Nov 15th) several connections from these UID's: CONNS UID's 1944 UID:0 244 UID:11580 4418 UID:12474 2 UID:12858 183 UID:13013 1524 UID:13120 1 UID:13664 2724 UID:23532 3683 UID:40043 5114 UID:406 - Do you know if any of these uid's are constantly querying slurmdbd? - How often are sacct/sreport requesting info to slurmdbd? Having a script/cron that constantly calls sacct/sreport can also cause the slurmdbd to slow down, which is why I ask. Yes, we have several workflows and health checks which submit some jobs and then query sacct for completion and exit status. Is there a mechanism for telling slurmdbd to drop or expire cache sooner without restarting? On 11/23/2016 05:59 AM, bugs@schedmd.com wrote: > > *Comment # 15 <https://bugs.schedmd.com/show_bug.cgi?id=3260#c15> on > bug 3260 <https://bugs.schedmd.com/show_bug.cgi?id=3260> from > Alejandro Sanchez <mailto:alex@schedmd.com> * > (In reply to Chris fromcomment #14 <show_bug.cgi?id=3260#c14>) > > No crashes yet :> > > Good. > > > We've seen the memory go to ~50% and it is sitting at 40% now. Perhaps this > is just the new norm with the number of jobs we have? > > Obviously the more pressure querying slurmdbd the more memory it's gonna > consume, but 32GB of memory seems ugly. > > > We have issued reconfigure a few times, would that be expected to clear > memory? > > Looking at what reconfig code does, don't think so. It just re-reads the > configuration, sets missing uids, resets the logging... but it doesn't look it > clears memory at all. > > > We do see warnings pretty much every hour about large processing times for > hourly_rollups. Is any action required or suggested there? > > I see in the logs (from just Nov 15th) several connections from these UID's: > > CONNS UID's > 1944 UID:0 > 244 UID:11580 > 4418 UID:12474 > 2 UID:12858 > 183 UID:13013 > 1524 UID:13120 > 1 UID:13664 > 2724 UID:23532 > 3683 UID:40043 > 5114 UID:406 > > - Do you know if any of these uid's are constantly querying slurmdbd? > - How often are sacct/sreport requesting info to slurmdbd? Having a script/cron > that constantly calls sacct/sreport can also cause the slurmdbd to slow down, > which is why I ask. > ------------------------------------------------------------------------ > You are receiving this mail because: > > * You reported the bug. > (In reply to Chris from comment #16) > Yes, we have several workflows and health checks which submit some jobs > and then query sacct for completion and exit status. Then probably all these workflows are making a lot of pressure on slurmdbd, depending on the frequency of the requests and the size of the results. > Is there a mechanism for telling slurmdbd to drop or expire cache sooner > without restarting? Can you please elaborate this a little bit more? Oh, I just found myself wondering what slurmdbd is doing with all that memory. I figure it can't all be uncommitted new information, but is probably also some amount of old job information that it keeps to answer repeated queries rather than relying on mysql. _if_ that is the case I thought maybe it there was some way to tell it to forget about those and go to the db next time. So .. are we doing it differently than other people? Should we not be querying with sacct? Should we instead be using a job completion plugin of somekind? What is the recommendation? (Or maybe we should just go buy something with more mem and move on....) On 11/23/2016 09:43 AM, bugs@schedmd.com wrote: > > *Comment # 17 <https://bugs.schedmd.com/show_bug.cgi?id=3260#c17> on > bug 3260 <https://bugs.schedmd.com/show_bug.cgi?id=3260> from > Alejandro Sanchez <mailto:alex@schedmd.com> * > (In reply to Chris fromcomment #16 <show_bug.cgi?id=3260#c16>) > > Yes, we have several workflows and health checks which submit some jobs> and then query sacct for completion and exit status. > > Then probably all these workflows are making a lot of pressure on slurmdbd, > depending on the frequency of the requests and the size of the results. > > > Is there a mechanism for telling slurmdbd to drop or expire cache sooner> without restarting? > > Can you please elaborate this a little bit more? > ------------------------------------------------------------------------ > You are receiving this mail because: > > * You reported the bug. > (In reply to Chris from comment #18) > Oh, I just found myself wondering what slurmdbd is doing with all that > memory. I figure it can't all be uncommitted new information, but is > probably also some amount of old job information that it keeps to answer > repeated queries rather than relying on mysql. _if_ that is the case I > thought maybe it there was some way to tell it to forget about those and > go to the db next time. > > So .. are we doing it differently than other people? Should we not be > querying with sacct? Should we instead be using a job completion plugin > of somekind? What is the recommendation? (Or maybe we should just go > buy something with more mem and move on....) Well slurmdbd retains the results of all these queries in memory until it buffers them out to the clients or to slurmctld. So not all memory is uncommitted new information indeed. That's why depending on the sacct query frequency and size of the query results the slurmdbd memory footprint can increase. Can you please identify the list of exact sacct queries being executed against slurmdbd and how often are they executed? For instance having sacct querying for job records starting from long time ago with a high frequency is an example of this. I'm not saying that you should not be querying with sacct, just that depending on frequency and size it may have a memory impact, that's why it's important to precisely identify the kind of queries being executed. Using a job completion plugin to store/retrieve the information related to finished jobs may help decouple a bit slurmdbd from your workflow of querying the status of these finished jobs. I will need to gather that info. I know in some cases it is just sacct -j $jids -o JobId,State,Exit and in other cases just 'sacct' with no arguments. In some cases it uses --brief and/or --starttime=$(d ate +%H:%M --date='1 hour ago'). Does slurmdbd have a debug flag or level which will log the sacct queuries or should I wrap sacct to log somewhere and report back? On 11/23/2016 10:29 AM, bugs@schedmd.com wrote: > > *Comment # 19 <https://bugs.schedmd.com/show_bug.cgi?id=3260#c19> on > bug 3260 <https://bugs.schedmd.com/show_bug.cgi?id=3260> from > Alejandro Sanchez <mailto:alex@schedmd.com> * > (In reply to Chris fromcomment #18 <show_bug.cgi?id=3260#c18>) > > Oh, I just found myself wondering what slurmdbd is doing with all that> memory. I figure it can't all be uncommitted new information, but is > > probably also some amount of old job information that it keeps to > answer > repeated queries rather than relying on mysql. _if_ that is > the case I > thought maybe it there was some way to tell it to forget > about those and > go to the db next time. > > So .. are we doing it > differently than other people? Should we not be > querying with sacct? > Should we instead be using a job completion plugin > of somekind? What > is the recommendation? (Or maybe we should just go > buy something > with more mem and move on....) > > Well slurmdbd retains the results of all these queries in memory until it > buffers them out to the clients or to slurmctld. So not all memory is > uncommitted new information indeed. That's why depending on the sacct query > frequency and size of the query results the slurmdbd memory footprint can > increase. > > Can you please identify the list of exact sacct queries being executed against > slurmdbd and how often are they executed? For instance having sacct querying > for job records starting from long time ago with a high frequency is an example > of this. I'm not saying that you should not be querying with sacct, just that > depending on frequency and size it may have a memory impact, that's why it's > important to precisely identify the kind of queries being executed. > > Using a job completion plugin to store/retrieve the information related to > finished jobs may help decouple a bit slurmdbd from your workflow of querying > the status of these finished jobs. > ------------------------------------------------------------------------ > You are receiving this mail because: > > * You reported the bug. > (In reply to Chris from comment #20) > I will need to gather that info. I know in some cases it is just sacct > -j $jids -o JobId,State,Exit and in other cases just 'sacct' with no > arguments. In some cases it uses --brief and/or --starttime=$(d > ate +%H:%M --date='1 hour ago'). If you can identify precisely these exact commands that may be helpful. > Does slurmdbd have a debug flag or level which will log the sacct > queuries or should I wrap sacct to log somewhere and report back? There's 'sacctmgr show transaction', but these will only list additions, modifications and removals, not queries. You can use the slurmdbd.conf DebugFlags DB_JOB, DB_QUERY, DB_STEP which will result in more verbose logging on SQL statements/queries when dealing with jobs, transactions and steps respectively. Anyhow, I'd suggest looking at history of commands of the UID's in comment #15 or check the scripts/crons that generate such queries. Then post them here in a new comment and also may be useful manually executing some of these and see if slurmdbd memory consumption changes while they are being executed. Here is a five minute slice from some: Nov 23 11:00:53 sacct: Nov 23 11:00:53 sacct: Nov 23 11:01:05 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26015188,26015189 Nov 23 11:01:05 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26015188,26015189 Nov 23 11:01:11 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25811768 Nov 23 11:01:11 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25811768 Nov 23 11:01:13 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25961228,26042983 Nov 23 11:01:13 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25961228,26042983 Nov 23 11:01:23 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25784701 Nov 23 11:01:23 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25784701 Nov 23 11:01:29 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25969744,25992646 Nov 23 11:01:29 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25969744,25992646 Nov 23 11:01:37 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26035833,26009823 Nov 23 11:01:37 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26035833,26009823 Nov 23 11:02:05 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26015188,26015189 Nov 23 11:02:05 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26015188,26015189 Nov 23 11:02:11 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25811768 Nov 23 11:02:11 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25811768 Nov 23 11:02:13 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25961228,26042983 Nov 23 11:02:13 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25961228,26042983 Nov 23 11:02:23 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25784701 Nov 23 11:02:23 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25784701 Nov 23 11:02:30 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25969744,25992646 Nov 23 11:02:30 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25969744,25992646 Nov 23 11:02:37 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26035833,26009823 Nov 23 11:02:37 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26035833,26009823 Nov 23 11:03:05 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26015188,26015189 Nov 23 11:03:05 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26015188,26015189 Nov 23 11:03:11 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25811768 Nov 23 11:03:11 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25811768 Nov 23 11:03:14 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25961228,26042983 Nov 23 11:03:14 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25961228,26042983 Nov 23 11:03:23 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25784701 Nov 23 11:03:23 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25784701 Nov 23 11:03:30 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25969744,25992646 Nov 23 11:03:30 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25969744,25992646 Nov 23 11:03:38 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26035833,26009823 Nov 23 11:03:38 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26035833,26009823 Nov 23 11:04:06 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26015188,26015189 Nov 23 11:04:06 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26015188,26015189 Nov 23 11:04:11 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25811768 Nov 23 11:04:11 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25811768 Nov 23 11:04:14 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25961228,26042983 Nov 23 11:04:14 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25961228,26042983 Nov 23 11:04:23 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25784701 Nov 23 11:04:23 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25784701 Nov 23 11:04:30 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25969744,25992646 Nov 23 11:04:30 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25969744,25992646 Nov 23 11:04:38 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26035833,26009823 Nov 23 11:04:38 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26035833,26009823 Nov 23 11:05:02 sacct: --brief --starttime=10:05 Nov 23 11:05:02 sacct: --brief --starttime=10:05 Nov 23 11:05:02 sacct: --brief --starttime=10:05 Nov 23 11:05:02 sacct: --brief --starttime=10:05 Nov 23 11:05:06 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26015188,26015189 Nov 23 11:05:06 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26015188,26015189 Nov 23 11:05:11 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25811768 Nov 23 11:05:11 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25811768 Nov 23 11:05:14 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25961228,26042983 Nov 23 11:05:14 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25961228,26042983 Nov 23 11:05:24 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25784701 Nov 23 11:05:24 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25784701 Nov 23 11:05:30 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25969744,25992646 Nov 23 11:05:30 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 25969744,25992646 Nov 23 11:05:38 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26035833,26009823 Nov 23 11:05:38 sacct: --parsable2 --noheader --format=JobID,State,ExitCode --jobs 26035833,26009823 On 11/23/2016 10:59 AM, bugs@schedmd.com wrote: > > *Comment # 21 <https://bugs.schedmd.com/show_bug.cgi?id=3260#c21> on > bug 3260 <https://bugs.schedmd.com/show_bug.cgi?id=3260> from > Alejandro Sanchez <mailto:alex@schedmd.com> * > (In reply to Chris fromcomment #20 <show_bug.cgi?id=3260#c20>) > > I will need to gather that info. I know in some cases it is just sacct> -j $jids -o JobId,State,Exit and in other cases just 'sacct' with no > > arguments. In some cases it uses --brief and/or --starttime=$(d > > ate +%H:%M --date='1 hour ago'). > > If you can identify precisely these exact commands that may be helpful. > > > Does slurmdbd have a debug flag or level which will log the sacct> queuries or should I wrap sacct to log somewhere and report back? > > There's 'sacctmgr show transaction', but these will only list additions, > modifications and removals, not queries. > > You can use the slurmdbd.conf DebugFlags DB_JOB, DB_QUERY, DB_STEP which will > result in more verbose logging on SQL statements/queries when dealing with > jobs, transactions and steps respectively. Anyhow, I'd suggest looking at > history of commands of the UID's incomment #15 <show_bug.cgi?id=3260#c15> or check the scripts/crons that > generate such queries. Then post them here in a new comment and also may be > useful manually executing some of these and see if slurmdbd memory consumption > changes while they are being executed. > ------------------------------------------------------------------------ > You are receiving this mail because: > > * You reported the bug. > Chris - Could you please recompile the code with --enable-memory-leak-debug option? we do create lists for each transaction which sort of caches the memory for the list to be reused later. This should be fairly small, but might be related. This option will not keep the pointers around. You would only need to do this for the slurmdbd node. Another thing you can do to help us diagnose this issue is to have a cron job which runs every minute or so looking at the memory footprint and see at what time the memory grows. Our guess is there is no leak but perhaps something in the mysql API is holding onto memory.
[root@drdslurm0001 bin]# systemctl status -l slurmcl15dbd
● slurmcl15dbd.service - Slurm DBD accounting daemon
Loaded: loaded (/usr/lib/systemd/system/slurmcl15dbd.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/slurmcl15dbd.service.d
└─10-options.conf
Active: active (running) since Thu 2016-11-17 14:30:26 EST; 2 weeks 0 days ago
Main PID: 20364 (slurmdbd)
CGroup: /system.slice/slurmcl15dbd.service
└─20364 /opt/slurmcl15/sbin/slurmdbd -n -18
From top
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20364 slurmcl2 2 -18 41.410g 0.018t 1472 S 0.3 30.0 490:02.16 slurmdbd
Recompiled with --enable-memory-leak-debug and restarted with systemd restart ....
Would that print something in the logs normally, or does this need to started with valgrind or something?
(In reply to Chris from comment #27) > Recompiled with --enable-memory-leak-debug and restarted with systemd > restart .... Good, thanks. > Would that print something in the logs normally, or does this need to > started with valgrind or something? No you won't notice anything special in the logs. We do create lists for each transaction which sort of caches the memory for the list to be reused later. It should be fairly small, but might be related to the bug. When MEMORY_LEAK_DEBUG is set to 1, the cache is disabled. Each memory request will be satisified with a separate xmalloc request. When the memory is no longer required, it is immediately freed. This means valgrind can identify where exactly any leak associated with the use of the list functions originates. Another thing would be have a cron job run every minute or so looking at the memory footprint and see at what time the memory grows. Our guess is there is no leak but perhaps something in the mysql API is holding onto memory. From what you sent it appeared the mysqld had a very similar footprint. With the above option at least we will file out the list cache. Chris, have you managed to monitor the footprint with any cron job or similar? any updates on this? Haven't seen feedback since three weeks ago; marking this as resolved/timedout. Please reopen if this is still an active concern. Interestingly we stopped seeing the rapid memory growth once we installed the version compiled with the suggested configure option. -------- Original Message -------- Subject: [Bug 3260] slurmdbd out of memory, dumping core From: bugs@schedmd.com Date: Dec 28, 2016, 06:01 To: "Harwell, Chris" <Chris.Harwell@DEShawResearch.com> Alejandro Sanchez<mailto:alex@schedmd.com> changed bug 3260<https://bugs.schedmd.com/show_bug.cgi?id=3260> What Removed Added Resolution --- TIMEDOUT Status UNCONFIRMED RESOLVED Comment # 30<https://bugs.schedmd.com/show_bug.cgi?id=3260#c30> on bug 3260<https://bugs.schedmd.com/show_bug.cgi?id=3260> from Alejandro Sanchez<mailto:alex@schedmd.com> Haven't seen feedback since three weeks ago; marking this as resolved/timedout. Please reopen if this is still an active concern. ________________________________ You are receiving this mail because: * You reported the bug. Great, it seems disabling the cache worked well then. Marking as resolved/infogiven. |