314 – sview scalability issues

Ticket 314 - sview scalability issues

Summary: sview scalability issues

Status:	RESOLVED DUPLICATE of ticket 345

Alias:	None

Product:	Slurm
Classification:	Unclassified
Component:	Other (show other tickets)
Version:	2.6.x
Hardware:	Linux Linux

Severity:	3 - Medium Impact
Assignee:	Moe Jette
QA Contact:

URL:

Depends on:
Blocks:

Reported:	2013-06-04 04:47 MDT by Yiannis Georgiou
Modified:	2013-06-21 07:38 MDT (History)
CC List:	1 user (show)

See Also:
Site:	Universitat Dresden (Germany)
Slinky Site:	---
Alineos Sites:	---
Atos/Eviden Sites:	---
Confidential Site:	---
Coreweave sites:	---
Cray Sites:	---
DS9 clusters:	---
Google sites:	---
HPCnow Sites:	---
HPE Sites:	---
IBM Sites:	---
NOAA SIte:	---
NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---
OCF Sites:	---
Recursion Pharma Sites:	---
SFW Sites:	---
SNIC sites:	---
Tzag Elita Sites:	---
Linux Distro:	---
Machine Name:
CLE Version:
Version Fixed:
Target Release:	---
DevPrio:	---
Emory-Cloud Sites:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this ticket.

Description Yiannis Georgiou 2013-06-04 04:47:34 MDT

In TU-Dresden cluster we observed problems with sview which hangs if many jobs have been submitted/are managed by SLURM . The cores where sview is running go to 100% CPU load, and sview stops updating the information in the GUI.

Have you seen this before? Are there configuration parameters that we can change to improve the scalability of sview ?

thanks 
Yiannis

Comment 1 Danny Auble 2013-06-04 04:51:27 MDT

How many jobs are you talking about?  I am guessing thousands.  There is probably work that could be done in terms of scalability.  There is so much going on with sview that does need to, like updating buttons that are not visible and such.

I would propose we look at some way to only change the status of what is currently displayed.  I am not sure how difficult that will be.

Comment 2 Moe Jette 2013-06-04 12:12:11 MDT

I did some scalability testing with sview a few years ago. My recollection is that sivew was unable to manage more than a couple thousand elements (nodes, jobs, or whatever). Almost all of the time was consumed by the underlying GTK library. Some enhancements were made to improve performance, but I'm not sure how much more is possible except by reducing the number of elements displayed, say only showing the highest priority jobs instead of all jobs.

Comment 3 Danny Auble 2013-06-21 07:38:32 MDT

Another bug with a possible patch was made for this issue, so marking it as duplicate.

*** This ticket has been marked as a duplicate of ticket 345 ***