Ticket 852 - sstat command can not use
Summary: sstat command can not use
Status: RESOLVED DUPLICATE of ticket 853
Alias: None
Product: Slurm
Classification: Unclassified
Component: Other (show other tickets)
Version: 2.6.2
Hardware: Linux Linux
: 3 - Medium Impact
Assignee: David Bigagli
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2014-06-02 20:29 MDT by toru matsuoka
Modified: 2014-06-03 02:36 MDT (History)
1 user (show)

See Also:
Site: CRAY
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
slurm.conf file (5.78 KB, text/plain)
2014-06-02 20:29 MDT, toru matsuoka
Details

Note You need to log in before you can comment on or make changes to this ticket.
Description toru matsuoka 2014-06-02 20:29:26 MDT
Created attachment 893 [details]
slurm.conf file

Hello,Slurm Support team !

I'm Toru Matsuoka in Cray Japan Engineer.

Please teach me about following contents.

We customer want use sstat commands. 

But , following error occured.

■sstat command

Note: the sstat  command requires that the jobacct_gather plugin be installed and operational.

[root@mgmt2 slurm]# sstat --format=AveCPU,AvePages,AveRSS,AveVMSize,JobID -j 27005
    AveCPU   AvePages     AveRSS  AveVMSize        JobID
---------- ---------- ---------- ---------- ------------
sstat: error: Malformed RPC of type 5020 received
sstat: error: slurm_receive_msgs: Header lengths are longer than data received
sstat: error: Malformed RPC of type 5020 received
sstat: error: slurm_receive_msgs: Header lengths are longer than data received
sstat: error: Malformed RPC of type 5020 received
sstat: error: slurm_receive_msgs: Header lengths are longer than data received
sstat: error: slurm_job_step_stat: unknown return given from e035: 9001 rc = Communication connection failure
sstat: error: slurm_job_step_stat: unknown return given from e036: 9001 rc = Communication connection failure
sstat: error: slurm_job_step_stat: unknown return given from e034: 9001 rc = Communication connection failure
sstat: error: problem getting step_layout for 27005.0: Communication connection failure

■sacct command 

It look likes use sacct command.

[root@mgmt2 slurm]# sacct --j 27061
       JobID    JobName  Partition    Account  AllocCPUS      State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
27061             prog4 ye016uta7+       root         40    RUNNING      0:0
27061.0       pmi_proxy                  root          2    RUNNING      0:0


In Slurm.conf , 

ProctrackType=proctrack/pgid
JobAcctGatherType parameter is not exist.

Is it necessary JobAcctGatherType in slurm.conf or cause of other problem?

Best Regards...
Toru Matsuoka
Comment 1 Moe Jette 2014-06-03 02:36:52 MDT
Duplicate of bug 853

*** This ticket has been marked as a duplicate of ticket 853 ***