Ticket 19561 - pam_slurm_adopt
Summary: pam_slurm_adopt
Status: OPEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Limits (show other tickets)
Version: 23.11.4
Hardware: Linux Linux
: 6 - No support contract
Assignee: Jacob Jenson
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2024-04-09 21:48 MDT by lihang
Modified: 2024-11-01 16:13 MDT (History)
1 user (show)

See Also:
Site: -Other-
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description lihang 2024-04-09 21:48:03 MDT
We can limite login compute node with pam_slurm_adopt.But the resource limitation depends.
     If login compute node with password,cgroup failed to limit resources.But if login with public key, cgroup successed to limit resources.

     It seems that 
       for password login: pid that catched by pam_slurm_adopt will disappear quickly,and a new pid for sshd will give.
       for public key login: pid that catched by pam_slurm_adopt will not change.
     As a result,cgruop will not limite resouce of password login but limite the public key login

    I wonder if this is a slurm bug, or some problem with our settings.


 the /var/log/secure are as follow:
  for passwd:
   
     Apr  9 22:00:13 compute03 sshd[42241]: pam_sss(sshd:auth): authentication 
success; logname= uid=0 euid=0 tty=ssh ruser= rhost=10.28.0.41 user=lihang
     Apr  9 22:00:13 compute03 pam_slurm_adopt[42241]: Connection by user lihang: user has only one job 64
     Apr  9 22:00:13 compute03 pam_slurm_adopt[42241]: Process 42241 adopted into job 64
     Apr  9 22:00:13 compute03 sshd[42238]: Accepted keyboard-interactive/pam for lihang from 10.28.0.41 port 34992 ssh2
     Apr  9 22:00:13 compute03 sshd[42238]: pam_unix(sshd:session): session opened 
for user lihang by (uid=0)

   for public key:
       Apr  9 22:04:05 compute03 pam_slurm_adopt[42375]: Connection by user 
lihang: user has only one job 64
       Apr  9 22:04:05 compute03 pam_slurm_adopt[42375]: Process 42375 adopted into job 64
       Apr  9 22:04:05 compute03 sshd[42375]: Accepted publickey for lihang from 10.28.0.41 port 35006 ssh2: RSA SHA256:XGLK9nU/wUtTU+Ip9HwcKhI2Zgf0EF8VAKrDGxNIP+0
       Apr  9 22:04:05 compute03 sshd[42375]: pam_unix(sshd:session): session opened for user lihang by (uid=0)