Ticket 19561

Summary: pam_slurm_adopt
Product: Slurm Reporter: lihang <lihang>
Component: LimitsAssignee: Jacob Jenson <jacob>
Status: OPEN --- QA Contact:
Severity: 6 - No support contract    
Priority: --- CC: rundall
Version: 23.11.4   
Hardware: Linux   
OS: Linux   
Site: -Other- Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---

Description lihang 2024-04-09 21:48:03 MDT
We can limite login compute node with pam_slurm_adopt.But the resource limitation depends.
     If login compute node with password,cgroup failed to limit resources.But if login with public key, cgroup successed to limit resources.

     It seems that 
       for password login: pid that catched by pam_slurm_adopt will disappear quickly,and a new pid for sshd will give.
       for public key login: pid that catched by pam_slurm_adopt will not change.
     As a result,cgruop will not limite resouce of password login but limite the public key login

    I wonder if this is a slurm bug, or some problem with our settings.


 the /var/log/secure are as follow:
  for passwd:
   
     Apr  9 22:00:13 compute03 sshd[42241]: pam_sss(sshd:auth): authentication 
success; logname= uid=0 euid=0 tty=ssh ruser= rhost=10.28.0.41 user=lihang
     Apr  9 22:00:13 compute03 pam_slurm_adopt[42241]: Connection by user lihang: user has only one job 64
     Apr  9 22:00:13 compute03 pam_slurm_adopt[42241]: Process 42241 adopted into job 64
     Apr  9 22:00:13 compute03 sshd[42238]: Accepted keyboard-interactive/pam for lihang from 10.28.0.41 port 34992 ssh2
     Apr  9 22:00:13 compute03 sshd[42238]: pam_unix(sshd:session): session opened 
for user lihang by (uid=0)

   for public key:
       Apr  9 22:04:05 compute03 pam_slurm_adopt[42375]: Connection by user 
lihang: user has only one job 64
       Apr  9 22:04:05 compute03 pam_slurm_adopt[42375]: Process 42375 adopted into job 64
       Apr  9 22:04:05 compute03 sshd[42375]: Accepted publickey for lihang from 10.28.0.41 port 35006 ssh2: RSA SHA256:XGLK9nU/wUtTU+Ip9HwcKhI2Zgf0EF8VAKrDGxNIP+0
       Apr  9 22:04:05 compute03 sshd[42375]: pam_unix(sshd:session): session opened for user lihang by (uid=0)