Ticket 1154

Summary: setsockopt warning with srun
Product: Slurm Reporter: Nicolas Joly <njoly>
Component: OtherAssignee: David Bigagli <david>
Status: RESOLVED FIXED QA Contact:
Severity: 6 - No support contract    
Priority: --- CC: brian, da
Version: 14.11.x   
Hardware: Other   
OS: Other   
Site: -Other- Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: 14.11.0rc2 Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---
Attachments: setsockopt portability fix

Description Nicolas Joly 2014-10-09 00:43:04 MDT
Created attachment 1303 [details]
setsockopt portability fix

Hi,

When running jobs/steps on my NetBSD/amd64 machine with srun, i do see a setsockopt(2) warning :

njoly@lanfeust [tmp/slurm]> srun hostname
srun: error: Unable to set low water socket option: Invalid argument
lanfeust.sis.pasteur.fr

The problem comes from net_set_low_water() function :

int net_set_low_water(int sock, size_t size)
{
        if (setsockopt(sock, SOL_SOCKET, SO_RCVLOWAT,
                       (const void *) &size, sizeof(size)) < 0) {
                error("Unable to set low water socket option: %m");
                return -1;
        }

        return 0;
}

Where setsockopt is given a value/length using a variable of type size_t; but according to the standard[1], for SO_RCVLOWAT, a type "int" (not integer) is expected.

And NetBSD enforce it by rejecting provided lengths that do not match sizeof(int). For 64bit archs, such as amd64, where sizeof(size_t) != sizeof(int) the syscall fail with EINVAL ...

The provided patch fix this small problem by using an "int" type for net_set_low_water() size argument.

Thanks.

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/setsockopt.html
Comment 1 David Bigagli 2014-10-09 04:53:47 MDT
Hi,
   yes you are right it should be 4 bytes, but the type instead of int should
be socklen_t as per man page.

David
Comment 2 David Bigagli 2014-10-09 05:47:39 MDT
Fixed thanks. Commit fafe6f6bc324a.


David
Comment 3 Nicolas Joly 2014-10-09 18:44:22 MDT
(In reply to David Bigagli from comment #1)
> Hi,

Hi David,

>    yes you are right it should be 4 bytes, but the type instead of int should
> be socklen_t as per man page.

I disagree,

  setsockopt(sock, SOL_SOCKET, SO_RCVLOWAT, &size, sizeof(size))

The last argument "sizeof(size)" should indeed be of type socklen_t, but not the size variable itself which must remain of "int" type.

Thanks.
Comment 4 David Bigagli 2014-10-10 04:30:45 MDT
sizeof(int) = sizeof(socklen_t) on the plaforms of interest.

The code then should be:

extern int net_set_low_water(int sock, int size);
if (setsockopt(sock, SOL_SOCKET, SO_RCVLOWAT,
 		       (const void *) &size, sizeof(socklen_t)) < 0) {

to be 100% conformant, yet going to break if sizeof(int) != sizeof(socklen_t). :-)

David