I’m encountering an issue with OTP 27 on Arch Linux. I can’t replicate it on Debian/Ubuntu and I’m wondering if someone can point me in the right direction to solve it. I need to use a kernel socket option and for some reason it reports it is not supported:
I compiled & installed OTP using asdf. Using docker it works like expected. I’m fine with digging deeper, but I’d need a little bit of advice where to dig
I’d look into what the OTP VM does for this call, and double-check that the needed support (header files, manifest constants) was present at OTP compile-time.
If OTP does the check at runtime, check if libc fails immediately or if it’s the kernel that signals failure. A combination of printf() and strace should give you that.
Also, if its ok in your case, try do the update with debug enabled:
socket:debug(true), % Enable global (socket-) debug
socket:setopt(Sock, {otp, debug}, true). % Enable debug for you socket
socket:setopt(Sock, {ip, recvtos}, Value), % Try set value
socket:setopt(Sock, {otp, debug}, false). % Disable debug for you socket
socket:debug(false), % Disable global (socket-) debug
Finally, you could try the really dirty approach, just to test if its actually supported,
by using the integer value for the option. On my Ubuntu its 13, but I do not know
if its a different value on Arch.
So the socket NIF includes netinet/in.h, that includes bits/in.h, then by compiler header magic it resolves to x86_64-linux-gnu/bits/in.h through the compile target.
Something seems to go wrong on your platform. The socket NIF includes netinet/in.h, and by that it should find IP_RECVTOS. Are target specific header files missing?
I think they’re not missing, it’s just that the target doesn’t seem to be in the path on my system. Shouldn’t the compiler be able to figure that out anyway? Or is this a mismatch between Arch & Ubuntu/Debian?
I guess that the compiler has some include path that depends on the target. How that works and how it is set up (per system / installation) I don’t know. It usually “just works”.
As it turns out, when you use a “native opt” you have to encode the value yourself and
pass it as a binary (socket does not know how to encode the value).
socket:setopt_native/3 takes a Value argument as either an integer, boolean, or binary. It assumes that Value is of the correct type and encodes it, for example: socket:setopt_native(Sock, {ip,13}, true).
I did some more work on this. I found the option is seen by the compiler and there is a different reason why it doesn’t work like expected. On Arch IPPROTO_IP is 0, same as on Debian. However IPPROTO_HOPOPTS also is 0. I confirmed by testing it:
Well, these protocols are protocols transported on the IP protocol, right? So the IP protocol itself cannot have an IP protocol number. But you need a number to specify the IP protocol level for e.g setsockopt().
You also need a number to specify the Socket level, which is a layer over the IP protocols; the libc “user” layer. Linux has it as 1, which collides with ICMP. FreeBSD has it as 0xffff.
Well, these protocols are protocols transported on the IP protocol, right? So the IP protocol itself cannot have an IP protocol number. But you need a number to specify the IP protocol level for e.g setsockopt().
That makes sense. Though we could reason that the default (0 or ip) should not be relied upon /etc/protocols but translated to the correct value by otp.
The socket NIF actually does not rely on /etc/protocols, or rather that getprotoent() enumerates the essential protocols. (getprotoent() probably just loops over /etc/protocols, but it could be implemented smarter)
The protocols ip, ipv6, tcp, udp, sctp, rm and igmp have fallbacks that one can see in that ip gets translated to 0 even when 0 gets translated to hopopt.
Therefore it seems that the socket NIF translates ip correctly into 0.
But I think the fallback implementation is a bit flawed. If getprotoent() enumerates a protocol it overrides the fallback, so if getprotoent only enumerates hopopt for 0 it is lost that ip is an alias.
Does socket:setopt(S, {hopopt,recvtos}, true) work?
Therefore it seems that the socket NIF translates ip correctly into 0.
Yes, except for the way the ip options are stored as persistent term. This is actually the only place I can think of it’s a bit flawed. When checking if recvtos is a valid ip option it’s being checked on the atom ip and not on 0. And while the nif returns recvtos as an option on 0 it is being put in persistent term storage as hopopt.
Sorry, I had a vague memory of that earlier post but didn’t find it.
All valid combinations are cached in the persistent term to require just a single lookup for socket:setopt/3. So in this case we should have #{ {hopopt,recvtos} := {0,13}, 'HOPOPT' := {0,13} } and are missing the fallback {ip,recvtos} := {0,13} because it was lost that ip also is an alias for 0.
This is to avoid first looking up Level to NumLevel and then looking up {NumLevel,Opt} to NumOpt.