When could networking code return eperm?

I work on an application that makes lots (thousands) of HTTP requests in a short time (in the order of seconds) using the wonderful gun library. Recently during testing lot of errors showed up in the logs. I couldn’t catch it while this was happening, so I only have some logs. In the application I have a gun_event event handler module and it looks like its connect_end/1 callback is called with #{error => eperm} parameter. I looked into the code and it looks like this situation happens when gun_tcp:connect/2 returns with {error, eperm} which in turn happens when gen_tcp:connect/4 returns with {error, eperm}. I’m not 100% sure in the later though.

Under what conditions can gen_tcp:connect/4 return {error, eperm}? The Erlang documentation doesn’t mention this explicitly, only that inet:posix() can be returned and eperm is included there. I’ve checked the Linux man page for connect(2) and it mentions that it returns EACCES or EPERM when

The user tried to connect to a broadcast address without
having the socket broadcast flag enabled or the connection
request failed because of a local firewall rule.

However, there are no broadcast addresses involved (and anyway, when I tried to connect to a broadcast address, I got enetunreach) and I also couldn’t reproduce this eperm error with firewall (I managed to get econnrefused for REJECT target and etimedout for DROP). Why do I get this eperm?

I shall mention that the test environment is Linux (I think RHEL 8.10, but I’m not sure) and we’re using OTP 26 with gun 2.0.1.

I think RHEL 8.10, but I’m not sure

Is, by any chance, SELinux enabled? From my very limited experience with RHEL, broken SELinux policies can result in a lot of seemingly unexplainable permission errors in the applications. Disclaimer: this is pure speculation.

2 Likes

Under what conditions can gen_tcp:connect/4 return {error, eperm}?

Might be solved already by now, but its likely a system socket limit ( ulimit man page ) you’re seeing. In Linux, sockets are treated as files. When you “run out of sockets,” you may be hitting the File Descriptor (FD) limit.

You can change the hard limit with the command:

$ ulimit -SHn 10000

Make sure you execute the ‘test’ under this current shell limit,

If you can’t do that (and sometimes you cant easily) make it system wide.

  • Check current limit: ulimit -n

  • Increase permanently:

    In /etc/security/limits.conf

    * hard nofile 100000
    * soft nofile 100000
    
  • Note: You’ll need to log out and back in for this to take effect.

Run your test again.

The next limit you might be hitting is the “ip local port range” limit in sysctl.

  • Check range: cat /proc/sys/net/ipv4/ip_local_port_range

  • Increase range: echo "1024 65535" > /proc/sys/net/ipv4/ip_local_port_range

To make the changes persistent,

  • Create a new file: /etc/sysctl.d/99-network-tuning.conf

  • Add the following line to the file:

    net.ipv4.ip_local_port_range = 1024 65535
    
  • And then run

sudo sysctl --system

Then run your test again..

I’m doing this on company time (Red Hat) and can justify this by helping a customer :wink: Subscription pays off !