`tcp_inet` driver crashes with largest allowed `fd`

I’m getting the following error reports, and I’m clueless what to do about these:

=ERROR REPORT==== 19-May-2024::16:17:33.621905 ===
driver_select(0x0000000163f77178, 24581, ERL_DRV_WRITE ERL_DRV_USE, 1) by tcp_inet driver #Port<0.2453
8> failed: fd=24581 is larger than the largest allowed fd=24575

Running >25000 processes connecting to a remote server, MacOS 14.5, M3 Pro, OTP 26.2.5, ulimit -n set to unlimited. That error is reported for almost all of the processes I’m spawning. Anybody has seen such a thing before? Any help offered? :pleading_face:

Shell ran as follows:

ERL_FLAGS="+P 134217727 +Q 134217727" ERL_MAX_PORTS=134217727 rebar3 shell
Erlang/OTP 26 [erts-14.2.5] [source] [64-bit] [smp:12:12] [ds:12:12:10] [async-threads:1] [jit]
Eshell V14.2.5 (press Ctrl+G to abort, type help(). for help)
1> ...
1 Like

You might be hitting the system limits vs session limits, did you check to see what system limits are? (launchctl limit maxfiles and sysctl -a | grep maxfile)

❯ launchctl limit maxfiles
maxfiles    256            unlimited      
❯ sysctl -a | grep maxfile
kern.maxfiles: 276480
kern.maxfilesperproc: 138240

Hmm… soft limit is 256, which is pretty much on a different order of magnitude from the limit I’m hitting now, but looking up what does that do now :thinking:

Huh, look at this, my system says to be using erts_poll, and apparently the 24576 value seems to be hardcoded here…

1> erlang:system_info(check_io).

otp/erts/emulator/sys/common/erl_poll.c at a96e544d7269dcda6da012ffd766139534e2d116 · erlang/otp · GitHub (added in this commit Don't crash if the number of file descriptors is unlimited · erlang/otp@667e1b0 · GitHub)

How to change that? It seems to me like it is the BEAM who is complaining, not my system :confused:

1 Like

Right, got a solution. Turns out the BEAM does not precisely like setting ulimit to unlimited, it would crash when using the atom as an integer down the line, as fixed in the PR I linked above. In Linux, if you set ulimit -n unlimited, it would automatically replace it with the largest value the kernel allows (which is a compilation flag, to increase that you need to recompile the kernel), but in MacOS, the kernel allows you to set it to the actual string unlimited.

Setting ulimit -n 1048576, or to any other integer, does actually fix the issue :smiling_face_with_tear: