Erlang diameter / diameter_sctp transport; send() on SCTP socket gets EAGAIN/EWOULDBLOCK

cristconst · December 4, 2023, 3:14am

Hi,

As far as I have seen on a running system and as far I understand from here:
https://www.erlang.org/docs/23/man/socket
sockets are non-blocking when created in Erlang. I assume this applies to gen_sctp sockets too. Pls. correct me if I am wrong.

Now:

based on some traces captured on a Linux system on which an Erlang diameter application is running (both strace and pcaps)
looking at the code in:

master, github.com/otp/lib/diameter/src/transport/diameter_sctp.erl

...
send(Sock, AssocId, StreamId, Bin) ->
    case gen_sctp:send(Sock, AssocId, StreamId, Bin) of
        ok ->
            ok;
        {error, Reason} ->
            x({send, Reason})
    end.
...

%% x/1

x(Reason) ->
    exit({shutdown, Reason}).

to me it looks like:

Regardless of the error in case of a send() system call which uses the socket the socket will be closed (gracefully)
Since the socket is asynch., everytime the peer will slow down due to some kind of app congestion, the local kernel socket send buffer will become full (normal SCTP behaviour in this case) and the send() will get an EAGAIN/EWOULDBLOCK. For most of the telco diameter applications I worked with (be it Diameter Routers, HSS, IMS SIP Proxies aso) closing the persistent SCTP association in such a case is not always the optimal solution. An option is to stop sending data from the app on that socket and monitor it for “write ready” events.
I see that the generic “socket” has a mechanism for this kind of monitoring:
Erlang -- socket
Is this usable with gen_sctp:send() as well?
Or should one use the SCTP_SENDER_DRY_EVENT (is it implemented in Erlang)?
See here:
RFC 6458 - Sockets API Extensions for the Stream Control Transmission Protocol (SCTP)

Thank you

raimo · February 28, 2024, 9:39am

EDIT: The below is a lie; wishful thinking.

There is an old problem with gen_sctp:send that it leaks the EAGAIN/EWOULDBLOCK error codes, and does not use the VM poll/select framework.

This should be fixed…

Hi,

there are two generations of sockets in Erlang/OTP. gen_sctp uses the old, ending up in inet_drv.c. The new socket module cannot be backend for gen_sctp, yet, that is one of the last pieces missing before we can proclaim socket to be complete…

gen_sctp exposes a somewhat non-blocking API. EAGAIN/EWOULDBLOCK is never returned to the user. Internally, inet_drv.c makes the VM insert the file descriptor into the VM:s poll/select set (or the corresponding on Windows), buffers the data and returns immediately. The data is then sent in the background when poll/select triggers.

Since that strategi can run the VM out of memory there is a buffer threshold where the inet_drv.c port driver marks itself as busy, so then the send operations are blocking… In fact it is probably incorrect to even try to claim that the API is non-blocking. It was intended to be so, but it turned out to be futile to expect the end users to handle flow control in a responsible way to avoid the VM running out of memory, so the VM protects itself (and the end user from fatal node crashes) by enforcing flow control.

But EAGAIN/EWOULDBLOCK isn’t an error that will be returned by gen_sctp:send, however, when heavily loaded the call could block.

I haven’t looked at SCTP_SENDER_DRY_EVENT. It exists in our source tree, but only in the incomplete socket NIF C-code, so that is just something that someone has found in the SCTP sockets API Extensions and added a define for the future.

When the socket backend for gen_sctp gets completed it will have to behave like the inet_drv.c backend, that is: blocking send.
There is a socket option send_timeout that can be used here.
But we also probably will need to add asynchronous send operations to the gen_* modules…

cristconst · March 8, 2024, 12:59pm

Hi,

Causing an gen_sctp socket to send (diameter) traffic at high rates generates an ‘eagain’ error; I saw it during debuging sessions:

(spu@localhost)2> flush().                                      
Shell got {trace,<8895.497.0>,getting_unlinked,#Port<8895.16>}
Shell got {trace,<8895.497.0>,exit,{shutdown,{send,eagain}}}
ok
(spu@localhost)3>

Where does this ‘eagain’ come from?

Thanks a lot,
Cristian

cristconst · March 10, 2024, 8:59pm

Is the gen_sctp:send EAGAIN/EWOULDBLOCK leak described in more detail somewhere; is there a github PR?
Do you know a use case/scenario for reproducing the leak?

Thanks a lot,
Cristian

raimo · March 11, 2024, 8:03am

I don’t know of any PR for this issue.

The return value comes from the OS send call. Since SCTP is packet oriented it got to share code in inet_drv.c with UDP. For UDP this value is never returned since it is allowed to loose packets and if the OS buffers should be full the packet is dropped and an OK value is returned. For SCTP I guess it happens in some overload situation when the OS is out of SCTP send buffers…

When this value is returned, inet_drv is supposed to insert the file descriptor into the VM select/poll subsystem and await a callback that it is time for a retry. But the UDP code in inet_drv is simply not structured for that, so when SCTP was added we took a shortcut and returned this value back to the Erlang level. Since it very seldom happens we postponed the problem for a rainy day.

The new socket module is better structured to handle this, but here SCTP is very incomplete and untested. It might still be easier to complete the SCTP support for socket than to make the UDP code in inet_drv select/poll aware. But there may be other options. One bad option is to loop over a timeout in Erlang if this happens…

cristconst · March 13, 2024, 1:03pm

Hi!

Thanks a lot @raimo for clarifying these issues.

Is it possible to add more details about the gen_sctp:send() return values to the Erlang/OTP documentation? Especially about the eagain return value?
https://www.erlang.org/doc/man/gen_sctp#send-3

Cristian

raimo · April 26, 2024, 8:52am

I have a fix in the pipeline for ‘maint-26’ OTP-26.2.5 (will be merged forward to ‘master’, OTP 27.0) that makes gen_sctp:send/3,4 block instead.

cristconst · April 26, 2024, 11:52am

Hi!

Thanks a lot @raimo for letting me know.
Is there a PR or a branch where I could already have a look at the code?

Cristian

raimo · April 26, 2024, 12:38pm

I just created a PR: Blocking `gen_sctp:send/3,4`: OTP-19061 by RaimoNiskanen · Pull Request #8428 · erlang/otp · GitHub