Enif_select_read: invoke a C function when a file descriptor becomes readable

saleyn · November 24, 2023, 4:34am

The NIF enif_select_read allows to register a pid that will receive a one shot message when an event object (such as a file descriptor) becomes readable. However if the intent is to add a file descriptor to the Erlang’s IO event loop in order to execute a C callback, it looks like the only option is for an Erlang pid to receive a message from enif_select_read, which in turn would have to issue a NIF call to pass the control of event handling back to the C land. It would be more convenient if it were possible to call a C function directly when an event object (e.g. file descriptor) becomes readable. Is it possible using the current version of the NIF API?

starbelly · November 25, 2023, 6:46pm

Your question got wide there at the end, so I’m answering this with an assumption that you wish to utilize enif_select*, or even stay in the bounds of the nif api and erl_check_io, and in that case the answer is definitely no.

However, it’s possible you may be able to do something very dirty and exploit the driver interface and use drive_select to get what you want, though that’s not part of the nif api. I am not recommending even trying that FWIW and it may not even be possible

That leads to an interesting question for the OTP team though, it seems simple to bring that functionality over to the NIF side (i.e., support ready in and ready out callbacks on a resource), but what would the ramifications of that be scheduler wise?

I suppose @saleyn , convenience aside, you may be interested in avoiding the overhead of erl_check_io queuing up a message, said message being selected by the receiving process, making a nif call again (which in the case of a dirty scheduler has extra overhead), etc. ?

saleyn · December 3, 2023, 2:25am

Indeed, it seems that extending the NIF API to cover that use case would be logical.

starbelly · December 4, 2023, 3:09pm

It does, but I’m not sure it’s so simple… Pinging @jhogberg for info.

jhogberg · December 4, 2023, 4:00pm

I think @sverker is the right person to ask, he knows this area much better than I do.

My gut feeling says that you’d just be trading one overhead for another, though, as you now need to get data (or just completion notifications) out of the callback somehow.

sverker · December 4, 2023, 6:12pm

My gut feeling says that you’d just be trading one overhead for another

Yes I agree. With such a callback, you would get called in a process-less context. So you will have to send a message anyway to a process to handle the read data.

The initial idea of the NIF interface was to not be callback driven at all. Then we reluctantly had to add some callbacks anyway when there were situations with no process to do the job.

saleyn · December 7, 2023, 2:09am

With such a callback, you would get called in a process-less context. So you will have to send a message anyway to a process to handle the read data.

Though, the advantage here would be that the notification and socket read on the C-side could avoid cases of creation and copying of binaries for partial payloads. I.e. a process would be notified by sending a “comple” binary message only, rather than reading a binary from the socket on the Erlang side, determining that it contains an incomplete message, and repeating the operation by later concatenating binaries.

zabrane · December 7, 2023, 9:00am

@saleyn +1. I like the idea to do the heavy lifting in C and when the data is ready/complete, send it to the Erlang side.

jhogberg · December 7, 2023, 9:20am

You’re free to do that as things are, e.g. returning incomplete until you have a full message and only then returning it in full.

saleyn · December 7, 2023, 1:27pm

In the existing API, it would be necessary to use enif_select_read to send a message to a PID that the socket is readable, then to either call the gen_{tcp,udp,sctp}:read/2,3 or to call a NIF from the context of the PID to issue a socket read, which would only return a binary if there is a full message (assuming the NIF would cache a state someplace for the socket with a user-space buffer containing partial reads). This extra overhead of signaling a PID for partial reads, only to return control back to the C-side, seems to be avoidable if the NIF API design allowed to define a callback for reading data from a file descriptor when available.

jhogberg · December 7, 2023, 2:17pm

The callback needs to be scheduled to run, too. Taking the process out of the picture wouldn’t save much (if anything), you’d just trade some overheads for different ones.

rlipscombe · December 12, 2023, 7:48pm

fwiw, when we faced this problem, we used a separate reactor pattern (implemented in C++), which ran in a background thread (lifetime managed by the NIF); when it had completed a task, it would then use enif_send with the result. Insert your definition of “complete”, “task” and “result” here – ours were (mostly) libcurl-related.

Aside: what I also wanted at the time was the ability to hook into Erlang’s timer wheel from a NIF.