Yielding NIFs in Rustler for bindings to async functions

The Functionality section in the erl_nif docs mention 3 different strategies for long running NIFs: Yielding NIFs, Threaded NIFs and Dirty NIFs. The rustler docs clearly mention how to implement Dirty NIFs. I also found an obscure function rustler::thread::spawn that handles setting up a non-Erlang thread to execute a closure and returning the result to the calling Erlang process. This addresses Threaded NIFs.

However, I can’t find a Rust-like way to write Yielding NIFs i.e. NIFs that perform work in chunks, yielding back control to the BEAM every millisecond. I found Rustler’s bindings to the C NIF API, so I could just call enif_schedule_nif itself and be done with it. Unfortunately, I can foresee two problems with this approach:

  • Calling the C API kind of defeats the purpose of writing in Rust
  • Calling the C API from Rust is probably far more painful than writing in C in the first place

TLDR; How do you implement Yielding NIFs in Rustler?


In order to eliminate any chance of an XY problem, here’s some more context: I am writing an NIF for a heavily async Rust library. I’ve also chosen tokio as the async runtime. I reckon that since I am binding to asynchronous functions, I might as well pass on the benefits to the BEAM’s scheduler.

A yielding NIF seemed like a better fit than a threaded NIF to me for 2 reasons:

  • Tokio provides robust scheduling and threading, and exposes this in a task-oriented API, making rustler::thread::spawn and enif_thread_* seem redundant
  • Yielding NIFs are recommended over Threaded and Dirty NIFs by the docs

Since Tokio threads are separate from the BEAM’s, I can have an initial NIF call create a task, and following calls yield very quickly until a result is received.

Crossposted from ElixirForum.

I don’t think you really want a Yielding NIF for this, enif_schedule_nif sets up the next function call as ready to run, but you don’t really want to check for completion as a busy loop. Either way, Yielding and Threaded NIFs both end up where the calling process returns, and (typically) waits for a message to get the result of the computation. When the computation is finished, it’s sent back to the original process with enif_send.

I’ve only used Rust/Tokio separately from Erlang/OTP/BEAM. But I would think you’d want to do something like this:

In your NIF init code, spawn a Tokio Runtime. If you think you want multiple runtimes, you could spawn a runtime as a NIF, returning a resource to refer to it and pass that resource into computation NIFs.

In your Erlang calling the nif do something like:

Ref = make_ref(),
ok = your_nif(Ref, Arguments)
receive 
    {Ref, Result} -> Result
end

(maybe with an timeout, and if so a process alias to ignore untimely responses)

Then NIF code would find your Tokio Runtime, use that to spawn a task to do the work and enif_send a response back to the process that made the request.

I’m not an expert, but my understanding was that it was only Threaded NIFs that sent the result as a message, while for Yielding NIFs the BEAM handled differentiating between the final result and the intermediate yielded results (see the Long Running NIFs subsection in Functionality section linked above).

That’s a good point - I implicitly assumed that the BEAM would schedule another process to run after I scheduled another NIF. Wouldn’t calling enif_consume_timeslice with a percent of 95 or 100 before calling enif_schedule_nif fix this? That way I still check completion in a loop, but the loop isn’t hot and busy.

Like I mentioned earlier, the library I’m binding to is (Rust) async from the ground up, so I can’t run any function from the lib without an async runtime. If not Tokio, it would have to be smol or monio.

The text does say “The final call scheduled in this manner can then return the overall result.” but … enif_schedule_nif returns immediately, and shortly after, in Threaded NIF, it says “The thread can send the result back to the Erlang process using enif_send. Information about thread primitives is provided below.” You could arrange to wait for a response in the initial NIF call, so that it could return the result to the original caller, but you would block the scheduler, which is not good. AFAIK, BEAM does not provide a way for a NIF to yield to the scheduler and retain its stack. enif_schedule_nif can be used to yield, but the stack is different, and AFAIK, the return from the scheduled NIF doesn’t go anywhere.

enif_consume_timeslice doesn’t necessarily help here; it just influences how likely the scheduler is to suspend the process out at its next function call; but when a process is suspended because it used its timeslice, it is immediately ready to run and will be queued to run at the next opportunity (skipping scheduler details again)

Yes, assuming the usual scheduler arrangements and priorities… the scheduler will run any other ready processes before it runs a newly scheduled process, but you still don’t really want to run every time through the ready processes loop to check something, when the Rust async runtime can send a message when it’s ready.

Whatever runtime is fine. Again, I haven’t used Rustler, but there must be a on_load for the NIF, I would start a multithreaded runtime in there (tokio::runtime::Runtime::new()?:wink: and globally store an Arc to it to use from the NIF calls. Depending on how your system ends up running, you might not want to run one BEAM scheduler per core as well as one Tokio scheduler per core, but it it’s probably ok to start there and see where it ends up.

What you probably don’t want to do is enter a Tokio runtime from within the NIF, you could block the BEAM scheduler. Your NIF should get the reference to the runtime and then do something like

rt.spawn(async {
   do some work with the library
   rustler::sys::enif_send( … );
});

Dropping the JoinHandle from spawn should be OK; from my reading, it will detach the task, but the task should run to completion, and it sends the result with enif_send, so you don’t need to await the JoinHandle. enif_send is documented as thread-safe, so you can call it from a Tokio runtime.

You don’t need to keep the stack - you can return the tokio JoinHandle as an Erlang Resource, and pass the resource to the scheduled NIF.

I think the result from the final scheduled NIF is what is returned to the calling Erlang process. The results from intermediate scheduled NIFs don’t go anywhere - they’re the result of enif_schedule_nif.

Wouldn’t waiting for a response block the calling Erlang process, but leave the scheduler free to run other processes? From the perspective of the calling process, this is very similar to a recieve block, but without the make_ref() boilerplate.

I think you’re right. Now, sending a message seems a lot more elegant than polling to check if a task is completed. I was trying to avoid the boilerplate on the Erlang side, but I realized that that adds a lot of complexity on the Rust side.

Yeah this is probably the way to go. Rustler already defines a helper to do this, so you would do something on the lines of this instead:

rustler::thread::spawn(env, {
    let result = rt.block_on(async {
        // work   
    });

    Ok(result)
});

I went and found an example of a yielding NIF, and you’re right. enif_schedule_nif works differently than I thought. bitwise/c_src/bitwise_nif.c at master · vinoski/bitwise · GitHub There’s no enif_send in here.

I still don’t think it would be wise to enter a Tokio runtime from this context; unless you can ensure you’ll leave the runtime in a reasonable amount of time; , you run the risk of blocking the BEAM scheduler thread you’re on; maybe in a dirty scheduler… but I would just start a multithreaded Tokio runtime and submit tasks that send a message when they’re done.

If you did really want to do this, you’d want to do something like submit a task to the current_thread runtime, then enif_schedule_nif to poll it with a short timeout. But you’d need to make sure the task yields regularly. BEAM languages are effectively preemptable, but NIFs and Tokio tasks aren’t; you’ve got to add yield points.

Note that this results in spawning an new thread everytime it’s called, whereas if you just add a task to the runtime (which is not async so you can do from outside the runtime), you don’t spawn a thread. Balancing the OS scheduler with BEAM threads and tokio threads is going to be tricky, adding an extra thread every time you call a NIF won’t help. Also, there’s a presumption in both BEAM and Tokio that OS threads are expensive / that more cpu bound OS threads than cores to run them reduces throughput because of the cost of OS context switches.

Yielding is trivial: you can poll using an NIF called task, which just calls .is_finished?, returning the value if the task is completed, yielding otherwise.

Yeah, you’re right. I finally went with something like this, which re-uses the tokio thread for threading:

#[rustler::nif]
fn nif_fun<'a>(env: Env<'a>, param: String) -> Reference<'a> {
    let erl_ref = env.make_ref();
    let pid = env.pid();

    let mut owned_env = OwnedEnv::new();
    let owned_ref = owned_env.run(|env| erl_ref.in_env(env));

    runtime().spawn(async move {
        let ret = // work
        let _ = owned_env.send_and_clear(&pid, |_env| (owned_ref, ret));
    });

    erl_ref
}
1 Like