:prim_file.get_cwd() hangs

dvic · September 28, 2023, 10:44am

Does anyone now under what circumstances :prim_file.get_cwd() hangs? Our Elixir app seems to choke (it’s an file serving app) at some point and we noticed that we can run functions on the node in question but operations with the file system (like get_cwd) hang. Does this have to do with file descriptors?

jhogberg · September 28, 2023, 10:51am

get_cwd cannot hang in and of itself, but all file operations run on dirty IO schedulers and do not yield execution until the operations finish completely (POSIX does not support asynchronous file IO so we don’t have any other choice here).

Hence, if you have other outstanding operations that use up all dirty IO schedulers (for example if they’re stuck on stale NFS handles, or simply lots of processes reading huge files), get_cwd will hang until one is available.

dvic · September 28, 2023, 11:33am

Thanks for the answer! Is there a way to figure out what processes are taking up the dirty IO schedulers?

We are indeed invoking a NIF (GitHub - akash-akya/vix: Elixir extension for libvips through GitHub - elixir-image/image: Image processing for Elixir) that has operations marked as IO bound. There also seems to be a correlation with how we configure the CDN: with HTTPS the problem occurs, but without HTTPS it does not, probably because in the case of HTTP the schedulers don’t end up being taken (no idea why).

jhogberg · September 28, 2023, 1:07pm

Yes, generate a crash dump (erlang:halt("asdf")) and then look at erl_crash.dump to see what the dirty IO schedulers are doing. You can either use crashdump_viewer:start() and look under the schedulers tab, or use a text editor and search for =dirty_io_scheduler.

dvic · September 28, 2023, 2:05pm

Thanks!

The dirty_io types (schedulers 13 through 22) are all occupied by modules / operations of the image library, so this confirms what I suspected. It also says run queue length of 18 for all of them, are these all pending operations waiting to be run? Is it a coincidence that they are all 18?

jhogberg · September 28, 2023, 2:29pm

Yes, that’s the number of things waiting to run. They all share the same queue (unlike normal schedulers that have one queue per scheduler), so that’s normal.

nzok · September 30, 2023, 8:39am

POSIX does support asynchronous file I/O.
The <aio.h> interface was optional in Issue 6 of the Single Unix Specification
but it has been part of the Base since Issue 7.
That interface just supports reading and writing of already open files.

jhogberg · September 30, 2023, 10:11am

That’s only asynchronous on paper though. The implementations all end up blocking anyway at ill-defined points (even for simple reads and writes), so we can’t rely on it even if we could stomach its horrific interface.

rickard · September 30, 2023, 1:51pm

You might want to increase the number of dirty I/O schedulers using the command line argument +SDio <amount> when starting the runtime system. The default is 10 which has been chosen in order not to fail to create threads at runtime system start on systems with limited resources. You probably want a lot more than that.

jhogberg · September 30, 2023, 2:55pm

While 10 is low in general, I’m not sure bumping the number of schedulers is the right solution in this case: the image processing library seems to be doing CPU-bound work in another process, merely using the dirty schedulers to block waiting for results over a pipe.

Adding more schedulers without also limiting the number of outstanding requests could make the problem worse (and is a DoS vector if nothing else).

rickard · September 30, 2023, 9:49pm

Assuming dirty I/O schedulers are used as intended, i.e., for potential I/O wait with minimal amount of CPU bound work only, I would say increasing the amount of dirty I/O schedulers drastically should not be a problem. If the amount of dirty I/O jobs is unlimited and no I/O wait occur, yes, you’ll probably run into issues. I would, however, say that using to few dirty I/O schedulers is likelier to give you DoS issues, since only a few long waiting dirty I/O jobs can block a huge amount of other unrelated processes from making progress like seen in the issue reported in this thread.

Whether or not the NIF in question is implemented as intended or not I have no idea, but if it potentially is doing large amounts of CPU bound work when executing on a dirty I/O scheduler that should preferably be fixed.