Erlang Supervisor. How?

alexandr · November 28, 2023, 6:23am

Hello all!

Haven’t been using erlang 2 years ang got forgotten how to use OTP Supervisor properly in different situations. Could someone to refresh it in my memory. The Supervisor handling few gen_servers. There are whole functionality divided on few gen_servers. There are few questions about it:

How to kill this supervisor with all child processes and do something on killing Suervisor event? First of all kill children and after make self kill? For now using exit(Pid,Reason) but it’s killing with exception.
How to catch up event of supervisor restarting any child?
How to create pause functionality? For example there could be situation when don’t need to kill children just need to keep them alive but doing nothing. Or just one of them keeping alive but doing nothing
If there are few gen_servers of divided functionality how to call correctly each of functionality like methods of Supervisor. For example gen_server has call “example_call”. How to make this “example call” of gen_server available like function in Supervisor module?
Is there option to start “worker” from this Supervisor on different node? For example on Supervisor handling few gen_servers on different nodes.
If there are multiple Supervisors started from the same module but with different parameters, what is the best way to store Pids? Proplists in memory? ETS? Or some thing else?

jimdigriz · November 28, 2023, 10:17am

Okay, quite a bit to unpack here, but everything is preference and dependent on your problem space, so my suggestions may be useful or completely useless to your situation.

Brace yourself to get ~100 different opinions.

You have several options:

exit(Pid, Reason) as you are now
place your supervisor under a one_for_one supervisor and terminate_child
since OTP 24 there is a concept of ‘significant’ processes, which if any/all of them shutdown then the supervisor then shuts down too; using this you could send a message to a significant process for it to exit which would bring everything else under than tree down with it too

Ignore the supervisor, its purpose is for you to describe the dependencies your processes and the ordering that they are spun up; so rest_for_one is particularly useful.

What you actually care about are the processes under the supervisor so have the last child started by the supervisor call back into your waiting process (or an event bus).

If there is not a good place for you to do this, add a one-shot child processes that returns ignore to the supervisor after sending the “started” message to whatever is interested; be aware that for strategies other than one_for_one you may get a message even though the supervisor did not restart.

Sounds like you want a pool? Plenty of applications/libraries that can handle this for you. There are a lot of race gotchas that make rolling your own a horrible experience of baptism by fire.

If you need to roll your own, I would suggest simple_one_for_one and spawn on demand as I suspect you do not want to carry state from one workload to the next? Processes are really cheap.

If you have an expensive initialisation step then look to use persistent_term (wrap your init’s in the gen with global:{lock,trans,...}, [node()]) on the local node only) or if you need a shared cache, then look to ets.

Don’t, look at server_ref() for hints on how to steer your message to the correct gen; the return to start_link is a Pid or if you locally/globally register it you can use the atom name of the registration.

Worth noting, is do not overlook {via, RegMod, ViaName} if you need some more interesting steering strategies. You effectively build your own process registry which can help hugely.

If you are determined to dispatch via the supervisor, then I recommend adding a helper function that takes your reference and ties it back to something in which_children…but this is where nothing but pain lives.

Sounds like you are attempting to roll your own mapreduce-esque service? I recommend you run the supervisor only with local pids but register them either locally and dispatch messages to them using {Name, Node} for your server_ref() or look to using global_groups to automatic the service discovery.

If though you really want to do this, for the start MFA you should wrap it with erpc to spawn and call start_link({local, ...}, ...) on a remote node and then make server_ref() for anything calling it {Name, Node}.

On a related note, one interesting thing you can do here is (ab)use supervisors to create global singletons too.

Depends on how you need to access them. ets is good as a generic “works for 99% of situations” so sure…use that.

What I like to do is create a rest_for_one supervisor, have my gen as first child and then have that gen spawn the child supervisors along side it. You options are (ugly) read which_children or (strongly recommended) have your process store the list in its own state. proplists are not really a good fit, you will find the lists:key{find,...} functions much more appropriate.

alexandr · November 28, 2023, 11:15am

Your reply is super useful. Some of your descriptions - proving my own strategy, some of them is very new for me. HUGE Thx!