EEP 76: Priority Messages

rickard · January 7, 2025, 10:37pm

Abstract

In some scenarios it is important to propagate certain information to a process quickly without the receiving process having to search the whole message queue which can become very inefficient if the message queue is long. This EEP introduces the concept of priority messages to the language which aim to solve this issue.

For more information see pull request 73 of the EEP repository.

zabrane · January 8, 2025, 7:03am

+1 for this concept.

Maria-12648430 · January 8, 2025, 9:59am

Thanks for taking this on @rickard, I was hoping for this for a while

rvirding · January 8, 2025, 12:07pm

I am sure this is fantastic but I reckon that we are going to see a lot of high priority messages. I know that the messages I send are important and will have high priority.

ieQu1 · January 8, 2025, 12:22pm

The problem of long message queue does, indeed, bite quite often, but IMO current message ordering guarantees are too nice to lose.

An alternative solution that I was thinking about (but not to the point of submitting an EIP), would be to extend process alias mechanism to the receive side:

Add a new BIF to create a new process mailbox. This BIF returns a reference that acts similar to process alias. Semantics of mailboxes created explicitly are the same.
Messages sent to this reference end up in a separate mailbox.
Add a new language construct, from Alias receive ... (or similar)
This will allow to write code like

recieve
  %% Process high prio control messages from the supervisor first:
  {'EXIT', ...} -> terminate(...)
after 0 ->
  %% Process other stuff:
  from OtherMailbox receive 
    {'$gen_call', From, ...} -> handle_call(...)
  end
end

Note: both receives should be treated as one by the compiler, rather than as 2 separate receives.

This solution preserves message causality within each mailbox. Additionally, it can be utilized by BEAM languages with static typing, because they could use a convention that each mailbox contains messages of the same type, but this is another story.

Just throwing this idea here. I understand that it requires significant changes in ERTS, compiler and standard library (e.g. gen:call).

dischoen · January 8, 2025, 12:40pm

I just thought of something similar, but without the need for new language constructs:

create a dedicated priority mail receiver process in a 1:1 relationship with the worker
priority messages go to the priority process, which caches them
like the “from OtherMailbox” construct the worker regularly polls the priority receiver for important messages.

of course the pid or name of the priority receiver has to be known to the sender(s).

dischoen · January 8, 2025, 12:46pm

I think the guys at the late twitter had to implement something similar for their boss.

Maria-12648430 · January 8, 2025, 4:56pm

Well, the EEP says that the receiver has to configure itself (via process_flag) to actually prioritize priority-marked messages from a specific sender. In other words, while you may consider your messages important for the receiver, the receiver has to consider important messages coming from you as important for him also

Maria-12648430 · January 8, 2025, 5:10pm

If you read the EEP (or rather, the PR for it, at the time of this writing) carefully, you will see that you won’t lose anything that is currently guaranteed. What you can change, if you as a receiver opt in for it, and that for specific senders, is the order in which messages end up in your mailbox. That is, if and only if a process A sends a priority message to process B, and process B has opted in to prioritize priority messages from process A, then the message will be actually prioritized in process B’s mailbox. Messages from other processes (priority or not) as well as non-priority messages even from B, will be put at the end of the mailbox as usual.

max-au · January 9, 2025, 4:49am

It’s finally happening! Thanks for formalising the design we discussed to death over emails and not only, the signal → message conversion magic. As I now read through the code, the fun part is repurposing receive marker storage when the process flag is set.

MononcQc · January 9, 2025, 5:05am

This looks like really risky dynamics, but since it’s opt-in from the receiver that makes it reasonable.

I’m curious about the semantics of toggling priority reception on and off while messages might be in flight, actively enqueued, etc.

I also think there’s potential here to make appups and relups a lot more reliable and safe by having the sys messages triggering the code update turn on priority reception for the duration of the upgrade. Part of the risks of these operations were that if they took place in busy processes, the gradual increase of mailbox size across dozens or hundreds of workers could take the node on a death spiral.

High priority relup messages could lower the duration of code_change communications and drastically de-risk these operations.

rickard · January 9, 2025, 6:34pm

ieQu1:

recieve
  %% Process high prio control messages from the supervisor first:
  {'EXIT', ...} -> terminate(...)
after 0 ->
  %% Process other stuff:
  from OtherMailbox receive 
    {'$gen_call', From, ...} -> handle_call(...)
  end
end

Note: both receives should be treated as one by the compiler, rather than as 2 separate receives.

It is more or less the alternate solution I considered in the EEP. I’m not fond of such a solution since it makes the receive expression very complicated and you wouldn’t be able to use this functionality in gen_servers, etc, without having to modify all behaviors as well.

Using the solution proposed in the EEP, the only thing you need to do is to add the following to the initialization of the worker process that should prioritize the supervisor 'EXIT' message:

    process_flag({priority_exit_message, SupervisorPid}, true),

and then you are done.

One thing that I think is nice about the approach in the EEP is that it solves these issues without introducing major changes.

rickard · January 9, 2025, 6:36pm

As @Maria-12648430 stated

rickard · January 9, 2025, 6:41pm

Having to poll for important events is quite annoying and often costly. The polling workarounds is what the proposed solution is trying to get away from.

rickard · January 9, 2025, 7:15pm

The state of the process-flag at the time of the signal reception determine the action taken for the message. If it is true, overtake ordinary messages in the message queue and add it there; otherwise, add it to the end of message queue. Once it has been added to the message queue, it wont move.

Regarding prioritized code change. This is not something that I have been looking into and feels to me a bit like stretching what it was intended for, but I might be wrong, it can perhaps be solved.

Just an idea (which perhaps might be stupid since I haven’t looked into the details here) in an initial phase (new or existing) before anything regarding the upgrade has been done send a sys-message request to the involved processes to enable priority messaging. Once replies from all involved processes have been received, initiate the upgrade. Disable priority messaging in final phase (new or existing).

max-au · January 10, 2025, 2:12am

On that note, I think it’d be useful to have a way to override the global default (via beam command line switch). There may also be some use of that for proc_lib:start/spawn function family, but I’d rather abstain from that.

rvirding · January 10, 2025, 7:26pm

I interpreted @rickard that there wasn’t a global default but that you explicitly set it inside a process:

    process_flag({priority_exit_message, SupervisorPid}, true),

max-au · January 10, 2025, 8:22pm

Yes, I’m saying, that I’d like to add the global default override to that PR, for completeness.

dszoboszlay · January 11, 2025, 12:21am

I agree with the motivation and the see the relevance of the problematic scenarios listed in the EEP. But, to be honest, I’m not entirely sold to the proposed solution.

One problem I have is that the receiver needs to opt-in for priority messages per sender. The current practice is that only the sender needs to know who will be the receiver of a message, a process generally doesn’t care about who will send a message to it. Perhaps in these specific problematic scenarios the receiver would know who may send a priority message to it in the future, but there are many scenarios when the sender is not known. For example it may be useful to schedule a priority message with erlang:send_after/4, but you don’t know what process to expect that message from.
The EEP deals with regular messages from a process, exit signals and monitor signals. They all need different process_flag/2-s for opting in, which is already a lot of complexity. monitor/3 gets a new option to make this easier to use, but for a complete priority-message support you’d need similar extensions to a lot of other API-s that set up some message to be sent in the future: timers, erlang:send_after/4, erlang:monitor_node/3, net_kernel:monitor_nodes/2, trace messages, OS signal handling, active messages sent by gen_tcp, the {tcp_passive, Socket} message sent upon exhausting the receive window of an {active, N} socket and so on. A complete support of priority message (where any message you may expect from Erlang/OTP can be configured to be sent as priority) would be a huge change, with a lot of updates both in the client API-s and in the implementation of the services (so they know whether to send regular or priority messages).
A minor note is that I don’t see the EEP mentioning NIF-s. NIF-s would also need an API for sending priority messages.

If I may propose an alternative solution to consider, I think indexed message queues could solve this problem with less complexity (at least in the language and library level). The idea is that a process could specify one or more patterns (in the form of a match specification) used for indexing its message queue. This could be a process_flag/2, for example like this:

process_flag(
  message_queue_indexing,
  [ { {'EXIT', '_', '_'}, [], [] }, % index EXIT messages
    { {immediate_abort, [], [] } % index some arbitrary message relevant for this process
  ])

This call would create 2 indices for the process.

When a message is placed in the message queue of a process with message_queue_indexing, the message is tested against the match spec, and gets inserted into all of the indices too (an index wouldn’t be a separate message queue, just something like a linked list of pointers to the actual messages in the one and only message queue).

Now, the trick is that when the process later does a selective receive, the receive patterns could be compared with the match spec and if all of the receive patterns are indexing patterns too, then the receive would only have to scan those indices, not the entire message queue.

This would help speeding up code written in the following (I believe quite common) style, without messing with message ordering guarantees in any way:

receive
  immediate_abort ->
    exit(shutdown);
  {'EXIT', Pid, Reason} when Pid =:= MyWorker ->
    restart_worker()
after
  0 ->
    receive
      Msg ->
        handle_msg(Msg)
    end
end

The downside is that it’s obviously a harder problem to implement these indexed receives in the compiler and ERTS (e.g. the compiler would have to turn the patterns to match specifications and then ERTS would have to figure out whether the index match specification fully covers the receive match specification; not to mention maintaining the indices as messages are inserted and removed from the message queue).

max-au · January 11, 2025, 2:53am

The way I understand your proposal, it’s an implementation of multiple message queues, with somewhat convoluted “selective-receive-like” syntax to specify the order in which receiving process wants to fetch these messages.

While generally I’m in favour of multi-receive-queue solutions, I think it will be quite a revolution to implement. That said, if someone can come with a decent implementation of that, it may be a really cool feature.