Performance of socket option {packet, 4}


In rabbitmq-server PR #5054 I switched from socket option {packet, 0} to socket option {packet, 4} for the RabbitMQ stream protocol.

My expectation was to get a speedup since packet parsing will be done by the Erlang runtime. However, message throughput slightly drops.

I tried to come up with a minimal reproducible example without using RabbitMQ in GitHub - ansd/packet-parser: Compare performance of Erlang's socket option {packet, 0} vs {packet, 4}. Unless I’m doing something wrong (which might well be the case :slight_smile:) the tests suggest that using {packet, 0} and parsing packets in the application layer on my own is more than 5 times faster with a packet size of 1500 bytes compared to using {packet, 4}.

I’m wondering whether this is expected?

Many thanks!


Since you are running {active,once} the VM will only read ?MSG_BYTES per call when you use {packet,4}. While when you use {packet,0} it will read the entire tcp buffer in one go. I think you would get a fairer comparison if you used {active,40} as then the VM can also read the entire buffer in one go and deliver all of it.


Thank you, that’s a good tip. I wasn’t aware that {active, N} will speed things up.

I added some test results with [{packet, 4}, {active, 40}]. It does speed up receiving messages most of the time (depending on the packet size). However, with a packet size of 1,500 bytes it’s still more than 3 times slower compared to [{packet, 0}, {active, once}].

(To be fair, for small packets (~10 bytes) and large packets (>10KB), {packet, 4} performs always better than {packet, 0}.)

1 Like

Been tinkering a bit with David’s program today and narrowed it down to exactly the message size that cause the slowdown (at least on my machine).

With the program using {active, once} - I get ~13M messages over the 30s run when ?MSG_SIZE is 1456.

if I increase ?MSG_SIZE to 1457 I only get ~2.5M messages over the 30s run.

Doesn’t feel quite right :slight_smile:


That is the default buffer size:

> {ok, S} = gen_tcp:connect("",80,[]), inet:getopts(S,[buffer]).

So when you go above it the tcp driver implementation has to do two reads instead of one and probably some reallocations as well.