Erlang_quic - QUIC Distribution for Erlang Clusters

Following the Hackney 3.1.0 - HTTP/3 now pure Erlang, zero C dependencies, I am pleased to announce erlang_quic 0.11.0 with a major new feature: QUIC-based Erlang distribution.

QUIC Distribution (quic_dist)

It is now possible to run Erlang clusters over QUIC instead of TCP:

  erl -name node1@host1 \
      -proto_dist quic \
      -epmd_module quic_epmd \
      -start_epmd false \
      -quic_dist_port 4433

Advantages over ssl_dist:

  • No head-of-line blocking - Multiple streams allow messages to flow independently
  • 0-RTT reconnection - Session resumption permits fast cluster recovery
  • Built-in TLS 1.3 - No separate SSL configuration required
  • QUIC-level liveness - Resolves net_tick_timeout issues under heavy load

Discovery backends are included: static configuration or DNS SRV records.

Congestion Control

The library now offers pluggable congestion control with three algorithms:

  • NewReno (RFC 9002) - Default, simple and robust
  • CUBIC (RFC 9438) - More suitable for high-bandwidth networks
  • BBR - Based on bottleneck bandwidth and RTT measurement

All algorithms support HyStart++ (RFC 9406) for improved slow start and packet pacing to avoid bursts.

Connection Migration (RFC 9000 Section 9)

Full support for QUIC connection migration:

  • Server-side NAT rebinding detection
  • Active migration with path validation
  • CID rotation for privacy
  • Automatic congestion control reset on path change

Other Notable Changes

  • UDP packet batching (GSO/GRO)
  • QLOG tracing for debugging
  • Stream prioritization
  • RTT-based flow control auto-tuning
  • All 10 QUIC Interop Runner tests pass

Links

:rocket: Pure Erlang, zero external dependencies, requires OTP 27+.

17 Likes

Throughput

Local benchmarks show TCP distribution achieving 40k+ msg/s for small messages, scaling to
1.4 GB/s for 64KB messages. QUIC distribution provides comparable throughput with the added
benefits above.

  β”Œβ”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚ Size β”‚ Throughput β”‚ Bandwidth β”‚ Latency β”‚
  β”œβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  β”‚ 64B  β”‚ 40k/s      β”‚ 2.4 MB/s  β”‚ 25 us   β”‚
  β”œβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  β”‚ 1KB  β”‚ 45k/s      β”‚ 44 MB/s   β”‚ 22 us   β”‚
  β”œβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  β”‚ 16KB β”‚ 39k/s      β”‚ 600 MB/s  β”‚ 26 us   β”‚
  β”œβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  β”‚ 64KB β”‚ 22k/s      β”‚ 1.4 GB/s  β”‚ 45 us   β”‚
  β””β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
2 Likes

I’m curious, how is this possible without violating the signal ordering guarantee of the language? Hashing pid() for stream selection?

Right now the implementation is opening multiple streams but only use the first one for distribution data and then maintain explict ordering. So the multi-stream benefit is limited to:

  1. Priority separation: control stream (stream 0, urgency 0) vs data streams (urgency 4-6)
  2. Ticks bypass congestion: the control stream uses separate flow control allowance

To actually use multiple streams without violating signal ordering guarantee, I think we may have 2 choices:

  1. hash by send-receiver pid pair, the same pair will use the same stream
  2. Hash by sender pid. which would just preserve ordering for a process. Which should be enough.

(1) is more correct imo . But not sure there is a real benefit t of multiple stream. The only benefit is maybe busy flow that would not block others. But there is still the need for a connection flow control, and the receiver will be a single process anyway. Currenyt implementation is more simpler and effective.

in coming 0.12 there will be an api to allow to exploit the distribution connection for data, allowing it to open some kind of β€œsidecar” streams.

1 Like

Sends can happen to an alias or a registered name. So you cannot always know the receiver PID.

1 Like

Here is the coming API:

While I think that (1) is more correct, I think that (2) should be simpler and provide similar performance while being much simpler to implement and easier to manage.

I guess that until we make sure requests and responses are each always using the same channel for a process this should work?

Ie if caller use stream A and Callee use stream B messages order should be kept since quic is ordered.

cc @garazdawi