I’ve released gen_cluster a pluggable library (has 2 behaviours you can configure what implementations are used) for node discovery and reconnection.
Hoping for feedback on the API – both how to configure and how to write third party implementations of the behaviours. I think I like it enough.
Planned:
- Considering a separate timer with backoff for reconnection instead of trying to reconnect to nodes that had previously disconnected, instead of immediately reconnecting on the same refresh interval. Plus a jitter for the refresh interval.
- Events for subscribing to – but unsure that is worth it since there isn’t much to send events for except what you can already get with
monitor_nodes
messages of down/up. But this could be a callback instead of a message and also include refresh events shrug - Metrics (may be
telemetry
+opentelemetry
or may be justopentelemetry
depending on how I decide to do the first bullet point) - Postgres based node discovery (based on a table of peers that nodes will healthcheck too)
Related to this, I’km going to make a topic and will update here about the idea of getting similar in OTP so this isn’t needed. I thought about doing gen_cluster
through -epmd_module
but its a little awkward as you still need the reconnection ability and something to say "connect to all returned by names
. So you end up having the user to have to configure both -epmd_module
and gen_cluster
. More in the followup topic.