I’m currently building my releases using rebar3 and deploying them to production using Docker, and the setup has been working well—all performance requirements are being met without issues.
However, I’d like to take the next step and implement release upgrades (hot code loading) to update my application without downtime. I understand the theory behind relup files and the release_handler module, but I’m uncertain about the practical implementation within a Docker environment.
Specifically, I’m wondering:
How can release upgrades be effectively achieved when running inside Docker containers? The typical approaches involve modifying the release directory structure, which seems to conflict with Docker’s immutable container model.
What are the recommended alternatives for achieving zero-downtime deployments with Erlang releases in Docker? I’m aware of strategies like blue-green deployments or rolling updates at the orchestration layer, but I’m curious whether anyone has successfully implemented true hot upgrades within containerized environments.
For those running Erlang/OTP in production with Docker (or Podman): how are you handling application updates? Are you using hot upgrades, or have you opted for different deployment strategies?
I’d appreciate any practical advice, war stories, or references to production-tested approaches. I’m particularly interested in understanding the trade-offs between leveraging Erlang’s built-in hot upgrade capabilities versus relying on container orchestration for zero-downtime deployments.
I’m the author of Castle, a library that helps with hot-code upgrades for Elixir releases. They’re undoubtedly a very cool feature of the platform but there are a few things to bear in mind.
the whole point of HCUs is to avoid downtime. The corollary is that you want to be very sure your upgrade scripts won’t themselves leave the system in bad state or worse, actually crash while running. This is probably going to mean a fair amount of gnarly testing.
You can only really use it for your own apps. I’ve seen very few apps outside OTP that ever ship with appups. So, it’s applicable for bug fixes but not for general dependency upgrades.
It’s a great question, I look forward to the responses.
We implemented in-service upgrades in our Apt and Yum packages but have made no attempt in Helm. It might be an anti-pattern in a k8s environment.
You could design your containers as read-only clients. The idea being that the container is just the OS with OTP bootstrap and you pull all your code from outside using file system volumes or the boot server.
Some would argue that hot code loading is a relic of a monolithic past, however what rolling updates of cloud native functions does not support is keeping existing network connections in place. If your mobile device data connection is lost it can reconnect and you probably won’t notice, however when your phone call is disconnected you will. We like in-service updates with hot code loading because there is no service disruption at all.
Regarding connection preservation: I believe nginx (and HAProxy) already solve this elegantly when fronting the application.
Nginx’s graceful reload (nginx -s reload) keeps old workers serving existing connections while new workers handle fresh requests to updated backends. Old workers only exit after completing all in-flight requests—zero dropped connections.
The deployment flow could look like this:
Start new container with updated code
Health check new container
docker exec nginx /usr/sbin/nginx -s reload (routes new traffic to new container)
Remove old container after drain
Reload nginx to clean up
This has been production-proven: the team at Tines implemented zero-downtime deploys using this technique with nothing more than a ~20-line bash script.
My question then: Given that nginx reload preserves connections effectively, what specific advantages do hot upgrades offer in containerized environments? Is it about preserving in-memory state beyond connections, or avoiding multi-container orchestration complexity?
Genuinely curious whether the read-only container + hot upgrade pattern provides benefits beyond what nginx graceful reloads already deliver.
@vances Ah, that’s the key distinction! My use case is much simpler—an HTTP-based service using Cowboy, so the nginx approach works well IMHO.
Out of curiosity: why use Docker for those telecom protocols in the first place? It seems like the container model conflicts with hot upgrades, while the need for hot upgrades suggests the workload might be better suited to traditional deployment.
Most of our customers are still running VMs however increasingly we are being asked to deliver onto k8s. When all the network functions are “cloud native” there’s no argument for us not to.
The more I think about it the more I like the idea of making our Pods be generic diskless Erlang clients. They would only need to be replaced for a new OTP version, which could be a normal rolling upgrade, while upgrading to our new release packages could be rolled out in-service as often as needed. I’ll have to think about how to integrate that into Helm.