Release upgrade specifications: day 1 or day 2?

We ship software in releases. When we ship a new release it may be installed as a fresh system or applied to one which is already in production. The sasl application provides tools for runtime release handling.

We can use release_handler:install_release/1 to upgrade a running system, which leverages the power of hot code loading. The release package has to have first been unpacked with release_handler:unpack_release/1. A release package may be prepared for shipping using systools:make_tar/1,2.

Now here’s where things get circular. Installing the release requires a release upgrade file (relup), however no such file is included in the package. We can create relup with systools:make_relup/3,4, however it requires an application upgrade file (appup) for each application included and those also are not included in the package.

The appup file allows us to specify application versions with a regular expression however the relup file does not. We could just create the relup in day 1 and add it to the shipped package (erl_tar:add/3,4), however in order to support upgrading from an arbitrary one of the many previous releases we have to include a relup clause for every single one. If we have relup created in day 2, on the target system, we have to arrange for all the appup files to be created too, which is probably less trouble.

So my question is, which approach were the OTP team expecting us to follow? How are other people dealing with it? I’ve never used rebar, is it opinionated on this? Would it be difficult to support regular expressions in relup? Should appup files be automatically included in packages?

1 Like

I can’t comment on OTP releases since we don’t use them much, and I have no personal experience with them.

Our main Erlang system at Klarna exclusively uses live upgrades for the production system. It has a home-grown framework of “UP-files” which are Erlang modules which can run before or after the new code has been loaded. Typically they do things like modifying the supervisor trees, restarting gen_* servers or entire applications, use sys: to modify internal state, or issue strategically ordered c:l calls to reload code in some specific order (to handle API changes). On top of this there is an “upgrade” framework which orchestrates the above, and knows about some key internal things that always receive special treatment during upgrades.

This has served us well for >> 10 years, but since some major changes over the last 3-4 years we have essentially eliminated the need for the nodes in a cluster to always run the exact same code, so we’re contemplating migrating to rolling upgrades with node restarts.

Some of our auxiliary systems use OTP releases, but only as a packaging method. They don’t do live upgrades.

1 Like

Above I suggested we could add files to a release package with erl_tar:add/3,4 however that turns out to be untrue as erl_tar:open/2,
with the write option, truncates an existing file.

The extra_files option to systools:make_tar/2 was introduced with sasl-4.0, however it is documented incorrectly currently. I opened issue 8842 and sent a pull request (8843) to fix it.