Erlang Deployment Tools/Options

AstonJ · January 29, 2022, 2:41am

What is everyone using for Erlang deployment? What are the options?

filmor · February 3, 2022, 11:27am

Not very pretty, requires a lot of caution but works very often:

git pull
./rebar3 as prod compile

[l(M) || M <- code:modified_modules()].

I really wanted to switch to relx, I swear ;-), but at the time I tried it, it was completely broken on Windows, which up until now (switching on Monday \o/) was what we had to put up with.

tsloughter · April 11, 2022, 11:52am

Any specifics you have on what wasn’t working on Windows?

filmor · April 11, 2022, 2:38pm

Well, it was at least this one: Release does not boot on Windows with include_erts to false · Issue #237 · erlware/relx · GitHub

I gave a patch for a part of it (related to the created batch files) in Fix erts path discovery on Windows if the path contains spaces. by filmor · Pull Request #464 · erlware/relx · GitHub and it took so long to get merged that I didn’t have time to to try again (project moved on).

hackerjones · April 12, 2022, 9:29pm

I build releases using rebar3 (relx). Most of this application is not written in Erlang (yet), there is an existing build system that just picks up the release and packages it with all the other deliverables into DEB and RPM files.

Led · August 12, 2022, 11:34am

osc ci

osc is a client with a command-line interface and network behavior in the style of Subversion. It serves as client for the source code repository component of the Build Service, and it is used to edit metadata or query about build results.

https://en.opensuse.org/openSUSE:OSC

maxlapshin · August 13, 2022, 3:58am

We write and sell our software that is used on client side: bare metal onpremises and public clouds.

I see that there are two different cultures that are rather far from each other even in languages that they use. Let’s call them “admin” and “devops”.

“Admin” culture implies “don’t touch if it works”. It is very easy to start, some of our clients learn linux basics only by our installation guide. Debian package with systemd, text /var/log files with preconfigured rotation, web UI that is running even if config is broken.

rebar3 as prod compile (of course rebar3 with our own patches, because we need to reduce minutes in our CI pipeline). After this we add some trivial systemd files and use Systemd support · GitHub

We see demand from such clients for “no bugs please, no updates please, only patches”. Nice, but too expensive =)

“Devops” culture is different and is more about docker, kubernetes, etc. It has a steep learning curve, however it is emerging and becomes cheaper at a scale. From our point of view, it means another approach for reading configuration (for example more reading environment variables than config file). Json logs to output, immutable local drives, etc. It is easier to develop for such approach, but harder to “just start for tests”.

srijan · August 15, 2022, 5:25am

rebar3 with our own patches, because we need to reduce minutes in our CI pipeline

Would love to know more about this.

tsloughter · September 7, 2022, 4:42pm

Same

maxlapshin · September 11, 2022, 7:01pm

ASN1 compiler
backported port compiler
Our target for CI pipeline is 10 minutes, but we cannot reach it for now. Usually it is about 13-17 minutes (17 if tests are failing).

Compilation takes about 70-80 seconds (on Ryzen). We cannot afford making several compilations, so we run rebar as prod compile only once. This adds some ambiguities with priv and test dirs, because our prod and test compilation is the same.

In production build priv, mibs, include dirs are copied into _build, not symlinked.

Minor thing, but important.

There were some other patches related to our license protection system, but they are not actual anymore.

tsloughter · September 21, 2022, 5:26pm

But what are the patches related to compilation speed doing?

Curious if you can also give some numbers on the number of modules you have in your project (not deps, since dependencies are only compiled once and are always compiled in prod profile).

maxlapshin · September 21, 2022, 6:35pm

The main speedup patch is making only one compilation for production and tests. We cannot afford making two compilations. There are some minor fixes that properly prepare test and priv catalogs for such approach.

$ find apps lib  -name '*.erl' -exec cat {} \;| wc -l
  573153
$ find apps lib  -name '*.erl'   | wc -l
    1391

tsloughter · September 21, 2022, 6:59pm

Ah, gotcha.

The numbers seem way off for only 1391 modules though. Unless there are a lot of parse transforms or something like that complicating the compilation?

It definitely isn’t recompiling dependencies on each run? Or are all deps vendorered into apps and lib dir?

maxlapshin · September 21, 2022, 7:35pm

We run our CI in docker and recompile apps with vendored libs on each run.

It may be possible do act like with javascript: copy rebar.config, rebar.config.lock, build deps (it will cache), then copy rest of files and compile them, but there are not too much dependencies.

We have only one global parse_transform — it is our internal event system that is being replaced by new logger infrastructure right now and it almost a copy of lager_transform. Should I better take a look at parse_transform speed? 1400 modules should compile faster than a minute?

MononcQc · September 22, 2022, 2:08pm

The Erlang compiler got a bit slower as it started making more and more optimization runs, and the way we profiled things, the best way to optimize was to compile the less often possible. We oriented most of our target workflow against interactive development – the developer builds and runs things repeatedly in a dev environment rather than CI runs.

There are few useful things to know:

Dependencies are fetched and built once. A quick analysis phase is done after verifying dependencies to know if they have complete .app files with the correct .beam modules. If they do, dependency compilation and analysis is bypassed
The compiler respects this two-phase process (deps first, then all top-level apps) and has an optional switch (--deps_only) that tells it to build the deps and ignore top-level apps.
The apps are compiled after the deps with all the relevant paths loaded and put in place
Each compilation phase (either top-level apps or deps) contains a Directed Acyclic Graph (DAG) that analyses all module declarations for include files, behaviours, and parse transforms, in order to properly know which modules must be compiled first, and which ones may be compiled later and in parallel.
The first build has to create that graph and analyze all modules.
Artifacts (beam files and other ones for alternative compilers) are added to the DAG.
On a follow-up run, a quick prune/analysis phase runs where all source files are checked for timestamps against what is in the DAG. If any file changed, we propagate the change to all dependent files (eg. if you change an include file used by 10% of modules, we know to only rebuild that 10%).
This does mean that changing a parse transform used by all modules forces a whole recompilation every time, for example, but for most iterative development scenarios, people touch 2-3 files at a time and then rebuild.
The build artifacts in the DAG are compared to their source files (the artifacts’ edges are labelled as artifact edges) to know which ones are outdated, and to force a re-build. The build artifacts also store the list of relevant options that were used when building them so we know a change in compiler options must re-trigger a build.
When using profiles, the build artifacts are reused and the deps that are shared are reused, but a fresh DAG is created because different deps and profiles can have different options and demand rebuilds.
The compiler version itself is also stored in the DAG metadata, so if you switch Erlang versions in your dev environment, we trigger a rebuild. Some other options, like having ENV-defined compiler options (with ERL_COMPILER_OPTIONS) will force a rebuild every time since we can’t easily track and parse them.

This tends to mean that yeah, the CI runs are gonna be more expensive. However, there is a benefit to be gained in a containerized situation to first fetch all the deps and building them with rebar3 do compile --deps_only) and then putting that in a layer. Doing it with both profiles you use first (.eg prod and test) will allow that layer to be reused by later phases.

If you do a single build and the compiler version does not change, then the DAG will have to be built because we don’t really have a way to skip it if we want to properly order compiler phases, but never using anything but that will result in some useless work being done. That’s front-loading for later calls that will never come.

I’m not 100% sure 1400 modules should compile faster than a minute. If you build a project with DEBUG=1 rebar3 compile and look at the output, you can see the different phases:

===> Running provider: app_discovery
    %% ^ this is scanning the project for its structure
===> Running provider: install_deps
===> Verifying dependencies...
    %% ^ this is where we fetch dependencies
...
===> Running provider: lock
    %% ^ lock file is generated or validated for any change in the previous fetch
===> Running provider: compile
    %% this is the section we care about
===> Compile (apps)
    %% This is the building of the dependencies
    ...
===> Compile (project_apps)
    %% And now we build your own top-level apps (or vendored deps and _checkouts)
===> Analyzing applications...
    %% ^  All top-level apps are analyzed in a single large phase that
    %%     is shared across all of them. This lets us set an app order
    %%     in case of cross-app dependencies.
===> Compiling <appname>
===> compile options: {erl_opts, [...]}.
===> files to analyze [...]
    %% ^ this big list lets you know what files have been found and will be 
    %%   part of the DAG for this app, and whose build artifacts will be checked.
===>      Compiled <module>
===>      Compiled <module>
    %% ^ each module that needed to be rebuilt is noted here. The first modules
    %%   are compiled sequentially (parse transforms), but everything else after
    %%   is built in parallel. 
Running hooks for ...
   %% ^ that means compilation is done, we're now generating app files and running
   %%   your custom hooks
...

This should let you figure out where a lot of the time is spent (though it’d be nicer if we output timestamps for each phase). The view with DIAGNOSTIC=1 rebar3 compile gives an even more detailed view.

The thing that may come out of this is that actually, yeah, rebuilding 1400 modules may take quite a while. Here’s the results for Rebar3 itself, all deps ignored:

→ rebar3 compile
...
→ ls _build/default/lib/rebar/src/*.erl | wc -l
89
→ rm -rf _build/default/lib/rebar/ebin/*.beam
→ time rebar3 compile
===> Verifying dependencies...
===> Analyzing applications...
...
===> Compiling rebar
rebar3 compile  10.88s user 1.43s system 169% cpu 7.254 total
→ time rebar3 compile
===> Verifying dependencies...
===> Analyzing applications...
...
===> Compiling rebar
rebar3 compile  2.86s user 0.58s system 152% cpu 2.257 total
→ rm _build/default/lib/rebar/ebin/rebar_prv_help.beam
→ time rebar3 compile
...
rebar3 compile  3.17s user 0.69s system 146% cpu 2.638 total

What this shows is that in the first build (where everything but the one rebar3 app is pre-built) takes ~11s. The second build, which is a no-op takes slightly under 3s, and rebuilding just 1 module is slightly above 3 seconds.

The overhead of rebuilding a single module seems to be at ~0.3 seconds. For comparison’s sake:

→ ERL_LIBS=_build/default/lib
→ time erlc
erlc  0.21s user 0.03s system 113% cpu 0.214 total
→ time erlc apps/rebar/src/rebar_prv_help.erl
erlc apps/rebar/src/rebar_prv_help.erl  0.51s user 0.08s system 113% cpu 0.519 total

This too seems to show that the overhead of compiling a single module is ~0.3 seconds, directly with erlc.

So 1400 small modules (<90 lines for rebar_prv_help) ought to take ~420 seconds if done sequentially, giving us ~7 minutes. Adding cores should divide this almost linearly since most modules don’t tend to represent a very deep chain of direct inclusion dependencies.

Concurrency can help with that, but this should at least help figure out how much of the time is spent in the analysis, and how much is spent in the actual compiler.

TL:DR; Rebar3 does make the decision of adding work on the first build to skip it later because we optimize for iterative development experience and want to compile the least amount possible. CI may be slower, but generally I wouldn’t be surprised if most of the time is actually spent in the compiler. Caching compiler phases on deps may yield decent improvements by saving most of that time across various builds.

maxlapshin · September 26, 2022, 3:09pm

Ok, will try to separate compiling our bundled libs (they do not change for years) and our apps.