What's the best way to call erlc?

lpil · December 17, 2021, 7:50pm

Hello!

I’m working on a build tool for Gleam and Erlang. One of the things that it does is call erlc on a bunch of .erl files in order to make .beam files.

Currently I’m just using erlc -server -o ebin src/*.erl. When I watch htop it seems to only use a single core, which surprised me.

What’s the recommended way to run erlc for the best performance? I’d really love to make full use of all the cores of my machine.

Thanks,
Louis

vkatsuba · December 17, 2021, 8:59pm

What if you will use escript for collect and compile files(as it’s done in rebar3)? I suppose it would be very flexible. But if you want use some Makefile - I suppose you need try to take a look to erlang.mk - exactly to erlang.mk#core/erlc.mk.

lpil · December 17, 2021, 9:05pm

I could write a multicore program that uses the compile module, but I am surprised that it would be required. It is such a normal thing for Erlang programs to be multicore that I didn’t consider for a second that that erlc wouldn’t be; not even with a flag or such.

vkatsuba · December 17, 2021, 9:16pm

What about JIT in Erlang The Road to the JIT - Erlang/OTP - do you compare performance with OTP 24?

lpil · December 17, 2021, 9:47pm

The JIT is fab, but here I’m looking to discover if there’s a convenient API for compiling Erlang to BEAM files in parallel rather than sequentially.

starbelly · December 18, 2021, 12:28am

I just tried this, but observed core usage go up on all cores…

I made a dir with 10,000 erlang source (just module declarations).

lpil · December 18, 2021, 1:47am

Could you share what you did precisely? I just tested again with 10,000 modules with just a declaration and I got a single erlc process using less than 1 CPU core, spread across a few cores due to the OS scheduler bouncing it around a little. It didn’t manage to max out 1, let alone the other 7.

louis ~/Desktop/test $ for i in (seq 1 10000)
                             echo "-module(mod$i)." >> src/mod$i.erl
                       end
louis ~/Desktop/test $ erlc -o ebin -server src/*.erl &
louis ~/Desktop/test $ ps -A -o %cpu | awk '{s+=$1} END {print s "%"}'
185.3%

Here you can see erlc + all the other programs I’m currently running are using less than 2 cores fully. I can tell you from htop that erlc isn’t even the majority of that.

starbelly · December 18, 2021, 4:28am

Yeah, that’s precisely what I did, but just eye balled things tbh. As far as the ps + awk command, I saw as much as 300%, prior to running I was at about 100% for current procs Maybe my eye balling effort was deceived by what’s your talking about with the OS scheduler bouncing around.

If I’m tracing code correctly, I don’t think you’re going to get what you want right now. It looks like erlc with the server option will fire up the compile server, the arguments are passed to the compile server, this in turn simply calls erl_compile:compile/2. This seems like it will compile one file at a time. There is no attempt to batch files ( num_files / num_cores) or spawning of any kind at that point.

That’s if I did my tracing correctly (and I’m rushing right now tbh). Of course, if I am right, it looks like wedging in what you’re after wouldn’t be a big PR. However, I have to wonder why that hasn’t been done yet.

Edit: Of course as you know, dependencies between modules would have to be taken into account with such a PR.

bjorng · December 18, 2021, 4:53am

The compile server will not help when used like this. All files given on the command line to one invocation of erlc will be compiled sequentially.

erlc was written a long time ago, long before OTP R11 that introduced SMP support. Without SMP support, running compilation jobs in parallel would only be slower. Also, erlc is often used from within a Makefile, meaning that there would be one invocation of erlc for each file. Adding support for compiling all files given on the command line in parallel would not have improved the usual way erlc was used.

Yes, consider:

erlc -pa . some_parse_transform.erl some_module_using_the_parse_transform.erl

lpil · December 18, 2021, 11:47am

Wonderful, thanks everyone. Now that I know that I’m not missing anything I can figure something out here that will do the job.

starbelly · December 18, 2021, 3:53pm

As a side note : Elixir’s parallel compiler is an interesting read. There was also a discussion with folks such as @jhogberg and @MononcQc around a next-generation erlang compiler. I can’t find that conversation right now.

As another side note, despite erlc compiling one file at a time (as a default), man it is fast

lpil · December 18, 2021, 7:19pm

It really is! It took me some time to realise it was single threaded as it didn’t feel slow at all.

josevalim · December 19, 2021, 2:35pm

Maybe the best route here is to have a small escript that will compile the files in parallel for you?

lpil · December 19, 2021, 2:56pm

Aye, I think that is the way to go.

tsloughter · December 20, 2021, 11:46pm

Definitely talk to @jhogberg about what the future holds (the hopefully near future!).

jhogberg · December 21, 2021, 10:55am

We had a discussion in the Build Tools and Packaging WG earlier this year, but I haven’t been able to work much on it lately. I hope to revisit it for OTP 26.