How much knowledge of C required to study the Erlang compiler and VM in depth?

Goal: I want to disect and study Erlang’s compiler and BEAM VM in depth.

  • As I knew that its low level parts are written in C lang.
  • I am a beginner to Erlang and as well beginner to C.

My question: How much of C lang I need to learn to get started with my goal.
Format: Do provide me with an index or syllabus for C from start to end.

Idk about the Erlang VM but you can never go wrong with beeg

1 Like

Have you seen The BEAM Book: Understanding the Erlang Runtime System ? It’s a good source about internals; and may help you refine your information gathering.

As I understand it, the Erlang compiler is mostly in Erlang, so you won’t need much C knowledge for that… but you’ll probably want to study the loader and/or beamjit at the same time, so you’ll need C for that. Most of the C code in OTP is pretty easy to follow. It’s in that place where most things look pretty simple and obvious… but it’s only possible because of a lot of hard work.

Meta: You might want to focus your goal a bit more, though. Understanding BEAM and the Erlang compiler is a great thing, but what will you do with that understanding? I grew to understand BEAM in order to better operate my service built upon BEAM… I needed to understand and avoid or fix bottlenecks, and sometimes to debug complex interactions between and within nodes. The understanding was critical, but it was the means to an end, not the end itself.

1 Like

The BEAM book is somewhat dated, and depending on your actual goal (see XY problem - Wikipedia) it might not be the best way to learn what you want. It assumes that you already know a fair bit of low-level programming, which implies that you already know enough C or could learn to read it in an afternoon since none of the concepts would be new.

What’s your background?

Such question sound to me as “how much physics do I need to understand nuclear fission and nuclear power plant? “

I would say instead that you will learn C by reading Erlang code.

1 Like

Way back when, people did a lot of programming in assembly code, and you needed to
know the instruction-level architecture of your computer pretty well.
Then people said “you know, high level languages like Algol are a lot easier to
write than assembler, but we still need to get really close to the machine.”
And so were invented Machine-Oriented-High-Level-Languages, like PL-360, BLISS,
PL-11, and a bunch of others. (I wrote a fair chunk of code in BLISS-10.) And
then along came C. It got ported between several different byte-addressed
machines, and before too long, it became that paradox: a portable MOHLL.
Becoming a good C programmer was pretty easy if you understood programming
some machine at the low level. It could be the IBM z/Series. It could be
the x86-64 family. It could be ARM or RISC-V. It could even be SPARC.
Because while C has had more and more application-oriented features grafted
on, and while it hides a lot of the machine architecture (like the number and
character of the registers, if any, or how procedure calls work, or whether
the machine has conditional moves or fused multiply-add), it does reveal,
very clearly, an assembly-level view of memory. It’s not an architecture-
neutral view. It isn’t the way the B6700/A-Mode/E-mode series from Unisys
view memory. It isn’t the way that several other machines like the Prime
400/500 viewed memory. It’s certainly not the way that the 80286 running OS/2
viewed memory, although C was bent into service there with architecture-
specific extensions.

So the foundation of understanding C is understanding what memory model C
presupposes.

Oddly enough, that’s also the foundation of understanding the BEAM.
Erlang does not presuppose the memory model that C does. The BEAM’s
fundamental task is to represent Erlang data structures in memory in a
way that makes sense to C. The next task of the BEAM is to operate
on those data structures, mapping Erlang control structures to
imaginary instructions.

Understanding the BEAM is straightforward once you understand C,
Erlang, and VMs in general.

So I suggest starting with this tutorial about writing VMs:
Write your Own Virtual Machine
That builds a much much simpler VM than the BEAM, using C,
so it will help you with understanding C and VMs.

After you’ve worked through that,
(PDF) Implementing functional languages: a tutorial
"Implementing Functional Languages: a tutorial, Simon Peyton Jones and
David Lester, would be good to read next.

As someone else has already said, though, the big question is
“why do you want to do this”? I think learning about C, Erlang,
and abstract machines is really neat, but then I thought it would
be neat to learn how to play the viola da gamba, and how many people
do that ? Do you want to understand how to compile something else
to the BEAM? More power to you, but compiling to Core Erlang is
probably easier. Do you want to rewrite the BEAM in Rust? (It has
been done and could probably be revived.) Do you just want to be a
better Erlang programmer? Understanding how Erlang uses memory is
probably enough for that. What is your real goal?

5 Likes