Thanks for the thorough reply.
Short version: Your reply helped me get a better understanding than before, but I still have some unanswered questions. I think that I understand why a value list makes sense now. What is the purpose of an expression list?
Long version:
FYI: I wanted to put in links for everything, but since I am a new user I cannot add more than five links in a post. I have been forced to remove most of my references. Full version with links can be found here: Erlang Forum Response · GitHub
Intro
My initial post would have been very long if I tried to explain everything that I have tried to understand Core Erlang, which is why I kept it brief.
Furthermore, I wanted to focus on Core Erlang, rather than what I did with it. But I realize now that more context is needed. Sadly I am not sure how to keep it brief while explaining what I know and the source for my confusion with part of the Core Erlang syntax, so the long version is quite long.
My Academic work is about Gradual Session types. We want to continue the work of Igarashi et al. by closing the gap between academic work and usable implementations. Our product will be a type system with an implemented type checker that supplements Dialyzer like other tools already do today (eqWAlizer, Gradualizer etc.)
My introduction to Erlang was with Learn You Some Erlang for Great Good! supplemented by Google and YouTube. As for Core Erlang: I looked at the Erlang Blog as you linked Core Erlang by Example including the followup on Core Erlang Wrap Up. Furthermore, several websites give some introduction to Core Erlang, e.g. 8thlight [dot] com and baha [dot] github [dot] io. I also read a bit on the Elixir forum to figure out where Core Erlang belonged in the compile process since I did not fully trust the third party websites referenceless description of the compile process. I have studied several research articles that use Core Erlang to learn more and hopefully find references to more recent official documentation. The most official Core Erlang documentation I can find is consistently on the HIPE CERL research group website with the latest version obviously outdated. Then I tried to look at Dialyzer for both the Dialyzer research group website and the Dialyzer source code since I got hints that Dialyzer might use Core Erlang somehow. I have even explored a Haskell package that works with Core Erlang.
Why Core Erlang?
I did not know about Erlang before I started working on this research project last summer (~August 2023). I have found the more strict nature of Core Erlang to be an advantage when building an analysis tool, which is why I have chosen to move on with Core Erlang rather than Erlang or the Erlang Abstract Format. The Erlang Abstract Format “just” represents Erlang code as tagged tuples, it does not reduce the scope of the language. The sheer possibilities of tagged tuples and their use are overwhelming.
I have already seen EEP 52. There is also EEP 43. Core Erlang changing is an acceptable risk especially as I think the more strict construction makes static analysis easier. I would have to spend significantly more time understanding the flow of data in Erlang as the syntax is much less strict and have many ways to do similar things. Core Erlang is more verbose and less readable, but at the same time much simpler and straightforward in general. I feel that I am close to fully understanding Core Erlang. A key part is to understand why expression lists are allowed almost everywhere in the syntax.
Expression lists
I did not call <…> value lists, as the same syntactical construct can be used to contain expressions (refer to Core Erlang 1.0.3 Appendix A.7, see exprs
). This is properly what put me the most off with regards to understanding the syntax of Core Erlang.
It is syntactically possible to write an expression list in both the Core Erlang spec and the Core Erlang parser. Expression lists are syntactically valid almost everywhere in Core Erlang (since a normal expression can always become an expression list), but it seems to me that their use is very limited. I only remember seeing expression lists being used like patterns, i.e. variables and/or values mixed in a single list.
You could put a full expression in the head of the case ... of
if you wanted:
module 'test' ['foo'/1]
attributes []
'foo'/1 =
fun (_0) -> case <let X = 5 in X,42> of <A,B> when 'true' -> B end
end
However, even though expression lists are allowed almost everywhere in the syntax, there are several cases where I am not sure how it would make sense to have an expression list. E.g. in apply
the local function name can be an expression list according to the syntax. For example:
module 'test' ['foo'/1]
attributes []
'foo'/1 =
fun (_0) -> let <Q,S> = <fun() -> 42,fun() -> 43> in apply Q ()
end
But when I use <Q>
instead of just Q
in apply
I get an illegal expression in foo/1
:
module 'test' ['foo'/1]
attributes []
'foo'/1 =
fun (_0) -> let <Q,S> = <fun() -> 42,fun() -> 43> in apply <Q> ()
end
Similarly, I do not understand why the syntax of the guard clause in the case of allows an expression list.
module 'test' ['foo'/1]
attributes []
'foo'/1 =
fun (_0) -> case <_0> of <X> when 'true' -> X end
end
Is accepted, but yet again if I put 'true'
in an expression list it is no longer ok due to illegal guard expression in foo/1
:
module 'test' ['foo'/1]
attributes []
'foo'/1 =
fun (_0) -> case <_0> of <X> when <'true'> -> X end
end
This is consistent with section 6.7 of Core Erlang 1.0.3:
If a clause guard evaluates to a value other than true
or false
, the behaviour is undefined
Why not limit the syntax to only support a single expression here? Why allow an expression list?
Yet another case where I do not fully understand the logic of where expression lists are in the sequence do
notation. It allows both elements of the sequence to be an expression list: The following code is accepted with a warning (since I do not use 42):
module 'test' ['foo'/1]
attributes []
'foo'/1 =
fun (_0) -> do 42 <_0>
end
However, if I wrap 42 in <> to make an expression list:
module 'test' ['foo'/1]
attributes []
'foo'/1 =
fun (_0) -> do <42> <_0>
end
I get an illegal expression error:
$ erlc test.core
test: illegal expression in foo/1
Are expression lists allowed everywhere to keep the syntax simpler and defer the decision of where expression lists are used to a higher level in the compilation process? Or is there some other reason?
I’ve built my own Core Erlang parser, and understanding the meaning <…> makes sense to ensure the analysis of Core Erlang is correct. Until now the specifics were not important, but it has become important now. I had hoped that I would understand why expression lists existed as I worked more with the other parts of Core Erlang. Since I still do not fully understand it, I asked here in hoping to get help from someone who knows the motivation behind the design of Core Erlang.
Best internal format?
You say it is “better” to use the Erlang Abstract format. After I got the hint about to_core0
and dcore
I started to read the compile.erl file (as I found them here).
There are two aspects here. One is the stability, the comments in the source code suggest they are equal: abstr and core. Another is documentation, where compile.erl clearly states that core erlang is not documented.
If I understand the original motivation for the Core Erlang format in the first place documented in the cerl spec version 1.0.3 it clearly states that Core Erlang is meant to be usable externally even to be edited by hand:
During its evolution, the syntax of Erlang has become somewhat complicated, making it difficult to develop programs that operate on the source. Such programs might be new parsers, optimisers that transform source code, and various instrumentations on the source code, for example, profilers and debuggers
I started to work with Core Erlang for the very reasons that motivated the creation of Core Erlang in the first place. I want a simpler way of working with Erlang for analysis with my tool.
The comments in the source code contradict these goals (e.g. lack of documentation). When did Erlang leave these goals behind? Or maybe Erlang never adopted them in the first place?
Maybe it is for the same reason that the Core Erlang specification never got updated after EEP 43 and EEP 52. Furthermore, I wonder why receive
wasn’t completely removed if the format is considered to be internal only. I got the impression from EEP 52 that the receive
was kept for backward compatibility reasons.
I feel that I am very close to fully understanding Core Erlang. Thank you for your time.