Is list comprehensions tail recursive?

Hi,
I found such description for the list comprehensions in system documentation.
List Handling — Erlang System Documentation v27.0.1

'lc^0'([E|Tail], Expr) ->
    [Expr(E)|'lc^0'(Tail, Expr)];
'lc^0'([], _Expr) -> [].

This code seems to NOT be tail recursive. This can by only simplified pseudo code for documentation purpose. But can you clarify, list comprehensions would resolve as tail recursive code or not?

Regards,
Pawel

2 Likes

The question is covered here in the Efficiency Guide:
https://www.erlang.org/doc/system/listhandling.html#list-comprehensions

BR,
Ulf

5 Likes

Additionally, that same documentation page has a section comparing body- and tail-recursive functions for constructing lists: Recursive List Functions.

2 Likes

Note that there’s no reason list comprehensions HAVE to be compiled using body recursion. It’s technically possible to have a pure functional programming languages in which code like the example in the manual is tail recursive, the lc^0 being passed a hole to fill in. All the holes necessarily get filled in before the data structure becomes visible to any other computation. It’s just that the BEAM wasn’t designed to support that. Will/should the BEAM ever be changed to allow bihind-the-scenes hole passing? Cost too high, benefit positive but too low. (I have a paper design for this. It’s possibly on a floppy for a classic Mac…)

1 Like

You mean tail-mod-cons? That would be awesome.

2 Likes

That’s an excellent paper with some really solid work behind it.
However, the idea is not original. Here are filter/2 and map/2
expressed using a functional-programming preprocessor for Prolog.

filter(, P) = [].
filter([H|T], P) = [H|filter(T, P)] :- call(P, H), !.
filter([
|T], P) = filter(T, P).

map(, F) = .
map([H|T], F) = [call(F, H) | map(T, F)].

This was translated, back in, oh, 1990? to

filter(, _P, R) :- R = .

filter([H|T], P, R) :- call(P, H), !, R = [H|T2], filter(T, P, T2).
filter([_|T], P, R) :- filter(T, P, R).

map(, F, R) :- R = .
map(H|T], F, R) :- call(F, H, H2), R = [H2|T2], map(T, F, T2).

where R is the “hole”.

To the best of my recollection – it’s years since I had either logix or Parlog –
the logix implementation of flat concurrent Prolog and Parlog did the same thing.
Logix at least had to, since in FCP all call are tail calls. Indeed,
“hole passing” was a major technique in FCP. My recollection of Parlog is now
hazy, but I recall the management of holes being part of how Parlog did
synchronisation.

Doesn’t surprise me. Every good idea now was a paper 25 years ago. I know Erlang and Prolog are as similar today as a magpie and a Tyrannosaurus Rex, but what to your knowledge would make this optimisation particularly hard to implement in the Erlang toolchain?

(Also, I hate to be the one to point this out, but the formatter really did a number on your code. Could you please enclose it in ```?)

2 Likes

Two things come to mind, tracing and the fact that there today are no write barriers. There was an attempts some 10+ years ago to implement it, but it was abandoned for reasons I cannot recall. Maybe if you dig in the old patch/questions mailinglist archives you can find some reference to it.

:frowning:

Pardon my ignorance, but how would a lack of write barriers preclude this? Isn’t each process single-threaded? And re. tracing, the tail-recursor would be almost 1:1 to the original source. I see how these are problems, and I believe if you say so that they’re showstoppers, but I don’t see how.

As consolation, is this post still accurate? That is, does the call stack of a tail-mod-cons list generator have about the same memory footprint as the list it generates?

1 Like

None of the problems are complete showstoppers, but they add complexity. There may also be more that I don’t recall. It was a long time ago that I looked at this last time.

Yes, it should still be accurate.

2 Likes

… in turn based on a paper published 25 years earlier. Hello fellow Lorite :wave::stuck_out_tongue_winking_eye:

(@nzok I am aware that you disagreed with the concept (which is merely a narrative figure in an alternative-reality fiction novel, Anathem (which I happen to especially like :smile_cat:)) when @juhlig mentioned it in another thread, but I would argue that it really depends on what constitutes novelty and the frame one puts a “new” idea in. Applying a specific hand-held device like an iPad to a specific health-related use case like measuring pulse, yes, that has very likely never been thought of before. Applying some hand-held device like a club to some health-related use case like hitting an opponent over the head (admittedly that concerns not your own health, and not in a good way), no, people came up with that idea quite a while ago.)

Yeah, the formatter usually does a messy job on posts done by e-mail :face_with_diagonal_mouth: I kinda got used to it, but it would be great if this could be somewhat improved (@AstonJ). Particularly, I see no point in turning [] into :sweat_smile:

1 Like

Pardon my ignorance, but how would a lack of write barriers preclude this?

Hole-passing boils down to after-the-fact mutation of the data. I.e., you create a CONS cell, leave its CDR uninitialized (or rather you put a non-value marker in it), pass the reference to that CONS cell around, and later on some code fills in the CDR (like LISP’s rplacd or Scheme’s set-cdr!).

One problem with mutable data structures is that the mutations create wrong-way pointers: older data refers to younger data, which causes problems for generational garbage collectors. The write-barriers are a way to catch and record those side-effects for the benefit of the GC.

None of this is impossible to implement, but the mainstream Erlang VMs deliberately never did.

1 Like

I see the problem. However, I will note, if it makes a difference, that tail-mod-cons is not general hole-passing. The incomplete data is created and fully assembled entirely inside the generated DPS function; no other function ever sees it, and it emerges from the tail-recursion immutable.

I see the problem. However, I will note, if it makes a difference, that tail-mod-cons is not general hole-passing. The incomplete data is created and fully assembled entirely inside the generated DPS function; no other function ever sees it, and it emerges from the tail-recursion immutable.

It does not make a difference, since the computation is non-atomic and can be interrupted by garbage collection at (almost) any time.

I have been complaining about the auto-mangling of text for some time.

As far as I know, the barriers to “hole passing” in Erlang are the BEAM, and the BEAM. The BEAM, because it’s just not set up to do that. And the BEAM, because the garbage collector exploits the fact that pointers in a process’s stack all point one way, from (objects allocated at) later times to (objects allocated at) earlier times. Also, normal processing doesn’t require pointers into the middle of objects.
(Write barriers in the context of GC have nothing to do with concurrency. They’re about detecting references from old generations to new ones, something one normally wants to avoid.)

O’CAML puts the function program’s stack in the system stack so there was a problem with body recursion running out of memory. BEAM and JAM don’t do that, so don’t have nearly as much need for hole passing.

Logix needed it because body recursion was technically impossible: all the calls in the body of a clause executed concurrently and making things happen in sequence required explicit and not always comfortable programming. From the perspective of the 1980s/1990s work on (Flat) Concurrent Prolog, Guarded Horn Clauses, and Parlog, Erlang is very much a “mostly sequential” language. Which may be why Erlang has flourished and the concurrent logic languages have not.

1 Like