Are there any plans for something like Elixir stream for Erlang?

I mean an official OTP offering.

I’m guessing that Elixir has the concept of enumerables has something to do with it not yet existing?

1 Like

What does elixir stream do?

Lazily evaluated operations on enumerables.

4 Likes

Everything in programming should be about semantics.
So while ‘lazy’ is a nice piece of work, I think that it is important to note that it doesn’t offer an implementation of lazy lists.

eager evaluation : call by value
lazy evaluation : call by need
‘lazy’ module : call by name

This means that a problem solved using the ‘lazy’ module can take vastly more time than the same problem solved the same way using actual lazy lists.

Suppose we add a very old list function to lazy.erl.


head(G) ->
{H, _} = G(),
H.

Now let G be one of these ‘unfold’-based sequences.

X = head(G),
Y = head(G),
Z = head(G).

How many times is the value computed?
Under lazy (call-by-need) evaluation, ONCE.
In the ‘lazy’ module, THRICE.

Let’s try dot product.


dot(G1, G2) ->
dot_loop(G1, G2, 0).

dot_loop(G1, G2, Sum0) ->
case {G1(), G2()}
of {{H1,T1}, {H2,H2}} -> dot_loop(T1, T2, Sum0 + H1*H2)
; _ -> Sum0
end.

If we call dot(G, G), each element of G will be evaluated TWICE,
whereas lazy evaluation would evaluate it just once.

The point of lazy lists (or, as Pop-2 called them, ‘dynamic lists’),
is to delay evaluation of the elements and the spine of a list until
you know that the result matters, then compute whether-there-is-a-next-element-and-if-so-what-it-is once and only once.

I don’t know any way to implement lazy lists without mutable data structures somewhere (e.g., my Smalltalk library has analogues of Scheme’s (DELAY expr) and (FORCE delay) but the implementation relies essentially on being able to assign to the fields of an object. If I was doing this in Erlang I’d probably use the process dictionary like this:

% delay(fun () -> <expr> end) models Scheme's (DELAY <expr>).

delay(Fun) ->
Ref = make_ref(),
put(Ref, {false,Fun}),
Ref.

% force(Ref) models Scheme's (FORCE <ref>)
% and acts sensibly on things that are not delayed.

force(Ref) when is_reference(Ref) ->
case get(Ref)
of {true, Val} -> Val
; {false,Fun} -> Val = Fun(), put(Ref, {true,Val}), Val
end;
force(Val) ->
Val.

I’d hate myself, and spend a couple of paragraphs of comments explaining why I was sinning against the Erlang Way, but I honestly don’t know a simpler way to do lazy evaluation in Erlang. Of course lazy lists implemented on top of delay+force are going to hammer the process dictionary, but that’s what garbage collectors are for. The details of one way to build lazy lists (“streams”) on top of delay and force can be found in section 3.5 of Abelson, Sussman, and Sussman “Structure and Interpretation of Computer Programs”, 2nd edition. One big snag is that lazy lists implemented this way cannot be sent to another process.

What’s a good name for lists that don’t compute an element until it is needed, but then recompute it repeatedly? “Slothful lists”?

1 Like

Thanks :slight_smile: Having done this just for fun, I’m surprised to see that it suddenly seems to gain a lot of attention :sweat_smile:

That’s certainly the expectation when looking for lazy evaluation in general. I’m not sure what the topics author is expecting out of streams, but in elixir they exist to deal with enumerables, which might not fit in memory in the first place. For that lazy lists as they’re described by you don’t help. It doesn’t matter how lazily you pull data into memory if it eventually exceeds system resources.

Streams as implemented in elixir are an API to help composing operations and evaluating them on the enumerable per element and keeping only accumulated results around vs eagerly applying operations on the whole enumerable for each intermediate operation. That way you can apply computations e.g. over large files without them ever landing completely in memory.

2 Likes

Lazy lists are most definitely used to deal with data too large for memory. One expects in Haskell that something like

import Data.Char
main = getContents >>= (putStrLn . unlines . map (map toLower) . lines)

will execute in bounded space. Where did you get the idea that unreferenced data would fill up memory from?

(This should not be taken as pretending that the entire Haskell community is satisfied with the IO monad. There has been a lot of work around ‘conduits’ and ‘streams’ and ‘iteratees’ trying to find the right abstractions.)

Tbh I’ve no idea how haskell works and I don’t want to pretend I do. But you argued that lazy lists are evaluated only once and the only way “only once” can happen is if you retain what was evaluated in memory. If you lazily iterate over the contents of a file you cannot go back to the start of the file without reading from the file again.

1 Like