EEP 73: Zip generators

Hi all! Here’s my EEP and the PR for zip generators. The idea and syntax of zip generators (comprehension multigenerators) was first brought up by EEP-19 (written by @nzok) without an accompanying implementation. The syntax and usages of zip generators proposed by this EEP is mostly the same with EEP-19, yet the comprehension language of Erlang has undergone many changes since then. This EEP defines the behavior of zip generators with more clarification on the compiler’s part.

EEP: Zip generator for comprehensions
PR: PR #8926 at OTP

Zip generators have the syntax of generator1 && ... && generatorN, where each generator can be a list, binary, or map generator.

Zip generators are evaluated as if they are zipped to be a list of tuples. They can be mixed with all existing generators and filters freely.

Two examples to show how zip generators is used and evaluated:

1> [{X,Y} || X <- [a,b,c] && Y <- [1,2,3]].
[{a,1},{b,2},{c,3}]
2> << <<(X+Y)/integer>> || X <- [1,2,3] && <<Y>> <= <<1,1,1>>, X < 3 >>.
<<2,3>>
13 Likes

Very nice!
But I can’t help but wonder if ; was considered as a syntax alternative to && - and if so why the current one was chosen.

I think both of them are a bit awkward, honestly - but at least ; is the “opposite” of , (used for permutation-comprehension) of sorts. In many languages && is the “opposite” of ||… and i don’t think it is a helpful mental model to think (or potentially led to believe) that this && has any close relation with the || delimiter in a comprehension.

I vaguely remembered that within OTP this year, ; was brought up but we decided to stay with &&, which was proposed by EEP 19. I forgot the reason both for and against ; unfortunately. If someone else in OTP remembers, feel free to fill us in.

Readability-wise, I prefer && over ;. && looks more different to , than ; does. People who read/write code can’t mix them up easily. [{X,Y,Z} || X <- L1; Y <- L2, Z <- L3] takes more squinting than [{X,Y,Z} || X <- L1 && Y <- L2, Z <- L3] to realize which generators are zipped, which ones are not.

I could see how ; is the “opposite” of ,. On the other hand, && conveys the meaning of “and” in most languages, which is close to the idea that generators are zipped or combined together in a way. And since || is already used in comprehensions, and it never means “or” in Erlang, I think it’s reasonable to use && in comprehensions too.

2 Likes

Instead of ; or && and making the syntax more and more look like old Egyption, why not just use the word zipped? I.e. [{X,Y,Z} || X ← L1 zipped Y ← L2, Z ← L3] The && and ; will totaly confuse most users, aspecially when they come from C, Python or just any other regular programming language.
Complex construction like this aren’t used that often and personally I would prefer not adding such new constructs. Also, I think many users will find it hard to get an expression like this right without to much effort and probably are better of building a function using lists:zip and friends.

Frans

Good to hear it was considered. Let’s hope some more reasoning will surface - or good arguments one way or another.

I completely agree && is more logical as an operator for zipping - at least if there was no legacy.
But seeing as && is used elsewhere and || is already used, in the same construct at that, it feels messy. I do like it to be an operator; but certainly no harm in having an equivalent keyword; and coding styles can pick a preferred way.

It is probably a bad idea to make zipping outside generators its own thing, but doing so would probably settle the operator to use. So maybe as a thought experiment it could be good to consider. What do other languages use?
Edit: Haskell looks to be using |, which they also have instead of our ||… so could that be a thing to consider?

Another good point by @schnef is to be very careful with growing the language. It is wonderfully small and much more approachable than classic functional languages.

To aid in comparing some possible options:

[{X, Y} || X <- [1,2,3] && Y <- [a,b,c]].
[{X, Y} || X <- [1,2,3]; Y <- [a,b,c]].
[{X, Y} || X <- [1,2,3] zipped Y <- [a,b,c]].
[{X, Y} || X <- [1,2,3] || Y <- [a,b,c]].
1 Like

; is not good since in erlang ; can be read as or,
i.e. between clauses and guards, that is written in some literature somewhere.

To me || confuses my old erlang brain, but that might be because I’m not used to it.

But comprehension in comprehensions will be hard for me to parse. Bad example:

    [ {X, Y} || X <- [ Fun(Foo) || Foo <- [a,b,c]] || Y <- [a,b,c]]
6 Likes

I wonder if there are even good real world use cases for this feature. I mean, EEP 19 and thereby the idea of zip generators has been out since 2008, yet I have never heard anyone asking for it.

For lists (list generators), there are certainly many cases where it could be useful, but then again we already have the lists:zip family of functions for that, which lately even gained the ability to work with lists of different lengths. And while they can only zip up 2 or 3 lists, in more than 10 years I never had the need to zip up more, at least I don’t remember any such case.

For binaries (binary generators), there may be some rare cases, but I can’t think of any. Why would you want to zip up two or more binaries, or binary/list combinations? Even if, while there are no binary:zip functions, I would consider this such a rare case that it doesn’t justify a dedicated language construct.

For maps (map generators), this feature makes no practical sense at all. The iteration order of map generators is undefined, so one would have to use iterators with a defined order there. The “simple” orderings ordered and reversed use map-key order, which is probably also not all too useful here, so one would have to use a tailored ordering function. This definitely outweighs any convenience gained by using the zip generator functionality.

That all said, it seems to me that this feature only introduces something nice to have, implemented just because it can be done. And once it is in the language, it has to be maintained even if nobody uses it.


I also object to &&, it positively jumps out at me (that is, more than it should) and hurts my eyes :face_with_peeking_eye: The proposed alternative of ; OTOH does not jump out at me enough.

Seriously though, it is highly confusing for me, and probably more so for people coming from other languages. In all other languages I know, && means “boolean and”, and || is its sibling meaning “boolean or”. And yes, in the zip generator context here, && means something and-ish, also. However, || which we also have and to which && is bound to appear in close proximity in the code, means something entirely different: it separates the expression part from the generators and filters part of a comprehension. And while the || as of now could also be confusing to people coming from languages where it means “boolean or”, it can be justified by framing it as being closely related to the | used for consing; the && however can not be framed this way, there is no & to which it may allude.

4 Likes

Agree with pretty much all this, the thing that made me want to comment was that, as @dgud says, ; means orelse in guards, and that’s going to get confusing. To be consistent, if this does go ahead then maybe it could use , instead, as that’s used in guards as andalso?

, is already used for combining generators

> [{P,Q} || P <- [a,b,c], Q <- [1,2]].
[{a,1},{a,2},{b,1},{b,2},{c,1},{c,2}]
2 Likes

At least I have wanted this for 25 years, the current combining of generators is mostly useless,
99% of the times I have only used that in tests.

But I have used lists:zip/2 a lot of times to create a temporary list, that is used as an argument to some of the lists functions directly. Or in most cases when I needed that functionality, I have to write my own recursion function which takes two lists, to avoid wasting heap space on temporary data.

I also want to combine to binaries:

 ComputeData = << <<(A*B+C):32>> ||
                   <<A:32, B:32>> <= Bin1 <ZIP_OP> <<C:32>> <= Bin2>>,

or image data

 RGBAImage = << <<R:8, G:8, B:8, A:8>> || 
                 <<R:8, G:8, B:8>> <= RGB <ZIP_OP> <<A:8>> <= Alpha >>, 
2 Likes

You’re one of the guys behind Wings3D if I am not mistaken? Ok, granted then :wink:

1 Like

About the zip operator, a quick thought that just came to me: Why not use some enclosing construct (like (...) or {...} or something) around the zip, instead of an infix operator in between? Like:

> [ {X, Y} || ( X <- [a, b, c], Y <- [1, 2, 3] ) ].
[{a, 1}, {b, 2}, {c, 3}]

IMO, this reads much better. It also has the benefit that syntax highlighters could show the matching parentheses/braces and thereby where the zip begins and ends, something which is not possible with infix operators.

Parentheses/braces are currently not allowed to be used that way in comprehensions, so there is no risk of breaking anything in existence.

1 Like

Hmm, I liked { } will bring that as well to the decision board.

It might not stand out as && (or another operator) which can be both positive and negativ.

One argument for && is that it has the implicit meaning of and though:

[ {X, Y} || X ← [a, b, c] && Y ← [1, 2, 3] ].

Take X from [a,b,c] and take Y from [1,2,3].

1 Like

Here are real-world use cases found in Erlang/OTP:

There were also many uses of lists:zip/2 in Dialyzer, which I didn’t rewrite because the changes would probably clash with another Dialyzer branch being actively worked on.

3 Likes

To be clear (I probably wasn’t before), I didn’t really question that there are cases where zip generators can be used, for lists there are probably many. I was questioning whether they are useful enough to justify a new language construct, or if we couldn’t get away with lists:zip.

I think this has been settled, you and @dgud convinced me that there are use cases enough :+1:

Another question I now have is if it would be good to have lists:zip functions that can likewise zip up an arbitrary number of lists (given as a list of lists), maybe called zipn or something. I think it can be done, but is it worth it?

1 Like

No, I don’t think it’s worth it.

If think it can be useful for solving a few puzzles from Advent of Code, but I don’t see that as sufficient reason to implement it.

1 Like

I’m not as easily convinced, I think sticking to lists:zip is preferred over introducing new syntax.

Also in the light of EEP 70: Non-filtering generators this proposal needs a complete overhaul - so perhaps better to postpone the discussion until that is done!? I look forward to seeing all cases covered such that we can make an informed decision, here. I also assume there will be a variant of the zip generators that doesn’t crash (the filtering version!) if the lists are not of equal length!?

1 Like

I don’t think we would add strict/relaxed variations to the zip operator itself (&& or what we settle on eventurally). Zip generators will support strict generators in the future. The skipping behavior of a zip generator depends on the individual generators within it. I’ll add a section to the EEP about strict/relaxed generators within a zip generator soon.
If lists are of different lengths, a zip generator will always produce an error. This is to be consistent with lists:zip’s default behavior.

3 Likes

Ok. I will probably not convince you, but let me mention that this not merely a convenient syntax, but will also avoid building a temporary list when combined with filtering. (When not filtering, lists:zipwith/3 can be used to build the final list in one go.)

No. We don’t want to bloat the language :wink: by introducing more operators to support that use case. Seriously, we don’t know how to best express that, and it is also a much less common use case.

Also, we don’t want to obsolete all the good work by @Maria-12648430 and @juhlig on PR-6347: enable zip functions to work on lists of different lengths.

4 Likes