EEP 70: Non-filtering generators

jhogberg · October 23, 2024, 7:46am

That was in the Erlang compiler, where we designed operation arguments around “relaxed” matches for brevity. They are much rarer in other applications.

That said, option 1 is viable as ?= already has largely the same meaning in maybe expressions. I don’t like option 2 because of the ingrained assumption that filters need to return true and = returns the expression. I’m not a fan of “visually overloading” operators.

By enabling warnings_as_errors you agree to follow the OTP team’s recommendations, so we don’t consider it a problem to add more warnings.

If you really don’t want it to break the build on things like this, just disable the respective warnings.

Odd, I thought you were against syntactical complexity? Option 2 introduces a new operator = (“strict match”) that is visually identical to the binding operator (=) but does not act like it.

Yes, and a very under-documented one. Filters are guard-like by default but if you do anything even slightly odd, you get an exception.

This is one of the reasons why I disagree with the notion that “Erlang is a simple language and therefore shouldn’t change:” it gets complex once you scrape the surface, especially in corners like these.

We can’t weed out all the weirdness, but I’m open to considering changes that do.

juhlig · October 23, 2024, 7:53am

You really should not be allowed anywhere near shiny things

Maria-12648430 · October 23, 2024, 7:55am

Call me Pandora

Maria-12648430 · October 23, 2024, 10:33am

Wow… then IMO, this an even more serious issue than the filtering vs non-filtering generators one.

Silently skipping may not be the best of choices regarding how to handle non-matching elements, but at least it does so in a consistent manner. Like, it always does. And the EEP 70 strict generators can be used to fix that up so that it will always fail.

The filter weirdness, well, whether and how a filter fails not only depends on what it will obviously result in, but also if it is guard-like or not. And there is no good way (that I can see) to alleviate that and make it consistent. Consistent behavior can be achieved on the user side by always wrapping a filter in an identity function, to prevent being expressed guard-like. But there is no way to make filters “obviously strict” by introducing a new operator or sth

Maybe we should take a step back and consider for a moment what an ideal comprehension would look like if we would implement it from the ground up, like, today, without the weirdnesses that became apparent over the years, and then think of ways how we could change the current ones to be like that, without breaking anything in existence in the wild?
(Note that this may also lead to the conclusion that (current) comprehensions are broken beyond repair, and by design, and that if we wanted a good implementation, we would have to create something entirely new, different, from the ground up.)

IMO, any other approach is just short-term fixing of leaks, or providing stops to fix leaks (like, strict generators), while leaving open the leaks which we can’t fix (like, weird filter behavior). This is not entirely a bad thing. I mean, fewer are better than more leaks, and some stops are better than no stops. But the best possible solution would of course be no leaks and no stops needed.

kuna.prime · October 23, 2024, 10:53am

This is something that I will always support.

seal · October 23, 2024, 11:03am

What happens when the filter expression does not evaluate to a boolean value depends on the expression:

If the expression is a guard expression, failure to evaluate or evaluating to a non-boolean value is equivalent to evaluating to false.

If the expression is not a guard expression and evaluates to a non-Boolean value Val, an exception {bad_filter, Val} is triggered at runtime. If the evaluation of the expression raises an exception, it is not caught by the comprehension.

Erlang reference manual describes the expected behaviors for filters in comprehensions quite accurately imo. If the expression is considered a guard, it doesn’t raise an exception, which is consistent with how guards are dealt with elsewhere.

Maria-12648430 · October 23, 2024, 11:25am

Ok, granted, so it is documented somewhere. That said, it is 3-4 pages down (depending on screen size), and not in the place where I would look first, which is about here (where it says nothing that there is more to know, giving the impression that what is written there is all the is to know)

While that is of course true, it is not obvious when it will and won’t be turned into a guard. And while when writing a function with invalid guards the compiler will complain and not let me do this, in a comprehension this will compile and silently be turned into something else. And if I change my comprehension filter just ever so slightly, it may jump from being expressed guard-like to non-guard-like and vice versa.

Quick poll: who knew about this behavior, and of those who did know, who has it in mind when writing a comprehension filter?

juhlig · October 23, 2024, 11:50am

Let me lend you a hand

I know about this behavior and usually/always have it in mind when writing filters
I know about this behavior but rarely/never have it in mind when writing filters
I didn’t know about this behavior

0 voters

elbrujohalcon · October 23, 2024, 2:17pm

The can was already opened.
I wrote not one, but two articles about it

And, I have to say that this and the other current threads about list comprehensions are giving me high hopes for extending that series of articles in the future

bjorng · October 24, 2024, 5:18am

This is not the place where I would look. The Efficiency Guide is meant to describe the performance aspects of language features or library functions; it is not meant to fully describe each feature. The Reference Manual is where the full description for feature is supposed to be found.

That said, having crucial or surprising information about language features in more than one place in the documentation would not hurt.

Yes, this is unfortunate.

You can blame it on me.

This behavior was present in the first implementation of list comprehensions in the compiler for JAM (Joe’s Abstract Machine). When @rvirding and I were working on the new compiler for BEAM, we noticed that code in Erlang/OTP depended on the guard behavior in list comprehensions. We didn’t know whether any customer code also depended on that behavior, so to avoid introducing an incompatibility, I argued for keeping the guard behavior in the new compiler.

Maria-12648430 · October 24, 2024, 7:12am

You may be biased from years of experience there To clarify, when I want to know something about list comprehensions, I type “erlang list comprehensions” into Google instead of digging through the manual, this is simply faster. And the first result that gives me is that link I posted

Definitely At least, there should be a link to the full gritty details, like, “There is more to know! ”

Will do (Just kidding )

Yes, a very sensible decision IMO. This is also why I said earlier that we can’t really fix this

josevalim · October 24, 2024, 7:19am

While I am sure it sounds controversial, I do agree that perhaps introducing a new syntax with the goal of fully replacing the old and all of its quirks is not necessarily a bad idea. If the premise is that comprehensions are confusing today, I am afraid adding more things will only add to the confusion, because it is more documentation, scenarios, and options to sort through.

Something that I also dislike about list comprehensions (in many functional languages) is that variables are defined after they are used:

[X * 2 || X <- List]

I believe this is the only place in Erlang where this happens. Yes, I understand it comes from mathematical set notation, but I don’t think it is relevant.

I was looking at the List Comprehensions page in Rosetta Code and I believe the F# syntax can be a good alternative entry point.

List comprehensions:

[for X <- List, X >= 10 of
  X * 2
end]

Map comprehensions:

#{for {K, V} <- List, V >= 10 of
  K => V * 2
end}

Binary comprehensions:

<<for X <- List, X >= 10 of
  X*10
end>>

I am not proposing for this syntax in particular. But the goal was to:

Avoid breaking changes. Since ‘for’ can only appear after a #{, [, and <<, I hope it doesn’t break all existing for atoms in the language
Can be made multi-line without begin/end
Generators are strict by default, filters never behave like guards but must still return true/false (it raises bad_filter if anything else)
Pattern ?= Expr could be introduced for relaxed matches
Variables are introduced before they are used

The old syntax is kept around for compatibility but with huge warnings saying “don’t use this”.

Maria-12648430 · October 24, 2024, 8:23am

If you don’t mind, I’ll stand way over there while you talk to the guys with the torches and pitchforks

IMO, it would be ok if those “more things” don’t block the way for a new implementation, in the sense that they only exist to work around glitches in the old one. That is, they should be usable in a new implementation the same way, and don’t look and work any different.

This is probably highly subjective and depends on individual background, but that said, personally it never bothered me. I’m pretty comfortable with the mathematical notation.

Also subjective, but my first (and also second) reaction is: This even reads weird, the intuitive association with “case … of …” is all too ready at hand.

Acknowledged

josevalim · October 24, 2024, 8:27am

Agreed. And I don’t think of reads well there either. We could use ; or reuse ||. It doesn’t really matter much at this stage. I just wanted to get the ball rolling.

Maria-12648430 · October 24, 2024, 9:53am

About the results from the poll. I would exclude @jhogberg and @bjorng, because of course they know For the others who said they knew: who of you knew this before getting in some way burnt? That is, who learned it in the “normal” course of learning Erlang? As for me, my path was mostly via LYSE and Programming Erlang, accompanied with some online resources, none of which made me even remotely aware of this.

I knew about this behavior from the beginning and so avoided being burnt
I know about this behavior because I was burnt by it, or got wise to the fact later by mere accident

0 voters

hanssv · October 24, 2024, 10:57am

I might be a bit of an outlier, my onboarding into Erlang (during my PhD days) consisted of writing a model-checker for Erlang (in Haskell, since that was my weapon of choice at the time) - so, yes, I did know about the semantics of Erlang before I even wrote my first multi-module Erlang program

That said, I can’t say I always get list comprehensions right, but I have this habit of testing more exotic variants (in the shell) before relying on them.

elbrujohalcon · October 24, 2024, 11:10am

In my case, I learnt about this while writing articles for my blog. I first found this way of writing else-less ifs…

[do:something() || if_this:is_true()]

And then I though: What other weird things can happen with LCs?
Me being a huge fan of adventure style games (Look behind you! A three-headed monkey!) I just… well… tried everything with everything

elbrujohalcon · October 24, 2024, 11:17am

It also doesn’t bother me as a user, but when teaching this subject to students (And, come to think of it, also while reading the code) I always end up going back and forth in the code to write them.

In other words… this would be my step-by-step way of writing, teaching, and reading a list comprehension, in general:

[] %% It starts empty, but I know it'll be a list.

[ || X <- Set ] %% Then I add the generator.

[ modify(X) || X <- Set ] %% Then I add the transformations needed for the final expressions

[ modify(X) || X <- Set, filter(X) ] %% Finally the filters

The last 2 steps can be interchanged, but the idea is to read it as “Get me all the Xs from Set, modified in the proper way. Actually, I only want those that pass the filter.” …or… “Get me all the Xs from Set that pass the filter, modified in the proper way.”. It’s never “Get me modified Xs that you would retrieve from X, but only those that pass the filter”.

rlipscombe · October 24, 2024, 1:00pm

I’ve found a few instances of that in our code base. I tend to remove them…

bjorng · October 24, 2024, 2:29pm

We have now merged the pull request that implements the the strict generator operators.

Thanks for all comments and suggestions.

The discussion gave us some ideas for issues that could be addressed in the future, for example that a small change to a filter can change it from being evaluated in guard context to body context. Perhaps it could somehow be made possible to indicate the intention that a filter must be a guard (or vice versa), and the compiler could emit an error message if the filter expression is not a legal guard expression.