Is it possible to add piping ala |> to Erlang?

Maria-12648430 · February 26, 2022, 10:20am

@bmitc I think we started off on the wrong foot here, and that’s my fault. My sincere apologies I see that I misinterpreted some of what you said, but not on purpose, it was just my initial intuitive interpretation and it didn’t occur to me that there may be others. The subsequent “bending” as you call it, well, that might have been on purpose as I got carried away Anyway, let’s start over, shall we?

My concerns are generally centered on reading code, not my writing. I know that I can go on writing Erlang as before, without ever touching the pipe operator. But since I’m mostly contributing to existing projects instead of rolling my own, at a wild guess, I get to read ~10 lines of code for each line I end up writing.

On a side note: I agree that nested function calls are ugly, at least when done excessively. However, I don’t think intermediate bindings are bad or ugly, not per se. FWIW, it makes a nice way of “checkpointing”, but that is beside the point. The actual problem with intermediate bindings as I see it is that there is no sensible way to name them, hence the numbered bindings. I can only guess that this is also what Joe meant when he said “yucky code”, but I may be wrong.

And this leads to the possibility and raises questions about how likely it is that the presence of the pipe operator will lead to unnatural code, ie code written in a way that it fits in with the pipe operator even if not piping would be the better way in specific places. Since you brought up the term “pipeline driven development”, this is not unlikely, since it makes it a programming style or paradigm, instead of a possibility to do or not do something specific when appropriate.

I have read one or two introductions to Clojure, and while yes, they have their pipe operators -> and ->>, they are not hyped (for lack of a better word), only mentioned in passing, late in the book. Like “Yeah, we have that, use it when it is useful”. But in the specific case of Erlang, where many people come to the language by way of Elixir, one of the first familiar things they will see is the pipe operator, something that they may cling to early on, and which in turn may lead to its over-employment.

If you can convince me that above concerns are unjustified, I probably still won’t become a fan of it, but I wouldn’t mind it much either I’m not questioning the general usefulness of the pipe operator, I’m questioning that the usefulness it provides justifies taking the “risk” I outlined above. Once it’s there, we have to live with it, this way or that.

bmitc · February 26, 2022, 9:26pm

No worries, and no need for any apologies! These discussions can always be a bit provocative, particularly in textual form. I come into this with my own biases, coming to Erlang from visual dataflow languages (piping on steroids) and languages like F# and Elixir that have |>. Thus, it’s a natural element in my personal programming experience. In many ways, I feel that |> could be useful in Erlang, but I can understand pushback since it’s usually not great when a significant feature is added to a language long after it’s been designed and in use for some. I am actually generally in favor of leaving languages alone a design plateau has been reached (why I like the languages I do), so I possibly made an error when viewing |> as a “natural” addition to Erlang. At the moment, I’d say it’s now a bit of a tossup for me personally on whether the hypothetical addition makes sense, primarily due to the issues of argument order and the potential complexity & would add.

seanhinde · February 27, 2022, 5:29pm

Interesting discussions. Great to be pointed to Joe’s article on Elixir now I actually started using Elixir. My take aways from reading that at the time were only the negatives. I had not registered how much a fan he was of bringing pipes and sigils to Erlang.

I also share the dislike expressed in this thread of the more complex variants. For pipes to aid readability they really should not make code harder to read!

I took a decent sample of the stdlib and found there is a large majority of modules that would suggest piping to last position, but it is not universal, and it feels like the wrong choice.

I would love to see Elixir style pipes passing into first position that we can use with our own functions. Large Erlang projects tend to be a lot more than a collection of lists:maps and lists:folds.

Pipes would drive a clear convention on arg position across large code bases and provide a great tool to take advantage of that standardisation.

jack · March 5, 2022, 1:00pm

I thought about this more than once, and last time I was thinking about it, I thought that maybe something like this could be done:

This is the original function in docs, I’ll use that for an example instead of looking for the actual implementation code.

sublist(List1, Len) -> ...

Would this work?

sublist(Len, List) when is_list(List) -> sublist(List, Len).
sublist(List1, Len) -> ...

This would not remove the original functions thus not breaking any code base, but would detect the argument order during a pipe and rearrange for the first/last position as desired.
(But not both, of course, not talking about |> |>>.)
Maybe those guards could end up generating some problems that I can’t think right now, but it’s an idea. Maybe you guys could help pointing out potential problems so I can learn and pay more attention when I implement guards myself.

Of course implementing pipes would be a long time effort, but I also think that it’s an addition that would benefit Erlang.

About the argument ordering.
I personally think it could be first argument, because let’s be honest , a lot of us code and like both languages, and some of us code in both daily, if pipe would be first argument in one, and last argument in another, it would generate a bit of confusion, even though your brain would adapt and switch context automatically after some time.

I’m a big fan of Lisp, one of my favourite languages, and I don’t have problems reading Lisp code at all, so I don’t have problems with parens, but I think that pipes are way more readable than putting one function inside another.
C++ receiving pipes is the reason I’m planning to dive into C++ soon.
They are really helpful, that’s it.

The argument about pipe-driven development:
I see people abusing pipes all the time in Elixir, specially when they’re just starting, people want to pipe everything because it’s something new, and it’s quite annoying but honestly not as bad as it sounds like, you can still read the code just fine.

About Jose’s take on readability and abuse specially depending on the argument order implementation:
I think it’s a problem that we would face in any language with any feature, people will eventually write bad code no matter what.
Code reviews and the team’s own definitions of code style and “good code” is what will improve the code readability.
I mean, if we consider something as bad style or confusing, but the team are good with that and accepting pull requests in that style… Maybe if it’s readable and the team like it, no reason to enforce them to write any other way. It might makes sense to read the code in some specific way for the problem they’re trying to solve.

Sometimes what we consider unreadable and a source of confusion, for other teams it’s good style. Look at APL.
Now, I love APL, and the way the code is written makes sense. But at first look before understanding APL and it’s design choice for the syntax you can’t help but think it’s unreadable.

Those are just some thoughts and ideas from someone that doesn’t have professional experience writing Erlang, so maybe something I said reflects my ignorance about not having to deal with Erlang code in production.

seanhinde:

    State1 = #state{ top_block_hash     = TopBlockHash,
                     top_key_block_hash = TopKeyBlockHash,
                     top_height         = TopHeight,
                     consensus          = Consensus},
    State2 = set_option(autostart, Options, State1),
    State3 = set_option(strictly_follow_top, Options, State2),
    {ok, State4} = set_beneficiary(State3),
    {ok, State5} = set_sync_mode(State4),
    State6 = init_miner_instances(State5),
    State7 = set_stratum_mode(State6),

This is a very good example where Pipe shines.

At least if not pipes I would like to see the possibility of rebind the variable name, so you could just write State instead of numbering it or having to come up with weird abbreviations and names.
We know for Lisp and Elixir that we can reuse variable names without mutation, so maybe someday but that’s a different discussion.

fogfish · March 20, 2022, 11:30am

For me the interesting questions about consequences of adding the simplest possible pipes in Erlang

Simples possible pipe is the category of identity functions. It would roughly solves only the “builder” pattern. Anything complex would require support of errors, promises, states etc. Eventually, you’ll realise that its usage is very limited and you need better support. The category pattern from datum library makes you to define your own signature of pipe operator and implement bindings. The library out of the box provides support for classical “pipe”: option, errors (either), io, etc.

Do not hesitate to raise an issue, if you need more support on the subject.

rvirding · March 21, 2022, 4:36pm

I would say a problem with @seanhinde example is that you would have to change the functions set_beneficiary and set_sync_mode to fit into pipes which could make them less useful outside of a pipe. Their returning an ok tuple might be the right thing for them to do. And not everything would be called in a pipe.

An alternative would be to make them and init_miner_instances coming afterwards to be able to handle ok tuples as input, as well as being able to handle when they don’t get an ok tuple. Again not everything is done in a pipe, there is a world outside of them as well.

Maria-12648430 · March 22, 2022, 10:40am

jack:

I thought about this more than once, and last time I was thinking about it, I thought that maybe something like this could be done:

This is the original function in docs, I’ll use that for an example instead of looking for the actual implementation code.
sublist(List1, Len) -> ...
Would this work?
sublist(Len, List) when is_list(List) -> sublist(List, Len).
sublist(List1, Len) -> ...
This would not remove the original functions thus not breaking any code base, but would detect the argument order during a pipe and rearrange for the first/last position as desired.

Yes, that specific example would work. But it introduces some problems that I would rather like to avoid In a nutshell, I prefer a solid if rigid API over one that friendly-magically changes things because it thinks it knows better what I meant to do than myself.

For one, it works only if the arguments are of different types. It won’t work with append/2 for example, where both arguments are lists. So you would invariably end up with some functions that do magical argument flipping, and some that don’t because they can’t.

Second, it will make sublist (and friends) ambiguous, when seen outside the context of pipes: Argument order doesn’t matter any more, you can write sublist(Len, List) just the same as sublist(List, Len), pipes or not, and some people will do it this way and some will do it that way, whatever they prefer. If the arguments to sublist are in obscurely named variables where you don’t see which is which… confusion.

(Third, it will result in an endless loop that flips arguments back and forth if both arguments are lists. As it is now, this will result in an error, as it should.)

seanhinde · March 23, 2022, 11:11am

My assumption for the example I posted was that those functions that today return ok tuples would naturally have been written to fit the pipeline model.

On checking, one of them always returns a new {ok, StateN} so the ok wrapping was not needed. I fixed the code - nice review . The other one can return {error, X}. It would be possible to make the following function accept {ok, Arg}, but that would indeed be ugly as you point out.

In a world with pipes I would inevitably end up with an unwrap function to put in the pipeline:

unwrap({ok, Val}) -> Val.

This would mess up the error reporting though - function clause in some unrelated function instead of a badmatch where the error happened.

juhlig · March 24, 2022, 2:21pm

You could work around that by inlining the unwrap function, things like that are done in OTP itself in many places for just that reason.

crownedgrouse · March 26, 2022, 6:04pm

‘à la’ and not ‘ala’, please

starbelly · March 26, 2022, 8:19pm

I’m sure you already know, but in case you don’t, OTP 25 will ship with the new maybe construct : eep-0049

While this is of course not a 1:1, perhaps it fills gaps you were pondering on.

AstonJ · April 29, 2022, 12:26pm

I’m currently going through PragDave’s Elixir course and he says this about functional programming:

Our goal when using a functional paradigm is to think out our program as one big function, transforming its inputs into outputs. We then break this down into progressively smaller functions, until we end up with a bunch of small functions; each function doing just one thing.
Our main tools are functional composition and pattern matching.

Note the bit in bold. “Composition means chaining together functions so that the output of one becomes the input of the next. In our dictionary code, we used pipelines to compose…”

With this in mind I wonder if Erlang had a pipe operator by default it would pipe into the first parameter but this could be overridden with an ampersand & (so if no ampersand present it would pipe to first parameter) - this would allow compatibility with existing code but also push people into aiming for the first parameter. (This will also keep it in line with Elixir - which will be a benefit as many people coding in Erlang now may already be familiar with Elixir.)

I think when you compare examples like the one @seanhinde posted pipes are much nicer and more easily made sense of:

seanhinde:

Piece of code I’m looking at today:

    State1 = #state{ top_block_hash     = TopBlockHash,
                     top_key_block_hash = TopKeyBlockHash,
                     top_height         = TopHeight,
                     consensus          = Consensus},
    State2 = set_option(autostart, Options, State1),
    State3 = set_option(strictly_follow_top, Options, State2),
    {ok, State4} = set_beneficiary(State3),
    {ok, State5} = set_sync_mode(State4),
    State6 = init_miner_instances(State5),
    State7 = set_stratum_mode(State6),

This is from the Aeternity codebase, but it’s a pretty common pattern that would be a lot cleaner with pipes. Renumbering the States to fit another step is annoying and for sure error prone. Alternative:

    #state{ top_block_hash     = TopBlockHash,
            top_key_block_hash = TopKeyBlockHash,
            top_height         = TopHeight,
            consensus          = Consensus}
    |> set_option(autostart, Options)
    |> set_option(strictly_follow_top, Options)
    |> set_beneficiary()
    |> set_sync_mode()
    |> init_miner_instances()
    |> set_stratum_mode(),

Not sure whether it would be possible but just thought it was worth mentioning.

NAR · April 30, 2022, 8:49am

One thing I haven’t seen mentioned here is that pipes in Elixir work differently than pipes in UNIX shells. The stuff that comes from pipe is received in a different way (on standard input) than the rest of the arguments. There’s a clear separation between the two:

grep 'foo' bar | cut -f2 -d" " | sort -n

In the example above the cut utility handles the input lines from grep totally differently than the command line parameters. For me, one of the most confusing aspects of Elixir pipes was that there’s in Elixir no such distinction, the stuff that comes from the pipe and the arguments are handled the same way. As I’ve been writing shell scripts more than two decades longer than Elixir code, this was really confusing.

My other big problem (that numerous people already mentioned) is that Elixir pipes go directly against the Elixir “explicit is better than implicit” motto. If I look at the example you’ve provided above, I see a set_sync_mode/0 function - oh wait, there’s some “line noise” before that magically converts this function into a one-arity function! It doesn’t help if that pipe operator is on the previous line and a grep or diff output might not even show it!

The only place where pipes might help is in the Erlang shell, in throwaway code written incrementally. For example I type something like this:

application:info().

then I realize the output is long and I need only the list of the running applications, so I modify the line:

proplists:get_value(running, application:info()).

and in this case it’s annoying that I have to go to the front and the back of the line. But I wouldn’t want to see pipes in Erlang source files.

filmor · April 30, 2022, 2:30pm

Well, doing it the Elixir way is not the only option. Back when this thread was started, I already suggested using the pipe operator only on arity-1 functions and combining its introduction with introducing a shorthand for partial evaluation, as in func(a, b, &) would be equivalent to fun (X) -> func(a, b, X) end, so you’d see something like |> set_sync_mode(&) instead. IMO this is explicit and unambiguous.

MononcQc · April 30, 2022, 4:26pm

My sort of approach has always been to turn these into data-driven formats. For example:

seanhinde:

    State1 = #state{ top_block_hash     = TopBlockHash,
                     top_key_block_hash = TopKeyBlockHash,
                     top_height         = TopHeight,
                     consensus          = Consensus},
    State2 = set_option(autostart, Options, State1),
    State3 = set_option(strictly_follow_top, Options, State2),
    {ok, State4} = set_beneficiary(State3),
    {ok, State5} = set_sync_mode(State4),
    State6 = init_miner_instances(State5),
    State7 = set_stratum_mode(State6),

Might be defined as:

new(#{top => #{block_hash => TopBlockHash,
               key_block_hash => TopKeyBlockHash,
               height => TopHeight},
      consensus => Consensus,
      options => [opt(autostart,Options), opt(strictly_follow_top, Options)],
      ...})

The extra question I’d have on this one is whether that state for calls such as set_beneficiary/1 and set_sync_mode/1 are whether they are disjoint things that share the same config but are independent (while modifying it) or whether they are functions within such as new-style function that simply try to make things clearer by breaking down more complex initialization substeps.

If they are, then I could also imagine trying to disentangle dependencies. Can we actually write:

   State = #state{ top_block_hash     = TopBlockHash,
                   top_key_block_hash = TopKeyBlockHash,
                   top_height         = TopHeight,
                   consensus          = Consensus},
   State#state{
       options = [set_option(autostart, Options, State), set_option(strictly_follow_top, Options, State),
       beneficiary = set_beneficiary(State),
       sync_mode = set_sync_mode(State),
       miners = init_miner_instances(State),
       stratum_mode = set_stratum_mode(State)
   }.

Or whether the dependency is actually all sequential? For me a lot of the pipe usages I’ve seen in Elixir that are actually warranted require:

no conditionals where some partial failures or successes can happen (note that in the original post, the set_benificiary and set_sync_mode returns have lost an assertion on a match that instead would trigger function_clause errors, and the calls need to be modified to return just a state or to handle the ok tuple)
no clearer representation as a flat data structure that can be typed and whose ordering can be handled behind the scope of a private function such that it is not possible to have partially-instantiated data returned, and becomes ordering-independent for callers.
do not start changing the returned data-structure halfway through (e.g. returns a state map and then it’s a pid and then it’s a {ok, Pid} and then it’s a boolean…
does not end up forcing a different API design for the sake of being piped (eg. lists:foreach/2 wrapper returning the input state just to wave in side-effects to the callchain, aside from debug functions intended for it.

This ends up being mostly cases of applying uninterrupted transformations to the same datastructure. The best examples I’ve seen were those where 3+ transformations were applied to a string (normalizing, trimming off ends, changing capitalization, replacing terms, etc.).

A lot of the elixir pipes usages I’ve seen I’d try to write such as not requiring a pipe, and I always found that nested branching was always worse (hence working on the maybe expression dropping in OTP25).

gumm · April 30, 2022, 5:21pm

I pretty regularly use the pipeline[0] parse_transform.

It allows for doing basic stuff like the fold examples above:

MyAwesomeString = [
    binary:to_string 
    string:trim,
    string:to_upper
](SomeBinary).

But it also is flexible enough to handle the not-super-conducive-to-pipes return values and arguments of Erlang like this:

NewPG = [ 
    {ok, __} = poolgroup:pools(Pools, __), 
    {ok, __} = poolgroup:teamids(Teamids, __), 
    {ok, __} = poolgroup:minutes_per_match(MinutesPer, __), 
    {ok, __} = poolgroup:start_time(StartTime, __), 
    {ok, __} = poolgroup:matches_or_games(MatchesOrGames, __), 
    {ok, __} = poolgroup:subtype(Subtype, __), 
    {ok, __} = poolgroup:populated(true, __),
    poolgroup:finalize(__)
] (PG);

More practically, though, I prefer not to return an {ok, _} tuple, so my “pipes” tend more to look like this:

NewPG = [ 
    poolgroup:pools(Pools, __), 
    poolgroup:teamids(Teamids, __), 
    poolgroup:minutes_per_match(MinutesPer, __), 
    poolgroup:start_time(StartTime, __), 
    poolgroup:matches_or_games(MatchesOrGames, __), 
    poolgroup:subtype(Subtype, __), 
    poolgroup:populated(true, __),
    poolgroup:finalize(__)
] (PG);

I find it useful, though it’s not conducive to searching (the way you could just grep for |> to find instances of the pipe operator)

[0] Disclosure: I’ve been the “maintainer” of this for a few months, but it was originally built by Danil Zagoskin like 10 years ago. Mostly that just means I updated it to support Erlang 24+)

nzok · May 1, 2022, 2:43am

Let’s start by reflecting on where |> comes from and why it is a good
fit there. The spelling |> comes from F# but the idea is older.
Haskell, for example, has a rich library of combinators.

import Data.Function
Prelude Data.Function> :type (&)
(&) :: a → (a → b) → b
Prelude Data.Function> :type (.)
(.) :: (b → c) → (a → b) → a → c
Prelude Data.Function> :type ($)
($) :: (a → b) → a → b

Dot is function composition: (f . g)(x) = f(g(x)).
Dollar is function application: f $ x = (f)(x).
And ampersand is (flip ($)), x & f = (f)(x).
In F#, there is nothing whatsoever special about |> .

(|>);;
val it : ('a → ('a → 'b) → 'b) = fun:it@1

is a word with operator syntax, but it is an
honest-to-goodness value that can be held in a
variable, passed to a function, and returned from
a function. Its arguments are also honest-to-
goodness values with nothing special about their
syntax. We can even define something similar in
Scheme without any macros:
(define (pipe x . rest)
(let loop ((x x) (rest rest))
(if (pair? rest) (loop ((car rest) x) (cdr rest)) x)))

(pipe 4 sqrt number->string)

“2”

We can even do it in Smalltalk.
Object
methods for: ‘combinators’

unary
^(unary isMemberOf: NiladicSelector) “isKindOf: Symbol”
ifTrue: [unary send_to: self] “self perform: unary”
ifFalse: [unary value: self]

4 |> sqrt | printString
==> ‘2’
but it’s pretty silly in Smalltalk, where the normal
syntax would be
4 sqrt printString
anyway.

What it is about Haskell, F#, Elm, Scheme, and Smalltalk
that make & |> pipe or |> comparatively pleasant?

IT IS NOTHING SPECIAL.

It’s just a normal operator (or in Smalltalk, ‘binary
selector’; Smalltalk technically doesn’t have operators)
with normal arguments and normal results that ANY
programmer could have defined. This means that there is
nothing special for someone using |> to understand.
There is no hidden machinery. There are no rules that
apply only to the pipe operator. Having it in the
language does not add complexity to the language.

What makes the pipe operator useful in Haskell, F#,
and Elm, and pointless in Scheme and Smalltalk?
[Having implemented pipe in Scheme and Smalltalk

five years ago, just for practice, I have never
found any use for it since in those languages.]

In the languages where the pipe operator is useful,
ALL functions have one input and one output and
“partial application” of a function to its leading
arguments is an essential aspect of the language.
So x |> f y automatically means f y x. There is,
again, no special syntax (nor even any “support”
operators) needed to make the pipe operator relevant
to ANY function whatever. While confusing,

1 |> (<) 2;;
val it : bool = false
involves no special syntax and no special semantics.
After doing
let flip f x y = f y x;;
we find that
1 |> flip (<) 2;;
val it : bool = true

as expected.

So the answer is that

The pipe operator is useful in its “native” languages
because it involves no special syntax or semantics
and can have ANY function as its right operand,
so there are few restrictions on its use and nothing
added to the complexity of the language.

None of that is or would be true for Erlang.

would have to be special syntax.
And there would have to be additional special
syntax for partial application.
Library functions can’t provide partial
application because some of the things we’d
like to partially apply, like (<), are not
functions or values. So we’d find ourselves
writing
1 |> (<| < 2)

I respectfully suggest that if you find yourself
missing a pipe operator in Erlang, the function you are
writing needs to be broken up into smaller pieces.

I’ve seen uses of |> in F# that would to my mind have
been better as list comprehensions, which F# also has.

Think of it as a code kata.

Find some code in Erlang where you would have
used |> in F# or Elixir.
Seek three different ways to restructure it as
Erlang, considering the context.

nzok · May 1, 2022, 2:53am

gumm wrote about the pipeline parse transform.
“I find it useful, though it’s not conducive to searching (the way you could just grep for |> to find instances of the pipe operator)”

Whyever would you grep for instances of pipe?
It’s just function application, and how often do
you grep for instances of normal function application?

Oh, another little bit of history.
Way back in the late 1960s, the British AI programming
language Pop-2 allowed two forms of function application:
f(x) f(x,y) f(x,y,z)

x.f (x,y).f (x,y,z).f
Once again, there was nothing special going on and the
reverse function application operator was definable in
the language. For decades nobody thought any of this
was a big deal, and in a language with user-definable
operators and built-in partial application, it isn’t.

gumm · May 1, 2022, 11:40pm

In or any language that natively supports pipes, never. But with Erlang, it’s a non-standard syntax that relies on parse transforms, so if I wanted to see which modules might be using it, there’s not a super easy way.

But it’s not a big deal, just one of those things that occurred to me.

frazze · May 8, 2022, 7:35am

    State1 = #state{ top_block_hash     = TopBlockHash,
                     top_key_block_hash = TopKeyBlockHash,
                     top_height         = TopHeight,
                     consensus          = Consensus},
    State2 = set_option(autostart, Options, State1),
    State3 = set_option(strictly_follow_top, Options, State2),
    {ok, State4} = set_beneficiary(State3),
    {ok, State5} = set_sync_mode(State4),
    State6 = init_miner_instances(State5),
    State7 = set_stratum_mode(State6),

In bash you have a $? variable meaning the return value from latest command, if we introduced something similarly in Erlang we would get this example to this:

    #state{ top_block_hash     = TopBlockHash,
            top_key_block_hash = TopKeyBlockHash,
            top_height         = TopHeight,
            consensus          = Consensus},
    set_option(autostart, Options, _?),
    set_option(strictly_follow_top, Options, _?),
    {ok, State4} = set_beneficiary(_?),
    {ok, State5} = set_sync_mode(State4),
    init_miner_instances(State5),
    State7 = set_stratum_mode(_?),

No |> operator would be needed. And it is very explicit, which would fit Erlang. I was thinking that this Unicode character would be a good fit.
↴ or ▼to signify that the value flows down from the previous statement

    #state{ top_block_hash     = TopBlockHash,
            top_key_block_hash = TopKeyBlockHash,
            top_height         = TopHeight,
            consensus          = Consensus},
    set_option(autostart, Options, ▼),
    set_option(strictly_follow_top, Options, ▼),
    set_beneficiary(▼),
    get_result(▼),
    set_sync_mode(▼),
    get_result(▼),
    init_miner_instances(▼),
    set_stratum_mode(▼),
    State = ▼