Is it possible to add piping ala |> to Erlang?

This is an interesting thing but would likely force some interesting order-of-evaluation decisions to be made and specified where I don’t know if they are:

a(),
f(b(), ▼)

Does return the value of a() or b() ? How about f(▼, b(), c(▼)) ? Do both refer to the same value as well? Would this declaration prevent re-ordering by the compiler?

5 Likes

Not sure how it would work behind the scenes, but probably it has to be part of a preprocessing step to translate it to this below to enable re-ordering in the compiler.

X1 = a(),
f(b(), X1)

And in your second example it would be

X1 = a(),
f(X1, b(), c(X1))

How the preprocessing/compiler is able to say that it should be this above and not this below I dont know. But could probably be solved in some way.

X1 = a(),
f(X1, X2=b(), c(X2))
2 Likes
State1 = #state{ top_block_hash     = TopBlockHash,

                     top_key_block_hash = TopKeyBlockHash,

                     top_height         = TopHeight,

                     consensus          = Consensus},

    State2 = set_option(autostart, Options, State1),

    State3 = set_option(strictly_follow_top, Options, State2),

    {ok, State4} = set_beneficiary(State3),

    {ok, State5} = set_sync_mode(State4),

    State6 = init_miner_instances(State5),

    State7 = set_stratum_mode(State6),

This feels wrong, and piping or equivalents do not make the wrongness go away. It’s just imperative code wearing a wig and a wax nose.

Years ago working in Prolog I faced a similar problem, and decided that the best approach was to represent option set-up as a data structure.

State = option_merge(#state{
top_block_hash = TopBlockHash,
top_key_block_hash = TopKeyBlockHash,
top_height = TopHeight,
consensus = Consensus},
[{autostart, Options},
{strictly_follow_top, Options},
set_beneficiary,
set_sync_mode,
init_miner_instances,
set_stratum_mode])

Yes, this adds an interpretive overhead, but (a) it’s the kind of thing that should be inlined, and inlined to better code than the original, (b) this doesn’t look like the kind of code where performance is the main issue, (c) it lends itself to type checking, and (d) – my personal favourite – there is no possibility of inadvertently referring to an intermediate or incomplete setup. There are (e), (f), and (g) positives as well, but lagom.``

5 Likes

@AstonJ Perhaps we should have a “quote of the week” - this would surely have to be a contender :smiley:

5 Likes

They mostly have some kind of extra logic, setting different parts of the State.

Some depend on the outcome of a previous step.

Indeed. This code was clearly not written with some non existing future pipe feature in mind. But in the end the code is just applying various operations to a threaded state, which is a common pattern in Erlang code. Pipes would remove the need to invent all the intermediate variable names.

There are many more places in the Aeternity code base that use fancy ways of running lists of funs and the various other ways to solve this issue. All written by many folks way more computer sciency than me. But all applying workarounds to avoid the intermediate state variables.

2 Likes

Pipes would remove the need to invent all the intermediate variable names.

I think this is a valid concern, I remember seeing code in AXD with variables like HR14 (the half call record after 14 updates). This is probably very extreme, but still it’s a problem. I recently saw a course on Rust which also has immutable variables like Erlang - however, as far as I understand, the language provides a way to rebind (or reuse) the variable name, something like this way:

State = #state{ top_block_hash     = TopBlockHash,
                     top_key_block_hash = TopKeyBlockHash,
                     top_height         = TopHeight,
                     consensus          = Consensus},
 let State = set_option(autostart, Options, State),
 let State = set_option(strictly_follow_top, Options, State),
 let {ok, State} = set_beneficiary(State),
 let {ok, State} = set_sync_mode(State),
 let State = init_miner_instances(State),
 let State = set_stratum_mode(State)

I’m not sure how much better it is - I’m afraid this mechanism could be too easy to abuse.

2 Likes

This option_merge function works - as long as there’s no conditional state update somewhere (not in this particular example, but in other code). Something like this:

setup(Foo, Bar) ->
  State = #state{...},
  State1 = set_beneficiary(State),
  case Foo of
    foo ->
      State2 = set_sync_mode(State1),
      init_miner_instances(State2);
    baz ->
      set_stratum_mode(State1, Bar)
  end.

Of course, these can be broken up into different functions, but then the problem of finding meaningful variable names becomes the problem of finding meaningful function names, so instead of State, State1, State2, etc. we have something(), do_something(), maybe_do_something(), really_do_something(), …

2 Likes

I thought I had made peace with this topic, but… sorry, honestly and bluntly, I find this pretty horrible :cold_sweat:

For one, to reiterate: there is nothing wrong with intermediate bindings IMO, I find that they are a good way to “checkpoint” progressive changes and at the same time perform assertions (like {ok, State4}=set_beneficiary(State3) on the way.
The problem with intermediate bindings is that we don’t have an (easy) way to name them nicely, hence the tendency towards numbered variables and the need to renumber things if you introduce another step.

Anyway, this approach of (silently) binding and rebinding the output of the last call to yet another special variable (_? or (and how am I supposed to type even?) and using a (magically) bound and rebound variable is cumbersome and confusing (like as you see, you need this get_result function to extract the state from the tagged tuple in order to artificially keep the -flow going), and will break and require strange gymnastics if you want to insert some call that returns something entirely different than what can be passed into the next function. Like,

...
{ok, StateX}=set_beneficiary(▼),
StateX=get_result(▼), % the state must be carried over the next call
log_new_beneficiary(▼), % returns just ok --> ▼=ok
set_sync_mode(StateX), % can't use ▼ here, need the carried-over state
get_result(▼),
...

If you ask me, that’s in no way better than intermediate bindings, it’s worse by an order of magnitude, at least in my perception. Even crude printf-debugging (temporarily printing an intermediate value by inserting io:format and friends) would become a serious drag.

What’s more, with this approach, you always have to check the next call to find out if, by not binding the return value of a function to a variable, the author just didn’t care about the return value (as it is now, no binding means it is not used further down, because it can’t) or if it could be used somewhere in the next call.

This may all sound very nice and cool while you are writing code, especially when you are writing it all in one go, but you will hate yourself if you (or somebody else) ever needs to read or change it. Well, at least I will :rage:

9 Likes

Not explicitly about piping, but the conversation has headed in the rebinding direction it seems.

I’m wondering if it’d be possible to introduce a rebinding-like behavior via a parse transform by just using a marker/sigil on the variable name.

EG, (re-using previous post’s example)

do_stuff(Options, #state{} = State) ->
    @State = set_option(autostart, Options, State),
    @State = set_option(strictly_follow_top, Options, @State),
    {ok, @State} = set_beneficiary(@State),
    {ok, @State} = set_sync_mode(@State),
    @State = init_miner_instances(@State),
    set_stratum_mode(@State).

where pars transform/pre-processor could re-write to

do_stuff(Options, #state{} = State) ->
    _@State_1 = set_option(autostart, Options, State),
    _@State_2 = set_option(strictly_follow_top, Options, _@State_1),
    {ok, _@State_3} = set_beneficiary(_@State_2),
    {ok, _@State_4} = set_sync_mode(_@State_3),
    _@State_5 = init_miner_instances(_@State_4),
    set_stratum_mode(_@State_5).

I understand this is what Elixir is doing internally when a variable is not pinned. At least that’s what it looks like when using ex_to_erl.

I’m absolutely fine with having to explicitly enumerate variable names when necessary though

3 Likes

Totally agree. I think the only version of these pipes that are in any way sensible for Erlang are exactly the Elixir version. Pipe to first position and no need for special variables.

5 Likes

In most functional languages, going back to ISWIM,
let State = … in
let State = …(State)… in …
let State = …(State)… in …
let State = …(State)… in …

is all you need. Each of these introduces a new
variable, which is only visible after ‘in’,
not between ‘=’ and ‘in’.
In LFE you would probably want to use let* rather than let.

Coming at this from a Haskell perspective,
I find the idea of “hiding the plumbing” attractive.
But I find the idea that there is a single combinator
that is appropriate for the majority of cases where there
is plumbing to be hidden absurd.
I remind readers that the “|>” operator in that spelling
comes from a functional language in which it is NOT
special syntax but a perfectly ordinary operator needing
no special machinery and in that language is one of several
such operators.

I’ve always been a great believer in trying to work with a
language the way it “wants” to go rather than fighting to
use it as if it were some other language. Very well,
Erlang functions are NOT curried and Erlang does NOT admit
user-defined operators. It does have parse transforms,
which are very heavyweight indeed, and not to be used lightly.
Erlang doesn’t really want to be a higher-order language at
all, and for the first few years it wasn’t one.

If you often write code that would be clearer in Elixir or
LFE, what the heck is wrong with using Elixir or LFE?

Not that code like this would be clearer in Elixir.
I find long chains of |> unreadable.

If you have compelling reasons for using Erlang specifically,
then I think it’s time to start treating “I wish I had |>” as
a code smell. I am happy with S0 S1 S2. By the time I get to
S3 I’m starting to get nervous. If I ever reach S10 it’s time
to burn what I’m writing to the ground and restructure it.

Patient: It hurts when I do .
Doctor: Ah. I see the problem. Don’t do .

Sometimes it helps to look at the rest of the code.
Perhaps there is a combination of updates that is needed
in more than one place that can be abstracted out?
If there is one update that must happen before another,
perhaps that combination should be separated out and the
constraint enforced by hiding the component updates?

I’m also coming at this from a Smalltalk point of view
(and I appreciate the reasoning behind “Joe hates OO”).
In a Smalltalk program I would regard a long block of
messages sent to the same object as
(a) a worrisome indication of coupling (the caller
“knows too much” about the object’s states)

(b) a sign that there’s a missing abstraction in the
interface of the object.
From an OO perspective, if you have an object in state
A and you want to get it into state B, you don’t force
it through an intricate sequence of state changes,
you give it some sort of description of state B and ask
it to change the state itself. The functional analogue
should I hope be clear.

We need examples with semantics and context.

9 Likes

What if there is a pipe anywhere?
|1>, |2>, |3> etc?

Or something like, pipe here.
piping_from() |> piping_into(Arg1, *, Arg3)
It’s piped in place of Arg2 (*).

2 Likes

I might see things differently, but to me (with few exceptions), Erlang is relatively consistent with the parameter position: it is last when dealing with a data structure and first when dealing with a process/port/ets… or any other resource where the function call is a side effect and you are not expected to bind the new returned value.

lists:map(Fun, List)
lists:foldl(Fun, Acc, List)
lists:filter(Pred, List)
...

vs.

ets:insert(Tab, ObjectOrObjects)
ets:delete(Tab, Key)
ets:lookup(Tab, Key)
...

I remember reading somewhere that it was more efficient to put the accumulator last when creating recursive functions because it allowed for argument register reuse. (Is this still true with today’s compiler?)

With this in mind, wouldn’t it make sense, in the Erlang world to have a pipe last operator? Given that the case where we want to pipe are the ones where we want to apply a sequence of transforms on a data structure?

2 Likes

Must admit I’ve never heard the phrase before Phil - what’s a wax nose? :lol:

My sister in law is currently wearing a wig because of chemotherapy so I hope it’s not derogatory to wig-wearers in any way :innocent:

1 Like

A wax nose is quite literally a prosthetic nose made of wax.
“Traditionally facial prosthesis has been made by hand worked sculpted wax or clay pattern.”
Famously, Kierkegaard described the liberal Danish state
church of his day as “putting a wax nose on God”.
As for why prosthetic noses used to be a surprisingly common
thing, look up “nasal tertiary syphilis”.

My intention in the wig+wax nose metaphor was to conjure up
the image of a clown. Coulrophobia, anyone?

4 Likes

I used a parse transform 11 years ago that did this: GitHub - spawngrid/seqbind: Sequential Binding Parse Transformation for Erlang

SeqBind offers a different, yet simple, solution to this problem.

It introduces the concept of sequential bindings. What is that? Sequential bindings are bindings that carry the suffix @ (like L@ or Req@) for example.

SeqBind is a parse transformation that auto-numbers all occurrences of these bindings following the suffix @ (creating L@0, L@1, Req@0, Req@1) and so on.

In order to use SeqBind, one should enable the seqbind parse transformation, this can be done either through compiler options or by adding the line below to your module:

-compile({parse_transform,seqbind}).

One of the important properties of SeqBind is that it does not introduce any overhead (unlike some other, relatively similar solutions). Namely, it doesn’t wrap anything into funs but simply auto-numbers bindings. Effectively, your compiled code is no different from the original code structurally.

Returning to the problem definition, this is how your code will look like with SeqBind:

L@ = lists:map(fun (X) -> ... end, L@), 
L@ = lists:filter(fun (X) -> ... end, L@),
%% or
{Q,Req@} = cowboy_http_req:qs_val(<<"q">>,Req@),
{Id,Req@} = cowboy_http_req:qs_val(<<"id">>,Req@)

It’s something that was used in a few commercial projects, but was eventually removed from them. It sort of fell out of collective memory since then, but has been possible to use forever. I assume the project no longer covers everything properly because new forms have been added that it may not handle however.

Do note that to this day, I still prefer to work the program flow into a declarative data structure (as shown earlier) than to work around it with these variables. My general feeling has been that the resulting code is cleaner even if it required more work to re-format in a better way.

4 Likes