Need help understanding some Erlang code

leeyis · July 6, 2022, 8:12am

Design the anonymous function F1, Make lists:map (F1 (3), [1, 2, 3, 4]) output for [1.0, 8.0, 27.0, 64.0]

F1 = fun(M) -> fun(N) -> math:pow(N, M) end end.
lists:map(F1(3), [1,2,3,4]).

I don’t understand why F1 (3) is matched to M and the list element is matched to N

leeyis · July 6, 2022, 8:26am

I understand:
When the 1 code is parsed to F1(3), it is bound to the M variable in the first layer;

lists:map/2 The first parameter is the return result of F1/1,
naturally the F1/1 result “F2/1” parameter matches each element of the second list parameter of lists:map /2

I don’t know if I understand this correctly？

raimo · July 6, 2022, 9:31am

Line 1 binds F1 to a fun() of arity 1.

Line 2 calls lists:map/2.
Argument 1 is a call to F1(3), which has to be evaluated before calling lists:map.
The call F1(3) returns a fun of arity 1 that has M in its environment.
Argument 2 to lists:map/2 is simply a list.

lists:map/2 will, for each element in the list call the fun of arity 1 that it got as its first argument; the inner fun that calls math:pow(N, 3) since M was bound to 3 in the fun’s environment.
N will be bound to 1, 2, 3, 4 for the calls to the inner fun.
The result from lists:map/2 will be [math:pow(1, 3), … math:pow(4, 3)]

raimo · July 6, 2022, 9:59am

It seems you wonder how and where N is matched.
The inner fun “F2/1” is simply a fun of arity 1.

lists:map/2 is implemented something like this:

map(F2, [N | Tail]) ->
    [F2(N) | map(Tail)];
map(_, []) ->
    [].

and in this code the list [1,2,3,4] is matched against the pattern [N | Tail] in the function head of map/2’s first clause, which succeeds and binds N to 1 and Tail to [2,3,4]. Then F2(N) is called where both F2 and N are now bound variables. The result is placed in a cons cell (list element container) and map(Tail) is called recursively to create the other half of the cons cell.

leeyis · July 6, 2022, 10:10am

Thank you for your detailed explanation, which made me understand this problem

SisMaker · July 6, 2022, 10:24am

I want to ask about the cost of creating anonymous functions. I’ve seen some code that makes heavy use of anonymous functions, and is it makes a big difference if you change to a defined function

leeyis · July 6, 2022, 10:24am

F1 = fun(M) -> fun(N) -> math:pow(N, M) end end.
lists:map(F1(3), [1,2,3,4]).

Before, I did not know the execution order of this code，if understand the execution order, the problem is clear.

“Argument 1 is a call to F1(3), which has to be evaluated before calling lists:map.” that’s the answer to my question, I’m going to explain this to a beginner. Is there any way to prove it?

To prove this sentence, what knowledge points need (I do not have, want to learn)

SisMaker · July 6, 2022, 10:27am

F1(3) return a new anonymous functions that fun(N) → math:pow(N, 3) end then to ues lists:map/2

leeyis · July 6, 2022, 10:32am

I think I understand that, and I think the key to understanding this is to be clear about the order of execution of the code

elbrujohalcon · July 6, 2022, 11:13am

Add some debugging messages and some variables to it…

6> F1 = fun(M) ->
6>        io:format("Calling F1 with M = ~p~n", [M]),
6>        F2 = fun(N) ->
6>               io:format("Calling F2 with M = ~p and N = ~p~n", [M, N]),
6>               math:pow(N, M)
6>             end,
6>        io:format("Returning F2: ~p~n", [F2]),
6>        F2
6>      end.
#Fun<erl_eval.44.40011524>
7> lists:map(F1(3), [1,2,3,4]).
Calling F1 with M = 3
Returning F2: #Fun<erl_eval.44.40011524>
Calling F2 with M = 3 and N = 1
Calling F2 with M = 3 and N = 2
Calling F2 with M = 3 and N = 3
Calling F2 with M = 3 and N = 4
[1.0,8.0,27.0,64.0]
8>

Maria-12648430 · July 6, 2022, 12:19pm

@leeyis the title of a topic should be a short summary, not the entire body repeated in one line.

raimo · July 6, 2022, 2:26pm

I can guess a little about the performance, although others in the team (that now are on vacation) should know better.

Creating an anonymous function should not be very expensive. The code is already compiled, so a term referring to the code and the environment variables has to be created, which is a bit larger than a tuple, and maybe a bit harder to create since fields such as module name and node are filled in.

Calling an anonymous function is more interesting since that is in general done more times than creating it. Here there are recent optimizations that I think are present in the OTP-25.0 release. Calling a fun() still should be a bit more expensive than a function, though. How expensive depends on how the fun() is created; fun foo(_) -> ok end, fun foo/1, fun mod:foo/1, fun mod:Foo/1, fun Mod:foo/1, fun Mod:Foo/1, that is if the module or the name are known in compile time.

IIRC, lists:map(F, L) is not much faster than M(F, L) if you do not count the time for M = fun lists:map/2 (both module and name are know in compile time).
Maybe a bad example since I also recall that the compiler unfolds such a construct entirely, if it can.

SisMaker · July 6, 2022, 3:16pm

thanks a lot

raimo · July 7, 2022, 8:27am

A not uncommon source of the idea that funs are slow is from accidentally measuring the speed of a shell fun.

The shell is interpreted, and a fun has to be compiled. This is solved in the shell (or rather in erl_eval) by having a number of compiled evaluation funs (of arity 0…20, I think). So when you define a fun in the shell you get the compiled erl_eval fun of the right arity, that has got the evaluation state in its environment. See @elbrujohalcon’s post above; #Fun<erl_eval.44.40011524>. E.g lists:map/2 gets this evaluation fun and calls it as usual, but for every invocation it is erl_eval that interprets the abstract format of your shell fun definition. And that is a lot slower than executing .BEAM code.

That funs have to be compiled (and thereby live in a module) is also a gotcha when sending a fun to a remote node. The compiled fun, meaning the same module of the same version, has to exist in the remote node or else it cannot be called.

The exception to this is a fun on the form fun lists:map/2 which is a reference to an export entry of the external function lists:map/2. So when calling such a fun the remote node will load the module if needed an call the function if it exists, ignoring the module version.

raimo · July 7, 2022, 8:46am

Note Erlang does not guarantee the evaluation order of arguments. If you call a(b(), c()) then, since we do not do postponed/lazy evaluation, b/0 and c/0 must be called before a/2, but b/0does not have to be called before c/0. Just like there is no guaranteed about in which order functions are called when you construct a compound term such as {tag, b(), c()}}, [b() | c()], etc.

The argument evaluation order has changed once, a long time ago. In Joe’s Abstract Machine (the JAM) the evaluation order was right to left, but in the BEAM it is left to right (mostly, there might be gotchas).

There has been talk (Coffee Corner Brainstorming, at least) of a hardening tool feature that would randomize the evaluation order. A question was whether to randomize at compile time, at load time or at call time (in the VM), and we got as far as concluding that it was an interesting question.

nzok · July 7, 2022, 9:44am

I’m probably going to confuse you,
but execution order is not the issue.
1>> F1 = fun (M) → fun (N) → math:pow(N, M) end end,

You bound F1 to a function of one argument, M.

2>> lists:map(F1(3), [1,2,3,4])

You called (the value of) F1 with one parameter, 3.
So the argument M in this call to F1 is bound to the value 3.
What ELSE could it be bound to?

The result of that call is now (equivalent to)

fun (N) → math:pow(N, 3).

IT DOES NOT MATTER WHEN THE CALL “F1(3)” IS EVALUATED.
What matters is that you are applying a function of one paramter
to one argument; whenever that happens, you’ll get something
equivalent to
fun (N) → math:pow(N, 3).

There are programming languages where a function’s arguments
must all be evaluated before the body of the function.
There are programming languages where they must NOT.
There are programming languages where it depends on how clever
the compiler is feeling that day.
In ALL of them you would get the same result.
You have a function that wants one argument.
You give it one argument.

So that is the argument it gets, and that is literally all there is to it.

Execution order has nothing to do with it.

leeyis · July 7, 2022, 10:18am

I seem to understand when you explain it that way; I don’t seem to understand (I feel like I don’t understand the underlying implementation of language), so I’ll take your point and digest it slowly.

max-au · July 8, 2022, 5:48am

It’s usually quite hard to make guesses on that, therefore I recommend measuring it.
There are tools like benchee and erlperf letting you do this with ease.

SisMaker · July 8, 2022, 6:25am

Oh, I didn’t notice that you replied to me again. Thanks for your advice. I think I may need to have a good look at Erlperf. For later i have write some code to test, make an anonymous function was usually several times or even 10 times slower.

raimo · July 8, 2022, 8:26am

10 times slower does not sound reasonable.
Please come back and show measurements…