Need help understanding some Erlang code

Maybe I wrote the wrong test code, just to see the time consumption of Make Fun

Blockquote
ams1(A) → A.
ams2(A) → fun() → A end.

Blockquote
amf1(0, A, B, C, _) → ok;
amf1(N, A, B, C, _) →
amf1(N - 1, A, B, C, 1) .

Blockquote
amf2(0, A, B, C, _) → ok;
amf2(N, A, B, C, _) →
Fun = fun(One1) → X = (One1 * A * B * C), Y = B-C, Z = X / Y, Z end,
amf2(N - 1, A, B, C, Fun).

the test result:

Blockquote
24> utTc:ts(1000000, funTest, ams1, [1]).
=====================
execute Args:[1]
execute Fun :ams1
execute Mod :funTest
execute LoopTime:1000000
MaxTime: 112800(ns) 0.000113(s)
MinTime: 200(ns) 0.0(s)
SumTime: 314857400(ns) 0.314857(s)
AvgTime: 314.8574(ns) 0.0(s)
Grar : 116553(cn) 0.12(%)
Less : 883447(cn) 0.88(%)
=====================
ok
25>
25>
25>
25> utTc:ts(1000000, funTest, ams2, [1]).
=====================
execute Args:[1]
execute Fun :ams2
execute Mod :funTest
execute LoopTime:1000000
MaxTime: 11801000(ns) 0.011801(s)
MinTime: 200(ns) 0.0(s)
SumTime: 404538500(ns) 0.404539(s)
AvgTime: 404.5385(ns) 0.0(s)
Grar : 118765(cn) 0.12(%)
Less : 881235(cn) 0.88(%)
=====================
ok
26>
26>
26> utTc:ts(1000000, funTest, ams2, [1]).
=====================
execute Args:[1]
execute Fun :ams2
execute Mod :funTest
execute LoopTime:1000000
MaxTime: 4434400(ns) 0.004434(s)
MinTime: 200(ns) 0.0(s)
SumTime: 390178500(ns) 0.390179(s)
AvgTime: 390.1785(ns) 0.0(s)
Grar : 420574(cn) 0.42(%)
Less : 579426(cn) 0.58(%)
=====================
ok
27> utTc:ts(1, funTest, amf1, [100000000, 1, 2, 3, 4]).
=====================
execute Args:[100000000,1,2,3,4]
execute Fun :amf1
execute Mod :funTest
execute LoopTime:1
MaxTime: 129498900(ns) 0.129499(s)
MinTime: 129498900(ns) 0.129499(s)
SumTime: 129498900(ns) 0.129499(s)
AvgTime: 129498900.(ns) 0.129499(s)
Grar : 0(cn) 0.00(%)
Less : 1(cn) 1.00(%)
=====================
ok
28> utTc:ts(1, funTest, amf2, [100000000, 1, 2, 3, 4]).
=====================
execute Args:[100000000,1,2,3,4]
execute Fun :amf2
execute Mod :funTest
execute LoopTime:1
MaxTime: 1571027600(ns) 1.571028(s)
MinTime: 1571027600(ns) 1.571028(s)
SumTime: 1571027600(ns) 1.571028(s)
AvgTime: 1571027600(ns) 1.571028(s)
Grar : 0(cn) 0.00(%)
Less : 1(cn) 1.00(%)
=====================

1 Like

Are ams1/1, ams2/1, amf1/5 and amf2/5 compiled in the module funTest?

What is utTc:ts/4?

Assuming the above is what I suppose it is

Comparing ams1/1 vs. ams2/1: calling a function that does nothing but returning its argument vs. calling a function that returns a fun of arity 0 with 1 environment variable. 315 vs. 405|390 ns. It seems to take 110|75 ns to construct such a fun.

Comparing ams1/4 vs. ams2/4: calling a loop over an arity 5 function, one that does nothing but loops vs. one that construct a fun of arity 1 with 3 environment variables (A, B, C) and then loops. 1.3 ns per loop iteration vs. 15.7 ns. It seems to take 14.4 ns to construct such a fun.

Attempting a conclusion

I think any result of the first comparison is hidden in benchmark overhead that seems to add about 300 to 400 ns when measuring something, so no conclusive result on the fun creation here.

The second comparison has probably better numbers since the measurement time is much larger than the benchmark overhead thanks to the 10^8 iterations. But comparing bare loop time vs. constructing a term does not say much. Comparing constructing a 4-tuple vs. constructing a fun/1 with 3 environment variables would be more interesting.

2 Likes

How about comparing these two (iterate/1 vs. iterate_fun/1) with 10^8 iterations:

iterate(N) ->
    iterate_loop({N, 1}).

iterate_loop({0, _}) ->
    ok;
iterate_loop({N, M}) ->
    iterate_loop(subtract(N, M)).

subtract(N, M) ->
    {N - M, M}.


iterate_fun(N) ->
    iterate_fun_loop(subtract_fun(N), 1).

iterate_fun_loop(Fun, M) ->
    case Fun(M) of
        0 -> ok;
        N -> iterate_fun_loop(subtract_fun(N), M)
    end.

subtract_fun(N) ->
    fun (M) -> N - M end.

That should compare a loop over calling subtract/2 vs. a loop over constructing an arity 1 fun with 1 environment variable and also calling it. Each fun is called only once after construction. Both loops create a comparative amount of garbage (2-tuple vs. fun with 1+1 variables) per iteration.

2 Likes

Are ams1/1 , ams2/1 , amf1/5 and amf2/5 compiled in the module funTest ?
yes it is.
What is utTc:ts/4 ?
it is like timer:tc/3,you can see it at this:eUtils/utTc.erl at master · ErlGameWorld/eUtils · GitHub

My goal was simply to test the time overhead of make Fun, and the only difference between the two functions was to generate an additional anonymous function

1 Like

The results of these two functions

Blockquote
Eshell V13.0.2 (abort with ^G)
1> utTc:ts(5, funTest, iterate, [100000000]).
=====================
execute Fun :iterate
execute Mod :funTest
execute LoopTime:5
MaxTime: 1765140131(ns) 1.76514(s)
MinTime: 1734687166(ns) 1.734687(s)
SumTime: 8786313785(ns) 8.786314(s)
AvgTime: 1757262757(ns) 1.757263(s)
Grar : 4(cn) 0.80(%)
Less : 1(cn) 0.20(%)
=====================
ok
2> utTc:ts(5, funTest, iterate_fun, [100000000]).
=====================
execute Fun :iterate_fun
execute Mod :funTest
execute LoopTime:5
MaxTime: 3445080471(ns) 3.44508(s)
MinTime: 3258557401(ns) 3.258557(s)
SumTime: 1683358147(ns) 16.833581(s)
AvgTime: 3366716294(ns) 3.366716(s)
Grar : 3(cn) 0.60(%)
Less : 2(cn) 0.40(%)
=====================
ok

From the results, they are only 2 times the difference, This should be the normal result.

2 Likes

Something unrelated to your question. How come you’ve implemented utTc:cvrTimeUnit/3 instead of using erlang:convert_time_unit/3?

Note that utTc:cvrTimeUnit/3 may stop working at any time since it is using ERTS internal functions that may be changed or removed at any time (even in an emergency patch) without any notice.

3 Likes

Thanks for reminding, utTc:cvrTimeUnit/3 is simply a rewrite based on Erlang :convert_time_unit/3 replacing 1000 * 1000 * 1000 to 1000000000 to avoid multiplication. If one day it doesn’t work I’ll fix it.
A lot of times I like to rewrite some code.

1 Like

The 1000 * 1000 * 1000 multiplications will be performed by the compiler at compile-time producing the constant 1000000000 in the beam-file, so there will be no overhead in runtime at all by these multiplications.

4 Likes

Oh, I didn’t know that the compiler did this optimization
i write the test code

Blockquote
t() →
A = 1000 * 1000 - 1,
B = 4342,
A + B.

the assembly code

Blockquote
{function, t, 0, 2}.
{label,1}.
{line,[{location,“t.erl”,5}]}.
{func_info,{atom,t},{atom,t},0}.
{label,2}.
{move,{integer,1004341},{x,0}}.
return.

  • But I never seem to see any mention of this optimization :sweat_smile: :sweat_smile: :sweat_smile:
2 Likes

This peaked my interest. Just out of curiosity, do you mean this as a way to test one’s code to see if the program fails if an (ill-)assumed evaluation order gets changed, or could there be actual benefits from having a randomized execution order in such cases?

1 Like

A tiny nitpick regarding benchmarking: it is more often useful to report median time than average time, since that a removes any larger outliers more effectively (although you probably want to show both).

It might also be useful to have a warmup run before the actual benchmark to avoid codeload/caching/JIT/etc issues.

1 Like

The intent, I think, was just to not always take the same path, which can hide faulty assumptions forever. Randomizing can expose such, through e.g test cases.

2 Likes