Runtime module compilation is this the way?

Hi!

My main domain is C++ language, but in our company Erlang heavly used.

So I have a question about runtime compilation. I want to create lib where user can define some “pathes” to field in map and change it. Like

matcher:add_path([a, b, c, d], fun(Value) -> change(V) end).
matcher:add_path([a, b, e, f], fun(Value) -> change(V) end).

I can do that through reqursive match on map like

recursive_matcher([Key | Rest], Map) when is_map(Map) ->
    case maps:find(Key, Map) of
        {ok, Val} -> recursive_matcher(Rest, Val);
        error -> nomatch
    end;
recursive_matcher([], Val) -> {ok, Val};
recursive_matcher(_, _) -> error.

And I thought about dynamic compilation/reload and made POC like

% exact_matcher.erl
-module(exact_matcher).
-define(PT_ADDED, {?MODULE, added}).
add_path(Path) when is_list(Path), Path =/= [] ->
    Pending = persistent_term:get(?PT_ADDED, []),
    case lists:member(Path, Pending) of
        true  -> ok;
        false -> persistent_term:put(?PT_ADDED, [Path | Pending])
    end,
ok.

make_function_clause(Clause) when is_list(Clause) ->
	Acc = "do(",
	Acc2 = lists:foldl(fun(C, A) -> string:concat(A, io_lib:format("#{~s:=", [atom_to_list(C)])) end, Acc, Clause),
	Val = string:concat(Acc2, "V"),
	Finishing = string:concat(Val, lists:duplicate(length(Clause), $})),
	list_to_binary(string:concat(Finishing, ") -> {oki,V};\n")).

compile() ->
    Pending = persistent_term:get(?PT_ADDED, []),
	{ok, F} = file:open("exact_matcher.erl", [write]),
	file:write(F, <<"-module(exact_matcher).\n">>),
	file:write(F, <<"-export([do/1]).\n">>),
	[ file:write(F, make_function_clause(Clause)) || Clause <- Pending ],
	file:write(F, <<"do(_) -> nomatch.\n">>),
	file:close(F),
    persistent_term:erase(?PT_ADDED),
    {ok, length(Pending)}.

It will generate function clauses in exact_matcher.erl module

% add_path([a, b, c])
% compile()

% exact_matcher.erl (skipped header)
do(#{a => #{b => #{c => V}}}) -> V;
do(_) -> nomatch.

In my tests

test_req(Count) ->
	timer:tc(fun() -> ntimes(fun() -> recursive_matcher([a, b, c], #{a => #{b => #{c => 1}}}) end, Count) end).

test_compiled(Count) ->
	timer:tc(fun() -> ntimes(fun() -> exact_matcher:do(#{a => #{b => #{c => 1}}}) end, Count) end).

The compiled version in about 30+% faster then recursive matcher and for me it worth it.

But maybe this is bad idea and erlang runtime is not fit for this way of using.

My second thouht is make something like rebar plugin and user defines these pathes in some file, describe mutators, etc, and rebar will make this exact_matcher.erl on precompile phase and compile it with application. In that way there is no runtime compilation, but for me it fits also - user knows all the paths at compile time, no need for “full” runtime

What is the best option here?

Thanks

Personally, I would recommend using `parse_trans_codegen`, which allows you to provide actual Erlang code in the form of funs, which are transformed to code.

You can see examples here:

and here:

The function calls `codegen:gen_module/3` etc. are pseudo-functions that trigger the actual code generation. If you need source code too, I suggest you use a pretty-printer. The function `parse_trans_pp:pp_src/2` outputs pretty-printed source from forms, and `parse_trans_pp:pp_beam/[1,2]` generates pretty-printed source from .beam files that have been compiled with the `debug_info` option.

BR,
Ulf W

That sounds like an incredibly bad idea, and, judging by numerous beginner mistakes in the metaprogramming code, the original recursive version of code wasn’t so well thought-out either. Meta-code:

  1. If your intention is to write data to file, there’s no reason to use string:concat
  2. io_lib:format("#{~s:=", [atom_to_list(C)])) conversion to list doesn’t make sense; "~p” can interpolate atoms.
  3. Typically one uses parse transforms and AST manipulation for metaprogramming in Erlang

Recursive code: use pattern-matching on maps instead of maps:find. That’ll remove an external function call and ok tuple construction from each iteration. With this optimization you’ll essentially get your generated version of the code + tail-recursive call:

match([Key | Rest], M) ->
  case M of
    #{Key := Val} ->
      match(Rest, Val);
    #{} ->
      nomatch;
    _ ->
      error
  end;
match([], V) ->
  {ok, V}.

Unscientific benchmarks show that this version can be twice as fast as your recursive_matcher, depending on the input data. Note: there’s almost never any reason to use maps:find function.

1 Like

I agree that dynamic code generation is seldom the right answer, but if you do want to explore it and find out for yourself if it’s worth it, I still argue that using `parse_trans_codegen` is a cleaner approach. :wink:

BR,
Ulf W

Thanks for the answer. I will try your solution in test.

But it looks like “user has to define a bunch of function (clauses) to match their structure to parse”. I want to simplify usage of this feature - user defines only paths (mutators will be later, ok) and that lib just matches (as fast as possible). Remove boilerplate about causes and generate it by demand.

Thanks for the answer.

Yeah, 1-2 is not optimal, but this is not a performace issue here. But in your first words you answered it. Ok. This is bad idea to dynamicly put updated module in application.

Your function is really the fastest.

Thanks.

But if there is “undefined” maps to check - this solution becames N times slower, right? If I want to check N paths - I have to check all of them, not just match as function clause. Am I right?

Or there is no performance advantage to use N clauses instead of N-cycle loop for matching?

This is bad idea to dynamicly put updated module in application.

I would not entirely rule out code generation, but I’d consider this approach only when everything else fails:

  1. Metaprogramming leads to meta-bugs
  2. Managing modules in BEAM VM is not a trivial task: there’s no garbage collection for the modules, and the VM keeps at most 2 versions of the code, so replacing the modules has side-effects. If you generate modules dynamically, you have to manage their lifetime strategically. For normal modules that are part of OTP application, the application controller does all the dirty work, but for dynamically generated code you’re on your own.
  3. Alternatively, you can generate all modules at the build time. But that complicates the build process; you’ll have to deal with custom build steps.

Or there is no performance advantage to use N clauses instead of N-cycle loop for matching?

I am not sure I fully understand the question, but when you write a pattern-match expression

#{foo := #{bar := ...} ...}

with N nested lookups, the VM will have to perform N map lookups. There’s no compiler magic that can fix that, as far as I know.

2 Likes