A few questions about the 'select_val' instruct

Hi all
I wrote a library of Aho Corasick’s algorithm in Erlang here. i tried every means to optimize its performance but did not get satisfactory results. can someone give me some advice please.

Add {d, 'DEBUG'} to rebar.config and compile.

Call the following function.

aho_corasick:gen_acs_by_filename(" keywords",asc).
{ok, acs}.

an acs.erl file with many function clauses is generated like this:


success(342) → {20250,343};
success(174) → {21806,175};
success(257) → {113,258};
success(142) → #{22920 => 143,29240 => 145};
success(182) → {35265,183};
success(159) → {30005,160};

success(228) → {36733,229};
success(328) → #{21608 => 329,30701 => 339};
success(313) → {21806,314};
success(105) → {25506,106};
success(263) → {113,264};
success() → false.

failure(234) → {239, undefined};
failure(318) → {288, undefined};
failure(174) → {313, undefined};
failure(142) → {260, undefined};

failure(265) → {0, 3};
failure(11) → {0, 3};
failure(205) → {0, 3};
failure(62) → {0, 2};
failure(343) → {0, 3};
failure(
) → {0, undefined}.

last

aho_corasick:matches(acs, SoneSensitiveWords).
[{4,2},{4,1},{3,1},{2,1},…]

I looked at how the ‘select_val’ instruct is implemented so i have a few questions

  1. Will a large number of function clauses match faster than Maps? (>100000)
  2. Will a large number of function clauses can jump by looking up the table?
  3. Do you have any other optimization scheme?

respect !

1 Like

I don’t know. It probably depends on what kind of keys are used. To find out, you will have to run benchmarks.

Yes, a jump table will be used if the keys are suitable. If not, a binary search will be used (unless there are only a few values, in which case a linear search will be used).

The JIT does some additional optimizations. See: jit: Optimized code generation for select_val

The match and matches functions in the binary module use the Aho-Corasick algorithm if more than one pattern is given (implemented in C in the runtime system).

1 Like

:grinning:
I saw the code for the jump table section in ‘select_instrs.tab’. It is verified in the assembly code generated by calling ‘erts_debug:df(acs)’.

1 Like