Matching empty map? Performance of map_size?

rlipscombe · May 1, 2024, 8:41am

I’m aware that matching an empty map with #{} = Map doesn’t work – it matches any map. What bothers me is that the alternative (in a guard) is something like is_map(Map), map_size(Map) == 0.

We’ve been taught that using length(List) is O(n), and we should explicitly match [], which is O(1).

Or consider matching a non-empty list; we should use [_|_] = List in preference to length(List) /= 0.

My question, I guess: what is the performance of map_size/1? Why is there no is_map_empty/1 guard BIF?

TD5 · May 1, 2024, 9:07am

Looking at the implementation (otp/erts/emulator/beam/erl_map.c at master · erlang/otp · GitHub), it should be super fast since the size is explictly tracked in the value itself. Empirically, I’ve found map_size is much faster than any mechanism to attempt read a single value, in order to determine whether it is empty or not.

bjorng · May 1, 2024, 9:09am

The is_map/1 call is not necessary. map_size/1 will fail the guard if Map is not a map.

A map includes its size in the header word, so map_size/1 is O(1). Furthermore, the JIT will inline calls to map_size/1.

rlipscombe · May 1, 2024, 2:40pm

Yeah; I know. Some of the team was surprised by that, and preferred it to be explicit.

Cool, so nothing to worry about. Thanks.

It’s not in the documentation, though. Should it be?

wojtekmach · May 1, 2024, 3:04pm

Since for this use case we can’t do a pattern match and need to do a guard anyway, I’d consider this one:

when Map == #{}

bjorng · May 2, 2024, 8:12am

Perhaps it should be. We have discussed adding notes about the complexity for common functions.

A rule of thumb for naming in Erlang/OTP is that functions having size in their name are O(1), while functions having length in their name are O(N).

Yes, that is more direct way to test whether a map is empty. In Erlang/OTP 27, that happens to result in slightly smaller native code than map_size(Map) == 0. That is one of the miscellaneous optimizations that I happened to mention in my blog post. In Erlang/OTP 26 and earlier, Map == #{} would not be inlined by the JIT and thus be somewhat slower than map_size(Map) == 0.

Digging deeper, let’s look at the native code.

For map_size(Map) == 0 in a guard, the JIT in both Erlang/OTP 26 and Erlang/OTP 27 generates the following native code:

# bif_map_size_jsd
    mov rax, qword ptr [rbx]
    test al, 1
    jne label_3
    mov ecx, dword ptr [rax-2]
    and cl, 63
    cmp cl, 44
    jne label_3
L17:
L18:
    mov rax, qword ptr [rax+6]
    shl rax, 4
    or al, 15
    mov qword ptr [rbx], rax
# is_eq_exact_fss
# simplified check since one argument is an immediate
# simplified compare of BEAM register
    cmp rax, 15
    jne label_3

That is, the JIT inlines both the call to map_size/1 and the call to =:= (the Erlang compiler has replaced == with =:= because the latter is more efficient while giving the same result in this case).

For Map == #{}, the JIT in Erlang/OTP 26 generates the following code:

# is_eq_exact_fss
L22:
    long mov rsi, 9223372036854775807
    mov rdi, qword ptr [rbx]
    cmp rdi, rsi
    short je L21
    rex test dil, 1
    jne label_6
    mov rbp, rsp
    lea rsp, qword ptr [rbx-128]
    vzeroupper
    call 94546710368096
    mov rsp, rbp
    test eax, eax
    je label_6
L21:

Here the JIT generates code to call the general function for testing two Erlang terms for exact equality.

In Erlang/OTP 27, the JIT generates the following code:

# is_eq_exact_fss
# optimized equality test with empty map
    mov rdi, qword ptr [rbx]
    rex test dil, 1
    short jne label_3
    cmp dword ptr [rdi-2], 300
    short jne label_3
    cmp dword ptr [rdi+6], 0
    short jne label_3
# i_move_sd

That is, the comparison has been inlined.

michalmuskala · May 14, 2024, 4:19pm

This could potentially be optimised even further by making empty maps global literals similar to how an empty tuple is one (here), though this could complicate all maps operations to be careful to return it, rather than allocating a new one.

Matching *empty* map? Performance of map_size?

Matching empty map? Performance of map_size?