Benchmarking the compressed option for ETS and results are unexpected - any thoughts?

Hi!

I’ve been benchmarking the compressed option for ETS and results are unexpected.

I’m caching some HTTP data looking like:

{{<<19,192,227,106,66,196,34,54,198,194,138,214,145,246,99,219,83,180,136,48,7,
    36,164,223,155,25,229,200>>,
  <<218,109,20,124,161,64,36,178,117,238,234,77,188,216,216,109,161,134,64,
    124,101,163,83,236,131,237,198,158>>},
 #{<<"accept-encoding">> => <<"gzip">>},
 <<202,202,12,177,123,34,165,124,70,183,226,42,18,167,173,90,190,70,238,227,
   172,98,102,174,140,249,173,211,248,224,237,219>>,
 {200,
  [{<<"content-type">>,<<"text/html; charset=utf-8">>},
   {<<"cache-control">>,<<"s-maxage=1200, public">>},
   {<<"x-request-id">>,<<"Fy4_SNMQ8ZQSPmoAAAIk">>},
   {<<"x-frame-options">>,<<"SAMEORIGIN">>},
   {<<"x-xss-protection">>,<<"1; mode=block">>},
   {<<"x-content-type-options">>,<<"nosniff">>},
   {<<"x-download-options">>,<<"noopen">>},
   {<<"x-permitted-cross-domain-policies">>,<<"none">>},
   {<<"cross-origin-window-policy">>,<<"deny">>},
   {<<"vary">>,<<"accept-encoding">>},
   {<<"content-encoding">>,<<"gzip">>},
   {<<"content-length">>,<<"1045">>}]},
 #{alternate_keys => [2,3,5,7,11,20],
   created => 1670342093,expires => 1670343293,grace => 1670343413,
   parsed_headers =>
       #{<<"cache-control">> =>
             #{<<"public">> => undefined,<<"s-maxage">> => 1200},
         <<"content-encoding">> => [<<"gzip">>],
         <<"content-type">> =>
             {<<"text">>,<<"html">>,[{<<"charset">>,<<"utf-8">>}]}},
   ttl_set_by => header},
 1670342093}

and using the compressed option when creating the ETS table results in used memory down by a 2.5x factor compared with uncompressed table.

Regarding performance, I’ve tested ets:lookup/2 and ets:update_element/3 (to update an integer) with set and ordered_set tables of 1 million elements. The only case uncompressed table is faster is lookup in a set (~3% faster). In all other cases, operations on compressed table are faster by ~7%.

Documentation states:

If this option is present, the table data is stored in a more compact format to consume less memory. However, it will make table operations slower. Especially operations that need to inspect entire objects, such as match and select, get much slower. The key element is not compressed.

I’ve taken a look at the code here and there and as far as I understand data is not compressed (with gzip etc.) but stored in a more compact form in memory, is that correct?

How could the benchmark result be explained? Better use of hardware cache thanks to better data locality when data is stored in compressed form?

Except for the functions cited in the documentation (match and select), are they some other trade-offs to be considered?

I never considered using this option before and now I feel like I’ve found a superpower, but I’d like to better understand it :blush:

Cheers!

6 Likes

Correct. It’s basically the external format produced by term_to_binary, but with some internal tweaks. Atoms are encoded with internal atom table index and big binaries are encoded with a pointer to the binary data (no data copy). It turned out to be surprisingly good both for speed and compactness.

Yes, I suppose so. CPU cycles are cheap, memory bandwidth is expensive.

Insertions and lookups of entire tuples are probably not relatively more expensive as the tuple must be copied to/from the process heap in the uncompressed case also. For compressed table, it’s like a slightly more complex copy operation.

The key element and “immediate” elements (small integers, atoms, local pids and port identifiers) are not compressed. Operations that might be relatively costly are when a temporary tuple needs to be created in order to access non-key and non-immediate elements. That’s all operations with a match spec, ets:delete_object and insertion in bag when a duplicate key is found.

6 Likes

All those small binaries will yield good compression, as the overhead per binary is much less. Strings (lists of bytes) will also compress very well. One byte per element instead of 16.

3 Likes