EEP 79: Deep Map Access

eproxus · February 21, 2025, 3:36pm

I’m proposing a new EEP Deep Map Access.

Abstract

Maps in Erlang are frequently used to represent nested structures, such as JSON
data or configuration settings. This EEP proposes new library functions in the
maps module for accessing nested maps, which will make it easier to work with
such data.

The proposal suggests backwards-compatible additions to the standard library that have been proven through a reference implementation in the mapz library, which has been in production use for several years.

I’m looking for feedback on it in general, but before any detailed discussions about the concrete proposal takes part, I feel that the Rationale section should be discussed with focus on the alternatives considered. I’m particularly interested in hearing from anyone who has implemented similar solutions or has experience with the alternatives mentioned. This will prevent any further unnecessary effort going into the proposal if it is not heading in the right direction.

The draft lists some open questions as well in the end, that it would be helpful if the community contributed to.

Looking forward to your opinions and criticism!

rlipscombe · February 21, 2025, 4:54pm

fwiw, there’s a deep map implementation in our Kafka client, here: kafine/src/kafine_maps.erl at 0.7.0 · happening-oss/kafine · GitHub (unit tests are in the usual place).

At the time, I was aware of mapz, but I wanted to keep kafine fairly light on dependencies.

eproxus · February 21, 2025, 6:51pm

Great to see more prior art! I added a reference to it.

josevalim · February 22, 2025, 9:31am

Thanks for sharing the proposal! In my opinion, dealing only with maps is too limiting. JSON is being used as an example, but traversing lists nested inside keys are very common in JSON and will require even more boilerplate than keys, and this whole subset of problems was declared as out of scope.

It is also worth saying that the ambiguity in lenses/access is not an intrinsic property of those solutions but an implementation choice. For example, in Elixir, map[:a][:b] is for key-value pairs only, and trying to replace the key a in [{a, 1, 2}] will raise.

To be more concrete, a general solution could be to specify the operation you want to perform along side each key. For example, if you want to support maps and tuples, you could do:

1> access:get([{key, a}, {element, 2}], #{a => {1, 2, 3}}).
2

In this case, there is zero ambiguity: key works with maps, element works with tuples, and anything else would raise. This would allow you to add as many operations as you want, that can be arbitrarily nested, such as all that could traverse lists:

1> access:get([{key, a}, all, {element, 2}], #{a => [{1, 2, 3}, {4, 5, 6}]}).
[2, 5]

This starts to resemble XPath and similar, where you have a richer API to traverse and update data structures. The downsides of this approach are two:

If you only want to access map keys, wrapping everything in a {key, ...} can become verbose
All of the operations must be defined upfront. Users don’t have the ability to add their own traversals

What lenses/access tell you is that there is a generic API we can define for anyone to apply any operation they want on the data type of their choice. Protocols are not a requirement and can be skipped altogether. So here is another possible solution to the problem:

If you pass any value, it is assume it is a map key, so you can access a nested map as: access:get([foo, bar], Map). This provides convenience for the most common use cases.
However, if the value is a 3-arity function, then said function must implement a contract so you can traverse any data structure.

Here is an implementation of the proposal above:

-module(access).
-export([get/2, key/1, all/0]).

% Main get function that handles map traversal with a list of keys
% as well as custom selectors
get([Key | Keys], Data) when is_function(Key, 3) ->
  Key(get, Data, fun(Value) -> get(Keys, Value) end);
get([Key | Keys], Data) when is_map(Data) ->
  get(Keys, maps:get(Key, Data));
get([], Data) ->
  Data.

% Creates a selector that gets a specific key
key(Key) ->
  fun(get, Map, Next) when is_map(Map) ->
    Next(maps:get(Key, Map))
  end.

% Creates a selector that traverses a list
all() ->
  fun(get, List, Next) when is_list(List) ->
    lists:map(Next, List)
  end.

And now I can use it as:

6> c(access).
{ok,access}
7> access:get(
     [languages, access:all(), name],
     #{languages => [#{name => erlang}, #{name => elixir}]}
   ).
[erlang,elixir]

Notice how we can traverse maps, using bare keys, and then use selectors to traverse any data structure. In this case we used access:all/0 for lists, but you could add one for tuples, another for proplists, etc. You can create any selector you want, strict or relaxed, and all you have to do is to define a function that expects certain arguments (in Elixir, the selector has to support two operations, get and get_and_update).

The only ambiguity in the above is that, if you are storing a 3-arity function as a map key, you need to wrap said key in the access:key/1 selector, but this is extremely rare in practice, and a price more than worth paying in my opinion.

In any case, I am not advocating for this solution in particular, I just want to point out that the ambiguity is not intrinsic and there are many decisions that could be made to have a rich API that goes beyond maps (and also in very few lines of code).

eproxus · February 22, 2025, 9:52am

This is exactly the type of feedback and discussion I was looking for I had similar thoughts while finishing this draft.

The intention with marking it as out of scope for this particular EEP draft was to get something out to start the discussion (with the thinking that “this is a real need and this solution has been ‘working’ for a couple of years for many, so it’s better than nothing” ).

I’m totally fine with exploring your idea and ditching this map-specific direction for the EEP in favor of another one. The main priority for me is that Erlang gets some way to access and update common nested structures.

I’ll take a look at your solution in depth, to get a feeling for if it holds with both access and different types of updates (updating, deleting, merging etc.). I’ll see if I can write a proof-of-concept implementation and possibly an EEP draft for it, if you don’t mind. Let me know if you’re interested in collaborating on that

josevalim · February 22, 2025, 10:58am

Elixir can do get, update and delete through two operations, get and get_and_update (which may return pop). We don’t do merge yet, so I am curious if it needs new operations in the selector or not. Interesting…

I will be glad to review and discuss but I probably cannot engage as a co-author.