Do we really need nil values in maps?

Oliver · October 29, 2021, 1:07pm

I’m struggling in elixir with the behavior of keys and nil values. But I see that the origin of the problem is partially in how Erlang defines maps:

1> Val = #{}.
#{}
2> Input@1 = case Val of
2> #{‘daps-PowerCoordinationInfo-r16’:=Input@1_0} → Input@1_0;
2> _ → asn1__MISSING_IN_MAP
2> end.
asn1__MISSING_IN_MAP

This code checks for the presence of a key in the map through matching. (Taken from generated asn1 codec.) You can see how it evaluates to key missing.

`3> Val2 = #{‘daps-PowerCoordinationInfo-r16’ => nil}.
#{‘daps-PowerCoordinationInfo-r16’ => nil}
4> Input@2 = case Val2 of
5> #{‘daps-PowerCoordinationInfo-r16’:=Input@1_0} → Input@1_0;
5> _ → asn1__MISSING_IN_MAP
5> end.
nil

If we provide a similar map with the key present but set to nil, the same code will evaluate to nil.

So “key presence” is different from “value of key is nil.” Which, if looked at by itself, is not a problem. It becomes however a problem in my view when looking at what distinctions you have to make when filling and evaluating maps.

Erlang is more consistent in this than elixir has built on it. If I look at the maps module, I see maps:get/2 will raise a badkey exception if the key is not present, maps:get/3 will respond with the default. For comparison, elixir’s Map.get/3 will be the same as maps:get/3, but Map.get/2 will return nil. This is more of an interop pitfall.

In elixir you have to be aware:

x = {} # empty map
x.key # throws if atom key is not a key in x
x[:key] # returns nil
Map.get(x, :key) # returns nil
Map.get(x, :key, “default!”) # returns “default!”
Map.has_key?(x, :key) # returns false

If you have a key present but it is nil:

x = { key: nil }
x.key # returns nil
x[:key] # returns nil
Map.get(x, :key) # returns nil
Map.get(x, :key, “default!”) # returns nil
Map.has_key?(x, :key) # returns true

This is still consistent, but becomes very unwieldy. I’m not trying to assert that elixir’s somewhat confusing semantics here are all rooted in Erlang, but I want to show, returning to the initial example, how they cause code to become hard to understand and convoluted:

We had code as follows:

{‘daps-PowerCoordinationInfo-r16’: is_nil(value) && :asn1__MISSING_IN_MAP || value }

Actually we were trying to update large maps (for ASN.1 encoding) in the declarative syntax. But there is no native way to say “This value is present only if.” So what was done here instead that we already put the value that the ASN.1 codec encode function would insert internally so that we could still could use the declarative syntax.

The ASN.1 codec does what I wrote above - it matches for the presence of the key. Both matching for the presence of a key in the map and calling elixir’s Map.has_key?/2 and Erlang’s maps:is_key/2 would all return true if a key was present but its value nil.

So, in most situations you have to make extremely fine distinctions whether the key is present or if it’s more important that its value is nil. elixir makes this a bit more confusing by some operator/function semantics that default to nil for the caller’s convenience. But the declarative syntax becomes impossible to use if there’s a scenario where the value might not be present, leading to a series of individual updates instead.

What we are currently contemplating in our codebase is either create a function that puts the value only if it is not nil and do the updates individually. Or a function that cleans up all keys from a map which have the value nil.

The question is: Is the general ability to have nil values in a map so important to actually convolute code built on it? Or would it also be possible to switch to a behavior where keys with value nil are never inserted in a map?

I’m sure scenarios exist where people will appreciate the presence of a key with value nil. But in case of the code base the inability to use declarative syntax to only insert an element if it was not nil has been a problem. (Same for using declarative syntax for lists.) We then write code for modifying maps or glueing together different lists and it never gets easy to understand.

OvermindDL1 · October 29, 2021, 7:39pm

Do you mean nil as in the atom nil (which is just a word/atom, not anything special), or the elixir nil value (which is also just the atom nil) or the internal erlang name of the empty list [] (which is nil)? Regardless, all of those are valid values, none of them represent anything like ‘missing’ or so unless you specifically define it as such, so a map missing a value is definitely distinct just as of the atom blah (or in elixir’ese :blah) is something, just as nil (the atom or the empty list either) is something, not nothing.

Personally, when using elixir I more often use the erlang calls rather than elixir calls as I don’t like “fail silent” code, I want things to crash then and there when I do something wrong and elixir’s design of passing back the nil atom often hides that. ^.^;

NobbZ · October 29, 2021, 8:13pm

I have a bit problem to follow your examples as they are not fully valid elixir, though have you checked Map.fetch/2 which clearly distincts between key exists and has a value and key does not exist.

Your Erlang examples still have a problem… What if my value is asn1__MISSING_IN_MAP?

tsloughter · November 9, 2021, 9:05pm

I was going to say similar. I have trouble following the post. Erlang has no nil (ok, it does, but it isn’t “special” like it is in Elixir and undefined is used in Erlang instead – tangent: one of the more or most annoying things when wanting libraries to be usable idiomatically between the two languages).

Maybe rewrite some of the questions and code in Erlang without use of nil so it is clearer if you are talking about the atom always or that something doesn’t exist.

OvermindDL1 · November 9, 2021, 9:09pm

I really wished that elixir mapped its nil special type to the :undefined atom instead of to the :nil atom (all in elixir syntax for these code examples). Would have worked so much better with erlang (and most other beam languages) integration.

Gwaeron · November 10, 2021, 4:27am

Erlang doesn’t have the problem, as you described maps:get/2 will generate an exception. I’m assuming that “the problem” is that you don’t know if the key didn’t exist, or if it existed with the value nil.

You mentioned Erlang being more consistent with what you need, so I think it’s perfectly valid to just use the maps module from Erlang instead of Map. The maps:find/2 function either returns {ok, Value}, error, or crashes with bad map. No ambiguity.

NobbZ · November 10, 2021, 6:50am

As I said, elixir has Map.fetch/2 as well, which returns either {:ok, value} or :error.

I do not understand the ambiguity.

We have bracket access for situations where we do not care as syntactic sugar for Map.get/3 with a default value of nil.

When you care for wether a key actually existed, you use the appropriate function.

Erlang actually has maps:get/3 as well.

Gwaeron · November 10, 2021, 6:59am

Yep, agreed, if anything it seems like it’s a misunderstanding of a feature, not an oversight. Without knowing that much about Elixir’s libraries, I just went with the OP’s claim that he thought Erlang’s maps behaved more like he expected.

maps:get/3 would be ambiguous in the sense that you can’t tell if there was an actual key with a value that happened to be the same as the default, but the obvious solution in any scenario where you would need to tell the difference, is to not use maps:get/3.

Oliver · November 10, 2021, 7:26am

Hello.

No, I don’t have a problem with checking whether a value exists in the map, nor to distinguish it from value nil. My problem is that using declarative syntax in elixir is actually often not possible except for trivial scenarios or you have to filter the map afterward in some form. We often first write maps in declarative syntax:

%{ hello: :world, a_thousand_other_parameters: …}

Then the scenarios (as code evolves) get more complex, and the filling of a parameter or its presence often depend on a condition. We now have the choice to:

Separate out all conditional parameters into map update function calls of some form.
Filter the map afterwards for unwanted values.

Both are clunky in my opinion and diminish the value of declarative syntax.

I thought originally , when opening this thread, that this problem might root in how Erlang’s map implementation and its maps module operate. The feedback received here makes me think that the problem got introduced with elixir’s operators and Map module assigning special meaning to the value nil meaning “not present” for several of its calls. Or at least part of the problem…

Thanks for the feedback about nil definitely not being a special value here, the difference between maps:get/2 and Map.get/2 is instructive here, with the first one raising an exception and the second one resulting in nil for missing key. In this way, if you want non-crashing code, you have to define your own special value when using maps:get/3 and so I guess it’s upon the creator of a map to define a value designating an element that has any special kind of meaning, semantically, if any.

Even in elixir these ambiguities can be resolved to satisfaction when knowing the Map module and semantics well. It’s just clunky (not the API, but the code you have to write to use it robustly), but it is not wrong as such (which I didn’t claim). It could be improved upon if there was a value or construct that could say “value not present” when defining a map in declarative syntax.

The problem basically exists also for Erlang declarative map syntax as well as code gets more complex. It just has no connection to the value nil at all.

NobbZ · November 10, 2021, 7:40am

I do understand this post even less If you want to have a map without a certain key, then just do not specify it, or drop it if you modify a pre-existing value.

Both Erlang and Elixir provide functions for that IIRC.

Oliver · November 10, 2021, 8:08am

Hello, NobbZ.

Let me explain it again, differently:

We use maps (and structs) as our basic message structure. So does, btw, the code you can generate from the asn1ct module when option maps is present.

So, there exist some rather large messages that need to be filled.

Declarative syntax is generally more readable for these scenarios than using any kind of function update. But, again, it cannot be used for cases where the presence of a value depends on a condition, diminishing its value as the message content gets more complex.

This is for example a problem when filling the large complex structures of Radio Resource Control protocol defined in ASN.1. (My original example.)

As I already said, you could filter it if insisting on using declarative syntax. This can get clunky if you do it at every level separately, and needs a well-chosen “undefined” value if you do it recursively. However, the value for code clarity in using declarative syntax is so high, I’m looking for a good solution.

“Not specifying it” is not a good solution with declarative syntax, because you can’t cover all conditional scenarios without it getting messy.

Finally, you could chose not to specify any conditional values in declarative syntax, but add them by conditional updates later. We currently do that. It’s not the most readable solution, but it clearly works.

When you have a map representing a part of a message with easily a dozen of parameters you might or might not want to fill, depending on system parameters, while also keeping code compact and readable, this problem becomes rather glaring after a while.

The problem “got so bad” that some of our coders, specifically for maps meant to be used in the asn1ct-generated codec, started filling asn1__MISSING_IN_MAP because this makes the codec ignore it due to the way it’s implemented. But this is an implementation detail of the generated code that can change at any time so I’m looking for a better solution.

I might actually declare a similar value for our own use (like not_present_in_map) and consider removing all keys with such a value. Then at least I can do it once “on my side” before calling the codec with the data for encoding.

cmo · November 10, 2021, 8:57am

It is a little difficult to follow what you’re trying to achieve. Can you show a better example?

Have you considered piping the map through a series of maybe_update_some_key(message) functions?

message
|> maybe_update_x() 

... 

def maybe_update_x(%{x: x} = message), do: update_in(message, [:x], ...) 
def maybe_update_x(message), do: message

Oliver · November 10, 2021, 9:24am

Hello, cmo.

Yes, my current code uses something like your maybe_update_x function already.

So, to reiterate: It’s about the filling of big messages implemented as maps.

Originally the elixir code might have looked like this:

def build_message(...) do
%{ a: 1,
   b: 2,
   c: 3,
   d: 4
}
end

Then, over time some of these fields might be optionally present dependent on conditions.

There are several approaches to implement that, let me present two I’ve tried:

def build_message(...) do
%{ a: 1,
   b: some_precondition && 2,
   c: 3,
   d: another_precondition && 4 || 2
}
|> filter_out_nil_values()
end

So here d is not a problem, it will always resolve to a “valid value.” But b is intended as optional, we want it filtered out under some conditions.

The filter_out_nil_values/1 does exactly that, but you have to remember to call it.

The problem here, as I see it, is that expressions can only return values and there’s no value to return that would naturally make us omit the value in the map when using declarative syntax.

Alternatively we can handle any truly conditional value outside of declarative syntax.

def build_message(...) do
%{ a: 1,
   c: 3,
   d: another_precondition && 4 || 2
}
|> map_put_if(:b, 2, some_precondition)
end

By the way, I’m not trying to advocate the use of operators here, the code could easily be written like this:

def build_message(...) do
%{ a: 1,
   b: if some_precondition do 2 end,
   c: 3,
   d: if another_precondition do 4 else 2 end
}
|> filter_out_nil_values()
end

Or it could call functions returning values.

What I’m trying to say is that when you start to fill dozens of values into a big map, as is my use case all over my code base, you start to run into limitations in the declarative syntax. But in my experience, for humans parsing the code with their eyes declarative syntax is by far the most intuitive. And that counts for a lot in my codebase which is meant to last several more years after evolving for 3 already. I also need to reread code a lot that I wrote years ago, so readability is key to maintainability for me.

Gwaeron · November 10, 2021, 10:10am

How many interfaces do you have for encoding? Wouldn’t it just be a matter of calling it once in that one place in the encoder?

It sounds more like you want a proplist than a map. The absence of a key-value pair would simply be the absence of the {:key, value} pair in the list. If a map is necessary to do the encoding, it should be trivial to convert a proplist to a map just before encoding.

If you have to use maps you could always define optional values in separate maps (or the lack of maps for “nil values”) and then merge them. Wouldn’t completely eliminate the use of functions, but it would be easy to abstract (fold a list of maps into a single map).

Oliver · November 10, 2021, 11:32am

Unless several optional values depend on the same condition, there’s no advantage in using a merged map over a Map.put/3, else you would just have to merge maps with one key-value pair.

Regarding calling it once - yes, this could be done. It would need to be redesigned to recursively step through all the possible constructs, which there are quite a few but definite a doable amount.

I don’t understand what advantage proplists would have here. You can’t omit a value, either, when declaratively building a list (I think), so I would to construct the list in some way then convert it with for example Map.new/1. But to my mind the construction of the list doesn’t look more readable, nor does it prevent having to filter it.

In elixir, this would look like:

def build_message(...) do
[ a: 1,
  b: if some_precondition do 2 end,
  c: 3,
  d: if another_precondition do 4 else 2 end
]
|> Map.new
|> filter_out_nil_values()
end

or

def build_message(...) do
[ {:a, 1},
  {:b, if some_precondition do 2 end},
  {:c, 3},
  {:d, if another_precondition do 4 else 2 end}
]
|> Map.new
|> filter_out_nil_values()
end

because of:

iex(1)> Map.new([a: 1, b: nil])
%{a: 1, b: nil}

Did you have something else in mind?

Gwaeron · November 10, 2021, 12:15pm

No, I think you’re right, I don’t think there’s any advantage to any of this. Avoiding functions to modify or build maps seems like a constraint that is setting you up for awkward solutions. I think the best way is using the dedicated Map API, unless of course some “optional key” binding gets implemented into the language.

What I mean with a proplist you could define and filter it in the same expression, so it looks consistent without several broken-up smaller maps that must be merged, or conditionals in the map definition - just one coherent list of the interesting keys and values. I don’t think it’ll look nicer than just filtering it once during encoding, and I don’t think you should actually do it, and it won’t solve all of your problems anyway. Depending on how creative you want to be…

build_message(...) ->
    [X || X <- [{key1, Value1},
                {key2, Value2},
                {key3, Value3}],
          should_include(X)].

should_include({b, _}) -> some_precondition();
should_include({d, _}) -> another_precondition().

or use N-tuple lists to make it even fancier,

build_message(...) ->
    Msg = [{a, 1},
           {b, 2, fun some_precondition/0},
           {c, 3},
           {d, 2, 4, fun another_precondition/0}],
    generate(Msg).

generate([]) ->
    [];
generate([{K, V, Condition} | Rest]) ->
    case Condition() of
        true -> [{K, V} | generate(Rest)];
        false -> generate(Rest)
    end;
generate([{K, Option1, Option2, Selector} | Rest]) ->
    V = case Selector() of
            left -> Option1;
            right -> Option2
        end,
    [{K, V} | generate(Rest)].
     end | generate(Rest)];
generate(X) ->
    [X | generate(Rest)].

Oliver · November 10, 2021, 2:20pm

You have a very good point here, thank you. List comprehension and other comprehensions (forgot if Erlang has more) could be used to create already filtered lists.

So, for example the presence for each field could be either implemented in a function or tracked in a map. For all fields where presence can be conditional, write a true or a false into the map when evaluating presence. Then use list comprehension and query the map of bools whether the field is present. For all mandatory fields assume true, easily done by calling maps:get(MyKey, MyMap, true). Then you could do the list comprehension predicate based on the map.

But I guess that’s not necessarily better than simply using a comprehension where you simply filter out all key-value pairs where the value is nil - that way you can actually compute and use the value in one go. It would be like doing the filter just directly visible instead of calling a helper. It would be more explicit, though. Both variations avoid polluting the key space unnecessarily.

tsloughter · November 10, 2021, 6:39pm

Unrelated but I just realized, does Elixir not have a way to add keys to a map besides Map.put? I found there is syntax for update, %{m | b: 2} but that only works if b is in the map.

Is there no equivalent to Erlang’s:

M = #{}
M#{a => 1}

But then also supports using := to fails if the key doesn’t exist:

> M#{b := 2}.   
** exception error: bad key: b

Oliver · November 10, 2021, 7:30pm

We always use Map.put/3 or Kernel.update_in/2 or Kernel.update_in/3. Map update syntax not so much, it in fact works better in elixir for structs than maps.

cmo · November 10, 2021, 9:55pm

I definitely tried stuff like that when I started using elixir but unfortunately no

You could use flat_map. some_precondition && [b, 2] || [] then reduce over that, which is probably cleaner than consulting on nil, as nil is a normal term. But maybe there is no magic way to conditionally execute lines of code in the middle of an expression (which is a good thing).