OTB decision on four proposed extensions to the lists module

bjorng · March 1, 2022, 3:57pm

In today’s meeting of the OTP Technical Board we discussed and decided on four proposed extensions to the lists module. Two were rejected and two were approved (with suggested modifications).

lists:transpose/1 – rejected

github.com/erlang/otp

Add lists:transpose/1, useful for list of lists transposition

erlang:master ← manuel-rubio:lists-zip-1

opened 05:34AM - 06 Dec 21 UTC

manuel-rubio

+77 -1

When I was developing the Advent of Code 2021, using Elixir and Erlang, I just r…ealized the rotating of a matrix like this: ```erlang L = [ [1, 2, 3], [4, 5, 6] ], ``` Could be rotated with `lists:zip/2` if we pass the lists as: ```erlang L1 = [1, 2, 3], L2 = [4, 5, 6], lists:zip(L1, L2). % [{1, 4}, {2, 5}, {3, 6}] ``` But, what if we have the need to rotate a matrix of arbitrary size? Then I've implemented `lists:zip/1` which is accepting the `L` parameter as a list of lists (matrix) and it's returning the rotated matrix: ```erlang L = [ [1, 2, 3], [4, 5, 6] ], lists:zip(L). % [[1, 4], [2, 5], [3, 6]] ``` And of course, we could use not only a matrix of 3x2 but also 9x7 or other completely different sizes.

This pull request was discussed in a previous OTB, but we didn’t reach a final decision. After some discussion and after considering the feedback in the use case thread, which reached the conclusion that the it would not be used frequently enough to warrant inclusion in the lists module.

lists:foldwhile/3 – rejected

We rejected it for the following reasons:

Using foldwhile does not save much code compared to writing your own tail recursive function and the resulting code could become less clear.
The convention of returning a tuple with cont or halt in the first element is new convention for the lists module.
We understand why Elixir has a similar function (Enum.reduce_while/3) because it is convenient to use in a pipeline. Since Erlang does not have a pipe operator, that usefulness does not transfer to Erlang.

lists:uniq/1 – approved with modifications

github.com/erlang/otp

lists:uniq

opened 09:46AM - 14 Jan 22 UTC

meox

team:VM enhancement

Seems that an important function is missing in the `lists` module: `unique` The… `uniq` function should takes a list and returns the unique elements from that list preserving the order. Possible implementation: ```erlang uniq([]) -> []; uniq(List) when is_list(List) -> uniq(List, []). uniq([], Acc) -> lists:reverse(Acc); uniq([H | Tail], Acc) -> case lists:member(H, Acc) of true -> uniq(Tail, Acc); false -> uniq(Tail, [H | Acc]) end. ```

Approved but with the following changes:

The implementation should be map-based for efficiency. See the uniq_one_by_one/1 function in the benchmarking thread for lists:uniq/1.
Another function uniq(Fun, List) should be added.
It should be documented that both uniq/1 and uniq/2 will keep the first occurrence of an element when there are duplicates.

We also discussed the name (uniq vs unique) and decided that it would be better to have the same name as in Elixir (uniq). There is also a Unix command called uniq (although they don’t do exactly the same thing).

lists:groupby – accepted with modifications

We discussed whether the best place for the function would be the lists module or the maps module. We agreed that it would fit into either module, but in the end we decided that putting it in maps had some advantages:

While the return values for the lists module are not entirely consistent, they tend to return either one or more lists (wrapped in a tuple), or an element from the list. Therefore, adding a function that returns a map introduces as slight inconsistency.
By placing it in the maps module, it would be possible to place similar functions in modules such as gb_trees or array. Each module would return its own data type.
The lists module is already quite large with many functions, while maps have fewer functions.

The sole disadvantage that we could find for not putting the function in the lists module is that most people would expect to find it in the lists module because it takes a list as input.

If the name would be maps:groupby one could reasonable think that it would take a map argument. Therefore, we think that a better name that makes it clearer what it does is maps:groups_from_list.

phild · March 1, 2022, 10:19pm

Thank you for taking the time to explain the OTBs rationale as well as the decision.

max-au · March 2, 2022, 4:17am

Thank you for clear and concise answers!
I wonder if there was any discussion on maps:inverse function that swaps keys with value (resolving duplicates with callback). Something like this one:

inverse(Map, Fun) when is_map(Map), is_function(Fun, ) ->
    maps:fold(
        fun(K, V, Acc) ->
            maps:update_with(V, Fun, K, Acc)
        end, #{}, Map).

bjorng · March 2, 2022, 4:30am

No, we didn’t discuss maps:inverse.

Maria-12648430 · March 2, 2022, 8:30am

@max-au @bjorng is/was there a PR for such an addition (maps:inverse)? I couldn’t find any

bjorng · March 2, 2022, 9:05am

Not as far as I know. I assumed that @max-au referred to some suggestion posted in the discussion thread for some other pull request or possibly in one of threads on this forum.

eproxus · March 2, 2022, 9:15am

@max-au I have an production used implementation of inverse/1 here mapz/mapz.erl at master · eproxus/mapz · GitHub

But I like your implementation more, it’s more flexible!

eproxus · March 2, 2022, 9:30am

In fact, a better implementation might be something more like:

inverse(Map) -> inverse(Map, fun(Old, _New) -> Old end).

inverse(Map, Fun) when is_map(Map), is_function(Fun, 2) ->
    maps:fold(
        fun(K1, V, Acc) ->
            maps:update_with(V, fun(K0) -> Fun(K0, K1) end, K1, Acc)
        end,
        #{},
        Map
    ).

This way one can decide to keep either the old or new key.

Maria-12648430 · March 2, 2022, 9:51am

I see

But that gives raise to an interesting question in regards to naming and naming consistency again, maybe worth discussing at OTP/OTB in view of future additions: Should the name be inverse or invert here, ie should a function name align with what the function delivers (the inverse map of the given map), or describe what it does (invert the given map)?

In OTP, we have both, eg maps:keys or maps:values aligning with what the functions deliver, as well as maps:fold or maps:filter which describe what the functions do.

It is probably not feasible to adhere strictly to either one form or the other, more so since we have a mix of both all over the place already. But it may be good to formulate a preference if there is a choice with both types of names equally sensible.

phild · March 2, 2022, 9:54am

@AstonJ Do you think it would be useful to have an area of the forums that collates reference implementations of common, concise and elegant example solutions like the above? I’m thinking of something like a curated list of gists.

The purpose would be to both provide a learning resource and a de-facto way of implementing a common pattern in user code, without the tighter constraints required for it to be in the standard library.

phild · March 2, 2022, 10:00am

Naming is hard!

Maria-12648430 · March 2, 2022, 10:08am

Yeah Good thing I have no children, they would probably be called “You Over There” and “The Other One”

Maria-12648430 · March 2, 2022, 10:14am

That would be a nice thing to have, but require a number of dedicated curators (depending on how much engagement the area receives), ie people to decide what is good and bad (on what grounds exactly?), and to see to good naming (again) for the threads so people can find what they’re looking for and at the same time prevent double posting. Otherwise, it might quickly turn into something like this: BOFH: We want you to know you have our full support • The Register (Yeah, I’m exaggerating )

phild · March 2, 2022, 1:21pm

Oh, so you’ve met my 2 children, and my dog? 3 names seemed excessive when time division multiplexing exists.

max-au · March 2, 2022, 4:28pm

I often have a need to keep the list of keys when they have duplicate values. E.g. when I am inverting a dependency tree stored as a map #{two => one, three => one} into #{one => [two, three]}.

For that case my conflict resolution Fun is similar to store(New, Old) -> [New | Old].

Somehow I am not able to find previous discussion on the topic, but I clearly remember it happening somewhere. @bjorng should I come up with PR for OTB to start discussions, or we can have the OTB decision first to avoid unnecessary coding if this is not going to be accepted?

AstonJ · March 2, 2022, 6:05pm

Just created a new category Phil

This can house things like how-to guides or any other kind of tips or snippets people want to share. It can be used alongside our ‘glossary’ type section:

Where rather than posting those tips yourself, you can invite others to submit comments, eg: How would you achieve ____ in Erlang?

bjorng · March 3, 2022, 5:25am

An complete implementation is not necessary for OTB to discuss an addition to the standard libaries. What is more important is a good motivation for why the new function is useful and example of use cases and perhaps a motivation for the name of the function if several reasonable names are possible. A thread in this forum could be a good place to discuss and suggest new functions and gather examples of use cases.

(Regarding maps:inverse, there is a similar function in sofs called converse/1. Either the same name should be used for consistency or there should be a motivation for why the names should be different.)

max-au · March 3, 2022, 6:05am

I was not aware of sofs:converse. The reason to name it inverse is, other languages call it that way (e.g. Java’s Map). Also, it feels in line with inverting the 2D matrix (swapping rows with columns).

There are plenty of cases where inverse would be helpful. Classic example is turning dependency tree from a map “B depends on A, C depends on A” into a map of “when A is done, we can start B and C”. That is, #{b => a, c => a} turned into #{a => [b, c]}.

I also remember that a recent Dialyzer PR #5498 uses sofs:converse to achieve the same result, and I actually think that maps would yield higher performance compared to sofs.

phild · March 3, 2022, 8:32am

inverse is (IMO) more intuitive than converse, which has other connotations in natural language.

The standard library functions are also quite inconsistent regarding whether they are verbs or nouns. On a quick scan of the maps module, the majority seem to be verbs and, arguably, some of the nouns are simply contractions of get_xxx (e.g. maps:values). This actually bothers me less than I feel it ought; perhaps as @Maria-12648430 mentioned, consideration has historically been given to what the function delivers vs what it does. Or, like lists:reverse, it works both ways.