**UPDATE**: This post has been edited because the `uniq_all_at_once/1`

is broken. See below.

I wanted to share the results of some benchmarking I’ve done as part of evaluating the proposed lists:uniq/1 function.

First I compared two functions that create a map in two different ways. This one creates a map by adding one element at a time to an initially empty map:

```
one_by_one(Data) ->
one_by_one(Data, #{}).
one_by_one([Key|T], Map) ->
one_by_one(T, Map#{Key => {Key, Key+1}});
one_by_one([], Map) ->
Map.
```

The other one creates a list of `{Key, Value}`

pairs and then turns it into a map using `maps:from_list/1`

:

```
all_at_once(Data) ->
all_at_once(Data, []).
all_at_once([Key|T], Acc) ->
all_at_once(T, [{Key, Key+1}|Acc]);
all_at_once([], Acc) ->
maps:from_list(Acc).
```

So which one is fastest? Here is the result from the benchmark run (using Erlang/OTP 24.2):

```
1> mbench:run().
generated 2000000 data items
one_by_one/1 : 1.825983
all_at_once/1 : 0.504215
.
.
.
ok
```

Before all of you start rewriting of all map updating code , note that the map in the benchmark had two million elements. Unless your code regularly have to handle that many elements or if part of a frequently used library or the Erlang compiler, you probably don’t have to worry about the performance difference.

I then turned to benchmarking two possible implementations of `lists:uniq/1`

. Here is the first, similar to the one proposed by @cons in the pull request:

```
uniq_one_by_one(L) ->
uniq_one_by_one(L, #{}).
uniq_one_by_one([X|Xs], M) ->
case is_map_key(X, M) of
true ->
uniq_one_by_one(Xs, M);
false ->
[X|uniq_one_by_one(Xs, M#{X => true})]
end;
uniq_one_by_one([], _) ->
[].
```

The other one constructs the map by using `maps:from_keys/1`

in one go (**UPDATED**: but it is wrong):

```
uniq_all_at_once(L) ->
M = maps:from_keys(L, true),
uniq_all_at_once(L, M).
uniq_all_at_once([X|Xs], M) ->
case is_map_key(X, M) of
true ->
uniq_all_at_once(Xs, M);
false ->
[X|uniq_all_at_once(Xs, M)]
end;
uniq_all_at_once([], _) ->
[].
```

(**UPDATED**: `is_map_key/1`

will always return `true`

, so the result will always be `[]`

.)

Based on the previous benchmarking of map creation, it is no surprise that `uniq_all_at_once/1`

is almost three times as fast (**UPDATE**: *an incorrect program can run arbitrarily fast*):

```
uniq_one_by_one/1 : 1.942674
uniq_all_at_once/1 : 0.736359
```

However, those results were produced with an input list where all items are unique. If I change the input list to two million identical elements, `uniq_one_by_one/1`

is the winner:

```
uniq_one_by_one/1 : 0.007935
uniq_all_at_once/1 : 0.085007
```

Therefore, in my opinion, if `lists:uniq/1`

is to be included in the `lists`

module, the implementation should be based on `uniq_one_by_one/1`

.

Finally, I wanted to know how those implementations compared to `lists:usort/1`

, which can be used if you don’t care about the order of the elements. So I added this function to the benchmark:

```
uniq_by_usort(L) ->
lists:usort(L).
```

Here are the results:

```
uniq_one_by_one/1 : 1.942674
uniq_by_usort/1 : 0.030334
```

Seeing that sorting can make the unique operation an order of magnitude faster, I then tried to implement `uniq/1`

by sorting while still keeping the order:

```
uniq_by_sofs(L0) ->
L = number(L0, 0),
S0 = sofs:relation(L),
S1 = sofs:relation_to_family(S0),
S2 = sofs:converse(S1),
S3 = sofs:to_external(S2),
[E || {_,E} <- S3].
number([H|T], N) ->
[{H,N}|number(T, N+1)];
number([], _) ->
[].
```

When all elements in the input list are unique, `uniq_by_sofs/1`

turned out to be the fastest function:

```
uniq_one_by_one/1 : 1.934827
uniq_by_sofs/1 : 0.534989
```

When all elements are identical, `uniq_one_by_one/1`

is again the winner:

```
uniq_one_by_one/1 : 0.007935
uniq_by_sofs/1 : 0.349756
```

Here is my benchmark module: