Questions
- Should we have a total term order?
- Should we complete the set of “exact” comparison operators?
Background
Today, Erlang has got these comparison operators: <
, =<
, >
, >=
, ==
, and /=
. These form a complete set that adheres to a term order where number() < atom() < reference() < fun() < port() < pid() < tuple() < map() < [] < [_|_] < bitstring()
.
We also have the “exact” operators =:=
and =/=
, a.k.a the match tests.
Terms of the same type has got a defined order, for some types documented, and for others just defined, so e.g all references has a total order between themselves.
Integer+float conflation
What is peculiar about the current term order is that it conflates integer() | float()
into number()
, so integers and floats of equal value are considered equal: e.g 3 == 3.0
and 2.0 < 3
are both true
. This may be practical in many cases, but does not constitute a total term order since terms that are different (of different type, for starters) compare equal.
The “exact” operators differ between integers and floats, so 3 =:= 3.0
is false
.
Ordered data structures
The lack of a total term order complicates order dependent features such as lists:sort/1
that has to treat 3
and 3.0
as the same value. It does the best of the situation and promises a stable sort, and lists:usort/1
promises to keep the first instance of “equal” elements. The behavior varies between different data structures such as ordsets
, gb_trees
, ETS, and so on.
In fact, it is hard to implement a data structure, such as a tree, that can handle any term types. You either have to ignore the integer+float conflation, which most have done, or you could work hard to implement a total term order of your own. The problem with the latter is that you would have to traverse down to the bottom of all compound types to implement a total term order function, so I have not seen that been done…
Maps key order
For maps we have a special situation, because to make them behave nicely in order dependent data structures we had to give them a defined order, so with respect to map keys they have a total order where integer() < float() < other types
. Therefore #{1 => a} /= #{1.0 => a}
and (surprise!) #{2 => a} < #{1.0 => a}
are both true
. This does not apply for map values so #{1 => 2} == #{1 => 2.0}
is true
while #{1 => 2} =:= #{1 => 2.0}
is false
. This oddity is not much of a problem since it is weird in itself to say that some map is less than some other map.
Glitch at zero
The IEEE Floating-point standard has got both a zero and a negative zero, and dictates that they are equal, except when calculating with them then the sign should be preserved. This is more useful when you also have got positive and negative infinity, which Erlang does not allow. Nevertheless, Erlang has got both 0.0
and -0.0
. For historical reasons they not only compare equal, but also match equal, so 0.0 =:= -0.0
is true
. One could argue that ==
should have them as equal but =:=
should not, but when we fixed how -0.0
was handled it was already common with code that used a case
match for 0.0
to avoid division by zero, so we sadly had to make 0.0
exactly equal to -0.0
…
Proposals
We could:
- Introduce a total term order.
- Introduce operators
:<
,=:<
,>:
and>:=
to, together with=:=
and=/=
get a complete set of operators for such a term order. - Eventually introduce options or functions to existing data structures such as ETS,
gb_trees
, andlists:sort/1
that adheres to the total term order to facilitate handling all types of terms nicely.
Total term order
A total term order could be the existing maps key order.
Another close-at-hand possibility would be to do a minimal extension of the current term order that conflates integers+floats so that if an integer and a float compares equal, then the integer is considered closer to zero. We can here name this the exact term order since we are aiming at completing the “exact comparison” operators.
There might be other possible total term orders, but I cannot think of an obviously better in any regard.
Pros and cons
The maps key order exists already, and has good performance since converting between floats and integers is avoided.
The exact term order is a minimal extension to the existing term order, and it is arguably intuitive.
The maps key order would have 3 :< 2.0
as true
. This can be either utterly bewildering or just a small price to pay for performance depending on if you want just any term order or not just any term order.
The exact term order would have 1 :< 1.0
, -1.0 < -1
, 0 :< 0.0
and (surprise!) 0 :< -0.0
as true
since -0.0
has to match 0.0
. This can either be the only small glitch with this order, or just too ugly to be implemented, depending or your point of view.
The exact term order must be much slower than the maps key order for mixed floats and integers (on some position in a compound term), but only slightly slower than the existing term order.
Follow-up questions
Should the exact term order still have the maps key order for map keys to have the sort order the same both for <
as for :<
, or should the exact term order be used here too. E.g should #{1.0 => a} :< #{2 => a}
be true
when <
has it as false
? Should we have performance and consistency between operators, or consistency within the new operator?
Should we implement new operators? We could just implement either a function erlang:exact_compare/2
or erlang:maps_key_compare/2
and use that. But new operators optimize better, especially in the JIT, and are easier to use. Completing =:=
and =/=
would be fulfilling in itself.
Comments, anyone?
Which combination should an EEP aim for?