However, when they are matched against each other or compared using the =:= operator, they are considered to be equal. Thus, 0.0 =:= -0.0 currently returns true.
@nzok considers this behaviour to be a bug. We also consider it a bug, but until recently we were not sure that fixing and introducing an incompatibility would be be worth it.
A recent bug found by erlfuzz made us reconsider. An optimization in the compiler to share identical code would consider the code for the two clauses in this function to be identical:
f(_V0, _V0) ->
-0.0;
f(_, _) ->
0.0.
and essentially rewrite it to:
f(_, _) ->
0.0.
To fix this optimization when =:= considers 0.0 and -0.0 to be equal would be cumbersome and could make the compiler slower. It also likely that other optimizations in the compiler could be affected and would have to be fixed in similarly cumbersome ways.
Therefore, the OTP Technical Board decided that in Erlang/OTP 27, we will change +0.0 =:= -0.0 so that it will return false, and matching positive and negative 0.0 against each other will also fail. When used as map keys, 0.0 and -0.0 will be considered to be distinct.
The == operator will continue to return true for 0.0 == -0.0.
To help to find code that might need to be revised, in OTP 27 there will be a new compiler warning when matching against 0.0 or comparing to that value using the =:= operator. The warning can be suppressed by matching against +0.0 instead of 0.0.
We also plan to introduce the same warning in OTP 26.1, but by default it will be disabled. Anyone that suspect they have code that might be affected can turn on that warning in OTP 26.1.
Interesting ⦠for people not in topic could you also please describe (or link to some article) why <<-0.0/float>> is represented as <<128,0,0,0,0,0,0,0>>? I think that many people and especially newbies would be confused and assume that -0.0 and +0.0 is always seen the same i.e. 0.0.
If I understand correctly regardless of value (0.0, 12.34, -98.76, ā¦) first bit of every float is 0 (positive) or 128 (negative) and Erlang previously saw that like:
Yes, the most significant bit is the sign bit. Your example have the wrong sizes for the segments, but it does seem that you have correctly understood both the current behavior and the new behavior. Here is a corrected Erlang version of your same?/2 function:
My example have:
a) 8 bits of 0 + 56 bits of 0 (<<0, 0:56>> or <<0, 0::56>> in Elixir)
b) 8 bits of 128 + 56 bits of 0 (<<128, 0:56>> or <<128, 0::56>> in Elixir)
Your example have:
a) 1 bit of 0 and 63 bits of 0 (<<0:1, 0:63>> or <<0::1, 0::63>> in Elixir)
b) 1 bit of 1 and 63 bits of 0 (<<1:1, 0:63>> or <<1::1, 0::63>> in Elixir)
No, they work for 0.0 but to be picky they drag in another 7 bits that will either be significant (positive? and negative?) or insignificant (same?) which will give funny results for other numbers. For example, -2.0 would be considered non-negative by negative?, and equal to 2.0 according to same?.
I was thinking in good way, but I assumed that all first 8 bits are used to store a sign. Itās why I asked if it would be always 0 or 128. Thatās obvious waste of memory, the only relevant is the first bit i.e. <<relevant::1, rest::63>>.
I suspect the idea is that == will still show them equal, and that satisfies the constraint that " negative zero and positive zero should compare as equal with the usual (numerical) comparison operators, like the == operators of C and Java," as the Wikipedia page says. Though Erlangās == also equates integer 0 and floating point 0.
Erlangās exact equality operator =:= will distinguish them, but maybe that is regarded as outside the standardās requirement?
No, arithmetic comparison (==, >, =<, et al) will still consider them equal. They will only be considered unequal in exact comparisons (=:=, =/=) which are an entirely different beast.
IEEE754 distinguishes between 0.0 and -0.0 on purpose because itās actually useful in many contexts, and people have complained about our behavior of randomly losing (or gaining!) the sign of zero many times. The bug referenced in the first post was just the straw that broke the camelās back.
Why? Consider the following:
foo(X, Y) when X =:= Y ->
Y;
foo(... snip ...) ->
... snip ...
When X and Y are the exact same it shouldnāt matter whether we return X or Y from the first clause: if the compiler deduces that itās cheaper to return X it should be free to do so, but 0.0 =:= -0.0 makes that impossible because we could return 0.0 when the user wanted -0.0 or vice versa.
This affects all code in the compiler that has to reason about exact equality. We cannot blindly assume that A and B represent the same term unless we know for certain that theyāre not a zero float, so the compiler skips these optimizations when this isnāt the case. This is not much fun since the highly dynamic nature of Erlang means that we rarely have this information, but we can live with it.
So far so good, right? We know about this problem and have taken measures to combat it. This is where the linked bug comes in. In our SSA form the code looks like this:
function `t`:`f`(_0, _1) {
0: %% Basic block 0
@ssa_bool = bif:'=:=' _1, _0
br @ssa_bool, ^4, ^3
4: %% Basic block 4
ret `-0.0`
3:
ret `0.0`
}
We have an optimization for sharing basic blocks that more or less just checks whether two blocks contain the same instructions and de-duplicates them if they do. Because 0.0 =:= -0.0 the return instruction in blocks 3 and 4 are considered equal and are thus shared, returning the wrong result.
To make this respect the sign bit of 0.0 we would have to implement our own comparisons manually to recursively dig into arbitrarily complex terms and treat 0.0 and -0.0 differently. This is not very difficult per se, but the rub is that we did not expect this to be a problem here. To solve it in this manner we would have to identify all the places in the compiler where this is necessary, which would quickly become a game of whack-a-mole that weāre sure would cost a lot of precious time and risks introducing bugs on its own.
As Bjƶrn mentioned weāve considered -0.0 =:= 0.0 to be a bug for quite some time but just havenāt found it big enough of a deal to risk changing until now. People are using floats far more often now than they used to on our platform (the recently launched Elixir Nx being one big user) so we canāt just stick our heads in the sand and pretend itās not an issue anymore. Weāre faced with the options of:
Giving up all attempts to distinguish 0.0 and -0.0, throwing everyone who needs that distinction under the bus.
Having the compiler team bend over backwards trying to make the compiler 0.0-aware everywhere. We are just two people who split our time on many other things as well so weād rather not have to do this.
Changing exact comparisons (=:=) to consider them different, keeping arithmetic comparison the same.
As we consider the behavior itself a bug and itās not only the compiler thatās affected, weāre attempting option 3 in the hopes that it wonāt cause too many problems elsewhere. If it does weāll have to backtrack before the OTP 27 release (or perhaps punt it to OTP 28 or even later).
Yes, == and friends will work just as they have before. Itās only exact comparisons that will change.
This is a rather annoying wrinkle: people who have used X =:= Y as both an arithmetic equality and implicit type test cannot blindly rewrite their code to X == Y, and will have to figure out whether the type test is unnecessary or rewrite it as X == Y andalso (is_float(X) == is_float(Y)) or similar.
Judging by a survey of the float-heaviest code bases we know of we donāt expect this to become a problem, but you never know, and weāre well aware that we may have to change our course after the release candidates for OTP 27. Weāre making noise now in the hopes that weāll catch these issues early.
I am not heavily impacted personally, but I think the long lead time is a good idea.
Iāve probably used =:= or =/= as an implicit type test as you mention. I also give an introductory homework assignment that asks students to solve the quadratic equation. Normal solutions, including mine, pattern match on 0.0 for the coefficient of the x^2 term.
I assume this kind of code will have to go from
f(0.0) -> %...
to
f(X) when X == 0.0 -> %...
(or separate positive and negative 0.0 or do explicit type checks if that matters).
This will also arise in unit tests, because ?assertEqual(0.0, StudentAnswer) currently succeeds for both positive and negative 0.0 and will fail in the future. My unit tests for the above mentioned homework will require some changes (to use, e. g., ?assert(StudentAnswer == 0.0))
For me, itās probably just a handful of relatively short files to look through, but I can imagine it will take others a fair amount of time and testing.
This seems to have blown up elsewhere on the internet (hi Hacker News!) and a lot of people misunderstand what this change means. Since I canāt reply everywhere Iāll try to explain it here in a manner that hopefully makes more sense for people who are unfamiliar with Erlang:
Like Prolog before us we only have one data type, terms. This means that the language is practically untyped. There are only terms and while you can certainly categorize them if you wish, there are no types in the sense most people use that term.
Functions have a domain over which terms they operate, and going outside their domain results in an exception thatās often analogous to a type error (for example trying to add a list to an integer) which can be mistaken for having traditional types. To make things more confusing, we have functions that can tell you which pre-defined category a term belongs to, like is_atom returning true for all terms that are in the atom space and false for all terms outside of it.
This mixup is so prevalent that even our documentation refers to these pre-defined categories as ātypesā despite them being nothing more than value spaces, but itās important to remember that at the end of the day we only have one data type, and that many functions are defined for all terms.
The arithmetic equality operator (==) returns whether two terms are considered arithmetically equal (for clarity itās defined for all combinations of terms). Primitive non-numeric terms are simply checked for identity, compound terms are compared recursively, and any numeric terms (floats and integers) are compared according to the rules listed in the documentation. This operator will remain unchanged in the future and 0.0 == -0.0 will continue to hold.
This operator covers just about all common uses, but is not enough for code that needs to reason about terms in the general sense, for example sets, memoization, or other things you would use generics for in another language.
For that we have the exact equality operator (=:=) which returns whether two terms are indistinguishable. That is for any terms X and Y, X =:= Y returns whether f(X) =:= f(Y) for all pure functions f.
Since there exists several f that distinguish f(0.0) from f(-0.0), we either have to conclude that those functions are broken (where does that leave copysign?) or say that 0.0 =:= -0.0 should not return true.
We could make it consistent by removing all the things that let us observe the difference, but that removes functionality that people rely on so itās not much of an option. So what weāve done so far is to try to sweep these differences deep under the rug and hoping no one notices, one small patch at a time.
Since weāve never exposed copysign or the other IEEE functions that allow you to observe the sign of zero, it has kind-of-sort-of worked as long as people stayed entirely within Erlang-land. Unfortunately with the rising popularity of GPGPU in general and the Nx library in particular, this bug has been rearing its ugly head with increasing regularity.
Not only because the compiler flubs the constants 0.0 and -0.0 like in the examples earlier in the thread, but because application code does the same. If you want to memoize the result of the aforementioned f(0.0) it will be confused with f(-0.0) unless you invoke some arcane nonsense to keep them apart.
The compiler is just where this bug is most visible. While we can certainly try to make the compiler distinguish between these values at all times thereās nothing we can do for application code, hence our attempt at breaking backwards compatibility. If it fails weāre most likely going to have to throw in the towel on this bug.
Yes.
The compiler will raise a warning whenever 0.0 is used in this manner, so our hope is that it will not be too difficult to find where to change the code.