Strange issue with OTP 25 RC1 on OSX (with JIT enabled)

I have a strange problem testing OTP 25 RC1 on OSX (with JIT enabled).

I tried to test the leveled database (GitHub - martinsumner/leveled: A pure Erlang Key/Value store - based on a LSM-tree, optimised for HEAD requests) today on OTP 25 RC1. Compiles OK, dialyzer is OK, but the eunit tests fail.

The interesting failure is in the leveled_ebloom module. This purpose of the module is to create a bloom filter like structure for checking existence of keys. The way it works is to take a hash of the key and split it into a slot and two hashes, the bloom is split into slots, and the two hashes are then added to the slot using bor. The number of slots varies depending on the number of hashes the bloom has to hold.

The failing test creates lots of random keys (hence hashes), and then builds a bloom, then checks all the keys/hashes are confirmed as in the bloom.

It seems odd that such a simple thing would be broken by an OTP change, but it is (on my machine).

If we run the test on OTP 25 it consistently fails (and it consistently passes on all other OTP versions). The bloom is created, but some of the hashes added to the bloom are not found. Mysteriously, it only fails on hashes were the Slot is 2 mod 16. All other keys (that fall into a Slot which is not 2 mod 16) are found OK.

Putting some print statements in as the hash is built, I have discovered that the calculation S2 bor Mask such as found here or here or [here(leveled/leveled_ebloom.erl at develop-3.1 · martinsumner/leveled · GitHub) is being truncated to just a 32 bit output. But only for these Slots, and not for the other lines above and below associated with Slots which are not 2 mod 16.

I’ve tested this on an Ubuntu machine with OTP 25 but without JIT, and the test passes fine. So this is either OSX or JIT specific. I’m going to setup an Ubuntu machine with JIT now to see if it is JIT or OS specific.

Once I have done that I will update, and then try and simplify the test case. However, I thought I would share now, as this is driving me a bit crazy.

4 Likes

I wonder if it’s OSX on M1/Arm or on Intel/x86?

2 Likes

It is OSX Mojave on Intel

2 Likes

This is the output from adding print statements to show the Hashes being add, the mask create to add the Mask, the current S2 before adding the Mask - and then the result of S2 bor Mask:

Add Hashes [3739,3638]
As Mask 3557572731526590674030709770534302190003655244542544923204105470076268198705533519803866265255031901329837788586526551337304045787628428098940916466498870739226079017293967539479428987563966568962987869277690962999434281262595696698419192062534793294646526561118880823034587386082952761438569799620871137172561273083401308870027533917990799433007615607122993334242212334433576140829688416793357620141324025202803627214981237531620856295624064503609369048424650867265141002508417757675409090556860953624975280762781766570226649501628236616334161674359480780978147270987752805021255694805534160516684302956139788378057204612532878659737233012955872929407806708978729209272539782742660119316279405494770634367650627487047773910402706697136840067073975433570307194136222558110146657955411590041986807505124905886018416774700520998063064333021528526801723472478524352910366634140537724300561670403041252246921177064834812678352966032875963771695704686976165832351303759567837248396713556035972117416139365303339641470986226894270863836670569779096061065762753997700823359849612169589694008924192077943991917535652977814459050360832
This S2 301989887
Next S2 301989887
2 Likes

Thanks for reporting this! I’ve found a bug that would explain it, can you try this branch?

2 Likes

Thanks for your quick response, I’m building this now.

2 Likes

I can confirm, that your fix resolves the issue. Thank-you for solving this so promptly

2 Likes

Thanks, I’ve opened jit: Fix integer ranges by jhogberg · Pull Request #5727 · erlang/otp · GitHub and will merge it in a few days if all looks well. :slight_smile:

6 Likes