Hello everyone! I’ve converted @michalmuskala’s wonderful Elixir Jason library to Erlang.
It has a smaller feature-set compared to Jason, see the README for details there.
Why convert Jason to Erlang?
I wanted to use Jason from within Gleam and Erlang projects without pulling in the Elixir compiler and standard library.
I wanted a fast JSON library to which I could add an API that does not work by traversing maps and lists created from Erlang, as I think this would be more suited to Gleam.
How does it perform?
In my benchmarking I found Thoas to use the same amount of memory as Jason while being a few percent faster or slower, which I expect is due to my development machine being a bit noisy.
Here it is compared to jsone and jsx.
##### With input Pokedex #####
Name ips average deviation median 99th %
thoas 762.02 1.31 ms ±21.99% 1.28 ms 1.61 ms
jsone 671.54 1.49 ms ±17.81% 1.43 ms 1.93 ms
JSX 398.03 2.51 ms ±9.54% 2.44 ms 2.91 ms
Comparison:
thoas 762.02
jsone 671.54 - 1.13x slower +0.177 ms
JSX 398.03 - 1.91x slower +1.20 ms
Memory usage statistics:
Name Memory usage
thoas 0.38 MB
jsone 1.23 MB - 3.23x memory usage +0.85 MB
JSX 2.59 MB - 6.80x memory usage +2.21 MB
Thanks
All credit for this library goes to Michał and the Jason contributors, they did all the work.
It looks very impressive!
Have you compared with other libraries(jiffy, jsone, poison, jazz, jsx) for speed checking of maps, lists, strings, string escaping, large value, pretty print?
It’s roughly the same as Jason, so it’s generally slower than Jiffy but faster than the others.
There’s benchmarks in the repo which you can run, but they take a couple hours to run.
edit: I ran a portion of the benchmarks and added them above.
On my macbook (M1), for encoding I consistently see Jason on top, sometimes even Poison. Poison especially surprises me. I wonder what’s up
Did you do the translation completely by hand? If so, it might be worth doing a mix decompile to spot differences, I would expect the results at worst to be the same as Jason.
I find on my M1 macbook it’s random with Thoas and Jason always within a few percent of each other. Poison seems to sometimes beat Jason with OTP 24, perhaps the JIT changed some things.
I did this:
Fork Jason
Remove all usage of the Elixir stdlib
Remove Elixir specific features
Compile to .beam
Decompile to Erlang
Fix compile errors
Neaten up formatting
Removed 10k lines of repeated generated code
I expected there to be some performance impact from 8, but when I benchmarked before and after I saw no change.
I ran this one on battery power so it’s a bit slower than the above.
##### With input Pokedex #####
Name ips average deviation median 99th %
thoas 658.88 1.52 ms ±31.23% 1.44 ms 3.03 ms
json 388.80 2.57 ms ±19.98% 2.46 ms 3.83 ms
Comparison:
thoas 658.88
json 388.80 - 1.69x slower +1.05 ms
Memory usage statistics:
Name Memory usage
thoas 0.79 MB
json 1.74 MB - 2.20x memory usage +0.95 MB
I wasn’t able to get the decode benchmarks to run as the json:decode/1 function would crash. I’m likely doing something wrong.
It looks like you need to pass the option {:maps, true} for map support (i.e., json:encode(Value, [{maps, true}]))
I ran the encode tests :
Operating System: macOS
CPU Information: Apple M1
Number of Available Cores: 8
Available memory: 16 GB
Elixir 1.13.1
Erlang 24.2
Benchmark suite executing with the following configuration:
warmup: 5 s
time: 30 s
memory time: 1 s
parallel: 1
inputs: Pokedex
Estimated total run time: 1.20 min
Benchmarking jhn_stdlib-json with input Pokedex...
Benchmarking thoas with input Pokedex...
Generated /Users/starbelly/devel/erlang/thoas/bench/output/encode.html
Generated /Users/starbelly/devel/erlang/thoas/bench/output/encode_pokedex_comparison.html
Generated /Users/starbelly/devel/erlang/thoas/bench/output/encode_pokedex_jhn_stdlib_json.html
Generated /Users/starbelly/devel/erlang/thoas/bench/output/encode_pokedex_thoas.html
Opened report using open
##### With input Pokedex #####
Name ips average deviation median 99th %
thoas 662.42 1.51 ms ±25.45% 1.44 ms 2.95 ms
jhn_stdlib-json 408.06 2.45 ms ±9.51% 2.46 ms 2.98 ms
Comparison:
thoas 662.42
jhn_stdlib-json 408.06 - 1.62x slower +0.94 ms
Memory usage statistics:
Name Memory usage
thoas 0.79 MB
jhn_stdlib-json 1.74 MB - 2.20x memory usage +0.95 MB
Edit: I might have derped, you said you couldn’t run the decode tests vs the encode tests
Either way, here’s the decode tests where there’s a much bigger difference :
##### With input Pokedex #####
Name ips average deviation median 99th %
thoas 775.07 1.29 ms ±5.11% 1.27 ms 1.50 ms
jhn_stdlib-json 185.93 5.38 ms ±17.89% 5.13 ms 8.04 ms
Comparison:
thoas 775.07
jhn_stdlib-json 185.93 - 4.17x slower +4.09 ms
Memory usage statistics:
Name Memory usage
thoas 0.38 MB
jhn_stdlib-json 8.03 MB - 21.08x memory usage +7.65 MB
Worth pointing out that Pokedex to a certain extend and Canada for a more visible one, have massive memory usage bloat due to floats. As so far, the “to string” for floats used return charlists instead of io_data. This cannot be fixed easily until OTP 25.