Glazer - the Swiss Army knife of parsing JSON/YAML/CSV

glazer started as a blazing-fast NIF JSON codec (single-pass, straight to/from native terms - no AST). In 0.3.0, the same engine now parses and encodes YAML and CSV too.

The numbers:

  • JSON: faster encoding than everything I tried - jason, thoas, euneus, torque, jiffy, simdjsone, OTP json- and neck-and-neck with torque (Rust sonic-rs) on decoding

  • YAML: 10x faster than yaml_rustler/fast_yaml, up to 100x faster than yamerl/ymlr

  • CSV: 2-20x faster than nimble_csv, and csv/erl_csv don’t even finish on large files (timeout) while glazer is done

Plus: streaming decode for JSON/CSV, bignums support, configurable null, zero external C++ deps.

{deps, [{glazer, "~> 0.3"}]}.

Charts and full benchmarks: https://github.com/saleyn/glazer#performance
Hex: https://hex.pm/packages/glazer

Break it, measure it, tell me what’s slow.

Best,
Serge

2 Likes

What version of Jiffy have you compared against? The latest version has a lot of performance improvements.

It was measured against the latest (2.0.0).

1 Like

Hi @saleyn

Really nice work on glazer. The CSV throughput looks impressive.

Quick question so I’m reading the API right before I run a few numbers of my
own (against our proprietary CSV lib): when decoding CSV, does glazer always hand back every field as a binary?
i.e. csv_decode/1,2 returns [[binary()]], and any actual type interpretation : integers, floats, booleans, dates … is left to the caller?

Just want to make sure I’m comparing like-for-like.

Thanks!

Currently yes, though it would likely be a good addition to perform a type conversion on the C++ side if the {col_types, [integer|float|boolean|date|{date, Format::binary()}|binary|charlist]} option is provided. You can submit an issue, and I’ll try to get to it soon.

1 Like

I implemented many options to control the CSV parsing.

Also last version has even better performance improvements, and ability to do jq like JSON and YAML searches.

1 Like

@saleyn when will the API be considered stable (json_decode/1 => json_try_decode/1, etc.)?

This weekend I made a few non-backward compatible changes needed to normalize API across JSON/YAML/CSV codecs, but as of now it should be pretty stable.

1 Like