Binary packet definition languages

Hi There,

Working on the Erlang-Red and specifically an MQTT emulation flow, I noticed how simple yet complicate binary definitions in Erlang are:

<<_:8/bits, 1:1, Len:7/bits, 1:1, Len2:7/bits, 1:1, Len3:7/bits, 0:1, Len4:7/bits, Rest/bytes>>

which is great but which isn’t low-code - IMHO :slight_smile:

So I’ve been looking into finding a more low-code representation and I happened across Packet which a NodeJS implementation of a representation. In Packet, the above would become:

x8,
x1 => 1,
b7 => len,
x1 => 1,
b7 => len2,
x1 => 1,
b7 => len3,
x1 => 0,
b7 => len4

which would generate a map with the keys:

#{ <<"len">> => ..., <<"len2">> => ..., <<"len3">> => ..., <<"len4">> => ... }

There is a corresponding Node-RED node - binary that implements the Packet specification as a node. As a comparison there is another Node-RED node that does more and is more popular but that I find too visual and too complicated –> node-red-contrib-buffer-parser.

Is there an alternative low-code representation? What I’m looking for is a representation that can be used by Node-RED and Erlang-Red, hence I don’t want to be using the Erlang binary matching since that’s too specific (sorry to say) and too hard to understand for non-Erlang folks.

I’ve created an initial implementation but if there is a better, existing solution, then I’m happy to switch to that.

Cheers & Many Thanks!

I don’t see how your “Packet” example is more “low-code” than the Erlang version.
If I understand what “low-code” means, which I probably don’t, shouldn’t record structures
be drawn, not coded with arrows?

1 Like

I actually put together an example over here, I’ve completed a first version of the parser that can chunk-out png images:

collect_png_chunks(<<>>, Acc, _ChunkFunc) ->
    lists:reverse(Acc);
collect_png_chunks(Data, Acc, ChunkFunc) ->
    {Chunk, Rest} = ChunkFunc(Data),
    collect_png_chunks(Rest, [Chunk | Acc], ChunkFunc).

check_parsing_of_png_image_test() ->
    HeaderDef = "
       x8 => 0x89,
       x8 => 0x50,
       x8 => 0x4E,
       x8 => 0x47,
       x8 => 0x0D,
       x8 => 0x0A,
       x8 => 0x1A,
       x8 => 0x0A
    ",

    ChunkDef = "
       b32         => length,
       b8[4]       => type,
       b8[$length] => data,
       b32         => crc
    ",

    {ok, HeaderFunc} = erl_packetparser:erlang_func_for_packetdef(HeaderDef),
    {ok, ChunkFunc} = erl_packetparser:erlang_func_for_packetdef(ChunkDef),

    {ok, PngData} =
        file:read_file(code:priv_dir(erlang_red_parsers) ++ "/test.png"),

    % skim off the header.
    {#{}, RestData} = HeaderFunc(PngData),

    % retrieve all chunks defined in the png
    Chunks = collect_png_chunks(RestData, [], ChunkFunc),
    Types = [T || #{ <<"type">> := T } <- Chunks ],

    ?assertEqual(["IHDR","iCCP","eXIf","iTXt","IDAT","IEND"], Types).

as a comparison what these “packet definiton” come to in turns of Erlangs binary matching, see the unit testing for the parser.

Having put together the parser, ironically, I now better understand how the binary matching works. But I can’t expect users of Erlang-Red to know that. Hence this simplification. What this code looks like in Erlang Red is shown by this flow.

Whether this format is better or not doesn’t really matter, it works for me. And I’d rather define some binary format using this format than Erlangs binary matchers.

No - low-code v. no-code.

It’s the difference between something like Node-RED, n8n and IFTTT - they all have a different level of abstraction. Probably n8n & IFTTT come the no-code closer but are limited by that approach to very specific use cases.

I don’t really get your implementation, but if it works for you, great. I made a very simple binary parser demo a few years ago, that you can look at for potential inspiration: GitHub - fylke/binlog-demo: Binary log parser demo in Erlang

I also have a cheat sheet for binary parsing here: GitHub - chiroptical/erlang-binary-cheatsheet: Converting https://cheatography.com/fylke/cheat-sheets/erlang-binaries to typst with PDF release

Interesting since when I go your solution, I’m confronted with the most important thing in open source development: a license text. No readme, no explanation, no description but a license text. A very nice and concise text but it doesn’t do it for me.

I don’t quite understand how your solution relates to what I’m trying to do but I’m not known for my bookwormness either.

I’m not trying to provide you with a solution, I just wanted to show you idiomatic binary parsing, since you expressed unfamiliarity with Erlang binary notation. Do with it what you will.

1 Like

Understand & thanks but that’s not what I need. Basically I want to be able to code erlang using crayons. Instead of expensive japanese pencils, i want to use crayons and draw pretty pictures.

Why?

Because crayons are compatible with node red which I’m emulating with erlang red. Expensive japanese pencils don’t fly with a visual paradigm.

erlangs binary matching isn’t portable. Something like packet comes from node red so anything i do in erlang red with it will work with node red.

Why?

Don’t ask that question or think of it as being something akin to gleam, elixir and erlang and their relationship to the BEAM. Just that the visual approach goes in the other direction- high level visual abstraction towards coding.

This project comes to mind: https://kaitai.io/ I never used it though, because Erlang exists.

1 Like

Exactly something like that. One definition, many lanuages:

This .ksy file can be compiled into gif.cpp / Gif.cs / gif.go / Gif.java / Gif.js / gif.lua / gif.nim / Gif.pm / Gif.php / gif.py / gif.rb and then one can instantly load a .gif file and access, for example, its width and height.

Notice: no gif.erl - perhaps something worth pursing.

Btw how do does one create a parser for a language (such as yaml) that is based on indentation? There needs to be state passed around or how would a yecc parser be constructed?

Hello. I’m probably being dense but I really don’t understand the question or issue here (if there even still is one!). From what I can tell, you’re building a “low-code” visual programming environment where people can code with crayons to get the benefits of a programming language/toolkit/environment that in your view provides a lot of value but requires people to code with expensive pencils. As part of the value that visual programming provides, you’d like users of the environment you’re developing to be able to represent binary patterns in a particular way, which doesn’t match the way the underlying language/system you’re using to develop it with represents binary patterns. Isn’t that simplification precisely the value that your visual programming environment offers? To wrap the perceived complexity in a more simple package? And isn’t it therefore purely an implementation question / decision / detail for your software, which can be done exactly however you want it to be done? What am I missing?

Yes and no, I’m still trying to be compatible with Node-RED so, to a certain extent, I would like to do what they have done as much as possible.

Every single programming language is doing the very same. Or when was the last time we coded in 1s and 0s? So even Erlang is an abstraction from the underlying assembly language which is an abstraction of the +3.3v/+5v/0v impulses of the CPU.

With each “simple package” assumptions are made - every simplification removes complexity but also functionality - so the trick is to simplify in a way that is most useful for most users. I.e., remove functionality that might only be used by 10% of end-users or required to provide solutions for the envisioned problem space.

For me, the level of simplification in <<binary:matcher/syntax>> is too complex for the problems I’m envisioning in low-code visual programming environment. I think this can be simplified even further, the question is what is the best level of simplification for the envisioned use case.

And portability.

A visual representation can also be seen as a kind of Esperanto programming language which encapsulates programming concepts from various programming languages but promises a global franca lingua which can be used to share solutions. Of course, most folks won’t understand this since it seems far too hard to be done however it seems that the RISC is just becoming that franc lingua in CPU development. In fact how many different types of CPUs are there? Compared to 30 years ago. Unification and simplification is a natural tendencies of the human species.

Another example: AIs attempt to bring simplification to programming by using text inputs to produce code. So to get Erlang code to solve one problem, one only has to tell AI to produce the solution in Erlang, or Ruby or PHP or whatever language one wants.

Using a visual programming environment is just another attempt at rebuilding the tower of Babylon.

How many different CPU types are there compared with 30 years ago?
You might be surprised.
Architectures die (although there are oodles of simulators).
New architectures like RISC-V also arise.
I have several 32-bit microcontroller boards, RPi Pico, RPi Pico 2, ESP32, and even a
couple of actual RISC-Vs.
I’ve lost count of the 8-bit and 16-bit architectures out there.