Structs in erlang

An interesting conversation was struct up on erlanger slack around accessing elements in a record via pattern matching without knowing the record type. The TL;DR of that part it that this is map functionality and not really possible with records AFAIK.

This led to the thought of having something similar to structs in elixir, in erlang, but faster :smile:

Basically, a would-be struct could be built on top of a record or similar to a record for performance reasons.

A nicety that structs in elixir allow for is optionally matching on the struct name or just treating it like a plain ole map, thatā€™s per a struct being a map under the hood. Whatā€™s more, they donā€™t require importing the definition of a struct, the compiler figures this out per a struct attribute which points back to the defining module.

Thus, I think it would be possible to have a similar structure and behaviour to go along with it, but with record behavior underneath.

If I think through the compiler steps, it should be possible, although there would be a cost associated with it at compile time. That is, instead of simply checking the struct definition to see if a key is or is not part of the definition, in erlang and with something like a record underneath, the pattern match or access method could expand to grabbing the desire element out of the struct (tuple) at itā€™s precise location per looking up the definition (grabbing it from the module that defined it).

Aside from having to restrict structs to being defined on top of a module, the other con would be that spitting out a tuple isnā€™t the most friendly format if youā€™re debugging, but perhaps it could be pretty printed in such a way that is friendly.

Quite interested what people think of this.

Edit:

I think the friendly formatting might be able to be solved with a bit of abstraction, that is one could define the structure using a map, but under the hood it ends up being a tuple, and when pretty printing, you can just refer back to the map to and spit it out on the screen as map, but perhaps with a module# prefix or something similar.

2 Likes

I must admit that I fail to see what practical purpose this would serve :thinking: When would one not know the record type to match on, but still know that a specifically named field is contained in it?

3 Likes

Sounds a bit like the frames proposal by @nzok

2 Likes

Consider some differents records that share the same field for example a Packet Identifer and you want to get its value

3 Likes

Indeed, definitely inline with frames, though frames sound quite complex implementation wise, not that Iā€™m opposed to the idea of frames :slight_smile: Though the syntax I think could be simplified. Iā€™ll have to re-read this, been a while.

Spawnfest idea : Implement frames :smile:

1 Like

I can also think of other patterns where you have a record or a struct and only a single entry point needs to validate that it is said record or struct. As an example, you have a module, with one or more public functions, but many private functions. Itā€™s not necessary for every private function to match on the name of the a structure, and just treat as some general structure.

That said, my interest in this is a record like data structure that has some compile time guarantees but is flexible like a map. Pure and simple.

Edit:

And the speed of a record :smile:

2 Likes

I personally feel that if we add something to Erlang called structs then they should be the same as Elixir structs otherwise it will create an enormous confusion. If we do add something else then give it another name. Then we can start arguing about what want and donā€™t want.

While I am not a big fan of Elixir structs I feel they should be added to Erlang and they should be compatible. This would greatly help the interaction between the languages. I did it for LFE for this reason.

9 Likes

So, youā€™re thinking maps then as opposed to building on top of records?

Edit:

Iā€™d be in support of it, I kind of wanted my cake and to eat it too (i.e., speed from records, at least in tight loop scenarios).

I know youā€™re not a fan of binding the struct to a module, and really per Jose thatā€™s a means to an end to support protocols in Elixir. That said, it does bring a convenience such that a struct defined on any module is available globally (i.e., you donā€™t have to import a record definition). That I very much like, but maybe thereā€™s an alternative way of doing that so that structs could be defined anywhere (include hrl), but are globally available. Thoughts?

1 Like

Iā€™m still not convinced, but well :sweat_smile:

Records are just syntactic sugar over tuples, the speed they have is simply the speed you get from working with tuples. So if you want to build on records, in reality you have to find a way to build on tuples. That is, forget about records and record field names (that information is lost when the compiler substitutes them for what they really are), but think in tuple elements and positions.

4 Likes

One of the things that should have been obvious about the Frames proposal was
that a small tweak to the design would have allowed a single tuple to be
viewed as a record and as a frame simultaneously, but even then, youā€™d have
to explicitly add the wrapper.

Erlang records provide economy of storage and efficiency of access.
The price is having to know at compile time what record type is involved.
Frames would have provided the same economy of storage with lowered
efficiency of access.
Maps (and structs) do not provide that economy of storage.

Programming involves tradeoffs. I think this may be a lollipop moment.

2 Likes

What does ā€œshare the same fieldā€ mean? Does it mean to have a field with the same
name? With the same name and type? With the same name and type and location?
What if two records each have a field with the same semantics but different names?
What if two record types have a field with the same name, type, location, and
semantics when the client is written, but at a later time one of them is revised
during maintenance to have a different name or a different type or a different
location or a different meaning, or is even deleted?

Oddly enough, there is a dynamically typed programming language with records
declared as
record foo(x, y)
record bar(y, x)
which allows thing.x and thing.y := 0 regardless of whether thing is a foo
or a bar. The programming language is Icon. I thought this might be a benefit,
but in practice I did not find it so.

I suspect that whatā€™s really wanted here is ā€˜view patternsā€™.

2 Likes

means have the same field name in differents positions and have differents number of other fields

The thing is that field names are a VERY weak property of data.
I started flicking through the Pathom documentation in response to a recent
message about Boto, and it occurred to me that this may be another lollipop
moment. Consider the question ā€œwhat is the temperature in Recife?ā€ which is
studied in the Pathom introduction. This is addressed via
{:city Recife
:temperature ?}
OK. But what does ā€œtemperatureā€ mean? For example, temperature-at-mown-grass-height
and temperature-at-shoulder-height generally differ by several degrees; during the day
the land is hotter than the air, and during the night the land is colder than the air.
For example, temperature varies with height. The rate at which it does so is known as
the lapse rate. The result is that where I live is about a degree colder than in the.
centre of the city, but since the centre of the city is pretty nearly at sea level it
is that temperature which is least typical of the city as a whole. For example, the
city extends onto a ridgy peninsula, with the obvious temperature difference. For
example, the temperature for a city is often measured at the nearest airport, which
tends for obvious reasons to be hotter than the city. So different semantics for
ā€œtemperatureā€ could lead to differences of as much as ten degrees C. For that
matter, the query as shown has no time information. There are places in this country
(which is classified as having a temperate climate) where the annual maximum
exceeds the annual minimum by 60 degrees C.

My point is that two fields in two records having the same name is a sheer accident.
The only way to be confident that it makes any kind of sense to regard them as
ā€œthe sameā€ is for that to be a consequence of deliberate design explicitly
documented. And in that case it costs the original designer very little to
provide an access function.

Years and years ago I proposed ā€˜abstract patternsā€™ for Erlang, which would
have let you write something like
#temperature(#frobnitz{temperature = X}) ā†’ X;
#temperature(#snarkle{temperature = X}) ā†’ X.

and use #temperature in a pattern match without losing any of the safety
properties of Erlang pattern matching. This was before I ever heard of view
patterns, but basically the same issues plus Erlangā€™s ā€˜patterns canā€™t do ā€™
guarantees.

What I think Elixir has done, and what I think Pathom and Boto are doing, is to
make it easier to do something dangerous. Iā€™m reminded of a book I once read
about using Perl to build software engineering tools, where every single example
showed how easy it was to use Perl ā€¦ to do the wrong thing, something fragile
and misleading. Relying on the same key in two unrelated maps to imply some kind
of useful relationship is also dangerous in the same way. (And to be honest,
frames would also have facilitated the same dangerous practice.)

Deliberate design, explicitly explained. Thatā€™s what it takes.
And then, ā€œtrust, but verifyā€.

3 Likes

That seems unnecessary. :smile: Most of the functionality of frames were implemented in Erlang/OTP 17.

However, we call them maps and not frames. Small maps with up to 32 elements are implemented in the way that @nzok suggested in his frame proposal.

4 Likes

haha yeah, someone pointed that out to me yesterday :stuck_out_tongue:

Though, my eyes went straight to record like definitons, etc.

1 Like

Quite curious what you think of dangerous here in regards to structs in Elixir. I can infer based on what you said, but Iā€™d rather not.

1 Like

I think my original idea was to build an abstraction on top of an abstraction, but that sounds like not a great idea now :smile:

I do rather like @rvirdingā€™s idea once it was thrown outā€¦ sacrifice speed for compatibility. The other alternative is to give recordā€™s a better syntax, but in my initial thinking, I figured leave records alone.

2 Likes

I see structs in Elixir as a dangerous solution because itā€™s error prone, in my opinion at least.
It gives you a sense of security where there is none, pattern matching over a struct name doesnā€™t ensure presence of all keys, clustered nodes with different definitions of the same struct can break stuff with bad data.

Itā€™s difficult to talk about pathom, because itā€™s a clojure library, some premises and design choices are just different. What Iā€™m trying to achieve with boto is something that is easy but that talks with beam premises, my biggest concerns right now are dealing with failure states and how strictly I handle input attributes.

1 Like

I infered thatā€™s what ROK meant too, but by that logic everything is dangerous. maps, lists, tuples, processes etc. I think structs are a tool and anything can be abused. Structs give me some compile time guarantees and Iā€™d rather have that than nothing at all. Polymorphism aside, they provide me with named currencies such that a consumer can unequivocally say ā€œThis is that and it is guanteed to have this shapeā€ but thatā€™s it. Itā€™s never gave me the illusion that field values are of a particular type, we have to lean dialyizer for that or runtime validation. Either way, itā€™s just a tool, and as sharp as any other IMHO.

EDIT:

As far as the shape changing, thatā€™s a human problem IMO. Humans can change anything :slight_smile:

1 Like

thatā€™s my point, there is no such guaranteeā€¦ you only have this if the code ā€œcastingā€ the struct is the same that is using it.
So, letā€™s say you have a clustered app that data goes through instances as just erlang terms, with no encoding(json, xml, etc).
Deploying new nodes where the struct definition is updated, if you cast the struct from an instance with the old definition it only gonna have the shape of the old definition and vice versa. But pattern matching on the name of the struct gonna work on both, independant if the shape is consistent with your local definition of the struct.

1 Like