The need for "protocols" in Erlang

yes but that was ugly and implied naming files too.
introducing namespaces would be retro-compatible as old code would bind to empty namespace once compiled with a release handling it.

yes, declaring a namespace in module is not a guarantee, like in XML, that namespace is not used elsewhere because it is a declarative FQDN, but in general uses a domain name owned by owner code.

But under the hood, Erlang VM could simply prefix modules basenames with the namespace as an atom (or a hash of namespace as atom) as well call to modules .
This would not impact the VM itself but only the name of objects compiled.
So that same module name in two different namespaces would be in fact two different modules.

Personally Iā€™m not a fan of complexity in the core language, I like how simple it is and I donā€™t see these cases as sufficient cause to complicate it. I realise this might make me some weird kind of ascetic/masochist, but I tend to think itā€™s an application, framework or environmentā€™s job to staunch data leaks, rather than the core language.

I can certainly see a value in namespaces, but then I immediately feel like Iā€™d really want them to provide some simple source directory hierarchy mapping, and I donā€™t think Iā€™d want to have com/domain/subdomain/package/module stuff like in some other languages, and I wouldnā€™t want it to interfere with how search paths work in e.g. c() or to make things any more complex - which on balance would probably make me think that overall, eh, it ainā€™t broke, maybe letā€™s not fix it ĀÆ\_(惄)_/ĀÆ

I understand , but we have to remind interoperability between beam languages.
I personally named an Erlang application the same than an Elixir one, just because I do not use Elixir and I wasnā€™t aware of.
Namespaces will allow much more bridges between beam languages MHO.

2 Likes

This proposal is not suggesting that foo:bar(Arg) becomes spread out across multiple modules. Since this was brought up more than once, I probably expressed myself poorly.

foo:bar(Arg) would still have some explicit source code as part of its implementation that does a dynamic dispatch. What I am proposing would not be different than doing this:

-module(foo).

bar(Arg) ->
  try ets:lookup_element(?MODULE, key_for(Arg), 2) of
    Fun -> Fun(Arg)
  catch
    _:_ -> error({badimpl, Arg})
  end.

The above is one possible implementation of dynamic dispatch that relies on ETS. A separate implementation could use persistent term or even processes. In other words, there are other ways of doing dynamic dispatch on Erlang, but they are more verbose and inefficient than a solution where the runtime/VM knows exactly when new dispatch options are added (based on module loading) and removed (based on module removal).

Not really. Remember you want to be able to write new implementations for existing data types and implementations of new data types for existing protocols.

Think about it as a table. New rows means adding new functionality for existing data types. New columns means adding new data types to existing functionality.

Lists Maps Integers ā€¦
Pretty printing
JSON encoding
Interpolation
ā€¦

In your suggestion, once a record namespace is defined, I cannot augment it with new functionality (rows) because it requires changing the module. And the way that function clauses work today do not allow you to add new data types (columns) because it also requires changing the module. The trick is to have a solution that allows both. You have to decouple on both axes.

Here is an example from Rust: The Expression Problem in Rust | Chris Swierczewskiā€™s Website - notice how as you add new traits and new data types, you donā€™t have to change none of the previous trait definitions nor the existing data types. Coupling a record to a single module would not provide this property.

1 Like

Hiya!

I understand , but we have to remind interoperability between beam languages.
I personally named an Erlang application the same than an Elixir one, just because I do not use Elixir and I wasnā€™t aware of.

Yes, absolutely, for multi-language apps & systems it could be really problematic if there are modules with the same name. Personally Iā€™ve not really struggled with this, even though there are plenty of e.g. JSON libraries that all manage to find different names, the only time was a couple of UUID projects and it worked fine with rebar.config aliases. I guess thatā€™s because I end up not using Elixir code from Erlang because of the work involved in the interop. But the fact that people name their modules the same as modules from the other language without realising thereā€™s a clash does seem to suggest that thereā€™s not that much in-project interoperability happening anyway, if people are mostly sticking to one language and not checking the other for clashes.

Even so,

Namespaces will allow much more bridges between beam languages MHO.

Right on. If namespaces would allow the kind of interoperability whereby Erlang could use Elixir code as easily as vice versa, then Iā€™d be all for it. Thatā€™d be fantastic - there are tons of cool Elixir modules Iā€™d like to be able to use reliably, and without having to do all the work currently necessary. Iā€™d just want to be really sure that it doesnā€™t have any negative impact on the Erlang-only user-dev experience, especially if it doesnā€™t actually provide that level of benefit to Erlang-only projects - and that itā€™s not solely a case of making big changes to that experience in order to avoid name clashes with modules in another language that, right now, involves a fair amount of work to use from Erlang.

1 Like

My 2 cents:

  1. Please do not pollute (abuse) the global Erlang module namespace with protocol or type lookups. It is already crowded enough as it is, and we only have one module namespace. It is not the right place to put type-to-encoding mappings in.

  2. I think the premise that there is a strong one-to-one mapping between type and some external representation is more often false than not.

    Take for example the calendar:datetime() type in Erlang. Letā€™s consider a JSON protocol implementation for it. We could choose e.g. an ISO 8601 string or a Unix seconds integer. Which one is ā€œcorrectā€? Iā€™m guessing we can find many different protocols (here Iā€™m referring the network kind, a sign that the name is already overloaded) that require one or the other. Therefore, the implementation of a JSON protocol (the type mapping kind) for datetime() can not and should not choose one or the other. This is up to the implementation of a specific system.

    In this light, any protocol-like implementation must be flexible enough that it can be fully parameterized at runtime (if, for example, building a system that needs to use both formats at different ends in different network protocols).

8 Likes

I personally donā€™t care whether anything in my Erlang code has the same name as something in Elixir, any more than I care whether it has the same name as something in Ada or Ruby (or Mercury, when Mercury compiled to BEAM). Just because something uses the same VM doesnā€™t mean I want it in my address space. We have a namespace concept in Erlang: a namespace is a node. Thatā€™s what Lawrie Brown based Safe(r) Erlang on all those years ago.

What is the difference between a namespace and a node?
ā€œNamespaceā€ means many things to many people, so when we talk about ā€œnamespacesā€ none of us knows what the others think we mean. ā€œNodeā€ has a precise meaning in Erlang. Currently, nodes are realised as operating system processes, but Lawrie Brown showed us they donā€™t have to be.

What do virtual nodes (multiple software-isolated nodes that may but need not reside in the same operating system process) buy us?

  • Untrusted/third-part code may be isolated in a separate node
  • Different applications may use different versions of modules
  • When multiple applications happen to use the same version of a module, only one copy needs to exist
  • A new performance level for inter-node communication is added; to WAN-remove, LAN-remote, same-cluster, and same board we add same OS process.
  • Java has already gone this route with the idea that running a new Java program can just drop into an existing JVM instance so that thereā€™s no need for multiple copies of classes in memory, no need to keep on re-JITting the same methods, &c, reducing memory and startup time.

Letā€™s not introduce a new mechanism into the language.
Letā€™s enrich the implementation of what we have.

3 Likes

ā€œThink about it as a table.ā€
This is one of the classic arguments for OOP.
And itā€™s wrong.
DONā€™T think of it as a (two-dimensional) table.
Think of it as AT LEAST a three-dimensional space of

  • what abstract data type
  • what operation
  • what consumer
    Pretty-printing: the combination of what data type and operation = pretty-print is not all we need to know. You might want different ā€œpretty-printingā€ methods for
  • Braille devices
  • dumb terminals/printers
  • small LCD screens
  • large LCD screens
  • projectors
  • three-dimensional displays
    and for any one of these, there might be many definitions of what counts as ā€œprettyā€. Think about whether the interface allows ā€œfoldingā€, for example.

JSON: weā€™ve already covered that and the idea that there is one right way to convert any Erlang data type to JSON is pretty much as dead as the phlogiston theory by now. For one thing, Erlang data types are already at the wrong level to make that kind of decision. Many different Erlang data structures might represent the same abstract data. Think of a Sudoku board; I could represent one as a string, as a tuple of strings, as a single 81-digit integer with 0 for blank, as a tuple of 9 9-digit integers, as a tuple of tuples of 1-digit integers, as a list of lists of 1-digit integers). And I must be able to change my choice of representation inside my Erlang program without changing the JSON. And of course one Erlang data structure might represent different abstract data, needing different JSON encodings.

Interpolation was never clear to me as an example. Not least because tutorials Iā€™ve seen on string interpolation in other languages are littered with things like ā€œthis is what you do when the default conversion for the data type you want to paste in is WRONG for your applicationā€. Itā€™s a combination of ā€œwhat ABSTRACT data do I have hereā€ (not ā€œwhat CONCRETE data do I haveā€) and ā€œWhat does the CONSUMER of this string wantā€.

The wrongness of the 2-D model is one of the classic arguments for multiple dispatch. The other classic argument is of course arithmetic. Consider Q + D where Q is a quaternion and D is a dual number. (An actual example in my Smalltalk, which being a single inheritance language does NOT do this easily.) Which argument should decide how it is done? If you say ā€œthe first argumentā€, then what about D + Q? If you say ā€œthe second argumentā€, then again, what about D + Q? The answer is that the implementation depends on BOTH types. Given that (a) Smalltalk is a single-inheritance language, rather like what ā€˜protocolsā€™ would give us, and (b) my Smalltalk library includes 13 or 14 different kinds of ā€œnumberā€, making sure this all works is NOT easy. (2 * aMoney is OK, 2 + aMoney is NOT, aDate + 2 is OK aDate * 2 is NOT, ā€¦) Oh, I didnā€™t count Money in the 13 or 14ā€¦

ā€˜Protocolsā€™ as proposed just cannot deal with the problems they are supposed to solve. Adding a lot of complexity to NOT solve a problem doesnā€™t sound like a good trade-off.

2 Likes

This is a good point, yet I donā€™t think there is an argument for a default JSON encoder / decoder in the language. Those details should be handled by a library. In this case a library could offer an option to encode or decode a specific key/val in a particular way. Or one could opt to do the transformation prior to encoding.

I only see this as being extremely problematic if there were a default implementation of a JSON protocol in erlang/otp, still it is a very good point and gives me pause :slight_smile:

Edit:

Disregard. This thread has become long and unwieldy such that I forgot the piece that youā€™ve alluded to which is related to a discussion of a default implementation of a json encoder/decoder.

Hi,

you mentioned the virtual nodes several times.
Is this only a concept or is there also an implementation for it?
So far I only found a paper [Brown99] by L. Brown, which focuses on safe erlang, but nothing more current.

Thanks for any pointers.

ā€“
[Brown99] EXTENDING ERLANG FOR SAFE MOBILE CODE EXECUTION

This has nothing to do with Elixir. I was referring to using a mapping scheme with module names like json_datetime which by definition would live in the Erlang VM global module namespace. The fact that different programs might want different implementations of such a mapping makes it impossible to use the module namespace in a node for that purpose.

4 Likes

At the time that Lawrie Brown was presenting papers like
https://link.springer.com/chapter/10.1007/978-3-540-47942-0_5
https://link.springer.com/chapter/10.1007/10718964_3
http://lpb.canb.auug.org.au/adfa/seminars/adfa303/adfa303.html
http://lpb.canb.auug.org.au/adfa/papers/tr9704.html
there was a small group of Erlang people saying ā€œYES! This
is great! This is the next step in scaling Erlang!ā€ and
there was a much larger group of people saying ā€œmeh.ā€

I remember there being great excitement at RMIT about the Magnus project. Then silence. The Erlang Compiler EC that was being developed at the time (this was before bit syntax, let alone maps) got to the point of being useful, and then disappeared. I havenā€™t heard anything from Maurice Castro or Dan Sahlin or Lawrie Brown since moving back to New Zealand.

I just found Lawrie Brownā€™s e-mail address and sent him a query.
Heā€™s the Brown of Stallings & Brown on Computer Security, so he was pretty clued-up about the security aspects of SSErl.

ā€œThis has nothing to do with Elixir. I was referring to using a mapping scheme with module names like json_datetime which by definition would live in the Erlang VM global module namespace. The fact that different programs might want different implementations of such a mapping makes it impossible to use the module namespace in a node for that purpose.ā€

Why are two different programs running in the same node?

Thatā€™s not the problem. The problem is that ONE program will commonly need multiple mappings between its own data types and JSON, so that json_datetime isnā€™t a problem (to be solved by some limited namespace hack), itā€™s a mistake (to be solved by better design). What the program needs is a
talking_to_partner_foo
module with functions mapping between one set of data types and JSON in one way and another
talking_to_partner_bar
module with functions mapping between a different but probably overlapping set of data types and JSON in a different way.
If I call
talking_to_partner_foo:request(loan, Loan_Info)
then the caller module SHOULD NOT KNOW that JSON is involved (or that it is not). The caller is responsible for WHAT is said to partner foo. talking_to_partner_foo is responsible for HOW it is said.

If I call
talking_to_partner_bar:request(invoice, Invoice_Info)
then the caller module SHOULD NOT KNOW that JSON is involved (or that it is not). The caller is responsible for WHAT is said to partner foo. talking_to_partner_foo is responsible for HOW it is said.

I repeat, json_datetime is not a problem, it is a MISTAKE.
We do not, no, we should not, go out of our way to twist the language to support design mistakes.

I feel like crying. I left the IT industry to teach programming and software engineering for the next couple of decades, and Iā€™m seeing lessons that the industry had learned more than 30 years ago still not heeded. Itā€™s right there in ā€œCriteria for Decomposing Systems into Modules.ā€ Consider changeability. Practice information hiding. In particular, hide things like external representations in modules.

Letā€™s look at an actual JSON file I requested recently.
Or at least just enough to make my point.

{
ā€œdisclaimerā€: ā€œUsage subject to terms: https://openexchangerates.org/termsā€,
ā€œlicenseā€: ā€œhttps://openexchangerates.org/licenseā€,
ā€œtimestampā€: ,
ā€œbaseā€: ā€œUSDā€,
ā€œratesā€: {
ā€¦
ā€œUSDā€: 1,
ā€¦
}
}

We start off with some fixed fields,

and then have a table of -> maplets.

Loading this into my program,

  • the licence and disclaimer are strings, discard them
  • the timestamp is a number, read it as an Integer then
    convert it to a DateAndTime.
  • the base currency is a string, convert it to a Currency.
    In the table,
  • the currencies are strings, convert them to Currency
    objects
  • the ratios are numbers, which are first read as exact
    ScaledDecimal numbers (not FloatD) and then converted
    ā€“ by attaching the base currency ā€“ to Money objects.

Two different treatments for strings (and in neither case is a String the result), two different treatments for numbers (and in neither case is a Number the result).

I could generate this JSON file easily enough.
However, the conversion of a DateAndTime to JSON form is not the usual Javascripty representation of a timestamp. If there were a json_datetime module (analogue) it would be WRONG to use it here. What happens to a Money object isnā€™t the usual representation of Money in JSON either. If there were a json_money module (analogue) it would be WRONG to use it here.
The question is not ā€œHow to represent an instance of the Money abstract data type in JSONā€ but ā€œHow to represent an instance of the Money abstract data type in JSON when generating an exchange rate table in OpenExchange formatā€.

If you look at the GRASP patterns,

  • talking_to_partner_foo is the Information Expert
  • itā€™s arguably an Indirection in that it mediates between
    clashing representations of information
  • it achieves Low Coupling
  • and High Cohesion (its responsibility is there in its
    name: talking to partner foo; what is said to that
    partner and what is done with the partnerā€™s replies are
    decided elsewhere)
  • itā€™s all about Protected Variation (decoupling its callers
    from changes to the external representation and the partner
    from changes to the internal representation)
1 Like

This is not different of what i thought.

-module(foo).
-namespace(talking_to_partner).

request(ā€¦) ā†’ ā€¦ .

This is only a way to virtually prefixe your own modules.Thatā€™s all. No other goal.

And if local code want to access another implementation of a module with same basename, declare it.

-module(foo).
-namespace(talking_to_partner).

-namespaces([{a, talking_to_other}]).

request(xxx)-> ā€¦ ;
request(yyy)-> a::foo:request(yyy).

Yes, protocols do not solve multiple dispatch, it is two dimensional. Yet, we are still stuck on a single-dimensional space. If there are good solutions for efficiently solving the N-dimensional aspect in a dynamic language (perhaps Julia would be a good example), then I am all ears. :slight_smile: Meanwhile I will keep thinking about the problem. If we assume we will change the VM, it at least opens up the solution space.

The proposal assumes the ability of defining/wrapping new types somehow.

Even then, letā€™s assume we do pick a solution with multiple-dispatch. I donā€™t see how to avoid both performance and ordering issues without making the solution type based.

If you have a SUDOKU string, then you would either need to check on every string if thatā€™s a SUDOKU string and then fallback. If others define FOO, BAR, BAZ strings, then we are now traversing 4 different types of strings. Extend this to every datatype and you end-up with a very slow dispatching mechanism? Plus if both FOO and BAR have similar shapes, then the order will matter. This sounds like weak-typing and it is easy to make a mess. If you want to have a special representation for a string, asking to wrap it on a new type is not much (and arguably a best practice even if you are not looking for a custom representation).

At best, we could ditch the type mechanism for a subset of pattern matching + guards where we can detect overlaps, but even then I fail to see how it would handle generic tuples or strings efficiently.

PS: In any case, I believe we had more than enough feedback on this one, so thanks everyone for the convo. I am glad to withdraw the idea for now. :slight_smile: