I am considering deployment for my Erlang back-end software and need to decide on either docker or VPS.
The preliminary deployment is initially 2 public nodes, each with a cowboy entry for clients (websockets), some logic and a mnesia db with few writes and many reads.
Mnesia is planned to be mirrored for all nodes (same mnesia for all nodes) and stored both in RAM and on disc on all nodes.
One option is deploymnet in a Docker. My uncertainty lies in mnesia storage. What is effective manners to deal with it? Should it be located inside the docker or maped to an external folder?
Or would you rather have a separate backend server where disc copies reside and only run ram copies in the docer instances?
I expect to be able to update docker instances along the way, i.e. deploy a new docker with updated erlang and software that has passed V&V.
You can use volumes for persistent storage. In our example we first start and initialize all the nodes manually where each one has it’s own volume. That is done once and afterwards you may start and stop containers.
In Kubernetes we use an init container to automate initialization. Scaling a Stateful set is done by automating joining an existing cluster.
@vances, re-reading your answer now a month later when I am actually trying to get the dockers to work with my application I see a depth I never understood at the time
A lot on insight in a few lines
So what you do with statefull and joining is to add and remove nodes from the cluster as demanded (horizontal scaling) and make sure other nodes are aware?
The characteristic of my system is huge logic network and many reads but just a few writes and those can take the time needed for transactions to apply on all nodes. Each normal session has a network of 110 processes interconnected for the logic network.
I have planned my production stack with three tiers:
Front end where I have cowboy that deals with web sockets and forward requests to next tier. Front end tier has no access to any application data (perhaps some runtime system data though).
This tier cannot scale horizontally easy because of the connection to public domain name.
Workers, nodes that performs the heavy lifting. Those need to be scalable as demand changes and I hope I can manage to scale them horizontally by some automagic later on.
Has application data in ram copies.
Database nodes with the sole purpose to have disc copies and ensure data is safe over time and between restarts. This tier do not scale horizontal easily but might get additional resources if needed.
I am still debating whether I need the database tier or not, but I have a gut feeling it is a good thing. If nothing else I’ll have a tier of disc copies that is quite stable and does just that. Less reasons for those to go down.
Have an idea of scaling the worker tier based on the number of clients each node have and trying to distribute number of connected clients evenly.
To be honest I think this is overkill but one has to learn on every project, right