VM tuning guide?

nhpip · October 16, 2022, 12:18am

Is anyone aware of a guide that describes the optimal, or even just recommended, VM configuration options for specific deployments?

Scheduler, inet, memory etc. for cloud, containerized, bare metal, embedded etc

As an example, when should you use +sub true or not?

starbelly · October 16, 2022, 6:29pm

This is not an easy to do. The tunings are so application and architectural specific it’s very hard to find a one size fits all (really, this is what OTP does a really good job of with the default settings).

What’s more, it’s often not a matter of turning some dials and calling it a day, usually it’s turn some knobs and closely monitor for a long time. To make matters more complicated, the tunings will change over time as your application and architecture evolves.

That said, I do think a general purpose VM tuning guide would be great. Whether it lives in OTP or not is another matter, I would think we would want it to live in OTP, but have to consider that the OTP has to maintain it and deal with issues people open as a result of it.

Interestingly, I recently had the idea of a “profile” concept. This might look like a kernel arg where you pass in a profile name. As with other things that offer similar functionality, it would be a starting point, no doubt you’d have to start with a profile, then adjust with overrides to get to your sweet spot. Such a feature once again could live inside OTP or it could be lib or simply a set of markdown files that instructs on how what specific settings to go with for your base configuration.

I know none of these are the answers you were looking for, but this has been on my brain quite a bit and I think could turn into a very interesting thread.

Finally, you can refer to other VM tuning guides others have put out there to give you some ideas. First ones that come to mind are riak, rabbitmq, and vernemq tuning guides, they have helped me some in the past. Of course those are recommendations based on how their app(s) work and how you might setup an architecture for it all. As an example, some of these guides recommend a very big dist busy buf limit, while I know in some other places small as possible is key.

Useless banter :

I’ve said in the past and perhaps recently, that tuning the VM is like dialing in a good tone on a a dual mesa boogie rectifier. It’s not a walk in the park, but in the end you arrive at balance (a lagom of settings) and often not where you expected you’d arrive (counter intuitiveness often wins), but you walk away with a good tone and an arsenal of knowledge about how it all works

nhpip · October 16, 2022, 8:42pm

Thank you for your reply. I guess it makes sense that there is no one size fits all. That said I like your profiles idea. In my case, for example, the profile could be a kubernetes/docker AWS deployment. Mind you, you can then have 20 different sub-profiles for things like core count, hard limits vs soft limits etc.

garazdawi · October 17, 2022, 7:59am

I quite often get asked this question and my answer will remain the same. The default settings are good for most applications. There is nothing that you can tune that would make a general application perform better, it all depends on what that application is actually doing.

There is one exception to this rule, and that is the scheduler busy wait time aka +sbtw. For any OS that is using CFS (Completely Fair Scheduler) and when you set CPU limits on Erlang, you want to set all schedulers busy wait to none. That is +sbwt none +sbwtdcpu none +sbwtdio none.

An example of such a system is when you run: docker run -it --cpus 1 erlang

From this example it would seem like what you would like is a guide describing when to use the various options and I think such a guide would be more useful and also easier to write. For example:

Use +sub true when you have given your customer access to per core CPU graphs and they are complaining that the load is not evenly spread across all cores. Or in other words, you should probably never use this option as load compaction is a good thing which saves both energy and allows for better cache locality.

There are so many options in Erlang that it is quite a daunting task to write such a guide, but if someone from the community would like to help out we will do our best to answer any questions that comes up.

A good starting point would probably be what @starbelly suggest, that is, look at what other large open source applications do and ask why they have done that. Maybe you can even get some non-open source application to reveal what they use and why.

nhpip · October 19, 2022, 12:24am

Many thanks. Our deployment is on AWS using Docker and Kubernetes. These are the kernel options:

   +c true 
   +C multi_time_warp
   +sub true
   +swt very_low
   +swtdio very_low
   +sbwt none
   +sbwtdcpu none
   +sbwtdio none

I was curious about +sub true because it’s just our app running and we have 8 real cores (16 virtual). It seemed like a shame not to use them all. Plus a lot of the app is written in Elixir and the Task module is used a lot

Thanks again…

max-au · October 19, 2022, 5:25am

Before diving into specific optimisations, I usually recommend to come up with a benchmark reflecting production usage.
This benchmark should run on continuous basis. If often happens that over time traffic patterns change, VM changes, application code changes, - and in a few years ERTS options that were optimal turned into problematic settings (and of course no one remembers why in the first place these optimisations were made).
It happened to me multiple times, even my own 2-year-old optimisations happened to get wrong.

When such a benchmark exists, you could then easily see where is the bottleneck (CPU, RAM, I/O, network, …). Depending on where the bottleneck it, you’ll be able to choose more specific settings.