Apologies in advance, a fairly long setup for my question, to give some context.
For some time I have been struggling with memory issues in a large (in terms of number of keys and data) Riak cluster running OTP 22.3. Nodes in the cluster can suffer from wild fluctuations and unexpected growth in their memory footprint - up from 50GB to 200GB in some cases.
The situation has been improved through a number of changes, mainly focused on trying to reduce the size of the loop state of processes (where there are thousands of such processes), hibernating processes where possible, and being careful with binary handling to avoid holding onto references unnecessarily.
One thing that has been observed was that at times fragmentation of the eheap_alloc
multiblock carriers was high (recon_alloc reporting this at around 30%). In Riak there are a lot of processes with a memory footprint of around 150-200KB, so a change that has been tried is to reduce the sbct threshold from 512KB to 128KB - so that these processes would now use single block carriers. This change proved to be consistently successful in reducing memory footprint in test, and so now that setting is in some production systems.
However, in the large and previously problematic production system, a fragmentation issue has now also occurred, even with the threshold change. In this case the erlang:memory/0
was reporting 30GB of memory in use, but the OS was reporting the beam using more than 150GB.
In summing the {mbcs_block_size, mbcs_carrier_size, smbcs_block_size, smbcs_carrier_size}
across all eheap allocators (1 to 48), as reported by recon_alloc:fragmentation(current).
the following result is returned:
{2356877512,4138991616,26392513472,158942248960}
So as would be expected with the reduced sbct threshold, the majority of the blocks allocated by the eheap_alloc
are in single_block carriers not multi_block carriers (26GB vs 2.4GB). However, what is unexpected is the discrepancy between the single block carrier block size and single block carrier carrier size (26GB vs 148GB). So it looks like a lot of blocks in single block carriers have been shrunk - so much so that the average block size is only 17% of the average carrier size.
So the primary question is how does this occur, especially given that the default rsbcst value for the eheap_alloc is 50%. My understanding is that for any single-block carrier from eheap_alloc
where the block has shrunk to less half the size of the carrier, then the carrier itself should be shrunk to free the memory up.
Is there another process that needs to be initiated to make this happen? Or some known situation where shrinking the carrier will be blocked? Or have I misunderstood the meaning of these thresholds?
Thanks