Performance: ets vs. dets/mnesia for infrequent persistence to disk

Hi folks,

I currently have a few large (tot. ~50GB) ets tables. I’ve optimised operations on them a lot, and the performance I get with them is decent. One bottleneck I have, however, is persisting to disk. My workflow involves writing my ETS tables out to disk on shutdown with (ets:tab2list and similar), and reading them back on start up.

One idea I have is to replace ets with dets/mnesia, to make the reading/writing from disk simpler and more performant. I am open to even having dets/mnesia backed by a RAM disk, and actually persisting it to storage using a native OS command after my program running on the BEAM has terminated, for maximum efficiency.

Does anyone have any experience they can share in moving from ets to dets/mnesia, or otherwise optimising writing out ets tables or other large Erlang term format data to disk?

1 Like

I don’t have the requested experience, but some extensive experience with mnesia. What do you’re looking for is automatized by disc_copies tables in mnesia, the performance is highly depend of the database design, of course less performance than ETS but with perfect design and functions use it can handle much more than this data efficiently

  1. Moving from ETS to Mnesia/dets (disc_only_copies) is not going to preserve your current performance. DETS is a very different animal from ETS.
  2. DETS is (trying to be polite) “not good” as a storage solution. The issues with DETS and needing fragment tables to compensate for its size limitations is one of the reasons why Klarna went with the Mnesia plugin support and LevelDB (also not good, but that’s a different story).
  3. If you want to persist sequences of Erlang terms look at disk_log.