Khepri - a tree-like replicated on-disk database library for Erlang and Elixir (introduction & feedbacks)

eproxus · November 12, 2021, 11:58am

@dumbbell The docs are beautiful! Great work! Is it possible to use the styles for EDoc outside of Kepri?

dumbbell · November 15, 2021, 11:17am

Thank you

It should be possible, here are the things I did for Khepri:

Download a CSS reproducing the GitHub Markdown style.
Download a javascript+CSS pair of files from Prism for syntax highlighting.
Add the following two modules to the source code to override EDoc:
- khepri/khepri_edoc_export.erl at main · rabbitmq/khepri · GitHub
- khepri/khepri_edoc_wrapper.erl at main · rabbitmq/khepri · GitHub

Configure EDoc to use the GitHub Markdown CSS and those two modules (in a rebar.config in this example):

{edoc_opts, [...,
             %% The CSS comes from the following Git repository:
             %% https://github.com/sindresorhus/github-markdown-css
             {stylesheet, "github-markdown.css"},
             {layout, khepri_edoc_wrapper},
             {doclet, khepri_edoc_wrapper},
             {xml_export, khepri_edoc_export}]}.

EDoc allows to override a few things programatically, but not everything. khepri_edoc_export and khepri_edoc_wrapper take care of both. When there is no hook in EDoc, they patch the generated files directly on disk. Therefore they are fragile and depend on the content of the generated files. But without that, syntax highlighting or even using the GitHub Markdown CSS unmodified would not be possible.

I think that’s an acceptable compromise because the documentation can be generated using a specific version of Erlang in CI for instance. And if it doesn’t work for another version a contributor is using, that’s no big deal: the documentation will still be generated and can be red and reviewed even though the style is missing.

The biggest downside to me is that those two modules are kind of part of Khepri’s API because they are in src to be compiled and ready for the doc generation. It would probably be possible to extract everything as a Rebar plugin for instance, but I didn’t do that for now. I also didn’t contribute any patch to EDoc itself to allow to override everything I wanted.

Update: The generated modules list sidebar is also patched to add a table of content pointing to sections in the overview page.

vkatsuba · November 15, 2021, 5:07pm

And after some discussion, together with @dumbbell we was create a rebar3_edoc_extensions to use the same documentation style in any projects. Also able in hex.pm rebar3_edoc_extensions.

dumbbell · November 17, 2021, 1:28pm

Khepri 0.1.1 was published to GitHub and Hex.pm:

Release notes: Release Khepri 0.1.1 · rabbitmq/khepri · GitHub
Hex.pm package: khepri | Hex

It includes a breaking change in the arguments of the following functions:

khepri:transaction/2
khepri:transaction/3
khepri_machine:transaction/2
khepri_machine:transaction/3

The release notes explain what changed and how to update your code.

Taure · November 19, 2021, 10:01am

I haven’t had time yet to test Khepri. But data that you store does it need to be in any format? Or can you store Erlang maps?

Have a project I thought Khepri would work nice in.

Like khepri:insert(“path”, Map)

dumbbell · November 19, 2021, 10:03am

You can store any Erlang term. Khepri doesn’t perform any check and Ra underneath basically does a term_to_binary() whn it has to store something on disk.

OvermindDL1 · November 19, 2021, 4:36pm

I.E. not any erlang term, but serializeable erlang terms, which is indeed most (just being clear).

dumbbell · November 19, 2021, 4:37pm

Yes, you’re right! I should probably refine the spec by the way and perhaps reject things like ports explicitely if they don’t make any sense on another Erlang node.

dumbbell · February 18, 2022, 9:28am

I’m pleased to announce the release of Khepri 0.2.0, followed this morning by 0.2.1 with a couple bug fixes. Highlights are described in detail in the release notes (including breaking changes and how to update your code), but let me summarize the most important addition here.

Indeed, Khepri 0.2.0 introduces stored procedures and triggers!

Triggers are a mechanism to execute an anonymous function automatically following some events.

Currently supported events are changes made to the tree: nodes were created, updated or deleted. In the future, it could support Erlang process monitoring, node monitoring, and so on. Triggers are registered using an event filter which, as its name suggests, takes care of filtering the event which should execute the associated function.

The anonymous function behind a trigger is stored in the database as the payload of tree node. This function is called a stored procedure. Before it is stored, the function is extracted like transaction functions are. However, there are no restrictions on what it can do, unlike transaction functions.

Here is an example taken from the release notes:

Store an anonymous function in the tree:

StoredProcPath = [path, to, stored_procedure],

Fun = fun(Props) ->
          #{path := Path},
            on_action => Action} = Props
      end,

khepri_machine:put(
  StoreId,
  StoredProcPath,
  #kpayload_sproc{sproc = Fun}))}.

Register a trigger using an event filter:

EventFilter = #kevf_tree{path = [stock, wood, <<"oak">>]},

ok = khepri_machine:register_trigger(
       StoreId,
       TriggerId,
       EventFilter,
       StoredProcPath))}.

In the example above, as soon as the [stock, wood, <<"oak">>] node is created, updated or deleted, the anonymous function will be executed.

Stored procedure can be used (i.e. stored, replicated and executed) independently of triggers.

Under the hood, significant improvements were made to the khepri_fun module which is responsible for extracting the code of these anonymous functions. It is also getting large and probably deserves to be an independent library at this point.

If you have any comments and feedback, or if you started to play with the library in one of your projects, please share! I’m looking forward to listen to your experience!

LeonardB · February 18, 2022, 3:54pm

Congrats!

Love the idea of triggers.

Have you considered maybe supporting callbacks and/or notifications as well as Funs?
eg
{callback, Mod :: atom(), Fun :: atom()}
{notify, Method :: call | cast | info, Pid :: pid()}

I implemented something similar for my foundationdb layer using their watcher facilities.

dumbbell · February 18, 2022, 4:07pm

Not yet, but this would be a good addition. Thank you!

Update: I filed an issue to remember:

github.com/rabbitmq/khepri

Triggers: Support simple MFA calls and messages in addition to stored procedures

opened 04:16PM - 18 Feb 22 UTC

dumbbell

enhancement

This idea comes from some [feedback on the Erlang forum](https://erlangforums.co…m/t/khepri-a-tree-like-replicated-on-disk-database-library-for-erlang-and-elixir-introduction-feedbacks/438/30?u=dumbbell): > Have you considered maybe supporting callbacks and/or notifications as well as Funs? e.g. > ```erlang > {callback, Mod :: atom(), Fun :: atom()} > {notify, Method :: call | cast | info, Pid :: pid()} > ``` Indeed, it would be nice to have a simple way to specify a function call or a message+PID.

dumbbell · April 25, 2022, 2:04pm

Here is Khepri 0.3.0!

For the past two months, there have been several bug fixes and improvements. But most of the focus was put on the public API. Let me shamelessly copy-paste the release notes section about that:

The high- vs low-level API distinction is now gone. The public API is now exposed by khepri only. khepri_machine becomes an internal private module. As part of that khepri grew several new functions for common use cases and we will certainly add more in the future, based on the feedback.
Unix-like path are first-class citizen: all functions taking a native path ([stock, wood, <<"oak">>]) now accept Unix-like paths ("/:stock/:wood/oak"). In the process, the syntax of Unix-like paths evolved: atoms are represented as :atom and binaries are represented as-is, binary. The main reason is that using <<binary>> for binaries was difficult to read and type.
Payload and event filter records are now private. Payload types are automatically detected now, likewise for event filters. That said, it is still possible to use functions to construct the internal structures, but it should rarely be necessary.
khepri_tx, the module to perform Khepri calls inside transactions, will now expose the same API as khepri, except when functions don’t make sense in a transaction.

Here is an example of an old code and its newer version:

Up to Khepri 0.2.1:

%% `khepri_machine' had to be used for "advanced" use cases, though
%% `khepri' would have been fine in this example.
case khepri_machine:get(StoreId, [stock, wood, <<"oak">>]) of
    %% Accessing the data in the node payload was a bit complicated,
    %% requiring to pattern-match on the node properties map inside the
    %% result map.
    {ok, #{Path := #{data := Quantity}}} when Quantity < 100 ->
        %% We would also have to construct a payload record.
        Payload = #kpayload_data{data = 500},
        {ok, _} = khepri_machine:put(StoreId, [orders, wood, <<"oak">>], Payload),
        ok
    _ ->
        ok
end.

Starting from Khepri 0.3.0:

%% Now we have helpers for common use cases like simply accessing the data
%% of a single tree node. The piece of data is returned directly, returning
%% a default value if there is no data and bypassing error handling if we
%% don't care.
%%
%% Unix-like paths are used for the demonstration. Native paths, like in
%% the previous example, would work as well.
Quantity = khepri:get_data_or(StoreId, "/:stock/:wood/oak", 0),
if
    Quantity < 100 ->
        %% The payload record is automatically constructed internally. No
        %% need to mess with records.
        {ok, _} = khepri:put(StoreId, "/:orders/:wood/oak", 500),
        ok;
    true ->
        ok
end.

The documentation should be up-to-date with all these changes.

We also continued to improve khepri_fun, one of the key component used to implement transactions, mostly thanks to @the-mikedavis!

As always, I would loooove to hear from anyone who glanced at the code, the documentation or even started to play with Khepri

lpil · April 25, 2022, 10:02pm

The new API looks great, a nice improvement. Looking forward to trying this in future.

dumbbell · May 2, 2022, 2:19pm

The next version of Khepri will also accept Unix-like paths as Erlang binaries in addition to Erlang strings. The goal is to improve compatibility with Beam languages which implement strings as Erlang binaries, such as Elixir or Gleam.

This leads me to a question: do Elixir developers usually prefer a binding on top of an Erlang library to perhaps provide a more Elixir-y feeling to the API, or do you prefer to use the Erlang library directly?

Some time ago, I read somewhere that the latter was perhaps preferred but I don’t remember where. Also, perhaps this has changed over time as Elixir grew new features.

Note that I never wrote any Elixir code, thus this question

the-mikedavis · May 2, 2022, 9:10pm

Elixir wrappers over Erlang libraries are very common in my experience. For example: the AMQP Elixir wrapper over Rabbit’s amqp_client reorganizes the API a bit. Or SweetXml that wraps xmerl.

There are a few common reasons for wrappers:

to improve documentation (although EEP48 helps a lot with this)

to make the interface more Elixir-y

for example, fetch/2 and get/2/get/3 have colloquial specs in Elixir:

# get/2 uses `nil` as the default
@spec get(SomeDataStructure.t(), key :: term(), default :: term()) :: value :: term()
@spec fetch(SomeDataStructure.t(), key :: term()) :: {:ok, value :: term()} | :error

reorganizing the API to allow chaining pipes is very common, for example in Ets or Elixir’s own Map module

to do work at compile-time using macros, either for efficiency or convenience

I think Khepri’s API is very usable without Elixir bindings but there are some possible conveniences to add like using a sigil for khepri_path:path/0s:

defmodule Khepri.Path do
  @moduledoc "Elixir wrapper around `khepri_path`"

  @doc "Creates a native path or pattern at compile-time"
  defmacro sigil_P({:<<>>, _meta, [path]}, _opts) do
    path
    |> :khepri_path.from_string()
    |> Macro.escape()
  end
end

iex> import Khepri.Path
iex> ~P"/stock/wood/:oak"
["stock", "wood", :oak]
iex> ~P"/stock/*/:oak"   
["stock", {:if_name_matches, :any, :undefined}, :oak]
iex> ~P"/stock/**/:oak"
["stock", {:if_path_matches, :any, :undefined}, :oak]

This would just be a convenience though, I already like the API as-is

dumbbell · May 3, 2022, 1:43pm

Thank you for the detailed answer!

For the documentation, I can’t set EDoc options to enable EEP48 documentation chunks because I already set doclet and layout modules to format the HTML documentation, unfortunately…
I will look at the libraries you mentionned, in particular to enable calls pipelining. Just one question: what do you mean by "fetch/2 and get/2/get/3 have colloquial specs in Elixir"?
This sigil feature looks interesting. Do you know if it’s possible to write that as an Erlang module? This way, this could be directly in Khepri instead of a binding on top of it.

dch · May 3, 2022, 2:21pm

@dumbbell I think having the Elixir “API” directly in Khepri would be awesome. Having an additional wrapper library is silly, when mostly its just a matter of having an additional module in the existing erlang source:

-module('Elixir.Khepri').
-export([thing/1]).

thing(Bin) -> khepri:thing(Bin).

wrt sigils, I haven’t tried it but How to debug Elixir/Erlang compiler performance - Dashbit Blog | GitHub - michalmuskala/decompile might help.

dumbbell · May 3, 2022, 2:47pm

Thank you Dave for the feedback!

So having modules under the Elixir “namespace” is enough, which is great! I was wondering if it was possible, thank you for confirming.

You’re right, the article you shared shows how to “decompile” Elixir to Erlang. That should give me some clues about how to implement sigils in plain Erlang.

I will probably take some inspiration from the Ets binding Mike pointed me at. According to the README, that’s the sort of things I was looking for in terms of what Khepri should provide to have a good integration with Elixir.

LostKobrakai · May 3, 2022, 2:54pm

Sigils in are backed by either a function or macro named sigil_*/2 where * is a single character. You can probably create the function one just fine in erlang, but not a macro one. There’s more detailed info here: Syntax reference — Elixir v1.16.0

dumbbell · May 3, 2022, 2:55pm

Thank you!