Kaos - A combinator library for generating advanced random data structures

kaos

I recently decided to learn Erlang. As my first endeavor I created a combinator library for generating dynamic random values and data structures. The documentation is quite extensive and gives many usage examples.

The one thing missing is pre-defined character sets. This is intentional as I intend to add an additional module that offers a more robust and comprehensive system of composing character sets from Unicode code points.

With that said, it is quite easy to create your own character set generators as can be seen from the examples included in this post.

I would love to hear feedback on what features or enhancements might be useful, or any comments on the API design. I am happy to answer any questions or how-tos.

Features

Basic Types

Boolean
Bit
Byte
Bitstring
Binary
Integer
Float
String

Data Structures

Array
List
Map
Set
Tuple
(And the main data structures of the standard library such as dict, orddict, gb_set, etc.)

Operations

All
Choose
Const
Cycle
Iterate
Recurse
Weighted

Modifiers

Filter
Flatmap
Map
Shuffle

Examples

Random Password Generator

Password will contain 2 special characters, 1 uppercase letter, 1 lowercase letter, 1 number, and the remainder containing those or other symbols. Then the chosen characters are shuffled.

1> GenSpecialW = {_, GenSpecialC} = kaos:weighted_from_list([$!, $@, $#, $$, $%, $^, $&, $*, $(, $), $_, $+, $-, $=]).
2> GenUpperW = {_, GenUpperC} = kaos:weighted_from_range($A, $Z).
3> GenLowerW = {_, GenLowerC} = kaos:weighted_from_range($a, $z).
4> GenNumberW = {_, GenNumberC} = kaos:weighted_from_range($0, $9).
5> GenSymbolW = kaos:weighted_from_list([$(, $), $_, $+, $-, $=, ${, $}, $[, $], $:, $;, $<, $>, $,, $., $?, $/]).
6> GenFiller = kaos:weighted([GenSpecialW, GenUpperW, GenLowerW, GenNumberW, GenSymbolW]).
7> GenAll = kaos:all([
7>   kaos:list_of(kaos:const(2), GenSpecialC),
7>   kaos:list_of(kaos:const(1), GenUpperC),
7>   kaos:list_of(kaos:const(1), GenLowerC),
7>   kaos:list_of(kaos:const(1), GenNumberC),
7>   kaos:list_of(kaos:const(12 - 5), GenFiller)
7> ]).
8> GenPass = kaos:flatmap(fun (Gs) -> kaos:shuffle(lists:flatten(Gs)) end, GenAll).
9> {ok, [Pass]} = kaos:generate(GenPass, 909, 1).
10> Pass.
"6+LP!J}vaj1sg(+K:("

Random JSON Document Generator

1> GenNestSize = kaos:integer(1, 4).
2> GenFloat = kaos:float(-12.0, 12.0).
3> GenInteger = kaos:integer(-12, 12).
4> GenString = kaos:string_of(kaos:integer(1, 12), kaos:integer($a, $z)).
5> GenPrimitive = kaos:choose([kaos:boolean(), GenFloat, GenInteger, GenString]).
6> GenJson =
6>   kaos:recurse(fun (Depth) ->
6>     case Depth < 3 of
6>       true -> kaos:weighted([
6>                 {3, GenPrimitive},
6>                 {2, kaos:choose([
6>                       kaos:list_of(GenNestSize, kaos:recurse(fun(_) -> GenPrimitive end)),
6>                       kaos:map_of(GenNestSize, GenString, kaos:recurse(fun(_) -> GenPrimitive end))
6>                 ])}
6>               ]);
6>       false -> GenPrimitive
6>     end
6>   end).
7> {ok, [Json]} = kaos:generate(GenJson, 909, 1).
8> io:format("~ts~n", [json:format(Json)]).
[
  {
    "aavquho": { "sitygndpvol": 2 },
    "butquytalbd": [
      [-8,{
          "mq": false,
          "zkcrikbkpb": 5.512894632507528
        }],
      "oeggzc",
      true
    ],
    "q": 6.649188725917739,
    "qb": [
      [false,false,"suzsscusndhq"],
      "ysvoe",
      "yjidecscisc"
    ]
  },
  false,
  false
]
ok

Hex Docs

Hex Package

Github

8 Likes

@EarthCitizen Thanks for sharing. Any github link?

@zabrane I updated the post to have the additional links.

1 Like

What is the difference from PropER ?

1 Like

Honestly, I had never heard of PropER, but looking at the documentation for it, PropER seems to be a framework for property testing. Hence it having shrinking and a global size and resize.

I see overlap in the ability to generate values. One thing I do not see in PropER is flatmap. Unless it is calling it something else and I a missing it.

Kaos is not intended to be a property testing framework (though someone could certainly use it as the basis for one). Kaos is simply a general purpose workhorse for generating random values and structures.

For example, in the documentation (and in the initial post), you will find an example of randomly generating a password with specific character content requirements (https://hexdocs.pm/kaos/0.1.0/examples.html#strong-password-generator). This could be use in a production system outside of testing, if there were a need to pre-generate passwords.

Additionally, with Kaos, you can actually stream the generated values to somewhere else as they are created:

1> %% On the sink node (register a simple receiver)
1> register(
1>   sink,
1>   spawn(fun Loop() ->
1>     receive
1>       {sample, V} ->
1>         io:format("~p~n", [V]),
1>         Loop();
1>       stop ->
1>         ok
1>     end
1>   end)
1> ).
2> %% On the source node (replace HOST with the sink host)
3> Merge = fun(V, N) -> {sink, 'sink@HOST'} ! {sample, V}, N + 1 end.
4> {ok, Count} = kaos:generate_into(kaos:integer(1,3), 909, 1000, Merge, 0).

I was curious, because outside of property-based testing, generation of complex and/or nested random terms is seldom used.

Brief look at the code suggests that it uses rand module, which is not cryptographically strong. To create a secure random password, one should use routines from crypto module. Also, normally one wants to create a random password with given entropy. So perhaps this example needs some sort of disclaimer.

I was curious, because outside of property-based testing, generation of complex and/or nested random terms is seldom used.

Then this library is not for you. I would suggest against you using it as you would not find it useful. You should stick to PropER. Stick to libraries that suit your needs. This library certainly does not intend to replace your need for property testing.

Brief look at the code suggests that it uses rand module, which is not cryptographically strong

I am new to Erlang, and this is useful feedback. This is the kind of feedback that I am looking for. This library is not a “stable” release so all is good. Feel free to open an issue for any bugs or major issues you see such as that.

1 Like

@EarthCitizen Welcome to the Erlang community! It’s great to see new developers contributing libraries like kaos. One big advantage of your library is that it appears to be standalone, which can be really useful for developers who want a lightweight solution without heavy dependencies.

@ieQu1 feedback about switching from the rand module to crypto for password generation is definitely worth addressing - it’s an important security consideration that will make your library much more robust.

Hope you’ll continue working on kaos and developing it further. The Erlang ecosystem benefits from having more libraries and contributors like you!

2 Likes

Cryptographically strong random number generators are generally quite slow compared with pseudo-random generators.
They should be used if you need cryptographic quality, as in password generation.
They may be used to generate seeds for PRNGs.
If you need to generate 10s of millions of random numbers,
then you should not be using cryptographically strong RNGs (except for seeding).

2 Likes

Click on the documentation link. The </> links will take you to gitbub.
That’s how I got it.

Thank you for the welcome!

Yes, that is correct. kaos is standalone. It only uses the standard library.

If you have any feature requests or comments, please feel free to leave a comment here or raise an issue on the Github repo. If you find a use case in anything that you do, I would love to know about it!

1 Like

Thank you for this additional information. If using strong RNG to generate the seed is good enough, it sounds like that would be the way to go.

It depends on the purpose of the library. If you designed it specifically for password generation, then seeding PRNG with strong RNG isn’t the correct way. If you do so, entropy of your password will be always =< entropy of the seed, since PRNG is a deterministic algorithm.
If you designed it for something like Monte-Carlo simulation, then it doesn’t matter.

I did not design it for a specific use case. The goal is to fit any need of random data generation.