Handling keyboard events

Hello friends.

Since Gleam v1.0.0 was released we have seen a huge surge of new users. Largely coming from other ecosystems they are unfamiliar with the BEAM have questions about how things work, how to do things, etc. Most of them I expected, but there is one common question that has been a surprise, one I don’t have an answer for: How to handle keyboard events.

We’ve done quite a bit of digging and it seems that we don’t have a way for an Erlang program to detect keyboard events, rather we have to receive a whole line of input. That is, an Erlang CLI program could not, for example, have a prompt that is responded to by pressing y, or UI that is navigated using the j and k or the arrow keys.

Here’s an example made using the Golang Charm libraries, which some of the people bring up and even say it’s the reason they stick with Golang.

Charm example

We’ve tried all the various NIF packages but we’ve had issues with latency, and they’re largely unsuited for most CLI programs as they want to be used and shared as escripts, and escripts do not bundle up NIFs.

It seems that prim_tty’s private read_nif has the capability identify these keyboard events, but the module does not expose this functionality. I have little understanding of this system so I couldn’t say what would be appropriate way of doing so.

I think it would be fantastic to be able to detect these keyboard events on the BEAM. With that the community could build Charm-like libraries, bringing us strength in an area which we are currently quite lacklustre. Would this be something we could add?

Thanks,
Louis

8 Likes

Terminals are quite slow, slow enough that tools like ncurses are based on a similar idea to JavaScript frameworks like React that diff a virtual DOM in order to optimize UI updates. So, some of the latency you’ve seen might be inherent to the terminal rather than the code. A minimal getch() wrapper, on Linux, seems responsive enough:

#include <erl_nif.h>
#include <termios.h>
#include <unistd.h>

static ERL_NIF_TERM getch(ErlNifEnv* env, int argc, const ERL_NIF_TERM argv[]) {
    struct termios oldattr, newattr;
    tcgetattr(0, &oldattr);
    newattr = oldattr;
    newattr.c_lflag &= ~(ICANON | ECHO);
    tcsetattr(0, TCSANOW, &newattr);
    int ch = getchar();
    tcsetattr(0, TCSANOW, &oldattr);
    if (ch == EOF) {
        return enif_make_atom(env, "eof");
    } else {
        return enif_make_tuple2(env, enif_make_atom(env, "ok"), enif_make_int(env, ch));
    }

}

static ErlNifFunc nif_funcs[] = {
    {"getch", 0, getch},
};

ERL_NIF_INIT(getch, nif_funcs, NULL, NULL, NULL, NULL)

With getch.erl:

-module(getch).
-export([getch/0, demo/0]).
-nifs([getch/0]).
-on_load(init/0).

init() ->
    ok = erlang:load_nif("./getch_nif", 0).

getch() ->
    erlang:nif_error(nif_library_not_loaded).

demo() ->
    io:fwrite("getch() demo: press a key to see its value!\n"),
    loop().

loop() ->
    case getch() of
        eof ->
            init:stop();
        {ok, Byte} ->
            case Byte of
                N when (N =:= $\n) or (N =:= $\r) ->
                    io:fwrite("You pressed enter.\n");
                N when N < $\s ->
                    io:fwrite("You pressed a control key.\n");
                $\s ->
                    io:fwrite("You pressed space.\n");
                N when N < 16#7F ->
                    io:fwrite("You pressed ~c\n", [N]);
                _ ->
                    % multi-byte input isn't treated kindly
                    ok
            end,
            loop()
    end.

As used:

$ gcc -I/usr/local/lib/erlang/usr/include/ -o getch_nif.so -fpic -shared getch_nif.c # or similar path
$ erlc getch.erl
$ erl -noinput -s getch demo
getch() demo: press a key to see its value!
You pressed t
You pressed e
You pressed s
You pressed t

That’s usable also in Eshell:

1> c(getch).
{ok,getch}
2> getch:getch().
{ok,116}
3> getch:getch().
{ok,101}
4> getch:getch().

So just getting keypresses is easy and should be reasonably fast. This is only some keypresses, as

  1. teminals simply can’t see all the events that GUIs can
  2. the terminal handles some keys specially, like Ctrl-C which was turned into a signal that BEAM intercepted to terminate both of the interactions above.
  3. the terminal also sends some keys (arrow-keys, alt-keys) as multi-byte escape codes, which are trickier to handle as timing starts to matter if you want to distinguish them from a lone escape key.

But that and some tactical terminal control codes can get you a lot of the way to a nice terminal interface for something simple - and not very robust, or portable. Since OTP’s already doing some of the portability work, for line-editing across platforms, it’d be nice to reuse that work.

A heavier example. This is an mp4, rendered from asciinema. In the bottom frame: an Erlang node, started with erl -noinput -sname keycodes -s tui init, with loads a NIF that starts controlling the terminal in a separate thread, and then polls to see if it should exit. In the top frame: an Eshell session is spawned to connect to that node and control the TUI from the ‘backend’. I suspect think this isn’t a very good way to do things vs. a C node or a port, but I liked it as a proof-of-concept.

5 Likes

@jrfondren can’t play the asciinema video

Try the gif.

1 Like

Being able to do this is something that we would like to introduce, but it has not been prioritized yet. There is a workaround using internal APIs here: Getch for OTP26 · Issue #8037 · erlang/otp · GitHub

4 Likes

We’ve been seeing latency with the smallest possible update rather than updating a whole TUI UI, so that’s not the problem unfortunately. We also referenced the TUI implementations in other languages and they didn’t do any rendering optimisation so it seems it’s not much of an issue.

Either way a NIF is not viable for CLI programs as can’t share them as part of an escript.

Good to hear, thank you.

Pure Erlang implementation of Sokoban in a terminal, based on that workaround.

EDIT: This works on Windows as-is, but with a very noticeable flicker as the screen is cleared. Performance will vary across terminals and computers, and vary especially if it’s run over an ssh connection. This is where ncurses-style tricks are vital.
sokoban

-module(sokoban).
-compile(export_all).
-record(game, {grid, moves, bound}).

sokoban(0) ->
    [" #####",
     " #.. #",
     "###  #",
     "# $  #",
     "# $ ##",
     "#@  #",
     "#####"];
sokoban(1) ->
    ["    #####",
     "    #   #",
     "    #$  #",
     "  ###  $##",
     "  #  $ $ #",
     "### # ## #   ######",
     "#   # ## #####  ..#",
     "# $  $          ..#",
     "##### ### #@##  ..#",
     "    #     #########",
     "    #######"].

new_game(N) ->
    G = grid_from(sokoban(N)),
    B = lists:foldl(fun ({X, Y}, {Bx, By}) ->
                            {max(X, Bx), max(Y, By)}
                    end, {0, 0}, maps:keys(G)),
    #game{moves=0, grid=G, bound=B}.

move("@ "++S)  -> " @"++S;
move("@."++S)  -> " &"++S;
move("& "++S)  -> ".@"++S;
move("&."++S)  -> ".&"++S;
move("@$ "++S) -> " @$"++S;
move("@$."++S) -> " @*"++S;
move("&$ "++S) -> ".@$"++S;
move("&$."++S) -> ".@*"++S;
move("@* "++S) -> " &$"++S;
move("@*."++S) -> " &*"++S;
move("&* "++S) -> ".&$"++S;
move("&*."++S) -> ".&*"++S;
move(S) -> S.

look(Grid, Loc={X, Y}, Delta={Dx, Dy}) ->
    case maps:get(Loc, Grid, false) of
        false -> [];
        Char -> [Char|look(Grid, {X+Dx, Y+Dy}, Delta)]
    end.
place(G, [], _, _) -> G;
place(G0, [Char|Look], Loc={X, Y}, Delta={Dx, Dy}) ->
    G1 = maps:put(Loc, Char, G0),
    place(G1, Look, {X+Dx, Y+Dy}, Delta).

move(Game, Delta) ->
    Soko = soko(Game#game.grid),
    L0 = look(Game#game.grid, Soko, Delta),
    L1 = move(L0),
    if
        L0 == L1 -> Game;
        true ->
            G1 = place(Game#game.grid, L1, Soko, Delta),
            Game#game{grid=G1, moves=Game#game.moves+1}
    end.

soko(G) ->
    case find($@, G) of
        false -> find($&, G);
        Loc -> Loc
    end.

find(Char, G) -> find(Char, G, maps:keys(G)).
find(Char, G, [Loc|T]) ->
    case maps:get(Loc, G) of
        Char -> Loc;
        _ -> find(Char, G, T)
    end;
find(_, _, []) -> false.

grid_from(L) ->
    {_, Grid} = lists:foldl(fun (Row, {Y, G0}) ->
                                    {_, G1} = lists:foldl(
                                                   fun (Char, {X, G2}) ->
                                                           {X+1, maps:put({X, Y}, Char, G2)}
                                                   end, {1, G0}, Row),
                                    {Y+1, G1}
                            end, {1, #{}}, L),
    Grid.

redraw(Term, #game{grid=G, moves=Moves, bound={_, By}}) ->
    write(Term, "\e[1;1H\e[2JE/H for easier/harder level, arrow keys to move\r\nMoves: ~p\r\n", [Moves]),
    draw(Term, G, 1, By).

draw(_, _, Row, Bound) when Row > Bound -> ok;
draw(Term, G, Row, Bound) ->
    write(Term, "~s\r\n", [look(G, {1, Row}, {1, 0})]),
    draw(Term, G, Row+1, Bound).

write(Term, Fmt, Args) ->
    ok = prim_tty:write(Term, unicode:characters_to_list(io_lib:format(Fmt, Args))).

start() -> play(prim_tty:init(#{}), new_game(0)).

wait(Term) ->
    receive
        {_, {data, <<"E">>}} ->
            play(Term, new_game(0));
        {_, {data, <<"H">>}} ->
            play(Term, new_game(1));
        _ ->
            wait(Term)
    end.

play(Term, Game) ->
    redraw(Term, Game),
    case find($$, Game#game.grid) of
        false ->
            write(Term, "*** You won in ~p moves ***\r\n", [Game#game.moves]),
            wait(Term);
        _ ->
            receive
                {_, {data, <<"\e[A">>}} -> play(Term, move(Game, {0, -1}));
                {_, {data, <<"\e[B">>}} -> play(Term, move(Game, {0, 1}));
                {_, {data, <<"\e[C">>}} -> play(Term, move(Game, {1, 0}));
                {_, {data, <<"\e[D">>}} -> play(Term, move(Game, {-1, 0}));
                {_, {data, <<"E">>}} -> play(Term, new_game(0));
                {_, {data, <<"H">>}} -> play(Term, new_game(1));
                _ -> play(Term, Game)
            end
    end.
10 Likes

Wow, what a fun demo, and good evidence of the use of this API. Thank you

An unrelated question, what is that -user flag? I couldn’t find reference of it in the documentation.

-user is an (as you noticed) undocumented flag that can be used to replace the user_drv process in an Erlang system. Before Erlang/OTP 26, it was the only way to replace the Erlang shell so both Elixir, LFE and others used that. Since Erlang/OTP 26 the same thing can be achieved with passing the -noinput flags and then use shell:start_interactive/1.

I haven’t tested, but most likely the above example could also be started like this:

erl -noinput -s sokoban
1 Like

This erl -noinput -s sokoban invocation fails as prim_tty:init/1 tries to re-register user_drv_writer.

I struggled for a bit to get the Sokoban example running in an escript, and the trick is actually -user escript

With a rebar3 escript project, that just means modifying rebar.config:

{escript_emu_args, "%%! +sbtu +A1 -user escript\n"}.