It all started when I saw this message upon login on my Ubuntu server:
There are 2 zombie processes.
I tried to locate these zombies:
# ps axo stat,ppid,pid,comm | grep -w defunct
Zs 58904 59064 epmd <defunct>
Z 58904 59065 epmd <defunct>
ppid=58904 points to my Erlang release running inside the Docker container:
# ps auxwww | grep 58904
ubuntu 58904 8.8 0.2 2362552 182348 ? Ssl 16:03 0:28 /opt/taurus/bin/taurus -Bd -Bi -- -root /opt/taurus -bindir /opt/taurus/erts-14.1.1/bin -progname opt/taurus/bin/taurus -- -home /home/ubuntu -- -noshell -noinput -boot /opt/taurus/releases/latest/start -mode embedded -boot_var SYSTEM_LIB_DIR /opt/taurus/lib -config /opt/taurus/config/docker/app.config -sname taurus -setcookie awesome_cookie -- -- foreground --
I’m using nothing special in my vm.args:
I narrowed down the issue and this situation happens early (less than 5sec) after I started my container.
Inside the container, I see this:
$ ps auxwww | grep epmd | grep -v grep
ubuntu 78 0.0 0.0 3740 100 ? S 16:03 0:00 /opt/taurus/erts-14.1.1/bin/epmd -daemon
ubuntu 132 0.0 0.0 0 0 ? Zs 16:03 0:00 [epmd] <defunct>
ubuntu 133 0.0 0.0 0 0 ? Z 16:03 0:00 [epmd] <defunct>
In addition to the 2 defunct
epmd, a new one is happily running.
This might be related to this RabbitMQ issue. But honestly, I’ve no clue.
Ubuntu 22.04 LTS
Help appreciated as I’m new to using Erlang within Docker.
I also tried the suggestion here, but I still get these
-kernel inet_dist_use_interface 127.0.0.1
@tsloughter Thanks for the pointer. What about the following settings in my
-env ERL_DIST_PORT 4369
Your solution worked perfectly.
My container is totally isolated and beside manually connecting to it using
remote_console (relx), it doesn’t connect to any other node.
Would you recommend the above in
@tsloughter When trying to connect to my node inside the container, I get this error:
$ echo $ERL_DIST_PORT
$ bash -x /opt/taurus/bin/taurus remote_console
+ erl_rpc erlang is_alive
+ echo 'Node is not running!'
What did I do wrong?
I’ve build my release using this
$ rebar3 --version
rebar 3.22.0 on Erlang/OTP 26 Erts 14.1.1
Looks right. I’d simplify the
vm.args just in case though, you don’t need any of that, the
taurus script generated by rebar3/relx will handle everything if you set
But note you can’t do this with
-env ERL_DIST_PORT 4369 since that is an arg to Erlang VM and thus won’t be picked up by the shell script
taurus. Since you are using Docker you’ll want to set it in the Dockerfile or with
-e or under
environment if using docker compose.
@tsloughter That’s correct. I’ve set
ERL_DIST_PORT in my Dockerfile to make things as described in your post. But i’m still unable to connect to my node using
Hm, weird. It is a tough one to debug without being able to play with it. Is it public by chance or can you reproduce in a public repo?
@tsloughter not publicly accessible unfortunately. Just for me to understand: in your case, you disabled
epmd and were able to
remote_console into the node inside
If yes, I must be doing something wrong.
Yea. And setting
ERL_DIST_PORT will automatically add
-start_epmd false to the args, no need to disable manually.
@tsloughter it seems that
erl_call can’t connect to my node.
$ echo $ERL_DIST_PORT
$ /bin/sh -x /opt/taurus/bin/taurus remote_console
/opt/taurus/erts-14.1.1/bin/erl_call -R -c awesome_cookie -address 4369 -timeout 60 -a erlang is_alive
erl_call: failed to connect to node with address ":4369"
+ [ 1 -eq 0 ]
+ return 1
+ echo Node is not running!
Node is not running!
$ ss -antp | grep 4369
This is strange. If i deploy my app outside the container, the above steps worked as expected and I can
remote_console. But from within the Docker container, my release isn’t listening on port
@tsloughter found it
The only thing that need to be set is
ERL_DIST_PORT environment variable.
Thanks a lot @tsloughter. You made my day.