Segmentation Fault when using init:restart() in docker

Hi all,

I am having a problem with a call init:restart() in my application. Sometimes it goes just fine but sometimes it crashes the beam with segmentation fault (usually the first time it works fine and the second time it crashes). Additionally, this only happens when running in a docker container, when running on host it doesn’t occur. I don’t see any other relevant logs, it just crashes suddenly.

The app is written in Elixir, it’s an umbrella application dedicated to data processing. One of the apps is a Phoenix application providing a UI. I can provide more information about the app, but sadly cannot share it.

Any ideas how to debug such a problem?

Any help is greatly appreciated. Thanks!

2 Likes

you could try to bisect the problem: you have several applications running in your system.
try to disable some of the and look if the problem appears.
If just one application is responsible for the crash, that could reveal a trace to it…

3 Likes

Yeah, that’s a good idea. Will try that.

1 Like

It is possible to get a core dump of segfaulting BEAM? If yes, then you can analyze BEAM core dump.

It’s even better if you can run it under debugger.

2 Likes

I was unable to get the core dump for some reason. Whatever I tried it didn’t make the dump.

However, I was able to identify that it was only happening on Alpine linux, which explain why it only happened in Docker. My guess is some NIF I use doesn’t play nice with libraries in Alpine (which often uses different versions than other Linux distros, like libc vs glibc, busybox etc.). This is supported by a test when I removed most dependencies and the restart worked, so it’s probably a NIF and not BEAM itself. I also later found some other issues for segfaults in Alpine (probably unrelated, like this).

I will solve this by moving to standard elixir images based on Debian, which works fine. Interestingly, the resulting image of our app on Debian is smaller than on Alpine, so there’s really no reason to use it anymore.

Thanks for the help.

2 Likes

It is not correct. libc is not a separate library, it is a well known name for set of different libraries that implement libc standard.

Alpine uses musl and frankly speaking there is not so much use in alpine as people like to say, when it comes to real world.

When you will install everything that you may require in container (like vim-tiny, curl, etc), you will get the same size as ubuntu.

When you use any alternative to glibc (like musl or diet-libc) you must compile with this exact version of libc implementation. Glibc will give you compatibility. Other libraries do not promise it.

So:

  1. compile erlang under the same version of alpine
  2. take ubuntu and I think that crashes will go away
2 Likes

Right, that was an example I took from somebody and I see an incorrect one, sorry :slight_smile:

In our case erlang was compiled under the same version of Alpine, or at least I’d think so when using the official Elixir image (which is based on official erlang image). When debugging I found joken to be one of the libraries causing the issues, so all signs point to some NIF having a problem with something in Alpine. I didn’t dig deeper, because it would most likely be too complicated and even if I found the issue it would be very complicated to resolve, for me at least. So yeah, switching to debian is the way to go.

1 Like