I’m trying to compile an erlang.mk
-using application in an amd64 container on an arm64 host. When it gets to the part where it bootstraps rebar3, it fails with a “Segmentation Fault”.
Latest erlang.mk, rebar3 3.24.0, docker.io/erlang:27.3.3-alpine
base image.
This happens with both docker and podman.
I’m pretty good with computers (this is British understatement, incidentally), but where do I even start to debug this?
Assuming that QEMU is used here, this is a known bug in QEMU user-mode emulation.
The gist of it is that we map all code twice, with one executable region for running code, and one writable region for updating it. As they map to the same physical memory any update in the writable region magically “appears” in the executable one.
This works fine on actual hardware, and when QEMU runs in full-system virtualization mode it has a full view of the page tables, but user-mode emulation is a very thin layer that does nothing beyond translating instructions. When we mmap
the same memory twice, QEMU has no idea that they are connected, and does not understand that previously translated instructions in the executable region must be updated when the writable region is modified.
For arm64
on amd64
, QEMU accepted a patch that fixed the issue by piggy-backing on the instruction cache invalidation arm64
requires, but unfortunately there are no corresponding instructions on amd64
we can use. Technically, invalidating all translations everywhere on cpuid
would work, but I don’t expect that to be accepted.
You can work around it by passing the +JMsingle true
emulator option which disables the dual-mapping.
1 Like
As far as I can tell, both podman and docker are using Rosetta (this is an M1 Mac), rather than QEMU. This is according to the documentation; I’m not sure how to confirm that, though.
It also doesn’t happen when building a rebar3 project, rather than an erlang.mk project, which is weird.
I’ll try this and see if it fixes it.
It also occurs that I should attempt to build/bootstrap rebar3 myself, just to rule out erlang.mk causing any problems (though I have no idea how that could be).
Yeah, this seems to fix it.
I might dig around some more (probably using Lima) to check whether this is a QEMU-only bug, or whether it occurs on Rosetta as well (Lima gives you the option of replacing the podman VM with your own, apparently, which means I can actually choose which emulation is used).
I did some digging around. On my MacBook Pro M1 (aarch64), building an x86_64 container:
- qemu-system: No segfault.
- qemu-user: segfault.
- rosetta: segfault.
Adding +JMsingle true
fixes the segfault on both qemu-user and rosetta.
Commands used:
limactl create --name qemu-system --arch x86_64 --vm-type qemu template://podman
limactl create --name qemu-user template://podman
limactl create --name rosetta --vm-type=vz --rosetta template://podman
Both Docker Desktop and Podman are using Rosetta on this machine.
Rosetta is significantly quicker than qemu (in either mode) when building amd64 on arm64.
Building my chosen application took ~1380 seconds with qemu-system, ~800 seconds with qemu-user and ~290 seconds with Rosetta (Lima, Docker Desktop and Podman were within a few seconds of each other).
That’s interesting. Last we checked , the Intel JIT works just fine in Rosetta despite the dual-mapping when running directly on top of MacOS. Perhaps there’s something funny going on when invoked inside a VM. I would’ve expected the cpuid
instruction we issue out of an abundance of caution to rectify this (as documented in the Intel manual), but perhaps they haven’t emulated this.
I’ll look into it as time allows. 