Skip to content

Core dump when using vector instructions inside a lima VM on a Mac M4 #3417

Open
@MishaVeldhoen

Description

@MishaVeldhoen

Description

Context

  • Lima version: 1.0.6
  • Host OS: Sequoia 15.4
  • Host system: Apple M4 Pro, 24 GB
  • Default settings

Description

When inside a lima VM, I try to use the python JAX library, which has a JIT compiler that tries to use the available vector instructions. On the host system, as well as inside an orbstack VM, this works without issue, but inside the lima VM I get a core dump.

Reproduction steps

limactl start
lima

# Setup
sudo apt update
sudo apt install pipx gdb
pipx install uv
pipx ensurepath
source ~/.bashrc
mkdir ~/test && cd ~/test
uv init
uv add jax[cpu]

# Causing the crash
ulimit -c unlimited
source .venv/bin/activate
python3
>>> import jax.numpy as jnp
>>> jnp.arange(2)

# Checking the core dump
gdb python3 core
(gdb) x/10i $pc
=> 0xf2606c4d7008:      index   z0.s, #0, #1
   0xf2606c4d700c:      ldr     x8, [x0, #24]
   0xf2606c4d7010:      mov     x0, xzr
   0xf2606c4d7014:      ldr     x8, [x8]
   0xf2606c4d7018:      str     d0, [x8]
   0xf2606c4d701c:      ldp     x29, x30, [sp], #16
   0xf2606c4d7020:      ret
   0xf2606c4d7024:      udf     #0
   0xf2606c4d7028:      udf     #0
   0xf2606c4d702c:      udf     #0

More information

I tried to do the same thing inside an orbstack VM, and I do not get a core dump.

I tried to do the same thing inside a lima VM on an M3 machine, and I do not get a core dump.

On the M4 system, when I compare lscpu between lima and orbstack, it seems like inside the lima VM the host system's vector instructions are exposed (but cause a core dump when used), while inside the orbstack VM, the vector instructions are not exposed.

Lima:

Architecture:             aarch64
  CPU op-mode(s):         64-bit
  Byte Order:             Little Endian
CPU(s):                   4
  On-line CPU(s) list:    0-3
Vendor ID:                Apple
  Model name:             -
    Model:                0
    Thread(s) per core:   1
    Core(s) per cluster:  4
    Socket(s):            -
    Cluster(s):           1
    Stepping:             0x0
    BogoMIPS:             48.00
    Flags:                fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 asimddp sha512 asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 flagm2 frint svei8mm svebf16 bf16 afp sme smei16i64 smef64f64 smei8
                          i32 smef16f32 smeb16f32 smef32f32 sme2 smei16i32 smebi32i32
NUMA:
  NUMA node(s):           1
  NUMA node0 CPU(s):      0-3

Orbstack:

$ orbctl run lscpu
Architecture:             aarch64
  CPU op-mode(s):         64-bit
  Byte Order:             Little Endian
CPU(s):                   12
  On-line CPU(s) list:    0-11
Vendor ID:                Apple
  Model name:             -
    Model:                0
    Thread(s) per core:   1
    Core(s) per cluster:  12
    Socket(s):            -
    Cluster(s):           1
    Stepping:             0x0
    CPU(s) scaling MHz:   100%
    CPU max MHz:          2000.0000
    CPU min MHz:          2000.0000
    BogoMIPS:             48.00
    Flags:                fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 asimddp sha512 asimdfhm dit uscat ilrcpc flagm sb dcpodp flagm2 frint bf16 afp

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions