Description
Provide a reproducer for a bug that is discussed at:
Bug summary:
When -split-functions -lite=1
is used on AArch64 with the below MongoDB setup, it crashes at runtime. Doing a tentative code-layout size estimation for cold blocks does not trigger this bug (using the now-merged patch of #96609). However, the bug should not have occurred regardless. Point 2 below has more details (see here).
Reproducer:
Summary of reproducing options:
Reproducing option | Follow from step |
---|---|
Reproduce the crash directly. Uses pre-generated: • input binary (mongod.tar.gz) • bolted profile (perf.boltdata) |
Step (3) |
Generate sampled profile and then reproduce the crash. Uses pre-generated: • input binary (mongod.tar.gz) |
Step (2) |
Generate all artifacts from sources. Compiles stage-0 MongoDB and follows all steps. |
Step (1) |
NOTE: Find all attachments at the end of this post. The mongod binary (in mongod.tar.gz) has stripped debug data due to upload size requirements. If those are needed please follow all steps to get a binary with debug info.
1. Generate MongoDB stage-0 binary from sources:
1.A. Instructions with docker:
Get this Dockerfile for compiling mongodb on AArch64 using:
# Compile mongodb v7.0.5 with clang 18.1.8 (stage-0):
docker build --progress=plain --tag 'tmp-mdb-stage-0' .
# Then, 'pull' the binary from container to the host, given host is Ubuntu 22.04/24.04 (preferred)
docker run --rm --entrypoint cat tmp-mdb-stage-0 mongo/build/install/bin/mongod > mongod
chmod +x mongod
# once binary is retrieved the image could be deleted
docker rmi tmp-mdb-stage-0
Dockerscript tested on AArch64 Ubuntu LTS hosts:
- Noble: 24.04.1, 6.8.0-1016-aws, docker 27.3.1
- Jammy: 22.04.4 / 22.04.5, on 6.8.0-1016-aws / 6.8.0-1017-aws (metal or not), docker 24.0.5 / 25.0.3
Note: it may run on other AArch64 hosts (e.g., AL2023) but is not recommended. Prefer building on Ubuntu 22/24. After building, extract it from Docker to perform steps 2 and 3 directly on the host. Otherwise, more might be needed for generating a perf profile and continuing with the steps within the docker container.
1.B. Sample Instructions without docker (alternative):
Show details
**Note: If compiling from sources you may need to consult MongoDB documentation and/or adjust commands/patch below slightly, depending on your setup.**Install required software and compile mongodb from sources.
The patch mongod.patch needs to be applied. It deals with a compilation error on a library dependency.
Configuration this was tested:
Software | Version |
---|---|
OS | Ubuntu 22.04.3 LTS |
Kernel | 6.8.0-1016-aws |
clang | 18.1.8 (++20240615103753+3b5b5c1ec4a3-1 |
Python | 3.10.12 |
# 1. pre-requisites
sudo apt-get install build-essential
# if ran into errrors, you may also need:
sudo apt install \
libcurl4-openssl-dev \
liblzma-dev \
libssl-dev \
python3-pip \
python3-venv \
openjdk-21-jdk
# 2. get mongo sources
git clone --branch r7.0.5 https://github.com/mongodb/mongo.git
# 3. setup python environment
python3 -m venv mongo-db-venv --prompt mongo
source mongo-db-venv/bin/activate
# 4. install mongod dependencies and apply needed patch
cd mongo
pip install --upgrade pip
pip install -r etc/pip/compile-requirements.txt
git am mongo.patch # apply patch
# 5. compile stage-0 binary
python3 buildscripts/scons.py install-mongod \
CC=clang-18 CXX=clang++-18 \
CCFLAGS="-fno-omit-frame-pointer -Wno-deprecated-non-prototype -Wno-enum-constexpr-conversion" \
LINKFLAGS="-Wl,-q"
cd .. # back to parent dir
2. Generate profile & BOLT optimize
2.a. Get YCSB harness
This is needed for initializing the database with some data, and then running a simple workload to trigger the crash.
# 1. Get maven
wget https://dlcdn.apache.org/maven/maven-3/3.9.6/binaries/apache-maven-3.9.6-bin.tar.gz
mkdir -p maven
tar -xf apache-maven-*-bin.tar.gz -C maven
# 2. Get and build YSCB from sources:
git clone --branch production https://github.com/mongodb-labs/YCSB.git
orig_dir=$(pwd)
cd YCSB/ycsb-mongodb
$orig_dir/maven/apache-maven-3.9.6/bin/mvn clean package
2.b. Build a dataset
# 1. Start the db service (in one tab/pane/window, as this is blocking)
mkdir -p mongo/dataset # empty dataset
./mongod --dbpath=./mongo/datase
# 2. Create a dataset (in a second tab)
cd YCSB/ycsb-mongodb
bin/ycsb \
load mongodb -P workloads/workloada \
-s \
-p recordcount=20000 \
-p operationcount=10000
# 3. Stop the db service (control-d)
2.c. Record a perf profile
# 1. Start db service (in one tab, as this is blocking)
./mongod --dbpath=./mongo/dataset
# 2. Find its process id and verify manually that you got the relevant pid:
# (sample command)
PID=$(ps uax | grep mongod | grep -v grep | awk '{print $2 }')
# 3. Record samples (in another tab)
perf record -e cycles:uP -p $PID
# 4. Start benchmarking (in another tab)
bin/ycsb run mongodb -P workloads/workloada \
-s \
-p recordcount=20000 \
-p operationcount=10000
# 5. Stop perf record once YCSB is done. Also kill the db service (control-d).
3. Generate BOLT'ed binary
3.a. Compile LLVM sources:
Checkout latest main without the patch of #96609:
# 0. clone llvm/llvm-project and set the below var:
llvm=/path/to/llvm/dir
# 1. checkout the latest patch w/o the pr improvement:
git checkout cb9bacf57d5c~1
# 2. compile BOLT from sources (adjust commands):
cmake -G Ninja .. \
-DLLVM_TARGETS_TO_BUILD=AArch64 \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_ASSERTIONS=ON \
-DLLVM_ENABLE_PROJECTS='clang;bolt;lld' \
-DLLVM_USE_LINKER=lld
ninja ...
3.b. Generate BOLT'ed binary
# 1. convert profile:
$llvm/build/bin/perf2bolt -p perf.data -o perf.boltdata --nl mongod
# 2. compile bolted binary:
$llvm/build/bin/llvm-bolt mongod -o mongod.bolt --data perf.boltdata -lite=1 -split-functions
3.c. Observe runtime crash..
# 1. start the database service (in one tab, as the below is blocking)
./mongod.bolt --dbpath=./mongo/dataset
# 2. start benchmarking (in another tab)
cd YCSB/ycsb-mongodb
./bin/ycsb run mongodb -P workloads/workloada \
-s -threads 1 \
-p recordcount=20000 \
-p operationcount=10000
Binaries and aux files:
All files are hosted on this Github Gist: 9eb878f73e18fb9d3f996ae7c59d4792.
Binary | Description | Details |
---|---|---|
mongod.tar.gz | Stage-0 binary of mongod service. | Extract with tar -xvf mongod.tar.gz |
perf.boltdata | Converted fdata BOLT profile. | Needed for optimizing mongod binary with BOLT. |
mongod.patch | Patch for mongodb sources. | Needed for compiling mongodb with clang-18. |
llvm-bolt.txt | Output of llvm-bolt . |
Not needed for compilation/optimization. |
perf2bolt.txt | Output of perf2bolt profile conversion. |
Not needed for compilation/optimization. |
Binary | Checksum | Details |
---|---|---|
mongod | f379208ed066b85dfb30d317e1758e9e | Generate using md5sum mongod |