Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault running SimNetwork under verilator #1885

Open
3 tasks done
sethp opened this issue May 24, 2024 · 11 comments
Open
3 tasks done

Segfault running SimNetwork under verilator #1885

sethp opened this issue May 24, 2024 · 11 comments
Labels

Comments

@sethp
Copy link

sethp commented May 24, 2024

Background Work

Chipyard Version and Hash

Release: N/A
Hash: ef71dfd

OS Setup

+ uname -a
Linux cerf 6.6.28-1-MANJARO #1 SMP PREEMPT_DYNAMIC Wed Apr 17 13:19:22 UTC 2024 x86_64 GNU/Linux
+ lsb_release -a
LSB Version:	n/a
Distributor ID:	ManjaroLinux
Description:	Manjaro Linux
Release:	23.1.4
Codename:	Vulcan
(partial) `printenv`
CONDA_EXE=/home/seth/miniforge3/bin/conda
_CE_M=
_CE_CONDA=
CONDA_PYTHON_EXE=/home/seth/miniforge3/bin/python
CONDA_SHLVL=2
CONDA_BACKUP_PATH=/home/seth/Code/src/github.com/ucb-bar/chipyard/software/firemarshal:/home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env/bin:/home/seth/Code/src/github.com/ucb-bar/chipyard/software/firemarshal:/home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env/riscv-tools/bin:/home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env/riscv-tools/bin:/home/seth/Code/src/github.com/ucb-bar/chipyard/software/firemarshal:/home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env/riscv-tools/bin:/home/seth/Code/src/github.com/ucb-bar/chipyard/software/firemarshal:/home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env/bin:/home/seth/miniforge3/condabin:/usr/bin:/home/seth/.bun/bin:/home/seth/perl5/bin:/home/seth/Code/bin:/home/seth/.cargo/bin:/home/seth/.krew/bin:/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin:/home/seth/.nix-profile/bin:/nix/var/nix/profiles/default/bin:/home/seth/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/opt/android-sdk/cmdline-tools/latest/bin:/opt/android-sdk/platform-tools:/opt/android-sdk/tools:/opt/android-sdk/tools/bin:/usr/lib/emscripten:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/usr/lib/rustup/bin:/var/lib/snapd/snap/bin:/home/seth/Code/bin:/usr/local/kubebuilder/bin
JAVA_HOME=/home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env/lib/jvm
JAVA_LD_LIBRARY_PATH=/home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env/lib/jvm/lib/server
CONDA_PREFIX=/home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env
CONDA_DEFAULT_ENV=/home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env
CONDA_PROMPT_MODIFIER=(/home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env) 
CONDA_PREFIX_1=/home/seth/miniforge3
RISCV=/home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env/riscv-tools
LD_LIBRARY_PATH=/home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env/riscv-tools/lib
GSETTINGS_SCHEMA_DIR_CONDA_BACKUP=
GSETTINGS_SCHEMA_DIR=/home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env/share/glib-2.0/schemas
XML_CATALOG_FILES=file:///home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env/etc/xml/catalog file:///etc/xml/catalog
JAVA_HOME_CONDA_BACKUP=/home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env/lib/jvm
JAVA_LD_LIBRARY_PATH_BACKUP=/home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env/lib/jvm/lib/server
_=/home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env/bin/printenv
`conda list`
# packages in environment at /home/seth/Code/src/github.com/ucb-bar/chipyard/.conda-env:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
_sysroot_linux-64_curr_repodata_hack 3                   h69a702a_14    conda-forge
aiohttp                   3.9.3           py310h2372a71_0    conda-forge
aiosignal                 1.3.1              pyhd8ed1ab_0    conda-forge
alabaster                 0.7.16             pyhd8ed1ab_0    conda-forge
alsa-lib                  1.2.11               hd590300_1    conda-forge
annotated-types           0.6.0              pyhd8ed1ab_0    conda-forge
appdirs                   1.4.4              pyh9f0ad1d_0    conda-forge
archspec                  0.2.3              pyhd8ed1ab_0    conda-forge
argcomplete               3.2.3              pyhd8ed1ab_0    conda-forge
asttokens                 2.4.1                    pypi_0    pypi
async-timeout             4.0.3              pyhd8ed1ab_0    conda-forge
atk-1.0                   2.38.0               hd4edc92_1    conda-forge
attrs                     23.2.0             pyh71513ae_0    conda-forge
autoconf                  2.71            pl5321h2b4cb7a_1    conda-forge
aws-c-auth                0.7.8                h538f98c_2    conda-forge
aws-c-cal                 0.6.9                h5d48c4d_2    conda-forge
aws-c-common              0.9.10               hd590300_0    conda-forge
aws-c-compression         0.2.17               h7f92143_7    conda-forge
aws-c-event-stream        0.3.2                h0bcb0bb_8    conda-forge
aws-c-http                0.7.14               hd268abd_3    conda-forge
aws-c-io                  0.13.36              he0cd244_2    conda-forge
aws-c-mqtt                0.9.10               h35285c7_2    conda-forge
aws-c-s3                  0.4.4                h0448019_0    conda-forge
aws-c-sdkutils            0.1.13               h7f92143_0    conda-forge
aws-checksums             0.1.17               h7f92143_6    conda-forge
aws-sam-translator        1.86.0             pyhd8ed1ab_0    conda-forge
aws-xray-sdk              2.13.0             pyhd8ed1ab_0    conda-forge
awscli                    2.15.28         py310hff52083_0    conda-forge
awscrt                    0.19.19         py310h43b4219_2    conda-forge
azure-core                1.30.1             pyhd8ed1ab_0    conda-forge
azure-identity            1.15.0             pyhd8ed1ab_0    conda-forge
babel                     2.14.0             pyhd8ed1ab_0    conda-forge
bash                      5.2.21               h7f99829_0    conda-forge
bash-completion           2.11                 ha770c72_1    conda-forge
bc                        1.07.1               h7f98852_0    conda-forge
bcrypt                    4.1.2           py310hcb5633a_0    conda-forge
binutils                  2.40                 hdd6e379_0    conda-forge
binutils_impl_linux-64    2.40                 hf600244_0    conda-forge
bison                     3.8.2                h59595ed_0    conda-forge
blinker                   1.7.0              pyhd8ed1ab_0    conda-forge
boltons                   23.1.1             pyhd8ed1ab_0    conda-forge
boto3                     1.34.61            pyhd8ed1ab_1    conda-forge
boto3-stubs               1.34.61            pyhd8ed1ab_0    conda-forge
botocore                  1.34.61         pyge310_1234567_0    conda-forge
botocore-stubs            1.34.61            pyhd8ed1ab_0    conda-forge
brotli                    1.1.0                hd590300_1    conda-forge
brotli-bin                1.1.0                hd590300_1    conda-forge
brotli-python             1.1.0           py310hc6cd4ac_1    conda-forge
bzip2                     1.0.8                hd590300_5    conda-forge
c-ares                    1.27.0               hd590300_0    conda-forge
ca-certificates           2024.2.2             hbcca054_0    conda-forge
cachecontrol              0.14.0             pyhd8ed1ab_0    conda-forge
cachecontrol-with-filecache 0.14.0             pyhd8ed1ab_0    conda-forge
cachy                     0.3.0              pyhd8ed1ab_1    conda-forge
cairo                     1.18.0               h3faef2a_0    conda-forge
certifi                   2024.2.2           pyhd8ed1ab_0    conda-forge
cffi                      1.16.0          py310h2fee648_0    conda-forge
cfgv                      3.3.1              pyhd8ed1ab_0    conda-forge
cfn-lint                  0.86.0             pyhd8ed1ab_0    conda-forge
charset-normalizer        3.3.2              pyhd8ed1ab_0    conda-forge
clang-format              17.0.6          default_hb11cfb5_3    conda-forge
clang-format-17           17.0.6          default_hb11cfb5_3    conda-forge
clang-tools               17.0.6          default_hb11cfb5_3    conda-forge
click                     8.1.7           unix_pyh707e725_0    conda-forge
click-default-group       1.2.4              pyhd8ed1ab_0    conda-forge
clikit                    0.6.2              pyhd8ed1ab_2    conda-forge
cloudpickle               3.0.0              pyhd8ed1ab_0    conda-forge
cmake                     3.26.3               h077f3f9_0    conda-forge
colorama                  0.4.6              pyhd8ed1ab_0    conda-forge
conda                     23.9.0          py310hff52083_2    conda-forge
conda-gcc-specs           13.2.0               h6a59387_5    conda-forge
conda-lock                1.4.0              pyhd8ed1ab_2    conda-forge
conda-package-handling    2.2.0              pyh38be061_0    conda-forge
conda-package-streaming   0.9.0              pyhd8ed1ab_0    conda-forge
conda-standalone          24.1.2               ha770c72_0    conda-forge
conda-tree                1.1.0              pyhd8ed1ab_2    conda-forge
constructor               3.7.0              pyh55f8243_0    conda-forge
contourpy                 1.2.0           py310hd41b1e2_0    conda-forge
coreutils                 9.4                  hd590300_0    conda-forge
crashtest                 0.4.1              pyhd8ed1ab_0    conda-forge
cryptography              40.0.2          py310h34c0648_0    conda-forge
ctags                     5.8               h14c3975_1000    conda-forge
curl                      7.88.1               hdc1c0ab_1    conda-forge
cycler                    0.12.1             pyhd8ed1ab_0    conda-forge
dbus                      1.13.6               h5008d03_3    conda-forge
diffutils                 3.10                 hf18258e_0    conda-forge
distlib                   0.3.8              pyhd8ed1ab_0    conda-forge
distro                    1.8.0              pyhd8ed1ab_0    conda-forge
docker-py                 7.0.0              pyhd8ed1ab_0    conda-forge
docutils                  0.19            py310hff52083_1    conda-forge
doit                      0.36.0             pyhd8ed1ab_0    conda-forge
dtc                       1.6.1                h166bdaf_2    conda-forge
ecdsa                     0.18.0             pyhd8ed1ab_1    conda-forge
elfutils                  0.187                h989201e_0    conda-forge
ensureconda               1.4.4              pyhd8ed1ab_0    conda-forge
exceptiongroup            1.2.0              pyhd8ed1ab_2    conda-forge
expat                     2.6.1                h59595ed_0    conda-forge
expect                    5.45.4               h555a92e_0    conda-forge
fab-classic               1.19.2                   pypi_0    pypi
file                      5.39                 h753d276_1    conda-forge
filelock                  3.13.1             pyhd8ed1ab_0    conda-forge
findutils                 4.6.0             h166bdaf_1001    conda-forge
flask                     3.0.2              pyhd8ed1ab_0    conda-forge
flask_cors                3.0.10             pyhd3deb0d_0    conda-forge
flex                      2.6.4             h58526e2_1004    conda-forge
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 h77eed37_1    conda-forge
fontconfig                2.14.2               h14ed4e7_0    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
fonttools                 4.49.0          py310h2372a71_0    conda-forge
freetype                  2.12.1               h267a509_2    conda-forge
fribidi                   1.0.10               h36c2ea0_0    conda-forge
frozenlist                1.4.1           py310h2372a71_0    conda-forge
fsspec                    2024.2.0           pyhca7485f_0    conda-forge
gcc                       13.2.0               hd6cf55c_3    conda-forge
gcc_impl_linux-64         13.2.0               h338b0a0_5    conda-forge
gdk-pixbuf                2.42.10              h829c605_5    conda-forge
gdspy                     1.4                      pypi_0    pypi
gengetopt                 2.23                 h9c3ff4c_0    conda-forge
gettext                   0.21.1               h27087fc_0    conda-forge
giflib                    5.2.1                h0b41bf4_3    conda-forge
git                       2.44.0          pl5321h709897a_0    conda-forge
gitdb                     4.0.11             pyhd8ed1ab_0    conda-forge
gitpython                 3.1.42             pyhd8ed1ab_0    conda-forge
gmp                       6.3.0                h59595ed_1    conda-forge
gmpy2                     2.1.2           py310h3ec546c_1    conda-forge
gnutls                    3.7.9                hb077bed_0    conda-forge
graphite2                 1.3.13            h58526e2_1001    conda-forge
graphql-core              3.2.3              pyhd8ed1ab_0    conda-forge
graphviz                  9.0.0                h78e8752_1    conda-forge
gtk2                      2.24.33              h280cfa0_4    conda-forge
gts                       0.7.6                h977cf35_4    conda-forge
gxx                       13.2.0               hd6cf55c_3    conda-forge
gxx_impl_linux-64         13.2.0               h338b0a0_5    conda-forge
gzip                      1.13                 hd590300_0    conda-forge
hammer-vlsi               1.2.0                    pypi_0    pypi
harfbuzz                  8.3.0                h3d44ed6_0    conda-forge
html5lib                  1.1                pyh9f0ad1d_0    conda-forge
humanfriendly             10.0               pyhd8ed1ab_6    conda-forge
icontract                 2.6.6                    pypi_0    pypi
icu                       73.2                 h59595ed_0    conda-forge
identify                  2.5.35             pyhd8ed1ab_0    conda-forge
idna                      3.6                pyhd8ed1ab_0    conda-forge
imagesize                 1.4.1              pyhd8ed1ab_0    conda-forge
importlib-metadata        7.0.2              pyha770c72_0    conda-forge
importlib_metadata        7.0.2                hd8ed1ab_0    conda-forge
importlib_resources       6.3.0              pyhd8ed1ab_0    conda-forge
iniconfig                 2.0.0              pyhd8ed1ab_0    conda-forge
itsdangerous              2.1.2              pyhd8ed1ab_0    conda-forge
jaraco.classes            3.3.1              pyhd8ed1ab_0    conda-forge
jeepney                   0.8.0              pyhd8ed1ab_0    conda-forge
jinja2                    3.1.3              pyhd8ed1ab_0    conda-forge
jmespath                  1.0.1              pyhd8ed1ab_0    conda-forge
joserfc                   0.9.0              pyhd8ed1ab_0    conda-forge
jq                        1.7.1                hd590300_0    conda-forge
jschema-to-python         1.2.3              pyhd8ed1ab_0    conda-forge
jsondiff                  2.0.0              pyhd8ed1ab_0    conda-forge
jsonpatch                 1.33               pyhd8ed1ab_0    conda-forge
jsonpickle                3.0.2              pyhd8ed1ab_1    conda-forge
jsonpointer               2.4             py310hff52083_3    conda-forge
jsonschema                4.21.1             pyhd8ed1ab_0    conda-forge
jsonschema-path           0.3.2              pyhd8ed1ab_0    conda-forge
jsonschema-specifications 2023.7.1           pyhd8ed1ab_0    conda-forge
junit-xml                 1.9                pyh9f0ad1d_0    conda-forge
kernel-headers_linux-64   3.10.0              h4a8ded7_14    conda-forge
keyring                   24.3.1          py310hff52083_0    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.5           py310hd41b1e2_1    conda-forge
krb5                      1.20.1               h81ceb04_0    conda-forge
lazy-object-proxy         1.10.0          py310h2372a71_0    conda-forge
lcms2                     2.16                 hb7c19ff_0    conda-forge
ld_impl_linux-64          2.40                 h41732ed_0    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libabseil                 20240116.1      cxx17_h59595ed_2    conda-forge
libarchive                3.5.2                hada088e_3    conda-forge
libblas                   3.9.0           21_linux64_openblas    conda-forge
libbrotlicommon           1.1.0                hd590300_1    conda-forge
libbrotlidec              1.1.0                hd590300_1    conda-forge
libbrotlienc              1.1.0                hd590300_1    conda-forge
libcblas                  3.9.0           21_linux64_openblas    conda-forge
libclang-cpp17            17.0.6          default_hb11cfb5_3    conda-forge
libclang13                17.0.6          default_ha2b6cf4_3    conda-forge
libcups                   2.3.3                h36d4200_3    conda-forge
libcurl                   7.88.1               hdc1c0ab_1    conda-forge
libdeflate                1.19                 hd590300_0    conda-forge
libdwarf                  0.0.0.20190110_28_ga81397fc4      h753d276_0    ucb-bar
libdwarf-dev              0.0.0.20190110_28_ga81397fc4      h753d276_0    ucb-bar
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 hd590300_2    conda-forge
libexpat                  2.6.1                h59595ed_0    conda-forge
libfdt                    1.6.1                h166bdaf_2    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-devel_linux-64     13.2.0             ha9c7c90_105    conda-forge
libgcc-ng                 13.2.0               h807b86a_5    conda-forge
libgcrypt                 1.10.3               hd590300_0    conda-forge
libgd                     2.3.3                h119a65a_9    conda-forge
libgfortran-ng            13.2.0               h69a702a_5    conda-forge
libgfortran5              13.2.0               ha4646dd_5    conda-forge
libgirepository           1.78.1               h003a4f0_1    conda-forge
libglib                   2.80.0               hf2295e7_0    conda-forge
libgomp                   13.2.0               h807b86a_5    conda-forge
libgpg-error              1.48                 h71f35ed_0    conda-forge
libiconv                  1.17                 hd590300_2    conda-forge
libidn2                   2.3.7                hd590300_0    conda-forge
libjpeg-turbo             3.0.0                hd590300_1    conda-forge
liblapack                 3.9.0           21_linux64_openblas    conda-forge
libllvm17                 17.0.6               hb3ce162_1    conda-forge
libmagic                  5.39                 h753d276_1    conda-forge
libmicrohttpd             0.9.77               h97afed2_0    conda-forge
libnghttp2                1.58.0               h47da74e_1    conda-forge
libnsl                    2.0.1                hd590300_0    conda-forge
libopenblas               0.3.26          pthreads_h413a1c8_0    conda-forge
libpng                    1.6.43               h2797004_0    conda-forge
libprotobuf               4.25.3               h08a7969_0    conda-forge
librsvg                   2.56.3               he3f83f7_1    conda-forge
libsanitizer              13.2.0               h7e041cc_5    conda-forge
libsecret                 0.18.8               h329b89f_2    conda-forge
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libsqlite                 3.45.2               h2797004_0    conda-forge
libssh2                   1.11.0               h0841786_0    conda-forge
libstdcxx-devel_linux-64  13.2.0             ha9c7c90_105    conda-forge
libstdcxx-ng              13.2.0               h7e041cc_5    conda-forge
libtasn1                  4.19.0               h166bdaf_0    conda-forge
libtiff                   4.6.0                ha9c0a0a_2    conda-forge
libunistring              0.9.10               h7f98852_0    conda-forge
libusb1                   2.0.1              pyhd8ed1ab_0    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libuv                     1.48.0               hd590300_0    conda-forge
libwebp                   1.3.2                h658648e_1    conda-forge
libwebp-base              1.3.2                hd590300_0    conda-forge
libxcb                    1.15                 h0b41bf4_0    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libxml2                   2.12.5               h232c23b_0    conda-forge
libzlib                   1.2.13               hd590300_5    conda-forge
livereload                2.6.3              pyh9f0ad1d_0    conda-forge
lz4-c                     1.9.4                hcb278e6_0    conda-forge
lzo                       2.10              h516909a_1000    conda-forge
lzop                      1.04                 h3753786_2    conda-forge
m4                        1.4.18            h516909a_1001    conda-forge
make                      4.3                  hd18ef5c_1    conda-forge
markupsafe                2.1.5           py310h2372a71_0    conda-forge
matplotlib-base           3.8.3           py310h62c0568_0    conda-forge
mock                      5.1.0                    pypi_0    pypi
more-itertools            10.2.0             pyhd8ed1ab_0    conda-forge
mosh                      1.4.0           pl5321h7cc048c_8    conda-forge
moto                      5.0.3              pyhd8ed1ab_0    conda-forge
mpc                       1.3.1                hfe3b2da_0    conda-forge
mpfr                      4.2.1                h9458935_0    conda-forge
mpmath                    1.3.0              pyhd8ed1ab_0    conda-forge
msal                      1.27.0             pyhd8ed1ab_0    conda-forge
msal_extensions           1.1.0           py310hff52083_1    conda-forge
msgpack-python            1.0.7           py310hd41b1e2_0    conda-forge
multidict                 6.0.5           py310h2372a71_0    conda-forge
munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
mypy                      1.9.0           py310h2372a71_0    conda-forge
mypy-boto3-s3             1.34.14            pyhd8ed1ab_0    conda-forge
mypy_boto3_ec2            1.34.61            pyhd8ed1ab_0    conda-forge
mypy_extensions           1.0.0              pyha770c72_0    conda-forge
ncurses                   6.4                  h59595ed_2    conda-forge
nettle                    3.9.1                h7ab15ed_0    conda-forge
networkx                  3.2.1              pyhd8ed1ab_0    conda-forge
nodeenv                   1.8.0              pyhd8ed1ab_0    conda-forge
numpy                     1.26.4          py310hb13e2d6_0    conda-forge
oniguruma                 6.9.9                hd590300_0    conda-forge
open_pdks.sky130a         1.0.471_0_g97d0844 20240223_100318    litex-hub
openapi-schema-validator  0.6.2              pyhd8ed1ab_0    conda-forge
openapi-spec-validator    0.7.1              pyhd8ed1ab_0    conda-forge
openjdk                   20.0.2               haa376d0_2    conda-forge
openjpeg                  2.5.2                h488ebb8_0    conda-forge
openssl                   3.2.1                hd590300_0    conda-forge
p11-kit                   0.24.1               hc5aa10d_0    conda-forge
packaging                 24.0               pyhd8ed1ab_0    conda-forge
pandas                    2.2.1           py310hcc13569_0    conda-forge
pango                     1.52.1               ha41ecd1_0    conda-forge
paramiko                  3.4.0              pyhd8ed1ab_0    conda-forge
paramiko-ng               2.8.10                   pypi_0    pypi
pastel                    0.2.1              pyhd8ed1ab_0    conda-forge
patch                     2.7.6             h7f98852_1002    conda-forge
pathable                  0.4.3              pyhd8ed1ab_0    conda-forge
pbr                       6.0.0              pyhd8ed1ab_0    conda-forge
pcre2                     10.43                hcad00b1_0    conda-forge
perl                      5.32.1          7_hd590300_perl5    conda-forge
pillow                    10.2.0          py310h01dd4db_0    conda-forge
pip                       24.0               pyhd8ed1ab_0    conda-forge
pixman                    0.43.2               h59595ed_0    conda-forge
pkginfo                   1.10.0             pyhd8ed1ab_0    conda-forge
pkgutil-resolve-name      1.3.10             pyhd8ed1ab_1    conda-forge
platformdirs              4.2.0              pyhd8ed1ab_0    conda-forge
pluggy                    1.4.0              pyhd8ed1ab_0    conda-forge
popt                      1.16              h0b475e3_2002    conda-forge
portalocker               2.8.2           py310hff52083_1    conda-forge
pre-commit                3.6.2              pyha770c72_0    conda-forge
prompt-toolkit            3.0.38             pyha770c72_0    conda-forge
prompt_toolkit            3.0.38               hd8ed1ab_0    conda-forge
psutil                    5.9.8           py310h2372a71_0    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
pyasn1                    0.5.1              pyhd8ed1ab_0    conda-forge
pycairo                   1.26.0          py310hda9f760_0    conda-forge
pycosat                   0.6.6           py310h2372a71_0    conda-forge
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pydantic                  1.10.14                  pypi_0    pypi
pydantic-core             2.16.3          py310hcb5633a_0    conda-forge
pygments                  2.17.2             pyhd8ed1ab_0    conda-forge
pygobject                 3.48.1          py310h30b043a_0    conda-forge
pyjwt                     2.8.0              pyhd8ed1ab_1    conda-forge
pylddwrap                 1.2.2                    pypi_0    pypi
pylev                     1.4.0              pyhd8ed1ab_0    conda-forge
pynacl                    1.5.0           py310h2372a71_3    conda-forge
pyopenssl                 23.1.1             pyhd8ed1ab_0    conda-forge
pyparsing                 3.1.2              pyhd8ed1ab_0    conda-forge
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
pytest                    8.1.1              pyhd8ed1ab_0    conda-forge
pytest-dependency         0.5.1              pyh9f0ad1d_0    conda-forge
pytest-mock               3.12.0             pyhd8ed1ab_0    conda-forge
python                    3.10.13         hd12c33a_1_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python-graphviz           0.20.1             pyh22cad53_0    conda-forge
python-jose               3.3.0              pyh6c4a22f_1    conda-forge
python-tzdata             2024.1             pyhd8ed1ab_0    conda-forge
python_abi                3.10                    4_cp310    conda-forge
pytz                      2024.1             pyhd8ed1ab_0    conda-forge
pywin32-on-windows        0.1.0              pyh1179c8e_3    conda-forge
pyyaml                    6.0.1           py310h2372a71_1    conda-forge
qemu                      5.0.0                hb15d774_0    ucb-bar
readline                  8.2                  h8228510_1    conda-forge
referencing               0.30.2             pyhd8ed1ab_0    conda-forge
regex                     2023.12.25      py310h2372a71_0    conda-forge
requests                  2.31.0             pyhd8ed1ab_0    conda-forge
responses                 0.25.0             pyhd8ed1ab_0    conda-forge
rfc3339-validator         0.1.4              pyhd8ed1ab_0    conda-forge
rhash                     1.4.3                hd590300_2    conda-forge
riscv-tools               1.0.6           0_h1234567_g56c29e0    ucb-bar
rpds-py                   0.18.0          py310hcb5633a_0    conda-forge
rsa                       4.9                pyhd8ed1ab_0    conda-forge
rsync                     3.2.7                h70740c4_0    conda-forge
ruamel-yaml               0.17.40                  pypi_0    pypi
ruamel.yaml.clib          0.2.7           py310h2372a71_2    conda-forge
s2n                       1.4.0                h06160fa_0    conda-forge
s3fs                      0.4.2                      py_0    conda-forge
s3transfer                0.10.0             pyhd8ed1ab_0    conda-forge
sarif-om                  1.0.4              pyhd8ed1ab_0    conda-forge
sbt                       1.9.7                hd8ed1ab_0    conda-forge
screen                    4.8.0                he28a2e2_0    conda-forge
secretstorage             3.3.3           py310hff52083_2    conda-forge
sed                       4.8                  he412f7d_0    conda-forge
setuptools                69.2.0             pyhd8ed1ab_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
smmap                     5.0.0              pyhd8ed1ab_0    conda-forge
snowballstemmer           2.2.0              pyhd8ed1ab_0    conda-forge
sphinx                    7.2.6              pyhd8ed1ab_0    conda-forge
sphinx-autobuild          2024.2.4           pyhd8ed1ab_0    conda-forge
sphinx_rtd_theme          2.0.0              pyha770c72_0    conda-forge
sphinxcontrib-applehelp   1.0.8              pyhd8ed1ab_0    conda-forge
sphinxcontrib-devhelp     1.0.6              pyhd8ed1ab_0    conda-forge
sphinxcontrib-htmlhelp    2.0.5              pyhd8ed1ab_0    conda-forge
sphinxcontrib-jquery      4.1                pyhd8ed1ab_0    conda-forge
sphinxcontrib-jsmath      1.0.1              pyhd8ed1ab_0    conda-forge
sphinxcontrib-qthelp      1.0.7              pyhd8ed1ab_0    conda-forge
sphinxcontrib-serializinghtml 1.1.10             pyhd8ed1ab_0    conda-forge
sqlite                    3.45.2               h2c6b66d_0    conda-forge
sshpubkeys                3.3.1              pyhd8ed1ab_0    conda-forge
sty                       1.0.0              pyhd8ed1ab_0    conda-forge
sure                      2.0.1                    pypi_0    pypi
sympy                     1.12            pypyh9d50eac_103    conda-forge
sysroot_linux-64          2.17                h4a8ded7_14    conda-forge
tar                       1.34                 hb2e2bae_1    conda-forge
tk                        8.6.13          noxft_h4845f30_101    conda-forge
tomli                     2.0.1              pyhd8ed1ab_0    conda-forge
tomlkit                   0.12.4             pyha770c72_0    conda-forge
toolz                     0.12.1             pyhd8ed1ab_0    conda-forge
tornado                   6.4             py310h2372a71_0    conda-forge
tqdm                      4.66.2             pyhd8ed1ab_0    conda-forge
truststore                0.8.0              pyhd8ed1ab_0    conda-forge
types-awscrt              0.20.5             pyhd8ed1ab_0    conda-forge
types-pytz                2024.1.0.20240203    pyhd8ed1ab_0    conda-forge
types-pyyaml              6.0.12.20240311    pyhd8ed1ab_0    conda-forge
types-requests            2.31.0.6           pyhd8ed1ab_0    conda-forge
types-s3transfer          0.10.0                   pypi_0    pypi
types-urllib3             1.26.25.14         pyhd8ed1ab_0    conda-forge
typing-extensions         4.10.0               hd8ed1ab_0    conda-forge
typing_extensions         4.10.0             pyha770c72_0    conda-forge
tzdata                    2024a                h0c530f3_0    conda-forge
ukkonen                   1.0.1           py310hd41b1e2_4    conda-forge
unicodedata2              15.1.0          py310h2372a71_0    conda-forge
unzip                     6.0                  h7f98852_3    conda-forge
urllib3                   1.26.18            pyhd8ed1ab_0    conda-forge
verilator                 5.022                h7cd9344_0    conda-forge
vim                       9.1.0041        py310pl5321he660f0e_0    conda-forge
virtualenv                20.25.1            pyhd8ed1ab_0    conda-forge
wcwidth                   0.2.13             pyhd8ed1ab_0    conda-forge
webencodings              0.5.1              pyhd8ed1ab_2    conda-forge
websocket-client          1.7.0              pyhd8ed1ab_0    conda-forge
werkzeug                  3.0.1              pyhd8ed1ab_0    conda-forge
wget                      1.20.3               ha35d2d1_1    conda-forge
wheel                     0.42.0             pyhd8ed1ab_0    conda-forge
which                     2.21                 h0b41bf4_1    conda-forge
wrapt                     1.16.0          py310h2372a71_0    conda-forge
xmltodict                 0.13.0             pyhd8ed1ab_0    conda-forge
xorg-fixesproto           5.0               h7f98852_1002    conda-forge
xorg-inputproto           2.3.2             h7f98852_1002    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.1.1                hd590300_0    conda-forge
xorg-libsm                1.2.4                h7391055_0    conda-forge
xorg-libx11               1.8.7                h8ee46fc_0    conda-forge
xorg-libxau               1.0.11               hd590300_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h0b41bf4_2    conda-forge
xorg-libxfixes            5.0.3             h7f98852_1004    conda-forge
xorg-libxi                1.7.10               h7f98852_0    conda-forge
xorg-libxrender           0.9.11               hd590300_0    conda-forge
xorg-libxt                1.3.0                hd590300_1    conda-forge
xorg-libxtst              1.2.3             h7f98852_1002    conda-forge
xorg-recordproto          1.14.2            h7f98852_1002    conda-forge
xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
xorg-xextproto            7.3.0             h0b41bf4_1003    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xxhash                    0.8.0                h7f98852_3    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
yarl                      1.9.4           py310h2372a71_0    conda-forge
zipp                      3.17.0             pyhd8ed1ab_0    conda-forge
zlib                      1.2.13               hd590300_5    conda-forge
zstandard                 0.22.0          py310h1275a96_0    conda-forge
zstd                      1.5.5                hfc55251_0    conda-forge

Other Setup

Followed the "setting up the repository" guide, and added this to PeripheralDeviceConfigs.scala:

class TapNICRocketConfig extends Config(
  new chipyard.harness.WithSimNetwork ++
  new icenet.WithIceNIC ++
  new freechips.rocketchip.subsystem.WithNBigCores(1) ++
  new chipyard.config.AbstractConfig)

Current Behavior

When I run: make -C sims/verilator CONFIG=TapNICRocketConfig VERILATOR_THREADS=1 with any number of threads larger than 1, I end up with a simulator program that dies with a SIGSEGV almost as soon as I can launch it.

As a consequence (and, to address what I'm really after), running the pingd.c test results in a ~2s ping on my system. That's longer than the default interval of a second, which means that running ping with no arguments against the single-threaded RTL simulation is an effective DOS strategy as it sends ICMP echo requests at about 2x the throughput the simulator can maintain.

Expected Behavior

Simply, I expected to be able to "throw more threads at it," as most of the time seems to be going to front-end stalls due to icache misses, something that multiple threads addresses nicely by way of expanding the effective available icache space.

More broadly, I suppose I expected there to be a way to get to workable performance of the RTL model for functional simulation of simple network nodes without custom hardware, proprietary software, or an FPGA-equipped cloud instance. Perhaps my expectation that such a path exists through verilator is worth discussing here, too?

Other Information

coredumpctl debug suggests this is because the thread-local context_t isn't fully initialized in all threads:

Core was generated by `/home/seth/Code/src/github.com/ucb-bar/chipyard/sims/verilator/simulator-chipya'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000056067f92ac60 in context_t::switch_to (this=0x56068194f370) at ../fesvr/context.cc:86
86        cur = this;
[Current thread is 1 (Thread 0x7fece98bb6c0 (LWP 4017918))]
(gdb) bt
#0  0x000056067f92ac60 in context_t::switch_to (this=0x56068194f370) at ../fesvr/context.cc:86
#1  0x000056067f3dda89 in network_tick ()
#2  0x000056067f555b96 in VTestDriver___024unit____Vdpiimwrap_network_tick_TOP____024unit(unsigned char, unsigned long, unsigned char, unsigned char&, unsigned long&, unsigned char&, unsigned long&) ()
#3  0x000056067f739e90 in VTestDriver___024root___nba_sequent__TOP__1899(VTestDriver___024root*) ()
#4  0x000056067f44d8b6 in VTestDriver___024root____Vthread__nba__2(void*, bool) ()
#5  0x000056067f415cb1 in VlWorkerThread::workerLoop() ()
#6  0x00007feceb8f0e95 in std::execute_native_thread_routine (__p=<optimized out>)
    at ../../../../../libstdc++-v3/src/c++11/thread.cc:104
#7  0x00007feceb5da55a in ?? () from /usr/lib/libc.so.6
#8  0x00007feceb657a5c in ?? () from /usr/lib/libc.so.6
(gdb) p this
$1 = (context_t * const) 0x56068194f370
(gdb) p cur
$2 = (context_t *) 0x0

I'm not familiar with the usercontext patterns/types used by the fesvr to implement what appears to be "green threads" (really, co-routines), but I do see in the verilator docs that in multithreaded verilator it's the verilated model which creates and manages all N-1 threads except for whatever called eval:

With --threads {N}, where N is at least 2, the generated model will be designed to run in parallel on N threads. The thread calling eval() provides one of those threads, and the generated model will create and manage the other N-1 threads.
...
When making frequent use of DPI imported functions in a multithreaded model, it may be beneficial to performance to adjust the --instr-count-dpi option based on some experimentation. This influences the partitioning of the model by adjusting the assumed execution time of DPI imports.

And the latter bit suggests to me that while under some conditions the DPI functions may be called from the same threads, I see no guarantees that DPI from the always blocks in a module will be called using the same thread as the initial blocks (which the current implementation implicitly assumes). I suspect that one more degree of refinement around "the model's threads ought to be more compatible with the fesvr's coroutine implementation backing the DPI SimNetwork implementation" would help here, not least in identifying whether this is even a crash that chipyard itself has any leverage over.

I'm opening this here because I believe it's the right home for this issue, since it seems that even if verilator or fesvr exposed a callback or config option that would affect the outcome there'd still need to be a change here in chipyard to take advantage of it, but I'm very new to this space and would welcome your guidance.

@sethp sethp added the bug label May 24, 2024
@sethp
Copy link
Author

sethp commented May 24, 2024

Oh, I also meant to mention: I found someone on the mailing list with a similar-looking symptom ( https://groups.google.com/g/chipyard/c/i0pNR4t8HFA/m/NBMP4fcsAQAJ ), but given that they reported 1) the crash occurred in what appears to be an initial block rather than the nba driving the network_tick, and 2) solving the problem by changing the name of the connected bus I suspect that is a different issue, though perhaps still somewhere in SimNetwork.cc & friends.

@jerryz123
Copy link
Contributor

I wonder if adding a lock to the network__tick and network_init functions would be sufficient.

@sethp
Copy link
Author

sethp commented May 24, 2024

I think it depends on how you mean: I noticed the cur that's 0x0 in gdb there is a static __thread context_t* cur;; since it's a thread-local, I read that as saying the faulting thread's copy of that storage is uninitialized. No other thread can (well, should) access the faulting thread's local storage, so locking to wait for it to be constructed would probably just hang.

If you mean "instead of using a ucontext/coroutine thing, set up a cond-with-lock in network_init that parks a pthread until signaled by network_tick", then I think there's a potentially fruitful path there: there's even a bit of an example implementation in the fesvr's context_t as an alternative to using a ucontext, albeit one that doesn't look directly usable.

Between repair and replace, I'm personally leaning towards trying to figure out what the fesvr's context_t thing wants, since there's a handful of other uses of in chipyard already (e.g. the spike tile) that would either suffer from a similar issue or possibly provide a solution. I'm hoping to continue posting my notes here as I learn my way through the fesvr and verilator threading models—unless you'd rather I didn't, of course!

@jerryz123
Copy link
Contributor

unless you'd rather I didn't, of course!

I suspect the solution to the problem does not require messing around in context_t. I believe several other devices uses FESVR's context_t and behave correctly in multithreaded sims.

@sethp
Copy link
Author

sethp commented May 28, 2024

Perhaps! I did see that other devices made use of context_t, which is what leads me to want to understand the problem a little better. I found certain --threads counts do in fact produce a simulation that works for a given model using SimNetwork; not just 1 (which always works), but sometimes the model will work indefinitely with --threads 2 and crash immediately with --threads 3.

Two especially relevant details I've noticed so far:

  1. The verilator multithreading model appears to schedule micro-tasks statically; i.e. the same thread always resolves the same DPI-C call for a given model
  2. The crash occurs inside context_t when the thread-local cur variable is NULL (0x0). cur usually looks like it gets initialized as a side-effect of calling init in that thread (for any context_t instance, I think?)

I think it's the combination of these two that's causing the behavior I'm seeing: when the scheduler happens to place the network_init and network_tick DPI calls into the same thread's work queue (P=100% with one thread, ~50% with two, ~33 % with three, etc...), then there's no crash—network_init populates the thread-local, and network_tick uses it.

If I'm further right in saying that any call to static context_t::current() in a thread "pre-warms" it, then adding more independently-scheduled instances (like, say, 2x ice nics + a block device + 8 spike tiles), we might end up rapidly (but asymptotically) approaching a 100% chance that some initial block populates each thread's cur storage for a given number of threads. Which could account for (directly, or indirectly) your experience that multi-threaded sims with context_t work fine?

@sethp
Copy link
Author

sethp commented May 29, 2024

Hmm, well, yes and no to that last question. Adding a spike tile1 did perturb the scheduler enough that the simulation worked at least once2 with VERILATOR_THREADS=3:

[UART] UART0 is here (stdin/stdout).
network init (tid=198488)
No tap interface provided
Constructing spike processor_t (tid=198490)
Done constructing spike processor
network tick (tid=198488)
- /home/seth/Code/src/github.com/ucb-bar/chipyard/sims/verilator/generated-src/chipyard.harness.TestHarness.TapNICRocketConfig/gen-collateral/TestDriver.v:158: Verilog $finish

and failed with VERILATOR_THREADS=4:

network init (tid=205548)
No tap interface provided
Constructing spike processor_t (tid=205550)
Done constructing spike processor
network tick (tid=205551)
zsh: segmentation fault (core dumped)  ./sims/verilator/simulator-chipyard.harness-TapNICRocketConfig 

But, adding/removing cores doesn't change which threads do the initialization:

Constructing spike processor_t (tid=219082)
Done constructing spike processor
Constructing spike processor_t (tid=219082)
Done constructing spike processor
Constructing spike processor_t (tid=219082)
Done constructing spike processor
Constructing spike processor_t (tid=219082)
Done constructing spike processor

And, experimenting also provided a counterexample to my speculation that any context_t::init caller in a thread would suffice:

[UART] UART0 is here (stdin/stdout).
network init (tid=192456)
No tap interface provided
Constructing spike processor_t (tid=192457)
Done constructing spike processor
network tick (tid=192457)
zsh: segmentation fault (core dumped)  ./sims/verilator/simulator-chipyard.harness-TapNICRocketConfig 

Also, it seems that bdev is vulnerable to the same crash (as long as the sim is run with +blkdev=somefile, otherwise the blkdev never inits or ticks):

bdev init (tid=312407)
[UART] UART0 is here (stdin/stdout).
...
bdev tick (tid=312410)
zsh: segmentation fault (core dumped)  ./sims/verilator/simulator-chipyard.harness-TapNICRocketConfig +permissive   
$ coredumpctl debug
...
Core was generated by `./sims/verilator/simulator-chipyard.harness-TapNICRocketConfig +permissive +blk'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000556201ef4530 in context_t::switch_to (this=0x55620306f9e0) at ../fesvr/context.cc:86
86        cur = this;
[Current thread is 1 (Thread 0x7fcebc80c6c0 (LWP 312410))]
(gdb) bt
#0  0x0000556201ef4530 in context_t::switch_to (this=0x55620306f9e0) at ../fesvr/context.cc:86
#1  0x00005562015cecbe in block_device_tick ()
#2  0x00005562017a7206 in VTestDriver___024unit____Vdpiimwrap_block_device_tick_TOP____024unit(unsigned char, unsigned char&, unsigned char, unsigned int, unsigned int, unsigned int, unsigned char, unsigned char&, unsigned long, unsigned int, unsigned char&, unsigned char, unsigned long&, unsigned int&) ()
#3  0x00005562019a38b0 in VTestDriver___024root___nba_sequent__TOP__1888(VTestDriver___024root*) ()
#4  0x0000556201646d0f in VTestDriver___024root____Vthread__nba__2(void*, bool) ()
#5  0x000055620160ce51 in VlWorkerThread::workerLoop() ()
#6  0x00007fcebe86fe95 in std::execute_native_thread_routine (__p=<optimized out>)
    at ../../../../../libstdc++-v3/src/c++11/thread.cc:104
#7  0x00007fcebe53e55a in ?? () from /usr/lib/libc.so.6
#8  0x00007fcebe5bba5c in ?? () from /usr/lib/libc.so.6
(gdb) p cur
$1 = (context_t *) 0x0

It seems like the spike tile is in fact the odd one out; since it's driven from a single DPI-C entrypoint that inits itself on demand, it's not possible(?) for it to be scheduled on to two different threads by the Verilator microtask scheduler.

Footnotes

  1. so the new config is

    class TapNICRocketConfig extends Config(
    new chipyard.WithNSpikeCores(4) ++
    
    new chipyard.harness.WithSimNetwork ++
    new icenet.WithIceNIC ++
    new freechips.rocketchip.subsystem.WithNBigCores(1) ++
    new chipyard.config.AbstractConfig)
    
  2. I'm testing with:

    touch sims/verilator/generated-src/chipyard.harness.TestHarness.TapNICRocketConfig/gen-collateral/filelist.f && MAKEFLAGS=-j`nproc` make -C sims/verilator VERILATOR_THREADS=N CONFIG=TapNICRocketConfig && ./sims/verilator/simulator-chipyard.harness-TapNICRocketConfig toolchains/riscv-tools/riscv-tests/build/isa/rv64ui-p-simple
    

    for various values of N. I'm also using rv64ui-p-simple because it doesn't seem important which test program to use: whether it segfaults or not happens on the first network_tick call and is apparently independent of whatever the simulated binary does.

@sethp
Copy link
Author

sethp commented May 29, 2024

Ah, one speculative code change later and, oh, hello:

[UART] UART0 is here (stdin/stdout).
network init (tid=536361)
No tap interface provided
network tick (tid=536364)
- /home/seth/Code/src/github.com/ucb-bar/chipyard/sims/verilator/generated-src/chipyard.harness.TestHarness.TapNICRocketConfig/gen-collateral/TestDriver.v:158: Verilog $finish

init and tick ran in different OS threads, and yet no crash! That result via noticing that it was a very different crash in the case that someone had previously populated the thread-local.

In reading the code to try and identify how the control flow ought to return, I noticed that both the device and the switch had a pointer to another thread's local storage (they were both storing cur in a field). So, overriding some access modifiers so I could update them from network_tick:

  if (!netdev || !netsw) {
    fprintf(stderr, "You forgot to call network_init!");
    exit(1);
  }

+  netdev->target = context_t::current();
+  netsw->main = context_t::current();

  netdev->tick(out_valid, out_data, out_last);
  netdev->switch_to_host();

  netsw->distribute();
  netsw->switch_to_worker();

and, since context_t::current exists to populate the thread-local, it seems neither crash occurred.

I'm still not quite sure what all this means yet, nor what course of action it suggests, but I did find that result very interesting.

@sethp
Copy link
Author

sethp commented May 31, 2024

Ok, so I've experimented a bit more, and I've come up with three potentially useful perspectives here:

  • SimNetwork (& SimBlockDevice) are mis-using context_t by stashing the pointer from context_t::current in a field during construction. That's always a pointer to a thread-local, and if the constructor is called in a different thread (as here), that gets very difficult to reason about. A change suggested by this perspective is an iteration on (but functionally the same as) "update in network tick" from above, maybe something like this in device.h:
    -    void switch_to_host(void) { host.switch_to(); }
    +    void switch_to_host(void) { target = context::current(); host.switch_to(); target = NULL; }
  • context_t is challenging to use correctly, especially in a threaded program. An idea for how to improve the surface a bit would be to promote prev to a thread-local itself and add something like:
    void context_t::yield() {
    #ifdef USE_UCONTEXT 
      if (swapcontext(cur->context.get(), prev->context.get()) != 0)
        abort();
    #else
      assert(false && "todo!");
      abort();
    #endif
    }
    which would allow NetworkDevice & NetworkSwitch a way to pass control flow back to the simulation without needing to stash their own context pointers. The full fix would also require changing context_t::switch_to so that it always initializes the thread-local if it's NULL (or, possibly better yet, just make cur ~ static __thread context_t cur so it's always valid memory)
  • verilator's threading model is already providing (effectively) stackless (micro-)tasks that can be efficiently distributed to threads. context_t does provide stackful coroutines, but the gain from using context_t might be relatively small for its cost in complexity (and the implicit syscall to set the signal mask every swap). The direction suggested here would be to refactor the tick functions to return after a single step (i.e. on the current loop back-edge); to my eye that looks fairly achievable for the NetworkDevice & NetworkSwitch after shuffling a few stack-locals to become class members or shim'd-in arguments. BlockDevice would be even simpler, and I don't see any other uses of fesvr/context.h in my clone of the chipyard repo at this time (although I only have the default list of submodules from ./build-setup.sh).

I'd be lying if I said I didn't see the last as the most pragmatic option. I also feel like it's a bit of of a loss: I've really enjoyed learning about the ucontext_t stuff, and I appreciate the utility of being able to multiplex lightweight user tasks over the same thread. That said, I spend a lot of time learning about weird corners of computing, and it was still very strange to me on first encounter. That to me is an important signal about code accessibility, and for the same reason non-local control flow is... well, best used sparingly.

I believe I can fill out any of those three directions into a full-fledged PR to the upstream(s) in question here (IceNet & testchipip, or riscv-isa-sim). Do any of them especially call to you, @jerryz123 ? I see that you've done a lot of this work, so I suspect you'd have a better sense for the overall context (pun intended).

@jerryz123
Copy link
Contributor

SimNetwork (& SimBlockDevice) are mis-using context_t by stashing the pointer from context_t::current in a field during construction. That's always a pointer to a thread-local, and if the constructor is called in a different thread (as here), that gets very difficult to reason about.

While I agree with your reasoning here, I don't think its reasonable to expect/require SimDevice implementations to be thread-safe, where the constructor/tick functions can be called from distinct threads. As far as I can tell, this quirk only appears with Verilator multi-threading. The other simulators don't do this, even with multithreading enabled.

context_t is challenging to use correctly, especially in a threaded program. An idea for how to improve the surface a bit would be to promote prev to a thread-local itself and add something like:

Does this generalize to systems with multiple contexts? IMO its better to require the programmer to explicitly specify the next context to execute. There are use-cases of context_t which have multiple contexts (not just target/host).

The direction suggested here would be to refactor the tick functions to return after a single step

The htif/tsi mechanism uses context_t, but I believe the implementation is buried with the static FESVR library, which is compiled as part of spike (Spike uses htif and context_t as well in its own simulation loop).
The FireSim FPGA emulation driver also heavily uses context_t.

Another example is the tick function for SpikeTile, which allows the Spike C++ core model to interact with the Chipyard RTL simulation.


I couldn't think of a way to make that system work without context_t.

My belief here is that the init/tick functions being called from separate threads is a Verilator-specific quirk that we should work around with minimal disruption to existing other code/interfaces. Perhaps the simplest thing is to merge network_tick and network_init, and make the tick function initialize the devices on-demand?

Thank you for digging into this, I've learned quite a bit about the subtleties of the context_t behavior in a multi-threaded system from your analysis.

@sethp
Copy link
Author

sethp commented Jun 3, 2024

While I agree with your reasoning here, I don't think its reasonable to expect/require SimDevice implementations to be thread-safe, where the constructor/tick functions can be called from distinct threads.

Yeah, I hear you about not wanting the {runtime,complexity} overhead of generalized thread-safety in the simulated devices. I want to note that the situation here calls for a much narrower "kind" of thread safety—it's the difference between what Rust calls Sync (you might be called from multiple threads at the same time) and the much simpler Send (it's safe to move the resource between threads)12. There is a strict happens-before relationship between the verilog inital blocks that call init and the always blocks that call tick, which it appears that verilator correctly implements. So to achieve correctness here it's not that the sim needs to handle multiple concurrent callers, but more or less just needs to avoid stashing a reference to another thread's local storage.

As far as I can tell, this quirk only appears with Verilator multi-threading. The other simulators don't do this, even with multithreading enabled.

To be fully transparent, I have only a few dozen hours' worth of experience with any of the commercial verilog implementations, and none to the depth I've gotten here with verilator. I do wonder if it's on accident or by design that the other simulators don't encounter this behavior: do you know if there's some verilog standard (implicit or explicit) that verilator is violating here by evaluating the initial block in a different thread than the always block? If it should be treating the module, say, as the "unit" of work (perhaps iff the module contains DPI?), that's something it might be worth raising with them upstream, too.

Does this generalize to systems with multiple contexts? IMO its better to require the programmer to explicitly specify the next context to execute. There are use-cases of context_t which have multiple contexts (not just target/host).

The yeild semantics I implemented do generalize in the sense that every context_t has a most-recently-swapped antecedent, but as written would probably produce surprising behavior when trying to "nest" contexts (M switch_to A switch_to B followed by yield would "return" to B, but a second yield from A would pass control flow back to B). We could imagine a context "stack", with switch_to pushing a task, and a definition of yield that acts as a "pop." I believe that would work to implement arbitrarily nested contexts just fine, although there's other simple solutions too when the number of tasks is small and statically fixed, as I think is the case here3.

My belief here is that the init/tick functions being called from separate threads is a Verilator-specific quirk that we should work around with minimal disruption to existing other code/interfaces.

I appreciate the examples! I'm glad to have the benefit of your experience here. I'd agree at this point that "avoid use of context_t entirely" is a path not worth further exploration.

Unfortunately, I'm not sure there's a general resolution that doesn't at least involve at least looking at the other use sites: neither threading nor non-local control flow are famous for composing well. I haven't identified any answers that reside entirely within context_t or the verilated main or some other high-leverage point that would span all devices (at least, not yet).

Perhaps the simplest thing is to merge network_tick and network_init, and make the tick function initialize the devices on-demand?

It's a good idea, that's how it seems the spike tile (and, perhaps, htif?) gets away with using context_t when verilated as a multi-threaded model. I considered it, but I didn't bring it up, because it has some immediate consequences that I presumed would be disqualifying (all the _tick interfaces would have to take all the _init parameters, for example).

I'm also not entirely sure how durable it would be, as a solution: the verilator documentation on task scheduling suggests that they tried both static and dynamic scheduling and went for the static for performance (rather than correctness) reasons. I suspect a dynamically scheduled runtime (e.g. one based on work-stealing) would probably cause even spiketile.cc to spontaneously fail, as it got moved around between threads.

My guess is that's a decision that's unlikely to be reversed any time soon ("efficient dynamic scheduling" notwithstanding), and you're in the much better position than I to know if "all DPI-C that uses context_t has a single verilog-facing entrypoint" is an invariant you feel is more maintainable.

I think my plan at this point is to continue experimenting with ucontexts to get a better understanding of what it means to nest them (& how else they're used by htif & friends), and whether there's somewhere besides a thread-local to pass task-local sideband data to try and break the coupling there.

Pursuing a definition of a task that didn't care what thread it was scheduled on (as long as it wasn't scheduled more than once) seems like it offers a resolution that's relatively low-impact and high-durability to me.

Thank you for digging into this, I've learned quite a bit about the subtleties of the context_t behavior in a multi-threaded system from your analysis.

Thank you for reading, and for the feedback! I'm glad you've found it helpful, I've very much enjoyed learning about all these fine details as well 😄

Footnotes

  1. It's fairly Rust-jaron-rich, but the rust user forums have a good discussion about what being !Send + Sync means.

  2. The only other non-experiential citation I have for this is the Rustonomicon, which unhelpfully states "A type is Send if it is safe to send it to another thread," and seems to mistake its own premise further down (filed as https://github.com/rust-lang/nomicon/issues/453 , for anyone reading that desires homework from a footnote).

  3. If you're curious, I have an example that I'm playing with: https://github.com/sethp/ucontext-coroutine . I haven't gotten to threading just yet, and I suspect my implementation is broken even for tail-recursive / single-recursion cases, but it's been enlightening.

@sethp
Copy link
Author

sethp commented Jun 24, 2024

I think I finally understand the problem here well enough that I feel confident about what's happening. Much of the clarity came when I got curious about the question "why is assigning target = context_t::current() not sending us back to the initial block when we target->switch_to()?". I wrote a small little sample program to investigate1, but the short version is that the referent of current() is not stable, it's (sometimes) internally mutated as a side effect of calling ::switch_to().

I found that even in single threaded mode, target->switch_to() took me back to a surprising point—it only "resumes" the target simulation when the implicit second parameter to swapcontext (via the cur thread local) points to the same referent as we captured during initialization.

I understand the desire to move responsibility for that to the verilog implementation, but I don't see how to do so effectively. In this case there's three non-lexical scopes that all need to line up (the thread local, the init, and the tick), but since context_t::current() could point to any user-allocated structure (not just the anonymous thread-local one), and be captured behind any DPI-C call, any boundary we draw here feels somewhat arbitrary to me. And, "do all of the simulators agree about what is the atomic unit of thread-binding, and does that cover every deferred reference to context_t::current() is a much harder property to pattern match on than "does this switch_to call immediately update something with context_t::current() just prior?"

So, I'd like to pitch a three step plan:

  1. Repair the network device by updating target = context_t::current() just before calling host.switch_to() in netdev (& similar for the netsw), as mentioned above. This ensures the invariant that target always points to the task we're about to park, and therefore works across time & threads both.
  2. Identify some other usages of context_t and evaluate a similar repair. A quick grep suggests there's on the order of a half dozen or so usages of context_t in chipyard, so repair should only take a few days' of effort and can be incremental—IMO it's ok for this step to be best-effort, because one way identification works is "someone reports an issue about a segfault".
  3. Potentially, revisit the idea to invent a new semantic to be more explicit about the referent we want to update as a side effect of the switch_to—how often this pattern appears suggests to me that it is indeed something worth looking into, and context_t::current() may even be worth deprecating, since capturing its result is misuse-prone.

What do you think? Would you be willing to accept a change like 1 and piecemeal updates for 2?

Footnotes

  1. https://gist.github.com/sethp/3f158929160935d83a4f976f3b752d69

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants