segfault when using prototype components. #981

mbstrange2 · 2021-02-01T22:22:01Z

I am experiencing an issue where any attempts to generate a Verilog for my test bench is segfaulting.

%> coreir --version
v0.1.51

This error can be reproduced by checking out lake:sparse_strawman and garnet:spVspV and running python tests/test_memory_core/test_memory_core.py in garnet.

WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
WARNING:magma:Wiring multiple outputs to same wire, using last connection. Input: Interconnect.Tile_X02_Y01.clk, Old Output: Interconnect.Tile_X02_Y00.clk_out, New Output: Interconnect.Tile_X01_Y01.clk_pass_through_out_right
WARNING:magma:Wiring multiple outputs to same wire, using last connection. Input: Interconnect.Tile_X04_Y01.clk, Old Output: Interconnect.Tile_X04_Y00.clk_out, New Output: Interconnect.Tile_X03_Y01.clk_pass_through_out_right
WARNING:magma:Wiring multiple outputs to same wire, using last connection. Input: Interconnect.Tile_X06_Y01.clk, Old Output: Interconnect.Tile_X06_Y00.clk_out, New Output: Interconnect.Tile_X05_Y01.clk_pass_through_out_right
Segmentation fault (core dumped)

This is the output I see when trying to generate the verilog in this context.

The text was updated successfully, but these errors were encountered:

rdaly525 · 2021-02-01T22:26:29Z

@mbstrange2, Ill take a look

leonardt · 2021-02-01T23:36:58Z

Can you try running pytest with the "-s" flag to see if there's a CoreIR error message being dumped?

mbstrange2 · 2021-02-01T23:38:15Z

@leonardt I haven't been using pytest, just normal python, so I believe whatever output should be there, right?

leonardt · 2021-02-02T19:18:44Z

This is the output I get when running the test:

~/repos/garnet spVspV*
garnet-venv ❯ PYTHONPATH=. python tests/test_memory_core/test_memory_core.py
/home/lenny/repos/garnet/garnet-venv/src/peak/peak/mapper/mapper.py:229: SyntaxWarning: "is" with a literal. Did you mean "=="?
  assert arch_binding[0][1] is ()
/home/lenny/repos/garnet/garnet-venv/src/peak/peak/mapper/mapper.py:236: SyntaxWarning: "is" with a literal. Did you mean "=="?
  assert ir_binding[0][1] is ()
/home/lenny/repos/garnet/garnet-venv/src/peak/peak/mapper/utils.py:198: SyntaxWarning: "is" with a literal. Did you mean "=="?
  assert binding[0][1] is ()
/home/lenny/repos/garnet/garnet-venv/src/peak/peak/mapper/utils.py:199: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(binding)==1 and binding[0][1] is ():
/home/lenny/repos/garnet/garnet-venv/src/peak/peak/mapper/utils.py:246: SyntaxWarning: "is" with a literal. Did you mean "=="?
  assert arch_path is ()
/home/lenny/repos/garnet/garnet-venv/src/lake/lake/passes/passes.py:29: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if port_name is "mode":
/home/lenny/repos/garnet/garnet-venv/src/lake/lake/utils/util.py:131: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if pdir is "input":
/home/lenny/repos/garnet/garnet-venv/src/lake/lake/utils/util.py:240: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if pdir is "input":
Getting length on class SparseSequenceConstraints.ZERO
Getting length on class SparseSequenceConstraints.ZERO






NEW TEST
len1=0
len2=0
num_match=0
SEQA: []
SEQB: []
DATA0: []
DATAD0: []
DATA1: []
DATAD1: []
common coords: []
result data: []
ALIGNED LENGTH 0: 0
ALIGNED LENGTH 1: 0
ADATA0: []
ADATAD0: []
ADATA1: []
ADATAD1: []
Variable: back_empty has no sink
Variable: back_full has no sink
Variable: front_empty has no sink
Variable: front_full has no sink
Variable: rd_valid has no sink
--------------------------------------------------------------------------------
/home/lenny/repos/garnet/garnet-venv/src/lake/lake/modules/strg_RAM.py:104
         self._rd_bank = self.var("rd_bank", max(1, clog2(self.banks)))
         self.set_read_bank()
>        self._rd_valid = self.var("rd_valid", 1)
         self.set_read_valid()
         if self.fw_int == 1:
--------------------------------------------------------------------------------
Use anneal_param_factor 120
HPWL: 12.668244
HPWL: 10.684666
Using HPWL: 10.684666
Before annealing energy: 359.644200
After annealing energy: 4.487500 improvement: 0.98752293/3293 | 328.9 kHz | 0s<0s]
terminate called after throwing an instance of 'std::runtime_error'
  what():  error in assign clb cells got cell type j
Traceback (most recent call last):
  File "tests/test_memory_core/test_memory_core.py", line 1162, in <module>
    spVspV_regress(dump_dir="mek_dump",
  File "tests/test_memory_core/test_memory_core.py", line 1133, in spVspV_regress
    success = run_test(len1, len2, num_match, value_limit, dump_dir=dump_dir, log_name=log_name, trace=trace)
  File "tests/test_memory_core/test_memory_core.py", line 1069, in run_test
    out_coord, out_data = spVspV_test(trace=trace,
  File "tests/test_memory_core/test_memory_core.py", line 926, in spVspV_test
    placement, routing = pnr(interconnect, (netlist, bus), cwd=cwd)
  File "/home/lenny/repos/garnet/garnet-venv/src/archipelago/archipelago/pnr_.py", line 82, in pnr
    place(packed_file, layout_filename, placement_filename, has_fixed)
  File "/home/lenny/repos/garnet/garnet-venv/src/archipelago/archipelago/place.py", line 16, in place
    subprocess.check_call([placer_binary, layout_filename,
  File "/home/lenny/miniconda3/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/home/lenny/repos/garnet/garnet-venv/lib/python3.8/site-packages/placer', '/home/lenny/repos/garnet/mek_dump/design.layout', 'mek_dump/design.packed', 'mek_dump/design.place']' died with <Signals.SIGABRT: 6>.

Looks like some issue related to placement?

leonardt · 2021-02-02T19:20:24Z

Runnning python garnet.py -v works without error for me, so I suspect there's some differences in our setups. Are there any local changes to garnet/lake or other dependencies that might not have been pushed yet?

mbstrange2 · 2021-02-02T19:44:45Z

@leonardt Sorry about that, you need updated cyclone, thunder, and canal, then

export DISABLE_GP=1

leonardt · 2021-02-02T19:57:27Z

Ok, I had to manually install the latest master branch from the cgra_pnr repo. I'm able to run the test and get verilog generated

garnet-venv ❯ python tests/test_memory_core/test_memory_core.py
Getting length on class SparseSequenceConstraints.ZERO
Getting length on class SparseSequenceConstraints.ZERO






NEW TEST
len1=0
len2=0
num_match=0
SEQA: []
SEQB: []
DATA0: []
DATAD0: []
DATA1: []
DATAD1: []
common coords: []
result data: []
ALIGNED LENGTH 0: 0
ALIGNED LENGTH 1: 0
ADATA0: []
ADATAD0: []
ADATA1: []
ADATAD1: []
Variable: back_empty has no sink
Variable: back_full has no sink
Variable: rd_valid has no sink
--------------------------------------------------------------------------------
/home/lenny/repos/garnet/garnet-venv/src/lake/lake/modules/strg_RAM.py:104
         self._rd_bank = self.var("rd_bank", max(1, clog2(self.banks)))
         self.set_read_bank()
>        self._rd_valid = self.var("rd_valid", 1)
         self.set_read_valid()
         if self.fw_int == 1:
--------------------------------------------------------------------------------
Variable: front_empty has no sink
Variable: front_full has no sink
 90.000000 -> 81.000000 improvement: 0.100000 total: 0.000000 | 675.9 kHz | 0s<0s]
 81.000000 -> 81.000000 improvement: 0.000000 total: 0.100000 | 442.3 kHz | 0s<0s]
using bit_width 1
Routing iteration:   0 duration: 20 ms
using bit_width 16
Routing iteration:   0 duration: 6 ms
[(4, 16), (83, 134217728), (83, 33554432), (4, 2), (83, 16777216)]
[(3, -16), (2, 0), (2, -2), (1, 0), (0, 65536), (1, 65536), (0, 0), (3, 1048576)]
[(4, 16), (83, 134217728), (83, 33554432), (4, 2), (83, 16777216)]
[(3, -16), (2, 0), (2, -2), (1, 0), (0, 65536), (1, 65536), (0, 0), (3, 1048576)]
Config isect core.....!
[(0, 256)]
[(4, 16), (83, 134217728), (83, 33554432), (4, 2), (83, 16777216)]
[(4, 16), (83, 134217728), (83, 33554432), (4, 2), (83, 16777216)]
[(0, 64), (4, 1), (83, 16777216), (83, 134217728), (4, 16)]
[(0, 64), (4, 1), (83, 16777216), (83, 134217728), (4, 16)]
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
WARNING:magma:Wiring multiple outputs to same wire, using last connection. Input: Interconnect.Tile_X02_Y01.clk, Old Output: Interconnect.Tile_X02_Y00.clk_out, New Output: Interconnect.Tile_X01_Y01.clk_pass_through_out_right
WARNING:magma:Wiring multiple outputs to same wire, using last connection. Input: Interconnect.Tile_X04_Y01.clk, Old Output: Interconnect.Tile_X04_Y00.clk_out, New Output: Interconnect.Tile_X03_Y01.clk_pass_through_out_right
WARNING:magma:Wiring multiple outputs to same wire, using last connection. Input: Interconnect.Tile_X06_Y01.clk, Old Output: Interconnect.Tile_X06_Y00.clk_out, New Output: Interconnect.Tile_X05_Y01.clk_pass_through_out_right
mek_dump/Interconnect.json
Running command: verilator -Wall -Wno-INCABSPATH -Wno-DECLFILENAME -Wno-fatal --cc Interconnect.v -v cfg_and_dbg_unq1.sv -v tap_unq1.sv -v jtag.sv -v glc_axi_ctrl.sv -v flop_unq1.sv -v flop_unq3.sv -v flop_unq2.sv -v glc_jtag_ctrl.sv -v global_controller.sv -v glc_axi_addrmap.sv -v CW_fp_add.v -v CW_fp_mult.v -v AN2D0BWP16P90.sv -v AO22D0BWP16P90.sv --exe Interconnect_driver.cpp --top-module Interconnect

Perhaps there's some difference in our setup still.

Can you show the pycoreir version and check if there's multiple version of coreir in your path with

pip show pycoreir

and

which -a coreir

here's what I have

~/repos/garnet spVspV*
garnet-venv ❯ pip show coreir
Name: coreir
Version: 2.0.128
Summary: Python bindings for CoreIR
Home-page: https://github.com/leonardt/pycoreir
Author: Leonard Truong
Author-email: lenny@cs.stanford.edu
License: BSD License
Location: /home/lenny/repos/garnet/garnet-venv/lib/python3.8/site-packages
Requires: hwtypes
Required-by: CoSA, magma-lang, fault, peak, metamapper

~/repos/garnet spVspV*
garnet-venv ❯ which -a coreir
/home/lenny/repos/garnet/garnet-venv/bin/coreir
/home/lenny/miniconda3/bin/coreir

mbstrange2 · 2021-02-02T20:01:02Z

Can you make sure to target xcelium? I'm not sure if there's any difference if you choose a different simulator target.

(aha) root@615a6684288f:/aha/garnet# pip show coreir
Name: coreir
Version: 2.0.128
Summary: Python bindings for CoreIR
Home-page: https://github.com/leonardt/pycoreir
Author: Leonard Truong
Author-email: lenny@cs.stanford.edu
License: BSD License
Location: /aha/pycoreir
Requires: hwtypes
Required-by: CoSA, magma-lang, peak, fault
(aha) root@615a6684288f:/aha/garnet# which -a coreir
/usr/local/bin/coreir

leonardt · 2021-02-02T20:01:55Z

The verilator compilation failed with a huge amount of errors, here's a snippet:

      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8143:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
 8143 |   if (io2glb_1_X06_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8166:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
 8166 |   if (io2glb_1_X01_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8169:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
 8169 |   if (io2glb_1_X06_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8192:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
 8192 |   if (io2glb_1_X01_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8195:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
 8195 |   if (io2glb_1_X06_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8218:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
 8218 |   if (io2glb_1_X01_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8221:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
 8221 |   if (io2glb_1_X06_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8244:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
 8244 |   if (io2glb_1_X01_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8247:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
 8247 |   if (io2glb_1_X06_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8270:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
 8270 |   if (io2glb_1_X01_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8273:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
 8273 |   if (io2glb_1_X06_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8296:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
 8296 |   if (io2glb_1_X01_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8299:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
 8299 |   if (io2glb_1_X06_Y00) {

I wonder if the large amount of errors is causing a segfault in the downstream tool?

I'll try using xcelium to see if there's any difference

mbstrange2 · 2021-02-02T20:03:05Z

Can you check conftest.py in garnet and make sure to set skip_compile=False? I might have pushed the code with it true in which case there's no verilog being produced.

leonardt · 2021-02-02T20:04:49Z

skip_compile is False in conftest

I was looking at the test code and noticed:

1029         tester_if = tester._if(circuit.interface[cvalid])

I think it should be

1029         tester_if = tester._if(tester.peek(circuit.interface[cvalid]))

Since you need to use the tester.peek function when referring to a circuit port (when not using the tester.circuit interface)

mbstrange2 · 2021-02-02T20:06:31Z

There must be some other mismatch in our envs. This worked for me when using old generated verilog and ran fine in xcelium.

leonardt · 2021-02-02T20:06:37Z

Also, I don't think changing the simulator target (to xcelium) would affect the verilog code generation. If you can't generate code with python garnet.py -v (without using the test), then this suggests that there's still some difference in our setup since I can generate the verilog fine.

mbstrange2 · 2021-02-02T20:07:08Z

I can generate the verilog with python garnet.py -v, just trying to figure out why it fails for me and Keyi when we use the test.

leonardt · 2021-02-02T20:07:59Z

Ah, I see, I misread the original post then, let me investigate with the xcelium target then

leonardt · 2021-02-02T20:21:13Z

Changing the target doesn't seem to affect verilog code generation for me (I get a file in mek_dump, Interconnect.V), so I think there's still some difference in our environments

mbstrange2 · 2021-02-02T20:22:17Z

Okay this is somewhat great news then. The test ran and passed?

Here's my pip list

(aha) root@615a6684288f:/aha/garnet# pip list
Package             Version   Location
------------------- --------- ---------------------
aha                 0.0.0     /aha
archipelago         0.0.8     /aha/archipelago
ast-tools           0.0.30    /aha/ast_tools
astor               0.8.1
attrs               20.3.0
buffer-mapping      0.0.5     /aha/BufferMapping
canal               0.0.0     /aha/canal
certifi             2020.12.5
chardet             4.0.0
colorlog            4.7.2
coreir              2.0.128   /aha/pycoreir
CoSA                0.4       /aha/cosa
dataclasses         0.6
DeCiDa              1.1.5
decorator           4.4.2
docker              4.4.1
fault               3.0.47    /aha/fault
gemstone            0.0.0     /aha/gemstone
genesis2            0.0.5
gitdb               4.0.5
GitPython           3.1.12
gmpy2               2.0.8
graphviz            0.16
hwtypes             1.4.4     /aha/hwtypes
idna                2.10
importlib-metadata  3.4.0
iniconfig           1.1.1
Jinja2              2.11.2
jmapper             0.2.0
kratos              0.0.32.3  /aha/kratos
lake-aha            0.0.4     /aha/lake
lassen              0.0.1     /aha/lassen
libcst              0.3.16
magma-lang          2.1.27    /aha/magma
Mako                1.1.4
mantle              2.0.16    /aha/mantle
MarkupSafe          1.1.1
mflowgen            0.3.0     /aha/mflowgen
mypy-extensions     0.4.3
networkx            2.5
numpy               1.19.5
ordered-set         4.0.2
packaging           20.9
peak                0.0.1     /aha/peak
pip                 20.1.1
pluggy              0.13.1
ply                 3.11
py                  1.10.0
pycyclone           0.3.26    /aha/cgra_pnr/cyclone
pydot               1.4.1
pyparsing           2.4.7
PySMT               0.9.0
pysv                0.1.2
pytest              6.2.2
pythunder           0.3.26    /aha/cgra_pnr/thunder
pyverilog           1.3.0
PyYAML              5.4.1
requests            2.25.1
requirements-parser 0.2.0
scipy               1.6.0
setuptools          47.1.0
six                 1.15.0
smmap               3.0.5
staticfg            0.9.5
tabulate            0.8.7
toml                0.10.2
typing-extensions   3.7.4.3
typing-inspect      0.6.0
urllib3             1.26.3
websocket-client    0.57.0
wheel               0.36.2
z3-solver           4.8.10.0
zipp                3.4.0

leonardt · 2021-02-02T20:37:00Z

I'm able to generate verilog and the test runs xrun but then fails with some errors. Here are the relevant *E snippets from xrun.log

   660 xmvlog: *E,DUPIDN (Interconnect.v,5493|18): identifier 'exp_bits' previously declared [12.5(IEEE)].
   661 localparam frac_bits = 7;
   662                    |
   663 xmvlog: *E,DUPIDN (Interconnect.v,5494|19): identifier 'frac_bits' previously declared [12.5(IEEE)].
   664     module worklib.mul:v
   665         errors: 2, warnings: 0

  1765 xmvlog: *E,DUPIDN (global_buffer_int.sv,129|45): identifier 'glb_config_rd_data' previously declared [12.5(IEEE)].
  1766     module worklib.global_buffer_int:sv
  1767         errors: 1, warnings: 0

leonardt · 2021-02-02T20:37:10Z

But it does not segfault at any point

mbstrange2 · 2021-02-02T20:39:43Z

You're having it use cadence ware (CW)? Those errors are in the PE so I'm even more confused.

leonardt · 2021-02-02T20:40:51Z

I haven't changed anything, looking at the generated code though, it looks out of date so possibly some different coreir version is being used

leonardt · 2021-02-02T20:41:39Z

Ah yes, my version of python on kiwi is old (3.7) so it's installing an older version of coreir, going to upgrade it to 3.8

leonardt · 2021-02-02T20:53:01Z

Hmm that wasn't the problem, it actually seemed to be the right version of coreir and still getting the same output

rdaly525 · 2021-02-02T20:54:29Z

Thanks for looking into this, I can help later today if this is still not resolved.

mbstrange2 · 2021-02-02T20:55:10Z

Hmmm not sure what to do then.

leonardt · 2021-02-02T20:55:55Z

When looking at the generated mek_dump/Interconnect.v on my local machine, I'm getting a different output (without localparam error), so it seems that something on kiwi is causing me to generate different verilog.

leonardt · 2021-02-02T21:05:27Z

Ok, figured it out. There was a leftover old version of coreir in my LD_LIBRARY_PATH, you may want to check that out (maybe there's an old version of the library being used). This was causing the old float code library to be loaded and affecting the verilog output). Now I just get this error from the global buffer:

  1118 xmvlog: *E,DUPIDN (global_buffer_int.sv,129|45): identifier 'glb_config_rd_data' previously declared [12.5(IEEE)].
  1119     module worklib.global_buffer_int:sv

I'm going to try patching it locally to see if the test will run

leonardt · 2021-02-02T21:31:39Z

Okay, I resolved the global_buffer_int problem, it looks like the test bench was copying the entire contents of genesis_verif directory. It turns out that directory had some old genesis files from an older version of garnet that was being copied in and causing the error. Purging the directory resolved that issue (now I'm getting the xcelium license issue so I'm trying again with the older version that works)

leonardt · 2021-02-02T21:35:03Z

Ok so the simulation completes but then fails during the results parsing with:

xcelium> run 10000ns
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
Simulation complete via $finish(1) at time 3541 NS + 0
./Interconnect_tb.sv:3668         #20 $finish;
xcelium> assertion -summary -final
  Summary report deferred until the end of simulation.
xcelium> quit
  No assertions found.
xmsim: *N,PRASRT: Protected assertions are not shown.
TOOL:	xrun(64)	19.03-s003: Exiting on Feb 02, 2021 at 13:33:17 PST  (total: 00:00:24)
</STDOUT>
Traceback (most recent call last):
  File "tests/test_memory_core/test_memory_core.py", line 1162, in <module>
    spVspV_regress(dump_dir="mek_dump",
  File "tests/test_memory_core/test_memory_core.py", line 1133, in spVspV_regress
    success = run_test(len1, len2, num_match, value_limit, dump_dir=dump_dir, log_name=log_name, trace=trace)
  File "tests/test_memory_core/test_memory_core.py", line 1089, in run_test
    data_sim = [int(x[3]) for x in split_lines]
  File "tests/test_memory_core/test_memory_core.py", line 1089, in <listcomp>
    data_sim = [int(x[3]) for x in split_lines]
ValueError: invalid literal for int() with base 10: 'x'

But I think I'm much further then necessary. It looks like I'm able to generate the verilog and run the test totally fine without a segfault so let's see what's different about your environment. Can you post the output of your $PATH and $LD_LIBRARY_PATH? Let's make sure there's no old versions of coreir lying around there. Also, is your coreir version installed via pip? Or do you have a local installation from a checkout of the pycoreir repo?

leonardt · 2021-02-02T21:36:54Z

Ah, I see that you have coreir installed from a local location: coreir 2.0.128 /aha/pycoreir

Can we double check this setup by either recompiling it to ensure it's up to date or uninstalling this version and using the pip distribution?

mbstrange2 · 2021-02-02T21:42:21Z

This is in the aha docker - if you want to attach to it mstrange-gracious_visvesvaraya and check it out that might be easier? Or startup another docker?

leonardt · 2021-02-02T21:45:02Z

docker attach mstrange-gracious_visvesvaraya hangs for me, I wonder if only one person can be attached at a time? or if there's a user permissions issue

leonardt · 2021-02-02T21:45:34Z

Hmm wait, nevermind, hitting ctrl-c dropped me into the shell, maybe it was just waiting for a command

mbstrange2 · 2021-02-02T21:45:49Z

You just need to hit enter - it doesn't automatically show the prompt for some reason lol

leonardt · 2021-02-02T21:50:58Z

Hm the tests seem to be running for me, it seems to be running more than one though so I have finished all of them

leonardt · 2021-02-02T21:55:02Z

Have you tried simply reattaching to the container? Perhaps there's some leftover config in your env causing the problem? How many tests is this supposed to run? I'm still waiting for it to finish but it seems to be running xcelium multiple times so it doesn't seem to be having any problems generating the verilog.

mbstrange2 · 2021-02-02T21:56:04Z

Oh I'm sorry one second
I have skip_compile=True in there

mbstrange2 · 2021-02-02T22:00:21Z

Okay if you run it again in the docker it will segfault

leonardt · 2021-02-02T22:07:29Z

Seemed to have "worked around" the issue by uninstalling coreir, and installing the pypi distribution. So something about the local docker setup is likely at fault

cd /aha/coreir/build
make uninstall
pip uninstall coreir
pip install coreir

leonardt · 2021-02-02T22:30:04Z

I reinstalled coreir and it causes the segfault so something about the local build is causing the problem

leonardt · 2021-02-02T22:38:31Z

Hmm, I tried reverting coreir to an older commit to match up with the pycoreir release (which is a few commits behind coreir master) but still the same problem, which suggests it's not an issue with any of the recent changes (also reviewing the commits shows nothing that would suggest a seg fault, they are minor)

leonardt · 2021-02-02T22:39:00Z

@mbstrange2 does that workaround work for unblocking you for now? We'll need to investigate the docker environment more closely to see what would be causing this issue with the local build versus using the pip wheel distribution

rdaly525 · 2021-02-03T03:47:46Z

Where is the docker environment specified?

mbstrange2 · 2021-02-03T23:49:17Z

@leonardt This workaround is good for me at present

@rdaly525 https://hub.docker.com/r/stanfordaha/garnet it's this docker - it should be created from https://github.com/StanfordAHA/aha

rdaly525 self-assigned this Feb 1, 2021

segfault when using prototype components. #981

segfault when using prototype components. #981

Comments

mbstrange2 commented Feb 1, 2021 • edited Loading

rdaly525 commented Feb 1, 2021

leonardt commented Feb 1, 2021

mbstrange2 commented Feb 1, 2021

leonardt commented Feb 2, 2021

leonardt commented Feb 2, 2021

mbstrange2 commented Feb 2, 2021

leonardt commented Feb 2, 2021

mbstrange2 commented Feb 2, 2021

leonardt commented Feb 2, 2021

mbstrange2 commented Feb 2, 2021 • edited Loading

leonardt commented Feb 2, 2021

mbstrange2 commented Feb 2, 2021

leonardt commented Feb 2, 2021

mbstrange2 commented Feb 2, 2021 • edited Loading

leonardt commented Feb 2, 2021

leonardt commented Feb 2, 2021

mbstrange2 commented Feb 2, 2021 • edited Loading

leonardt commented Feb 2, 2021

leonardt commented Feb 2, 2021

mbstrange2 commented Feb 2, 2021

leonardt commented Feb 2, 2021

leonardt commented Feb 2, 2021

leonardt commented Feb 2, 2021

rdaly525 commented Feb 2, 2021

mbstrange2 commented Feb 2, 2021

leonardt commented Feb 2, 2021

leonardt commented Feb 2, 2021

leonardt commented Feb 2, 2021

leonardt commented Feb 2, 2021

leonardt commented Feb 2, 2021

mbstrange2 commented Feb 2, 2021 • edited Loading

leonardt commented Feb 2, 2021

leonardt commented Feb 2, 2021

mbstrange2 commented Feb 2, 2021

leonardt commented Feb 2, 2021

leonardt commented Feb 2, 2021

mbstrange2 commented Feb 2, 2021

mbstrange2 commented Feb 2, 2021

leonardt commented Feb 2, 2021

leonardt commented Feb 2, 2021

leonardt commented Feb 2, 2021

leonardt commented Feb 2, 2021

rdaly525 commented Feb 3, 2021

mbstrange2 commented Feb 3, 2021

mbstrange2 commented Feb 1, 2021 •

edited

Loading

mbstrange2 commented Feb 2, 2021 •

edited

Loading

mbstrange2 commented Feb 2, 2021 •

edited

Loading

mbstrange2 commented Feb 2, 2021 •

edited

Loading

mbstrange2 commented Feb 2, 2021 •

edited

Loading