Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault when using prototype components. #981

Open
mbstrange2 opened this issue Feb 1, 2021 · 44 comments
Open

segfault when using prototype components. #981

mbstrange2 opened this issue Feb 1, 2021 · 44 comments
Assignees

Comments

@mbstrange2
Copy link

mbstrange2 commented Feb 1, 2021

I am experiencing an issue where any attempts to generate a Verilog for my test bench is segfaulting.

%> coreir --version
v0.1.51

This error can be reproduced by checking out lake:sparse_strawman and garnet:spVspV and running python tests/test_memory_core/test_memory_core.py in garnet.

WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
WARNING:magma:Wiring multiple outputs to same wire, using last connection. Input: Interconnect.Tile_X02_Y01.clk, Old Output: Interconnect.Tile_X02_Y00.clk_out, New Output: Interconnect.Tile_X01_Y01.clk_pass_through_out_right
WARNING:magma:Wiring multiple outputs to same wire, using last connection. Input: Interconnect.Tile_X04_Y01.clk, Old Output: Interconnect.Tile_X04_Y00.clk_out, New Output: Interconnect.Tile_X03_Y01.clk_pass_through_out_right
WARNING:magma:Wiring multiple outputs to same wire, using last connection. Input: Interconnect.Tile_X06_Y01.clk, Old Output: Interconnect.Tile_X06_Y00.clk_out, New Output: Interconnect.Tile_X05_Y01.clk_pass_through_out_right
Segmentation fault (core dumped)

This is the output I see when trying to generate the verilog in this context.

@rdaly525 rdaly525 self-assigned this Feb 1, 2021
@rdaly525
Copy link
Owner

rdaly525 commented Feb 1, 2021

@mbstrange2, Ill take a look

@leonardt
Copy link
Collaborator

leonardt commented Feb 1, 2021

Can you try running pytest with the "-s" flag to see if there's a CoreIR error message being dumped?

@mbstrange2
Copy link
Author

@leonardt I haven't been using pytest, just normal python, so I believe whatever output should be there, right?

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

This is the output I get when running the test:

~/repos/garnet spVspV*
garnet-venv ❯ PYTHONPATH=. python tests/test_memory_core/test_memory_core.py
/home/lenny/repos/garnet/garnet-venv/src/peak/peak/mapper/mapper.py:229: SyntaxWarning: "is" with a literal. Did you mean "=="?
  assert arch_binding[0][1] is ()
/home/lenny/repos/garnet/garnet-venv/src/peak/peak/mapper/mapper.py:236: SyntaxWarning: "is" with a literal. Did you mean "=="?
  assert ir_binding[0][1] is ()
/home/lenny/repos/garnet/garnet-venv/src/peak/peak/mapper/utils.py:198: SyntaxWarning: "is" with a literal. Did you mean "=="?
  assert binding[0][1] is ()
/home/lenny/repos/garnet/garnet-venv/src/peak/peak/mapper/utils.py:199: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(binding)==1 and binding[0][1] is ():
/home/lenny/repos/garnet/garnet-venv/src/peak/peak/mapper/utils.py:246: SyntaxWarning: "is" with a literal. Did you mean "=="?
  assert arch_path is ()
/home/lenny/repos/garnet/garnet-venv/src/lake/lake/passes/passes.py:29: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if port_name is "mode":
/home/lenny/repos/garnet/garnet-venv/src/lake/lake/utils/util.py:131: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if pdir is "input":
/home/lenny/repos/garnet/garnet-venv/src/lake/lake/utils/util.py:240: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if pdir is "input":
Getting length on class SparseSequenceConstraints.ZERO
Getting length on class SparseSequenceConstraints.ZERO






NEW TEST
len1=0
len2=0
num_match=0
SEQA: []
SEQB: []
DATA0: []
DATAD0: []
DATA1: []
DATAD1: []
common coords: []
result data: []
ALIGNED LENGTH 0: 0
ALIGNED LENGTH 1: 0
ADATA0: []
ADATAD0: []
ADATA1: []
ADATAD1: []
Variable: back_empty has no sink
Variable: back_full has no sink
Variable: front_empty has no sink
Variable: front_full has no sink
Variable: rd_valid has no sink
--------------------------------------------------------------------------------
/home/lenny/repos/garnet/garnet-venv/src/lake/lake/modules/strg_RAM.py:104
         self._rd_bank = self.var("rd_bank", max(1, clog2(self.banks)))
         self.set_read_bank()
>        self._rd_valid = self.var("rd_valid", 1)
         self.set_read_valid()
         if self.fw_int == 1:
--------------------------------------------------------------------------------
Use anneal_param_factor 120
HPWL: 12.668244
HPWL: 10.684666
Using HPWL: 10.684666
Before annealing energy: 359.644200
After annealing energy: 4.487500 improvement: 0.98752293/3293 | 328.9 kHz | 0s<0s]
terminate called after throwing an instance of 'std::runtime_error'
  what():  error in assign clb cells got cell type j
Traceback (most recent call last):
  File "tests/test_memory_core/test_memory_core.py", line 1162, in <module>
    spVspV_regress(dump_dir="mek_dump",
  File "tests/test_memory_core/test_memory_core.py", line 1133, in spVspV_regress
    success = run_test(len1, len2, num_match, value_limit, dump_dir=dump_dir, log_name=log_name, trace=trace)
  File "tests/test_memory_core/test_memory_core.py", line 1069, in run_test
    out_coord, out_data = spVspV_test(trace=trace,
  File "tests/test_memory_core/test_memory_core.py", line 926, in spVspV_test
    placement, routing = pnr(interconnect, (netlist, bus), cwd=cwd)
  File "/home/lenny/repos/garnet/garnet-venv/src/archipelago/archipelago/pnr_.py", line 82, in pnr
    place(packed_file, layout_filename, placement_filename, has_fixed)
  File "/home/lenny/repos/garnet/garnet-venv/src/archipelago/archipelago/place.py", line 16, in place
    subprocess.check_call([placer_binary, layout_filename,
  File "/home/lenny/miniconda3/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/home/lenny/repos/garnet/garnet-venv/lib/python3.8/site-packages/placer', '/home/lenny/repos/garnet/mek_dump/design.layout', 'mek_dump/design.packed', 'mek_dump/design.place']' died with <Signals.SIGABRT: 6>.

Looks like some issue related to placement?

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

Runnning python garnet.py -v works without error for me, so I suspect there's some differences in our setups. Are there any local changes to garnet/lake or other dependencies that might not have been pushed yet?

@mbstrange2
Copy link
Author

@leonardt Sorry about that, you need updated cyclone, thunder, and canal, then

export DISABLE_GP=1

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

Ok, I had to manually install the latest master branch from the cgra_pnr repo. I'm able to run the test and get verilog generated

garnet-venv ❯ python tests/test_memory_core/test_memory_core.py
Getting length on class SparseSequenceConstraints.ZERO
Getting length on class SparseSequenceConstraints.ZERO






NEW TEST
len1=0
len2=0
num_match=0
SEQA: []
SEQB: []
DATA0: []
DATAD0: []
DATA1: []
DATAD1: []
common coords: []
result data: []
ALIGNED LENGTH 0: 0
ALIGNED LENGTH 1: 0
ADATA0: []
ADATAD0: []
ADATA1: []
ADATAD1: []
Variable: back_empty has no sink
Variable: back_full has no sink
Variable: rd_valid has no sink
--------------------------------------------------------------------------------
/home/lenny/repos/garnet/garnet-venv/src/lake/lake/modules/strg_RAM.py:104
         self._rd_bank = self.var("rd_bank", max(1, clog2(self.banks)))
         self.set_read_bank()
>        self._rd_valid = self.var("rd_valid", 1)
         self.set_read_valid()
         if self.fw_int == 1:
--------------------------------------------------------------------------------
Variable: front_empty has no sink
Variable: front_full has no sink
 90.000000 -> 81.000000 improvement: 0.100000 total: 0.000000 | 675.9 kHz | 0s<0s]
 81.000000 -> 81.000000 improvement: 0.000000 total: 0.100000 | 442.3 kHz | 0s<0s]
using bit_width 1
Routing iteration:   0 duration: 20 ms
using bit_width 16
Routing iteration:   0 duration: 6 ms
[(4, 16), (83, 134217728), (83, 33554432), (4, 2), (83, 16777216)]
[(3, -16), (2, 0), (2, -2), (1, 0), (0, 65536), (1, 65536), (0, 0), (3, 1048576)]
[(4, 16), (83, 134217728), (83, 33554432), (4, 2), (83, 16777216)]
[(3, -16), (2, 0), (2, -2), (1, 0), (0, 65536), (1, 65536), (0, 0), (3, 1048576)]
Config isect core.....!
[(0, 256)]
[(4, 16), (83, 134217728), (83, 33554432), (4, 2), (83, 16777216)]
[(4, 16), (83, 134217728), (83, 33554432), (4, 2), (83, 16777216)]
[(0, 64), (4, 1), (83, 16777216), (83, 134217728), (4, 16)]
[(0, 64), (4, 1), (83, 16777216), (83, 134217728), (4, 16)]
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
WARNING:magma:Wiring multiple outputs to same wire, using last connection. Input: Interconnect.Tile_X02_Y01.clk, Old Output: Interconnect.Tile_X02_Y00.clk_out, New Output: Interconnect.Tile_X01_Y01.clk_pass_through_out_right
WARNING:magma:Wiring multiple outputs to same wire, using last connection. Input: Interconnect.Tile_X04_Y01.clk, Old Output: Interconnect.Tile_X04_Y00.clk_out, New Output: Interconnect.Tile_X03_Y01.clk_pass_through_out_right
WARNING:magma:Wiring multiple outputs to same wire, using last connection. Input: Interconnect.Tile_X06_Y01.clk, Old Output: Interconnect.Tile_X06_Y00.clk_out, New Output: Interconnect.Tile_X05_Y01.clk_pass_through_out_right
mek_dump/Interconnect.json
Running command: verilator -Wall -Wno-INCABSPATH -Wno-DECLFILENAME -Wno-fatal --cc Interconnect.v -v cfg_and_dbg_unq1.sv -v tap_unq1.sv -v jtag.sv -v glc_axi_ctrl.sv -v flop_unq1.sv -v flop_unq3.sv -v flop_unq2.sv -v glc_jtag_ctrl.sv -v global_controller.sv -v glc_axi_addrmap.sv -v CW_fp_add.v -v CW_fp_mult.v -v AN2D0BWP16P90.sv -v AO22D0BWP16P90.sv --exe Interconnect_driver.cpp --top-module Interconnect

Perhaps there's some difference in our setup still.

Can you show the pycoreir version and check if there's multiple version of coreir in your path with

pip show pycoreir

and

which -a coreir

here's what I have

~/repos/garnet spVspV*
garnet-venv ❯ pip show coreir
Name: coreir
Version: 2.0.128
Summary: Python bindings for CoreIR
Home-page: https://github.com/leonardt/pycoreir
Author: Leonard Truong
Author-email: lenny@cs.stanford.edu
License: BSD License
Location: /home/lenny/repos/garnet/garnet-venv/lib/python3.8/site-packages
Requires: hwtypes
Required-by: CoSA, magma-lang, fault, peak, metamapper

~/repos/garnet spVspV*
garnet-venv ❯ which -a coreir
/home/lenny/repos/garnet/garnet-venv/bin/coreir
/home/lenny/miniconda3/bin/coreir

@mbstrange2
Copy link
Author

Can you make sure to target xcelium? I'm not sure if there's any difference if you choose a different simulator target.

(aha) root@615a6684288f:/aha/garnet# pip show coreir
Name: coreir
Version: 2.0.128
Summary: Python bindings for CoreIR
Home-page: https://github.com/leonardt/pycoreir
Author: Leonard Truong
Author-email: lenny@cs.stanford.edu
License: BSD License
Location: /aha/pycoreir
Requires: hwtypes
Required-by: CoSA, magma-lang, peak, fault
(aha) root@615a6684288f:/aha/garnet# which -a coreir
/usr/local/bin/coreir

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

The verilator compilation failed with a huge amount of errors, here's a snippet:

      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8143:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
 8143 |   if (io2glb_1_X06_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8166:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
 8166 |   if (io2glb_1_X01_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8169:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
 8169 |   if (io2glb_1_X06_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8192:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
 8192 |   if (io2glb_1_X01_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8195:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
 8195 |   if (io2glb_1_X06_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8218:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
 8218 |   if (io2glb_1_X01_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8221:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
 8221 |   if (io2glb_1_X06_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8244:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
 8244 |   if (io2glb_1_X01_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8247:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
 8247 |   if (io2glb_1_X06_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8270:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
 8270 |   if (io2glb_1_X01_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8273:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
 8273 |   if (io2glb_1_X06_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8296:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
 8296 |   if (io2glb_1_X01_Y00) {
      |       ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8299:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
 8299 |   if (io2glb_1_X06_Y00) {

I wonder if the large amount of errors is causing a segfault in the downstream tool?

I'll try using xcelium to see if there's any difference

@mbstrange2
Copy link
Author

mbstrange2 commented Feb 2, 2021

Can you check conftest.py in garnet and make sure to set skip_compile=False? I might have pushed the code with it true in which case there's no verilog being produced.

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

skip_compile is False in conftest

I was looking at the test code and noticed:

1029         tester_if = tester._if(circuit.interface[cvalid])

I think it should be

1029         tester_if = tester._if(tester.peek(circuit.interface[cvalid]))

Since you need to use the tester.peek function when referring to a circuit port (when not using the tester.circuit interface)

@mbstrange2
Copy link
Author

There must be some other mismatch in our envs. This worked for me when using old generated verilog and ran fine in xcelium.

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

Also, I don't think changing the simulator target (to xcelium) would affect the verilog code generation. If you can't generate code with python garnet.py -v (without using the test), then this suggests that there's still some difference in our setup since I can generate the verilog fine.

@mbstrange2
Copy link
Author

mbstrange2 commented Feb 2, 2021

I can generate the verilog with python garnet.py -v, just trying to figure out why it fails for me and Keyi when we use the test.

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

Ah, I see, I misread the original post then, let me investigate with the xcelium target then

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

Changing the target doesn't seem to affect verilog code generation for me (I get a file in mek_dump, Interconnect.V), so I think there's still some difference in our environments

@mbstrange2
Copy link
Author

mbstrange2 commented Feb 2, 2021

Okay this is somewhat great news then. The test ran and passed?

Here's my pip list

(aha) root@615a6684288f:/aha/garnet# pip list
Package             Version   Location
------------------- --------- ---------------------
aha                 0.0.0     /aha
archipelago         0.0.8     /aha/archipelago
ast-tools           0.0.30    /aha/ast_tools
astor               0.8.1
attrs               20.3.0
buffer-mapping      0.0.5     /aha/BufferMapping
canal               0.0.0     /aha/canal
certifi             2020.12.5
chardet             4.0.0
colorlog            4.7.2
coreir              2.0.128   /aha/pycoreir
CoSA                0.4       /aha/cosa
dataclasses         0.6
DeCiDa              1.1.5
decorator           4.4.2
docker              4.4.1
fault               3.0.47    /aha/fault
gemstone            0.0.0     /aha/gemstone
genesis2            0.0.5
gitdb               4.0.5
GitPython           3.1.12
gmpy2               2.0.8
graphviz            0.16
hwtypes             1.4.4     /aha/hwtypes
idna                2.10
importlib-metadata  3.4.0
iniconfig           1.1.1
Jinja2              2.11.2
jmapper             0.2.0
kratos              0.0.32.3  /aha/kratos
lake-aha            0.0.4     /aha/lake
lassen              0.0.1     /aha/lassen
libcst              0.3.16
magma-lang          2.1.27    /aha/magma
Mako                1.1.4
mantle              2.0.16    /aha/mantle
MarkupSafe          1.1.1
mflowgen            0.3.0     /aha/mflowgen
mypy-extensions     0.4.3
networkx            2.5
numpy               1.19.5
ordered-set         4.0.2
packaging           20.9
peak                0.0.1     /aha/peak
pip                 20.1.1
pluggy              0.13.1
ply                 3.11
py                  1.10.0
pycyclone           0.3.26    /aha/cgra_pnr/cyclone
pydot               1.4.1
pyparsing           2.4.7
PySMT               0.9.0
pysv                0.1.2
pytest              6.2.2
pythunder           0.3.26    /aha/cgra_pnr/thunder
pyverilog           1.3.0
PyYAML              5.4.1
requests            2.25.1
requirements-parser 0.2.0
scipy               1.6.0
setuptools          47.1.0
six                 1.15.0
smmap               3.0.5
staticfg            0.9.5
tabulate            0.8.7
toml                0.10.2
typing-extensions   3.7.4.3
typing-inspect      0.6.0
urllib3             1.26.3
websocket-client    0.57.0
wheel               0.36.2
z3-solver           4.8.10.0
zipp                3.4.0

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

I'm able to generate verilog and the test runs xrun but then fails with some errors. Here are the relevant *E snippets from xrun.log

   660 xmvlog: *E,DUPIDN (Interconnect.v,5493|18): identifier 'exp_bits' previously declared [12.5(IEEE)].
   661 localparam frac_bits = 7;
   662                    |
   663 xmvlog: *E,DUPIDN (Interconnect.v,5494|19): identifier 'frac_bits' previously declared [12.5(IEEE)].
   664     module worklib.mul:v
   665         errors: 2, warnings: 0

  1765 xmvlog: *E,DUPIDN (global_buffer_int.sv,129|45): identifier 'glb_config_rd_data' previously declared [12.5(IEEE)].
  1766     module worklib.global_buffer_int:sv
  1767         errors: 1, warnings: 0

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

But it does not segfault at any point

@mbstrange2
Copy link
Author

You're having it use cadence ware (CW)? Those errors are in the PE so I'm even more confused.

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

I haven't changed anything, looking at the generated code though, it looks out of date so possibly some different coreir version is being used

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

Ah yes, my version of python on kiwi is old (3.7) so it's installing an older version of coreir, going to upgrade it to 3.8

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

Hmm that wasn't the problem, it actually seemed to be the right version of coreir and still getting the same output

@rdaly525
Copy link
Owner

rdaly525 commented Feb 2, 2021

Thanks for looking into this, I can help later today if this is still not resolved.

@mbstrange2
Copy link
Author

Hmmm not sure what to do then.

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

When looking at the generated mek_dump/Interconnect.v on my local machine, I'm getting a different output (without localparam error), so it seems that something on kiwi is causing me to generate different verilog.

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

Ok, figured it out. There was a leftover old version of coreir in my LD_LIBRARY_PATH, you may want to check that out (maybe there's an old version of the library being used). This was causing the old float code library to be loaded and affecting the verilog output). Now I just get this error from the global buffer:

  1118 xmvlog: *E,DUPIDN (global_buffer_int.sv,129|45): identifier 'glb_config_rd_data' previously declared [12.5(IEEE)].
  1119     module worklib.global_buffer_int:sv

I'm going to try patching it locally to see if the test will run

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

Okay, I resolved the global_buffer_int problem, it looks like the test bench was copying the entire contents of genesis_verif directory. It turns out that directory had some old genesis files from an older version of garnet that was being copied in and causing the error. Purging the directory resolved that issue (now I'm getting the xcelium license issue so I'm trying again with the older version that works)

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

Ok so the simulation completes but then fails during the results parsing with:

xcelium> run 10000ns
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
COORD:     0, VAL:     x
Simulation complete via $finish(1) at time 3541 NS + 0
./Interconnect_tb.sv:3668         #20 $finish;
xcelium> assertion -summary -final
  Summary report deferred until the end of simulation.
xcelium> quit
  No assertions found.
xmsim: *N,PRASRT: Protected assertions are not shown.
TOOL:	xrun(64)	19.03-s003: Exiting on Feb 02, 2021 at 13:33:17 PST  (total: 00:00:24)
</STDOUT>
Traceback (most recent call last):
  File "tests/test_memory_core/test_memory_core.py", line 1162, in <module>
    spVspV_regress(dump_dir="mek_dump",
  File "tests/test_memory_core/test_memory_core.py", line 1133, in spVspV_regress
    success = run_test(len1, len2, num_match, value_limit, dump_dir=dump_dir, log_name=log_name, trace=trace)
  File "tests/test_memory_core/test_memory_core.py", line 1089, in run_test
    data_sim = [int(x[3]) for x in split_lines]
  File "tests/test_memory_core/test_memory_core.py", line 1089, in <listcomp>
    data_sim = [int(x[3]) for x in split_lines]
ValueError: invalid literal for int() with base 10: 'x'

But I think I'm much further then necessary. It looks like I'm able to generate the verilog and run the test totally fine without a segfault so let's see what's different about your environment. Can you post the output of your $PATH and $LD_LIBRARY_PATH? Let's make sure there's no old versions of coreir lying around there. Also, is your coreir version installed via pip? Or do you have a local installation from a checkout of the pycoreir repo?

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

Ah, I see that you have coreir installed from a local location: coreir 2.0.128 /aha/pycoreir

Can we double check this setup by either recompiling it to ensure it's up to date or uninstalling this version and using the pip distribution?

@mbstrange2
Copy link
Author

mbstrange2 commented Feb 2, 2021

This is in the aha docker - if you want to attach to it mstrange-gracious_visvesvaraya and check it out that might be easier? Or startup another docker?

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

docker attach mstrange-gracious_visvesvaraya hangs for me, I wonder if only one person can be attached at a time? or if there's a user permissions issue

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

Hmm wait, nevermind, hitting ctrl-c dropped me into the shell, maybe it was just waiting for a command

@mbstrange2
Copy link
Author

You just need to hit enter - it doesn't automatically show the prompt for some reason lol

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

Hm the tests seem to be running for me, it seems to be running more than one though so I have finished all of them

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

Have you tried simply reattaching to the container? Perhaps there's some leftover config in your env causing the problem? How many tests is this supposed to run? I'm still waiting for it to finish but it seems to be running xcelium multiple times so it doesn't seem to be having any problems generating the verilog.

@mbstrange2
Copy link
Author

Oh I'm sorry one second
I have skip_compile=True in there

@mbstrange2
Copy link
Author

Okay if you run it again in the docker it will segfault

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

Seemed to have "worked around" the issue by uninstalling coreir, and installing the pypi distribution. So something about the local docker setup is likely at fault

cd /aha/coreir/build
make uninstall
pip uninstall coreir
pip install coreir

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

I reinstalled coreir and it causes the segfault so something about the local build is causing the problem

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

Hmm, I tried reverting coreir to an older commit to match up with the pycoreir release (which is a few commits behind coreir master) but still the same problem, which suggests it's not an issue with any of the recent changes (also reviewing the commits shows nothing that would suggest a seg fault, they are minor)

@leonardt
Copy link
Collaborator

leonardt commented Feb 2, 2021

@mbstrange2 does that workaround work for unblocking you for now? We'll need to investigate the docker environment more closely to see what would be causing this issue with the local build versus using the pip wheel distribution

@rdaly525
Copy link
Owner

rdaly525 commented Feb 3, 2021

Where is the docker environment specified?

@mbstrange2
Copy link
Author

@leonardt This workaround is good for me at present

@rdaly525 https://hub.docker.com/r/stanfordaha/garnet it's this docker - it should be created from https://github.com/StanfordAHA/aha

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants