Questions about the test suite #1579

TommyMurphyTM1234 · 2024-10-10T22:07:40Z

I still have some confusion about the test suite...

In the case of a change/PR that needs to be tested what are the recommendations for running the test suite? For example - what simulator (Spike or QEMU - or maybe it depends on the nature of the changes requiring testing?), what toolchain(s) (e.g. bare-metal versus Linux/glibc, multilib or not, etc.), what arch/abi(s) (e.g. default rv64gc/lp64d versus rv32gc/ilp32d versus, say, rv32imac/ilp32 etc.)?
On the modest hardware that I have access to (e.g. up to i5 gen 8) the test suite takes a very, very long time to run - so long, in fact, that it's not really practical to run it much if at all. Is there any way to accelerate the running of the test suite or are there any other options such as cloud based testing. GitHub actions etc.?
Is there any friendly guide to understanding the basics of what the test suite does, how it works and how to deal with things like test results/reports and exclusion config files? Or is it simply a case of reading upstream info about the GCC test suite, DejaGnu etc.?
(This may merit a separate issue?) When I cloned the latest riscv-gnu-tools repo master branch and tried to run the test suite I seem to have got incorrect results. Does this indicate a problem with how I am running it or are these failures "real"? See below for details.

git clone https://github.com/riscv-collab/riscv-gnu-toolchain
cd riscv-gnu-toolchain
./configure --prefix=`pwd`/installed-tools --with-sim=spike
make
make build-sim
make report-newlib 2>&1 | tee report-newlib.log

...

                === gcc Summary ===

# of expected passes            207055
# of unexpected failures        45
# of unexpected successes       2
# of expected failures          1438
# of unresolved testcases       4
# of unsupported tests          13175

(The test suite is still running after many, many hours so I cannot post the C++ test results summary or the full test log yet).

The text was updated successfully, but these errors were encountered:

TommyMurphyTM1234 · 2024-10-11T06:57:48Z

Further results:

                === g++ Summary ===

# of expected passes            215739
# of unexpected failures        15
# of expected failures          1705
# of unresolved testcases       1
# of unsupported tests          11968
/home/user/spike-pk/riscv-gnu-toolchain/build-gcc-newlib-stage2/gcc/xg++  version 14.2.0 (g04696df0963)

make[3]: Leaving directory '/home/user/spike-pk/riscv-gnu-toolchain/build-gcc-newlib-stage2/gcc'
make[2]: Leaving directory '/home/user/spike-pk/riscv-gnu-toolchain/build-gcc-newlib-stage2/gcc'
make[1]: Leaving directory '/home/user/spike-pk/riscv-gnu-toolchain/build-gcc-newlib-stage2'
mkdir -p stamps/
date > stamps/check-gcc-newlib
/home/user/spike-pk/riscv-gnu-toolchain/scripts/testsuite-filter gcc newlib /home/user/spike-pk/riscv-gnu-toolchain/test/allowlist `find build-gcc-newlib-stage2/gcc/testsuite/ -name *.sum |paste -sd "," -`
                === g++: Unexpected fails for rv64imafdc lp64d medlow  ===
FAIL: c-c++-common/torture/builtin-clear-padding-3.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
UNRESOLVED: c-c++-common/torture/builtin-clear-padding-3.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  compilation failed to produce executable

               ========= Summary of gcc testsuite =========
                            | # of unexpected case / # of unique unexpected case
                            |          gcc |          g++ |     gfortran |
 rv64imafdc/  lp64d/ medlow |    0 /     0 |    2 /     1 |      - |
make: *** [Makefile:1314: report-gcc-newlib] Error 1

report-newlib.log: report-newlib.log

TommyMurphyTM1234 · 2024-10-11T23:00:33Z

FWIW - another test run...

time make report-newlib 2>&1 | tee report-newlib.log

...

                === gcc Summary ===

# of expected passes            207085
# of unexpected failures        45
# of unexpected successes       2
# of expected failures          1438
# of unresolved testcases       4
# of unsupported tests          13199

...

                === g++ Summary ===

# of expected passes            215771
# of unexpected failures        14
# of expected failures          1705
# of unsupported tests          11978

...

               ========= Summary of gcc testsuite =========
                            | # of unexpected case / # of unique unexpected case
                            |          gcc |          g++ |     gfortran |
 rv64imafdc/  lp64d/ medlow |    0 /     0 |    0 /     0 |      - |

real    354m48.053s
user    282m49.505s
sys     39m5.811s

report-newlib.log: report-newlib.log

TommyMurphyTM1234 · 2024-10-12T23:13:41Z

And a run with the Linux/glibc toolchain:

time make report-linux 2>&1 | tee report-linux.log

...

                === gcc Summary ===

# of expected passes            194597
# of unexpected failures        21700
# of unexpected successes       2
# of expected failures          1675
# of unresolved testcases       12
# of unsupported tests          13276

...

                === g++ Summary ===

# of expected passes            240348
# of unexpected failures        10439
# of expected failures          2625
# of unresolved testcases       28
# of unsupported tests          11858

...

               ========= Summary of gcc testsuite =========
                            | # of unexpected case / # of unique unexpected case
                            |          gcc |          g++ |     gfortran |
 rv64imafdc/  lp64d/ medlow |21684 /  4163 |10455 /  2646 |18931 /  3190 |
make: *** [Makefile:1321: report-gcc-linux] Error 1

real    539m11.430s
user    424m49.504s
sys     59m47.288s

report-linux.log.zip: report-linux.log.zip

TommyMurphyTM1234 · 2024-10-14T22:34:34Z

Does anybody (@cmuellner or @kito-cheng perhaps?) know why I'm getting the results above with the latest of everything from this repo?
The results do not seem to indicated "success" as far as I can tell.
I built and ran everything again from scratch but got the same results as above so I can't see that I'm doing anything wrong at my end...

lazyparser · 2024-10-15T14:50:54Z

@pz9115 would you like to check this issue and provide some inputs?

TommyMurphyTM1234 · 2024-10-15T15:22:21Z

If you need me to provide any clarification or do any additional tests please let me know.

pz9115 · 2024-10-31T13:01:33Z

I still have some confusion about the test suite...

In the case of a change/PR that needs to be tested what are the recommendations for running the test suite? For example - what simulator (Spike or QEMU - or maybe it depends on the nature of the changes requiring testing?), what toolchain(s) (e.g. bare-metal versus Linux/glibc, multilib or not, etc.), what (s) (e.g. default versus versus, say, etc.)?arch/abi``rv64gc/lp64d``rv32gc/ilp32d``rv32imac/ilp32

On the modest hardware that I have access to (e.g. up to i5 gen 8) the test suite takes a very, very long time to run - so long, in fact, that it's not really practical to run it much if at all. Is there any way to accelerate the running of the test suite or are there any other options such as cloud based testing. GitHub actions etc.?

Is there any friendly guide to understanding the basics of what the test suite does, how it works and how to deal with things like test results/reports and exclusion config files? Or is it simply a case of reading upstream info about the GCC test suite, DejaGnu etc.?

(This may merit a separate issue?) When I cloned the latest repo branch and tried to run the test suite I seem to have got incorrect results. Does this indicate a problem with how I am running it or are these failures "real"? See below for details.riscv-gnu-tools``master
git clone https://github.com/riscv-collab/riscv-gnu-toolchain
cd riscv-gnu-toolchain
./configure --prefix=`pwd`/installed-tools --with-sim=spike
make
make build-sim
make report-newlib 2>&1 | tee report-newlib.log

...

                === gcc Summary ===

# of expected passes            207055
# of unexpected failures        45
# of unexpected successes       2
# of expected failures          1438
# of unresolved testcases       4
# of unsupported tests          13175
(The test suite is still running after many, many hours so I cannot post the C++ test results summary or the full test log yet).

Hi @TommyMurphyTM1234
The testsuites you run is a regression test,which is used to detect whether changes to gcc and other components have introduced new errors. Modifications in riscv-gnu-toolchain usually have no additional negative effects, since you do not change any submodule sourcecode directly.（unless you update the gcc module）

If you try some gcc modification, then regression testing is necessary. In comparison, gcc's regression testing time is much longer than other components (such as binutils). The more errors a regression test has, the longer it takes to run(I guess it's because some of the execution tests are not responding in the simulator).

So it is a good choice that only run riscv related testcases in gcc part. You can use RUNTESTFLAGS="riscv.exp" make report -j$(nproc) to short the test cost time.(Sometimes set RUNTESTFLAGS="rvv.exp" when change rvv part)

I usually run two types of regression tests, one using glibc and one using newlib, which use --with-arch=rv64gc as their base argument. It fine for mostly changes undenpend new sub-extension.

Hope this will help you:)

TommyMurphyTM1234 · 2024-11-02T12:53:14Z

Thanks @pz9115. I understand the general rationale for the test suite and acknowledge that my runs here are "artificial" and strictly unnecessary because I'm not actually testing before and after any changes to stuff like GCC, Binutils, C library etc.

However, my other questions still stand and have not been addressed. In particular, why does the test suite seem to fail and generate errors with the latest repo contents? Isn't that indicative if something anomalous? As far as I can see there isn't a baseline "successful" test suite run right now against which tests on a modified version of the toolchain can be compared.

pz9115 · 2024-11-04T08:17:21Z

Thanks @pz9115. I understand the general rationale for the test suite and acknowledge that my runs here are "artificial" and strictly unnecessary because I'm not actually testing before and after any changes to stuff like GCC, Binutils, C library etc.

However, my other questions still stand and have not been addressed. In particular, why does the test suite seem to fail and generate errors with the latest repo contents? Isn't that indicative if something anomalous? As far as I can see there isn't a baseline "successful" test suite run right now against which tests on a modified version of the toolchain can be compared.

For the failed cases, you can check the detail in the test log, which in the build-gcc-newlib-stage2/gcc/testsuite/gcc/gcc.log. I believe most of them are caused by execution fault, like incompatible ABI problem. Once we set the qemu arguments consistent with the toolchain, they we pass the test correctly.

TommyMurphyTM1234 · 2024-11-04T20:56:33Z

For the failed cases, you can check the detail in the test log, which in the build-gcc-newlib-stage2/gcc/testsuite/gcc/gcc.log.

OK - but what about stuff like this where make errors out? Surely that shouldn't happen?

/home/user/spike-pk/riscv-gnu-toolchain/scripts/testsuite-filter gcc newlib /home/user/spike-pk/riscv-gnu-toolchain/test/allowlist `find build-gcc-newlib-stage2/gcc/testsuite/ -name *.sum |paste -sd "," -`
                === g++: Unexpected fails for rv64imafdc lp64d medlow  ===
FAIL: c-c++-common/torture/builtin-clear-padding-3.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
UNRESOLVED: c-c++-common/torture/builtin-clear-padding-3.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  compilation failed to produce executable

               ========= Summary of gcc testsuite =========
                            | # of unexpected case / # of unique unexpected case
                            |          gcc |          g++ |     gfortran |
 rv64imafdc/  lp64d/ medlow |    0 /     0 |    2 /     1 |      - |
make: *** [Makefile:1314: report-gcc-newlib] Error 1

...

               ========= Summary of gcc testsuite =========
                            | # of unexpected case / # of unique unexpected case
                            |          gcc |          g++ |     gfortran |
 rv64imafdc/  lp64d/ medlow |21684 /  4163 |10455 /  2646 |18931 /  3190 |
make: *** [Makefile:1321: report-gcc-linux] Error 1

I believe most of them are caused by execution fault, like incompatible ABI problem. Once we set the qemu arguments consistent with the toolchain, they we pass the test correctly.

So the recommended target for testing is QEMU and not Spike? I thought that several changes/PRs were checked by running the test suite against Spike?

It seems to me that there is a severe lack of instructions/documentation on the specifics of how best/correctly to run the test suites against the riscv-gnu-toolchain and, unfortunately, by extensive experimentation I personally can't seem to clarify matters.

TommyMurphyTM1234 mentioned this issue Oct 26, 2024

Bump Spike and pk to latest commits #1596

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about the test suite #1579

Questions about the test suite #1579

TommyMurphyTM1234 commented Oct 10, 2024

TommyMurphyTM1234 commented Oct 11, 2024

TommyMurphyTM1234 commented Oct 11, 2024

TommyMurphyTM1234 commented Oct 12, 2024 •

edited

Loading

TommyMurphyTM1234 commented Oct 14, 2024

lazyparser commented Oct 15, 2024

TommyMurphyTM1234 commented Oct 15, 2024

pz9115 commented Oct 31, 2024

TommyMurphyTM1234 commented Nov 2, 2024

pz9115 commented Nov 4, 2024 •

edited

Loading

TommyMurphyTM1234 commented Nov 4, 2024

Questions about the test suite #1579

Questions about the test suite #1579

Comments

TommyMurphyTM1234 commented Oct 10, 2024

TommyMurphyTM1234 commented Oct 11, 2024

TommyMurphyTM1234 commented Oct 11, 2024

TommyMurphyTM1234 commented Oct 12, 2024 • edited Loading

TommyMurphyTM1234 commented Oct 14, 2024

lazyparser commented Oct 15, 2024

TommyMurphyTM1234 commented Oct 15, 2024

pz9115 commented Oct 31, 2024

TommyMurphyTM1234 commented Nov 2, 2024

pz9115 commented Nov 4, 2024 • edited Loading

TommyMurphyTM1234 commented Nov 4, 2024

TommyMurphyTM1234 commented Oct 12, 2024 •

edited

Loading

pz9115 commented Nov 4, 2024 •

edited

Loading