Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publish C-level coverage for the test suite #426

Closed
mdboom opened this issue Jul 7, 2022 · 20 comments
Closed

Publish C-level coverage for the test suite #426

mdboom opened this issue Jul 7, 2022 · 20 comments
Assignees

Comments

@mdboom
Copy link
Contributor

mdboom commented Jul 7, 2022

One thing we've learned from the frame.setlineno set of issues is that the C-level coverage of the test suite isn't perfect.

If we started publishing the C level coverage of the test suite somewhere that would help us to file bugs to proactively improve coverage of the test suite.

I've seen projects that do this on pull requests -- there is usually so much noise in the results that IME it's not terribly useful. I think just a nightly (or even weekly) publishing of coverage results on main would go a long way, though.

It should be possible to do this with Github Actions.

@mdboom mdboom self-assigned this Jul 7, 2022
@brettcannon
Copy link

Here's the old coverage GH Action for CPython that still runs on older branches: https://github.com/python/cpython/actions/workflows/coverage.yml .

@mdboom
Copy link
Contributor Author

mdboom commented Jul 7, 2022

Looking at this, is my understanding correct that the upload token for Codecov.io is broken and therefore we probably haven't been publishing anything from this for a while?

I'm not convinced that Codecov.io provides a ton of value anyway -- I was planning on just publishing whatlcov produces to Github Pages. Obviously that doesn't support diffing of coverage reports or anything like that. I can certainly get feedback on what people want out of this before committing to any particular approach.

@brettcannon
Copy link

is my understanding correct that the upload token for Codecov.io is broken and therefore we probably haven't been publishing anything from this for a while?

Probably.

I'm not convinced that Codecov.io provides a ton of value anyway

It was honestly just the easiest solution at the time.

@mdboom
Copy link
Contributor Author

mdboom commented Jul 7, 2022

This PR (on my own fork) produced this output.

@erlend-aasland
Copy link

An alternative (or supplement) to lcov, is using LLVM's source-based code coverage. The big advantage over lcov is branch coverage. I've found that very useful when increasing the sqlite3 extension module coverage.

@iritkatriel
Copy link
Collaborator

@erlend-aasland Can you share your commands/workflow?

@erlend-aasland
Copy link

erlend-aasland commented Jul 7, 2022

@erlend-aasland Can you share your commands/workflow?

Sure! For the record, I'm on macOS, using the XCode supplied toolchain.

  1. Set both CFLAGS and LDFLAGS to -fprofile-instr-generate -fcoverage-mapping
  2. configure and make
  3. run the test suite (or most often, just the sqlite3 tests)
  4. generate coverage results (see details)
py-cov.sh script
#!/bin/sh
set -ve

MACOS_VER="12.2"
PY_MAJOR="3"
PY_MINOR="12"
PREFIX="build/lib.macosx-${MACOS_VER}-x86_64-${PY_MAJOR}.${PY_MINOR}-pydebug/"
SUFFIX="cpython-${PY_MAJOR}${PY_MINOR}d-darwin.so"

SQLITE_LIB="${PREFIX}/_sqlite3.${SUFFIX}"
BIN=$SQLITE_LIB  # python.exe, $SQLITE_LIB, etc.
PROFRAW=default.profraw
PROFDATA=default.profdata
SRC=Modules/_sqlite
REPORT=llvm-report
IGNORE="(clinic|Include)"

llvm-profdata merge \
  -sparse $PROFRAW  \
  -o $PROFDATA

llvm-cov report                    \
  -ignore-filename-regex="$IGNORE" \
  -instr-profile=$PROFDATA         \
  $BIN                             \
  $SRC

llvm-cov show                      \
  -format=html                     \
  -output-dir=$REPORT              \
  -ignore-filename-regex="$IGNORE" \
  -instr-profile=$PROFDATA         \
  $BIN                             \
  $SRC

rm -f $PROFDATA
llvm-cov report, sample output
Filename                                               Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
src/cpython.git/Modules/_sqlite/blob.c                     318                 6    98.11%          27                 0   100.00%         360                14    96.11%         146                14    90.41%
src/cpython.git/Modules/_sqlite/connection.c              1317               115    91.27%          60                 0   100.00%        1301               183    85.93%         530               127    76.04%
src/cpython.git/Modules/_sqlite/cursor.c                  1183               129    89.10%          30                 0   100.00%         921               115    87.51%         456               100    78.07%
src/cpython.git/Modules/_sqlite/microprotocols.c            81                 8    90.12%           3                 0   100.00%          76                17    77.63%          32                 9    71.88%
src/cpython.git/Modules/_sqlite/module.c                  1114               125    88.78%          17                 0   100.00%         314                51    83.76%         368               151    58.97%
src/cpython.git/Modules/_sqlite/module.h                     2                 0   100.00%           2                 0   100.00%          10                 2    80.00%           0                 0         -
src/cpython.git/Modules/_sqlite/prepare_protocol.c          23                20    13.04%           4                 3    25.00%          22                15    31.82%           6                 5    16.67%
src/cpython.git/Modules/_sqlite/row.c                      231                42    81.82%          13                 1    92.31%         144                22    84.72%          72                24    66.67%
src/cpython.git/Modules/_sqlite/statement.c                 99                 3    96.97%           5                 0   100.00%         121                 5    95.87%          66                 7    89.39%
src/cpython.git/Modules/_sqlite/util.c                     102                22    78.43%           4                 0   100.00%         109                31    71.56%          72                23    68.06%

Files which contain no functions:
install/sqlite-3.39.0/include/sqlite3.h                      0                 0         -           0                 0         -           0                 0         -           0                 0         -
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                                     4470               470    89.49%         165                 4    97.58%        3378               455    86.53%        1748               460    73.68%

See llvm-report.zip for sample HTML output.

It's pretty crude, but it works for me :)

@mdboom
Copy link
Contributor Author

mdboom commented Jul 8, 2022

@erlend-aasland: This is a fantastic suggestion.

The branch results are pretty helpful, and I think they may provide some performance hints to the Faster CPython project. (It's a shame it doesn't collect branch stats for switch statements, but the -show-regions option is also really nice, where you can see coverage of individual components of an expression.

I also experimented with -show-expansions (which expands macros and provides coverage information within the macro). It's super-helpful in many cases, but ultimately on the Python codebase, it's pretty noisy to see Py_XDECREF expanded everywhere. If there were a way to ignore certain expansions, maybe that might help, but I couldn't see that as an option.

In short, this seems superior to what gcov/lcov provides, and it definitely seems like we should use this instead if possible. I've put the results from each system up here:

Notes about how these were generated I did this on Debian bookworm and needed to apt install `clang-13` and `llvm-13`.
export CC="clang-13 -fprofile-instr-generate -fcoverage-mapping"
./configure
make -j4

llvm-cov breaks when multiple shared objects try to write to the same object file, so I needed to use the trick in this comment to have an output file per shared object.

LLVM_PROFILE_FILE=python%m.profraw ./python -m test
llvm-profdata-13 merge -sparse *.profraw -o python.profdata
llvm-cov-13 show -format=html -output-dir=llvm-coverage -instr-profile=python.profdata -show-branches=count -show-regions python .

@erlend-aasland
Copy link

I've got a branch lying around where I've added llvm-cov support to configure/make. I'll see if I can restore it and rebase against main.

@mdboom
Copy link
Contributor Author

mdboom commented Jul 8, 2022

I've got a branch lying around where I've added llvm-cov support to configure/make. I'll see if I can restore it and rebase against main.

Sure, that would be very helpful.

@mdboom
Copy link
Contributor Author

mdboom commented Jul 8, 2022

@mdboom wrote:

(It's a shame it doesn't collect branch stats for switch statements,

Actually, it does do this (example). It just doesn't do this for the big switch in ceval.c, which, in fairness, is pretty obscured by macros etc.

@tiran
Copy link

tiran commented Jul 12, 2022

FYI lcov can do branch coverage. It is just disabled by default. I have working code that generated branch coverage with GCC and LCOV.

I would like to see llvm-cov support in Makefile, too.

@tiran
Copy link

tiran commented Jul 12, 2022

There is also https://github.com/gcovr/gcovr . It's a replacement for gcov/lcov/genhtml and written in our favorite language.

@mdboom
Copy link
Contributor Author

mdboom commented Jul 12, 2022

Thanks for pointing out lcov with branch coverage. I'll generate that and compare that with llvm-cov.

Yeah, I agree on llvm-cov in the Makefile. It's a bit tricky because everything else in the Makefile assumes you're using gcc on Linux.

As for gcovr, I had looked at it, but it doesn't seem like anything it offers is really relevant and I thought it better to not add another moving part. But if there's something specific it does that lcov doesn't let me know.

@tiran
Copy link

tiran commented Jul 12, 2022

@mdboom
Copy link
Contributor Author

mdboom commented Jul 12, 2022

Thanks for sharing those. Based on this, I think my entirely subjective thoughts are:

  • The GCOVR view of branches is slightly more obvious than the LCOV one. (The same information is there).
  • LLVM-COV still has one advantage which is produces better highlighting of the regions of subexpressions. Look at GCOVR vs. llvm-cov for example.

@tiran
Copy link

tiran commented Jul 13, 2022

I agree with you. The llvm-cov output looks more modern and presents more information.

@lazka
Copy link

lazka commented Jul 18, 2022

I'm not sure if that is relevant/useful here, but coverage.py recently gained an option to export to lcov format, so you can merge C and Python coverage into one report. Example for a Python/C extension: https://gnome.pages.gitlab.gnome.org/pygobject/gi/index.html coverage.py is prettier of course, but having everything in one place is nice too.

@mdboom
Copy link
Contributor Author

mdboom commented Jul 18, 2022

@lazka: That's cool to know about. I was planning on doing Python coverage once this landed (C coverage is a bit more urgent to my team at the moment), and it might make sense to do something like this.

@mdboom mdboom closed this as completed Feb 28, 2023
@gvanrossum
Copy link
Collaborator

Should be closed as "not planned" instead?

@mdboom mdboom closed this as not planned Won't fix, can't repro, duplicate, stale Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

7 participants