Differences between `coverage lcov` and `py2lcov` for consideration

I have been looking into the possibility of using lcov's genhtml to generate a custom coverage dashboard for <https://github.com/MillionConcepts/pdr>.  This project is written in Python, so I'm experimenting with several different ways of converting coverage.py data to lcov; so far nothing has come out quite the way I want it.  I think most of the problems I am having are bugs and lacunae in coverage.py, not lcov (see https://github.com/nedbat/coveragepy/issues/1846) but I think some of the analysis I've produced may be useful to the lcov project anyway.

To reproduce the raw coverage.py database whose SQL dump I have attached, do the following:

```sh
git clone https://github.com/MillionConcepts/pdr
git switch 092675e442438ddc39d4bb9e9d0d5f4c1e317f29
python3 -m venv .venv
. .venv/bin/activate
pip install pytest-cov
pip install -e '.[browsify,fits,pvl,tiff]'
sed -i -e '/formats/d; /pvl_utils/d' .coveragerc
pytest --cov --cov-branch --cov-report= --import-mode=importlib
(echo max_lineno,path
 find pdr -name '*.py' -exec wc -l '{}' + | sed '/total/d; s/^ *//; s/ /,/g'
) > max-lineno.csv
sqlite3 .coverage \
'.import --csv --schema temp max-lineno.csv max_lineno
alter table file add column max_lineno integer;
update file set path = replace(path, '\'"$PWD"/\'', '\'\'');
update file set max_lineno = ml.max_lineno from temp.max_lineno as ml
  where ml.path = file.path;'
```

Having done the above, I then generated an lcov-format coverage report two different ways:

* `coverage lcov` (A.lcov in the attached diff)
* `py2lcov` (equivalent to `coverage xml` followed by `xml2lcov` AFAICT) (B.lcov in the attached diff)

and normalized both for comparison purposes as follows:

* all TN: lines were stripped
* checksums were removed from all DA: lines
* records were sorted by SF: pathname
* within each record, the sequence of sub-record types was canonicalized to DA, LF, LH, BRDA, BRF, BRH
* DA and BRDA lines were sorted numerically by line number
* vacuous BRF:0 BRH:0 and LF:0 LH:0 pairs were removed

I believe that all remaining differences in the output indicate a bug in _something_.  I'm pretty sure the radical differences in BRDA records are the aforementioned https://github.com/nedbat/coveragepy/issues/1846 and related, but I'm not sure what's up with the DA record differences.

Please note the absence of function coverage records in B.lcov, contra <https://github.com/linux-test-project/lcov/issues/317#issuecomment-2334203032>.

[coverage+linemax.sql.gz](https://github.com/user-attachments/files/16909031/coverage%2Blinemax.sql.gz)
[lcov-gen-comparison.diff.gz](https://github.com/user-attachments/files/16909561/lcov-gen-comparison.diff.gz)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Differences between `coverage lcov` and `py2lcov` for consideration #318

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Differences between coverage lcov and py2lcov for consideration #318

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Differences between `coverage lcov` and `py2lcov` for consideration #318