The illusion of 100% code coverage #222
Replies: 3 comments
-
Thanks for initiating this issue, @jamesmbaazam 🙏 Tests are fantastic as a means for observing known functionality. These can be expectations around outputs or whether known (previous) issues are resolved (regression tests). Tests help build confidence with known issues in that sense. Whether test coverage is at 90 or a 100% does not matter all that much to me personally, as this is a heuristic. It says how much code is covered, not the quality with which the code is covered. I can for example test a simple function for one of many scenarios, including it in the code coverage, but forget to test other more critical scenarios. The coverage is only as good as the tests themselves. Even with 100% coverage, the tests do not cover 100% of the scenarios that we may want to test. This adds the problem of observability - are the tests observing the behaviors we want to be observing? As mentioned before, tests are great for known issues - they do not work for unknown issues. With unknown issues, there is also the split between known unknowns and unknown unkowns. The risk of known unknowns can be assessed, and may be okay to not observe. The risk for any software lies in the unknown unknowns and identifying what we don't even realize we're not yet testing properly. This requires a critical look on existing tests and finding gaps that need to be filled - even if there is 100% code coverage to begin with. |
Beta Was this translation helpful? Give feedback.
-
I like this idea – and a useful launch point for a 'what risks are we trying to mitigate?' discussion. A few related observations that may (or may not) be useful:
|
Beta Was this translation helpful? Give feedback.
-
Thanks for your input @chartgerink and @adamkucharski. I just shared a resource here https://github.com/orgs/epiverse-trace/discussions/282 that covers what I was thinking about (different types of tests to ensure your code coverage has adequate quality). I'm not sure of the value add of this suggested blogpost so I would vote to close it. |
Beta Was this translation helpful? Give feedback.
-
I've been thinking a lot about how we often overly rely on 100% code coverage to measure code quality. It's important to consider code coverage in the context of Goodhart's law, which says, "When a measure becomes a target, it ceases to be a good measure".
This post will discuss how 100% coverage could be an illusion and lead to undetected issues and bugs.
This post will cover categories of expectations to test for, drawing on the infrastructure and workflows for writing unit tests in R. An example of such a category is statistical correctness, which has been discussed in the statistical correctness post. This new article will be a follow-up to existing work in the wild.
I warmly welcome all contributions and co-authorship in the form of ideas, code examples, anecdotes, and experiences.
Related resources
Beta Was this translation helpful? Give feedback.
All reactions