[FEA] Consider fuzz testing with hypothesis

**Is your feature request related to a problem? Please describe.**
Currently tests of libcudf, pylibcudf, and cuDF Python are a large set of manually written tests. While we endeavor to achieve high coverage rates of the APIs, we inevitably miss data-dependent edge cases, particularly around things like empty data sets.

**Describe the solution you'd like**
We should consider using [`hypothesis`](https://hypothesis.readthedocs.io/en/latest/) or another fuzz testing library to add more systematic verification of different inputs. I recommend doing this at the Python layer since there is better and simpler tooling available, and because pylibcudf testing can be treated as a superset of libcudf testing in this respect to ensure good coverage of the C++.

**Describe alternatives you've considered**
We could also implement fuzz testing in C++ directly using e.g. [Google's fuzztest](https://github.com/google/fuzztest), but that will be a bit more cumbersome to do.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Consider fuzz testing with hypothesis #16129

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development