Skip to content

[FEA] Add utility for Hypothesis-based comparisons of CPU and GPU algorithm implementations #4943

Closed
@wphicks

Description

@wphicks

With the ongoing work to provide togglable CPU/GPU execution, we need to provide strong guarantees about the equivalency of results for executing on each device. Toward that end, we should begin using Hypothesis to generate difficult inputs and compare output from both CPU and GPU execution. Having a utility that makes it easy to do so for both classifiers and regressors would be extremely beneficial.

The Hypothesis testing we do on the FIL backend offers some things to think about here. Most importantly, for generating large datasets, we probably do not want to have Hypothesis generate a single array all at once but instead generate several smaller arrays and concatenate them together. That tends to allow us to explore more diverse inputs more quickly at the cost of a more difficult "narrowing" process for Hypothesis. For small datasets, generating arrays individually is probably more beneficial.

It's an open question how much we want to use Hypothesis for training datasets as opposed to inference. While testing on oddly-constructed training data may expose some corner cases, it may also lead to relatively uninteresting or trivial models. Having Hypothesis use ordinary make_blobs type dataset constructors for some large fraction of the training data may be a good idea.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions