Skip to content

Unexpected behavior for set generator #25

@mak-dunkelziffer

Description

@mak-dunkelziffer

The set generator behaves very naive and therefore creates an unexpected distribution of tests.

Example 1:

items = [1, 2, 3, 4, 5]
G = PropCheck::Generators
G.set(G.one_of(*items.map { G.constant(_1) })).map(&:to_a)

This example creates very few small examples and misses plenty of cases (with n = 100) and very often creates a Set with all 5 items.

Example 2:

items = [1, 2, 3, 4, 5]
G = PropCheck::Generators
G.set(G.one_of(*items.map { G.constant(_1) }), max: items.size).map(&:to_a)

Adding a size limitation makes the generator a bit smarter. The distribution is now better, but still skewed to bigger sets. I think this is skew is reasonable for larger input lists, so I'm not sure I'd call this a bug. I had cases, where some sets were not created at all. I'm not sure, whether that is expected behaviour, if the number of all possible values (2^5 = 32) is smaller than the number of runs (n = 100). But this feels acceptable and gets compensated if you run the test multiple times.

Both cases seem to stem from a missing deduplication. The set generator very naively uses the array generator. For arrays order matters, for sets it doesn't. IMO the set generator would need to deduplicate here to avoid creating equivalent sets all the time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions