Skip to content

[ML] bucket_count is inaccurate when there are gaps in the data #30080

@elasticmachine

Description

@elasticmachine

Original comment by @davidkyle:

Open a job send some data and close the job then reopen the job and send some data timestamped a week later than the previous batch. Autodetect will create empty bucket results for the intervening period but DataCounts::bucket_count will not reflect that.

The testMlBasicMultiNodeIT::testMiniFarequoteReopen does exactly this but the test was asserting that bucket_count == 2 rather than bucket_count = 7 days of buckets. bucket_count should equal to the number of buckets written by autodetect, with the caveat that old results are sometimes pruned.

Metadata

Metadata

Labels

:mlMachine learning>bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions