Skip to content

[ML] Bad values for the variance scale #24

Closed
@tveasey

Description

@tveasey

A user data set has shown up two issues with the variance scale calculation in version 6.2.2 of the analytics:

  1. it is sometimes negative(!),
  2. it is sometimes infinite.

In particular, we are seeing the following error messages logged:
Error calculating joint distribution: Bad variance scale -5.75
Error calculating joint distribution: Bad variance scale inf

There is no prospect of getting hold of the data set; however the data characteristics sound benign. There were two detectors:

  • detector high_mean(x) over y influencers y,z.
  • detector high_median(x) over y influencers y,z

For x we have min: 0, max: 4.34571, avg: 2.0736 and cardinality of y is 430.

This issue is to investigate routes by which this problem could occur. The initial areas to investigate are CTimeSeriesDecomposition::scale and the calculation of the count variance scale, particularly for influencers.

cc @LucaWintergerst.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions