Skip to content

[ENH] Enable custom cost/score aggregators in a unified way across detectors #33

Open
@Tveten

Description

@Tveten

The output from BaseIntervalEvaluator is 2d to be able to output univariate costs or scores per input data column. Such multivariate output are currently summed over column, and this occurs within each detector. However, many multivariate changepoint and anomaly detection methods differ in the way they aggregate the information across univariate components. This aggregation should be handled in a unified way, such that aggregators can be reused and customised easily.

Need to decide:

  • Where should aggregation occur? Is it a component of interval evaluators or detectors?
  • Aggregation design: Is it a class? Is it a function? Does the function take one row of costs, or a matrix of several cost evaluations?

Requirements:

  • Ease of customisation/extension/flexibility.
  • Performance. The aggregation operation can easily become a bottleneck in computations for high-dimensional data.

Option 1

Use np.apply_along_axis, and let the user pass any function that is passed further to np.apply_along_axis.

Pros:

  • Simple and flexible.

Cons:

  • Slow. It forces the user to use np.apply_along_axis.

Option 2

Allow custom aggregation functions. Any function that takes in a 2d array and returns a 1d array with the same size as the number of rows of the input.

Pros:

  • Flexible
  • Speed: Allows aggregation functions for entire cost/score matrices to be written in numba.
  • Doesn't need to implement aggregation functions in skchange.

Cons:

  • Maybe too flexible? How to validate the input function?

Option 3

Option 2, but introduce an aggregation class that handles aggregator validation.

Pros:

  • Same as Option2.
  • Simpler to handle input validation.

Cons:

  • Yet another class that needs to be learned for the user.
  • Need to implement a range of common aggregators as classes in skchange.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions