Skip to content

Add overlay_dx() metric to darts.metrics: tolerance-sweep “visual alignment” score (normalized AUC) #2970

@ngohlong

Description

@ngohlong

Is your feature request related to a current problem? Please describe.
When evaluating forecasts, common error metrics (MAE/RMSE/MAPE/…) don’t reflect visual/operational alignment across tolerance thresholds. In practice, stakeholders often care about “how often are we within X%?” across multiple X values.

Describe proposed solution
We propose Overlay-dx: compute the fraction of points within tolerance bands across a range of tolerances (defined as % of target range), then compute the normalized AUC of that curve to get a [0, 1] score (higher is better).

Describe potential alternatives
Darts metrics are exposed in darts.metrics.metrics with a standard signature accepting actual_series, pred_series, intersect, optional quantile selection q, and reduction functions (component/series reduction, parallelism).

We’d like Overlay-dx to be available like other “aggregated over time” metrics, so users can call it:

from darts.metrics import overlay_dx

score = overlay_dx(actual_series, pred_series)

Implementation:

@multi_ts_support
@multivariate_support
def overlay_dx(
    actual_series,
    pred_series,
    intersect: bool = True,
    *,
    q=None,
    max_percentage: float = 100.0,
    min_percentage: float = 0.1,
    step: float = 0.1,
    component_reduction=np.nanmean,
    series_reduction=None,
    n_jobs: int = 1,
    verbose: bool = False,
):
    """Overlay-DX (normalized AUC of tolerance-sweep coverage curve)."""

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    triageIssue waiting for triaging

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions