-
Notifications
You must be signed in to change notification settings - Fork 977
Description
Is your feature request related to a current problem? Please describe.
When evaluating forecasts, common error metrics (MAE/RMSE/MAPE/…) don’t reflect visual/operational alignment across tolerance thresholds. In practice, stakeholders often care about “how often are we within X%?” across multiple X values.
Describe proposed solution
We propose Overlay-dx: compute the fraction of points within tolerance bands across a range of tolerances (defined as % of target range), then compute the normalized AUC of that curve to get a [0, 1] score (higher is better).
Describe potential alternatives
Darts metrics are exposed in darts.metrics.metrics with a standard signature accepting actual_series, pred_series, intersect, optional quantile selection q, and reduction functions (component/series reduction, parallelism).
We’d like Overlay-dx to be available like other “aggregated over time” metrics, so users can call it:
from darts.metrics import overlay_dx
score = overlay_dx(actual_series, pred_series)
Implementation:
@multi_ts_support
@multivariate_support
def overlay_dx(
actual_series,
pred_series,
intersect: bool = True,
*,
q=None,
max_percentage: float = 100.0,
min_percentage: float = 0.1,
step: float = 0.1,
component_reduction=np.nanmean,
series_reduction=None,
n_jobs: int = 1,
verbose: bool = False,
):
"""Overlay-DX (normalized AUC of tolerance-sweep coverage curve)."""
Additional context