Skip to content

Track Retrieval Test Coverage per Deal #109

@SgtPooki

Description

@SgtPooki

Track Retrieval Test Coverage per Deal

Summary

Retrieval tests currently choose deals biased toward newest (createdAt DESC), so the same recent deals get tested repeatedly while older/untested deals are skipped. Add per-deal tracking fields and adjust selection to favor never/least-recently-tested deals while keeping provider balancing.

Current Behavior

  • RetrievalService.selectRandomDealsForRetrieval() loads deals ordered by createdAt DESC, takes Math.max(count * 2, 100), groups by spAddress, shuffles within each provider, then selects a balanced set.
  • No tracking exists for “when was this deal last retrieval-tested” or “how many times has it been tested”.
    • Because deals are shuffled within each provider, selection is effectively random within a provider (biased only by the initial “newest deals” cutoff), which makes it hard to improve coverage without changing the approach.

Code: apps/backend/src/retrieval/retrieval.service.ts

Proposed Changes

1) Add retrieval tracking fields to deals

  • deals.last_retrieved_at TIMESTAMPTZ NULL
  • deals.retrieval_count INTEGER NOT NULL DEFAULT 0
  • Index for sorting/filtering: IDX_deals_last_retrieved_at on last_retrieved_at (optionally partial: WHERE last_retrieved_at IS NOT NULL)

2) Update selection to prioritize coverage (while keeping provider balancing)

When selecting candidate deals, order by:

  1. lastRetrievedAt ASC NULLS FIRST (never tested first)
  2. retrievalCount ASC (lower count first)
  3. createdAt DESC (tie-breaker)

Selection approach (keeps provider balancing and preserves priority):

  • Query a prioritized candidate set using the ORDER BY above.
  • Group candidates by spAddress without reordering (preserve the query order within each provider).
  • Select a balanced batch by taking from the front of each provider’s list (round-robin or “deals per provider”).

Optional randomness (without losing the coverage bias):

  • Only randomize within “equivalent priority” ties inside a provider (e.g., same retrievalCount and same/NULL lastRetrievedAt), then still take from the front.

3) Update tracking once per deal attempt

After a deal’s retrieval tests finish (success or failure), update:

  • lastRetrievedAt = NOW()
  • retrievalCount = retrievalCount + 1

Notes:

  • Update should run even if all retrieval methods fail (track “attempts”, not just successes).
  • Prefer an atomic DB update (UPDATE ... SET retrieval_count = retrieval_count + 1) to avoid lost updates if concurrent workers can test the same deal.

Implementation Checklist

  • Migration: add last_retrieved_at, retrieval_count, and IDX_deals_last_retrieved_at
  • Entity: add lastRetrievedAt and retrievalCount columns in apps/backend/src/database/entities/deal.entity.ts
  • Retrieval selection: replace createdAt DESC ordering with coverage-based ordering, preserve provider balancing
  • Retrieval tracking update: increment once per deal attempt (not per method), including failure paths
  • Rollback: migration down() cleanly removes index + columns

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    📌 Triage

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions