TH component uses lifetime settlement rate — misses recent degradation

## Problem

The Transaction History (TH) component in §01 computes:

```
TH = settled_count / total_count  # all-time lifetime ratio
```

An agent that settled 500 consecutive transactions then failed on the last 20 still scores TH = 0.96. The ALPHA composite score barely moves, yet the agent is demonstrably degraded. The lifetime ratio is accurate as a historical statement but misleading as a current trust signal.

## Real-World Evidence

We ran a 20-agent behavioral monitoring study (published as [PDR v1.0 on Zenodo](https://doi.org/10.5281/zenodo.19028012)) tracking agent reliability over 30 days. Key finding:

> Window-based scoring (last 14 days) detected a **57% reliability drop** that the lifetime aggregate missed entirely. The same agents, using lifetime settlement rate, appeared "stable" for 72 hours after degradation began.

This is structurally identical to TH's current design: lifetime statistics absorb recent failures.

## Proposed Fix

Add a **windowed settlement rate** as an optional TH mode:

1. **AgentBaseline** stores a ring buffer of recent transaction outcomes (configurable window, e.g., last 90 days or last N transactions)
2.  computes from the window when sufficient data exists, falls back to lifetime for new agents
3. A new  parameter (default: 90, None = lifetime, backward-compat)

Spec change (§01):

> TH SHOULD use a windowed settlement rate over a configurable lookback period (default: 90 days). When fewer than 5 transactions are available in the window, implementations MUST fall back to the lifetime rate.

Python implementation sketch:

```python
def _compute_th(self, b: AgentBaseline, window_days: int = 90) -> float:
    # Use recent window if we have enough data
    if b.recent_window and len(b.recent_window) >= 5:
        settled = sum(1 for txn in b.recent_window if txn.settled and not txn.cancelled)
        return settled / len(b.recent_window)
    # Fallback to lifetime
    if b.total_count == 0:
        return 0.5
    return b.settled_count / b.total_count
```

## Why This Matters

TH carries 0.25 weight in ALPHA. With lifetime averaging, a degraded agent's ALPHA score decays logarithmically — too slowly for real-time trust routing. A 90-day window makes TH responsive to the actual current state, which is what counterparties need when deciding whether to transact.

This is the same design principle behind credit score "recent inquiry" weighting vs. total history — the recent signal is higher-information for predicting near-future behavior.

## Implementation Notes

- **Backward compat:** When `th_window_days=None`, behavior is identical to current v0.1 spec. No breaking changes.
- **Storage cost:** A 90-day ring buffer for an active agent is O(N) where N is max daily transactions × 90. Trivial for most deployments.
- **CI:** The existing test suite structure in `tests/test_scorer.py` makes adding windowed TH tests straightforward.

Happy to submit an RFC (`spec/rfcs/RFC-0001-windowed-th.md`) and a PR to `truce-py` if this direction resonates.

---

*This connects to the broader question in Issue #1 about Layer 3 Community Signals — recency weighting in community reputation data (not just settlement history) has the same structural problem. Windowed TH could be the template pattern.*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TH component uses lifetime settlement rate — misses recent degradation #6

Problem

Real-World Evidence

Proposed Fix

Why This Matters

Implementation Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

TH component uses lifetime settlement rate — misses recent degradation #6

Description

Problem

Real-World Evidence

Proposed Fix

Why This Matters

Implementation Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions