Formalize and protect the treatment-vs-policy vocabulary with a lint/test

## Summary

Add a lightweight guard (a test and/or a custom ruff/grep check) that protects the
hard treatment-vs-policy vocabulary separation documented as an invariant, flagging
risky variable/parameter naming in the estimator core that could invert results.

## Why this matters

The invariants doc states that confusing "treatment" (logged action) with "policy"
(model recommendation) "will invert or completely invalidate evaluation results".
This is a correctness landmine for any contributor or AI agent. A small automated
guard makes the most dangerous semantic invariant visible at edit time, protecting
a deliberately-chosen design.

## Current evidence

- `docs/agent-context/invariants.md` "Domain Vocabulary (Hard Separation)" and
  `AGENTS.md` §3 both document the treatment/policy separation.
- The estimator core (`core.py`) and the importance weight `w = π/e` rely on this
  distinction (`_dr_weight_components`, `dr_value_with_clip`).

## External context

Not required for this issue.

## Proposed implementation

1. Encode a check (test or pre-commit hook) that flags suspicious renames/usages
   conflating the two concepts in the estimator core (heuristic, advisory).
2. Add a focused test documenting the expected directionality of the importance
   weight (policy in numerator, logged propensity in denominator).
3. Cross-link the guard from the invariants doc.

## AI-agent execution notes

- Inspect first: `core.py` (`_dr_weight_components`, `dr_value_with_clip`),
  `docs/agent-context/invariants.md`, `AGENTS.md` §3, `.pre-commit-config.yaml`.
- Keep any naming heuristic advisory (avoid false-positive churn); the
  directionality test is the firm guarantee.
- Do not change estimator behaviour.

## Acceptance criteria

- A test asserts the importance-weight directionality (policy/propensity).
- An advisory guard (or documented convention check) for the vocabulary exists.
- The invariants doc links to the guard.

## Test plan

- Directionality test on a controlled example where swapping numerator/denominator
  is detectable; `make validate`.

## Documentation plan

- Link from invariants doc; CHANGELOG `### Testing`.

## Migration and compatibility notes

Not expected to require migration.

## Risks and tradeoffs

A naming heuristic can be noisy; keep it advisory and rely on the directionality
test for the hard guarantee.

## Suggested labels

testing, architecture, reliability

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Formalize and protect the treatment-vs-policy vocabulary with a lint/test #247

Summary

Why this matters

Current evidence

External context

Proposed implementation

AI-agent execution notes

Acceptance criteria

Test plan

Documentation plan

Migration and compatibility notes

Risks and tradeoffs

Suggested labels

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Formalize and protect the treatment-vs-policy vocabulary with a lint/test #247

Description

Summary

Why this matters

Current evidence

External context

Proposed implementation

AI-agent execution notes

Acceptance criteria

Test plan

Documentation plan

Migration and compatibility notes

Risks and tradeoffs

Suggested labels

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions