Problem
Tier 1 currently contains a small set of canonical legal AI citation failures.
The benchmark should expand carefully using only:
- court-documented incidents
- canonically retrievable authorities
- reproducible source URLs
- stable citation metadata
Requirements for scoring-eligible additions
Each record should include:
- authoritative source URL
- retrieval timestamp
- canonical citation metadata
- reproducible verification path
- documented judicial or regulatory context
Non-goals
Do NOT add:
- social-media anecdotes
- unverifiable claims
- media summaries without underlying authority
- speculative incidents
Potential future additions
- sanctions orders
- judicial findings involving fabricated citations
- disciplinary proceedings
- verified filing incidents involving non-existent authorities
Problem
Tier 1 currently contains a small set of canonical legal AI citation failures.
The benchmark should expand carefully using only:
Requirements for scoring-eligible additions
Each record should include:
Non-goals
Do NOT add:
Potential future additions