Skip to content

Conversation

@lunelson
Copy link
Contributor

@lunelson lunelson commented Dec 5, 2025

🌟 What is the purpose of this PR?

Initial implementation of Mastra-based Named Entity Recognition (NER)
system, establishing foundation for migrating AI inference capabilities
from hash-ai-worker-ts to structured Mastra architecture.

🔗 Related links

  • Detailed architecture mapping:
    apps/hash-ai-agent/docs/mastra-migration-plan.md
  • Phase 2 roadmap: apps/hash-ai-agent/docs/NEXT_STEPS.md

🔍 What does this change?

  • Created entity type schemas with Zod validation
    (src/mastra/types/entities.ts)
  • Ported entity type dereferencing utility from hash-ai-worker-ts
    (src/mastra/shared/)
  • Implemented baseline NER agent using Google Gemini 2.0 Flash via
    OpenRouter
  • Created entity recall scorer with weighted penalty evaluation
  • Ported 4 stable test cases with ground truth data
  • Registered new agent and scorer in Mastra instance

Pre-Merge Checklist 🚀

🚢 Has this modified a publishable library?

  • does not modify any publishable blocks or libraries, or modifications do
    not need publishing

📜 Does this require a change to the docs?

  • are internal and do not require a docs change

🕸️ Does this require a change to the Turbo Graph?

  • do not affect the execution graph

⚠️ Known issues

None. This is Phase 1 foundation only - does not yet include claims
extraction or full entity proposal pipeline.

🐾 Next steps

Phase 2: Implement claims-based extraction (claim extraction agent →
entity proposal agent → three-step workflow). See docs/NEXT_STEPS.md.

🛡 What tests cover this?

  • Smoke test: src/mastra/evals/test-entity-extraction.ts
  • Full evaluation: src/mastra/evals/run-entity-extraction-eval.ts
  • Entity recall scorer with 4 test cases covering edge cases

❓ How to test this?

  1. Checkout the branch
  2. Ensure OPENROUTER_API_KEY is set
  3. Run: cd apps/hash-ai-agent && pnpm tsx
    src/mastra/evals/test-entity-extraction.ts
  4. Verify agent extracts entities from test case

@github-actions github-actions bot added area/deps Relates to third-party dependencies (area) area/apps > hash* Affects HASH (a `hash-*` app) area/infra Relates to version control, CI, CD or IaC (area) area/apps labels Dec 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/apps > hash* Affects HASH (a `hash-*` app) area/apps area/deps Relates to third-party dependencies (area) area/infra Relates to version control, CI, CD or IaC (area)

Development

Successfully merging this pull request may close these issues.

2 participants