🧭 Epic: Skills Router (Cross-Repo)

## Overview

Implement an intelligent Skills Router system that efficiently matches user queries to relevant skills using a multi-stage approach combining fast heuristics with optional LLM reranking.

> [!NOTE]
> This is a **cross-repository Epic**. Core routing logic lives in Neona, cloud services live in Neona-Cloud.

## Problem Statement

Currently, Neona's skill system (Phase 4) has no efficient way to match user queries to relevant skills. A naive LLM-for-every-query approach would be slow and expensive. We need a hybrid approach that balances **speed**, **accuracy**, and **cost**.

## Goals

1. 🚀 **Fast heuristic-based matching** for most queries (no LLM needed)
2. 🧠 **LLM reranking only when needed** (ambiguous or low-confidence results)
3. 💾 **Intelligent caching** (L1/L2 local, L3 distributed)
4. 🛡️ **Guardrails** for safety, rate limiting, and policy enforcement
5. ⚡ **Priority-based retrieval** for efficient skill discovery

## Architecture

```
User Query
    │
    ▼
[Hybrid Pre-Router] → Exact match, keywords, capabilities, heuristics
    │
    ▼
[Priority Retrieval] → Cache check (L1→L2→L3), filtering, confidence check
    │
    ├─ High confidence ──→ [Return Results]
    │
    └─ Low confidence ──→ [LLM Reranking (Cloud)] → [Guardrails] → [Cache] → [Return Results]
```

## Child Issues

### Neona (Core)
- [ ] #83 - Hybrid Pre-Router Implementation
- [ ] #84 - Priority Retrieval System  
- [ ] #86 - Local Cache + Guardrails

### Neona-Cloud (Backend)
- [ ] Neona-AI/Neona-Cloud#17 - Epic: Skills Router Cloud Services
- [ ] Neona-AI/Neona-Cloud#18 - Cloud Skills API
- [ ] Neona-AI/Neona-Cloud#19 - LLM Reranking Service
- [ ] Neona-AI/Neona-Cloud#20 - Distributed Cache (L3)
- [ ] Neona-AI/Neona-Cloud#21 - Cloud Guardrails

## Performance Targets

| Metric | Target |
|--------|--------|
| Cache hit latency (L1) | < 5ms |
| Heuristic-only latency | 10-50ms |
| Heuristic + LLM rerank | 200-1000ms |
| Cache hit rate | > 70% |
| LLM rerank rate | < 30% of queries |

## Dependencies

- Phase 4.1: Skill Definition System
- Phase 4.2: Skill Registry

## Timeline

**Target Completion:** 5 weeks

## Success Criteria

- [ ] Router matches skills using heuristics (no LLM)
- [ ] LLM reranking triggers only when needed (via Cloud)
- [ ] Multi-level caching reduces redundant computations
- [ ] Guardrails prevent inappropriate skill usage
- [ ] Performance meets targets
- [ ] Test coverage > 80%


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🧭 Epic: Skills Router (Cross-Repo) #82

Overview

Problem Statement

Goals

Architecture

Child Issues

Neona (Core)

Neona-Cloud (Backend)

Performance Targets

Dependencies

Timeline

Success Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Metric	Target
Cache hit latency (L1)	< 5ms
Heuristic-only latency	10-50ms
Heuristic + LLM rerank	200-1000ms
Cache hit rate	> 70%
LLM rerank rate	< 30% of queries

Uh oh!

🧭 Epic: Skills Router (Cross-Repo) #82

Description

Overview

Problem Statement

Goals

Architecture

Child Issues

Neona (Core)

Neona-Cloud (Backend)

Performance Targets

Dependencies

Timeline

Success Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions