Skip to content

Conversation

@ricky-chaoju
Copy link
Contributor

summary

  • Integrate Semantic Router for automatic model routing
  • Add MoM model exposure and usage tracking
  • Improve worker reconnection and offline/online handling
  • Sync deployment and app status on startup, shutdown, and reconnection
  • Add monitoring support for Semantic Router (dashboard/metrics)

- Add SEMANTIC_ROUTER app type for deploying vllm-sr container
- Create semantic router config generator service
- Add config hot-reload support when deployments change
- Add API Gateway support for model='MoM'/'auto' semantic routing
- Add /api/semantic-router endpoints for status and config management
- Add worker API for writing files to Docker volumes
- Override entrypoint to create symlink for config file
- Add entrypoint support in app deployment
- Fix VRAM display showing too many decimal places
@ricky-chaoju ricky-chaoju merged commit e66dc45 into main Jan 19, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants