FastAPI workbench for text embedding (Gemma-300m with Matryoshka) and summarization (Gemma/Gemini). Features hardware acceleration, caching, and secure endpoints for local LLM integration.
secure-access production-ai code-summarization code-intelligence fastapi text-embeddings local-ai gemma3 rate-limiting-caching hybrid-reasoning multimodel-llm matryoshka-embeddings persona-engineering
-
Updated
Oct 17, 2025 - Python