-
-
Notifications
You must be signed in to change notification settings - Fork 67
Open
Description
Summary
The proxy fails with a 500 error when generating embeddings via gemini_cli (e.g., gemini-embedding-001). Requests are incorrectly routed to the chat completion endpoint instead of an embedding endpoint.
Technical Details
- Routing Bug: In
rotator_library/client.py, the_execute_with_retrymethod hardcodesprovider_plugin.acompletionfor custom providers, ignoring whether the original call was for embeddings. - Missing Implementation:
GeminiCliProvideringemini_cli_provider.pydoes not implementaembedding. - Endpoint Mismatch: Requests are sent to
:streamGenerateContent, which returns 404/400 for embedding models, resulting in a 500 error for the client.
Steps to Reproduce
curl http://localhost:8000/v1/embeddings \
-H "Authorization: Bearer <token>" \
-d '{"input": "test", "model": "gemini_cli/gemini-embedding-001"}'
Suggested Fix
- Update client.py to check the api_call type before delegation.
- Implement aembedding in GeminiCliProvider using the Google :embedContent endpoint.Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels