Skip to content

[Bug] Embedding requests for gemini_cli incorrectly routed to acompletion #102

@kevincojean

Description

@kevincojean

Summary

The proxy fails with a 500 error when generating embeddings via gemini_cli (e.g., gemini-embedding-001). Requests are incorrectly routed to the chat completion endpoint instead of an embedding endpoint.

Technical Details

  1. Routing Bug: In rotator_library/client.py, the _execute_with_retry method hardcodes provider_plugin.acompletion for custom providers, ignoring whether the original call was for embeddings.
  2. Missing Implementation: GeminiCliProvider in gemini_cli_provider.py does not implement aembedding.
  3. Endpoint Mismatch: Requests are sent to :streamGenerateContent, which returns 404/400 for embedding models, resulting in a 500 error for the client.

Steps to Reproduce

curl http://localhost:8000/v1/embeddings \
  -H "Authorization: Bearer <token>" \
  -d '{"input": "test", "model": "gemini_cli/gemini-embedding-001"}'
Suggested Fix
- Update client.py to check the api_call type before delegation.
- Implement aembedding in GeminiCliProvider using the Google :embedContent endpoint.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions