Description
Enhancement Description
Each model integration is composed of two aspects: an *Api
class calling the model provider over HTTP, and a *Client
class encapsulating the LLM specific aspects.
Each *Client
class is highly customizable based on nice interfaces, making it possible to overwrite many different options. It would be nice to provide similar flexibility for each *Api
class as well. In particular, it would be useful to be able to configure options related to the HTTP Client.
Examples of aspects that would need to be configured:
- enable logging of requests/responses, very useful for general troubleshooting but also for refining prompts during development and testing;
- define connection and read timeout settings;
- configure an
SslBundle
to connect with on-prem model providers using custom CA certificates; - configure connections through a corporate proxy, very common in production deployments.
Furthermore, there might be additional needs for configuring resilience patterns:
- configure retry strategy in case of failures;
- define a fallback logic in case of failures.
More settings that right now are part of the model connection configuration (and that still relates to the HTTP interaction) would also need to be customisable in enterprise use cases in production (e.g. multi-user applications or even multi-tenant applications). For example, when using OpenAI, the following could need changing per request/session.
- API Key
- Organization
- User
All the above is focused on the HTTP interactions with model providers, but the same would be useful for vector stores.
Possible Solutions
Drawing from the nice abstractions designed to customize the model integrations and ultimately implementing the ModelOptions
interface, it could be an idea to define a dedicated abstraction to pass HTTP client customizations to an *Api
class (something like HttpClientConfig
), which might also be exposed via configuration properties (under spring.ai.<model>.client.*
).
For the more specific resilience configurations (like retries and fallbacks), an annotation-driven approach might be more suitable. Resilience4j might provide a way to achieve this, since I don't think Spring supports the Fault Tolerance Microprofile spec.
A partial alternative solution would be for developers to define a custom RestClient.Builder
or WebClient.Builder
and pass that to each *Api
class, but it would result in a lot of extra configurations and reduce the convenience of the autoconfiguration. Also, it would tight a generic configuration like "enable logs" or "use a custom CA" to the specific client used, resulting in duplication when both blocking and streaming interactions are used in the same application.
I'm available to contribute and help solve this issue.