HTTP Client configuration for models and vector stores

### Enhancement Description

Each model integration is composed of two aspects: an `*Api` class calling the model provider over HTTP, and a `*Client` class encapsulating the LLM specific aspects.

Each `*Client` class is highly customizable based on nice interfaces, making it possible to overwrite many different options. It would be nice to provide similar flexibility for each `*Api` class as well. In particular, it would be useful to be able to configure options related to the HTTP Client.

Examples of aspects that would need to be configured:

* enable logging of requests/responses, very useful for general troubleshooting but also for refining prompts during development and testing;
* define connection and read timeout settings;
* configure an `SslBundle` to connect with on-prem model providers using custom CA certificates;
* configure connections through a corporate proxy, very common in production deployments.

Furthermore, there might be additional needs for configuring resilience patterns:

* configure retry strategy in case of failures;
* define a fallback logic in case of failures.

More settings that right now are part of the model connection configuration (and that still relates to the HTTP interaction) would also need to be customisable in enterprise use cases in production (e.g. multi-user applications or even multi-tenant applications). For example, when using OpenAI, the following could need changing per request/session.

* API Key
* Organization
* User

All the above is focused on the HTTP interactions with model providers, but the same would be useful for vector stores.

### Possible Solutions

Drawing from the nice abstractions designed to customize the model integrations and ultimately implementing the `ModelOptions` interface, it could be an idea to define a dedicated abstraction to pass HTTP client customizations to an `*Api` class (something like `HttpClientConfig`), which might also be exposed via configuration properties (under `spring.ai.<model>.client.*`).

For the more specific resilience configurations (like retries and fallbacks), an annotation-driven approach might be more suitable. Resilience4j might provide a way to achieve this, since I don't think Spring supports the Fault Tolerance Microprofile spec.

A partial alternative solution would be for developers to define a custom `RestClient.Builder` or `WebClient.Builder` and pass that to each `*Api` class, but it would result in a lot of extra configurations and reduce the convenience of the autoconfiguration. Also, it would tight a generic configuration like "enable logs" or "use a custom CA" to the specific client used, resulting in duplication when both blocking and streaming interactions are used in the same application.

I'm available to contribute and help solve this issue.

### Related Issues

* https://github.com/spring-projects/spring-ai/issues/123
* https://github.com/spring-projects/spring-ai/issues/354
* https://github.com/spring-projects/spring-ai/issues/441
* https://github.com/spring-projects/spring-ai/issues/477

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HTTP Client configuration for models and vector stores #512

Enhancement Description

Possible Solutions

Related Issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

HTTP Client configuration for models and vector stores #512

Description

Enhancement Description

Possible Solutions

Related Issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions