Open
Description
Bug description
When using MultiQueryExpander
, if the model's response contains empty newlines, the query expansion fails and returns the original query unchanged. The current implementation doesn't properly handle or filter out empty lines from the model's response.
Environment
- Spring AI version: 1.0.0
- Java version: 17
- Model/API used: deepseek-reasoner
-
- Spring Boot version: 3.4.6
Steps to reproduce
- Configure a
MultiQueryExpander
withnumberOfQueries(3)
- Call
expand()
with any query - Have the AI model return a response that includes empty newlines between valid queries
- Observe that the original query is returned instead of the expanded queries
Expected behavior
The expander should:
- Filter out empty lines from the model's response
- Return the valid expanded queries as long as there are enough non-empty variants
- Only fall back to the original query if there aren't enough valid expanded queries
Actual behavior
The expander fails when encountering empty newlines in the response, even when there are enough valid query variants present.
Minimal Complete Reproducible example
MultiQueryExpander queryExpander = MultiQueryExpander.builder()
.chatClientBuilder(this.chatClient.mutate())
.includeOriginal(false)
.numberOfQueries(3)
.build();
return queryExpander.expand(new Query("How to run a Spring Boot app?"));
Proposed solution
The split("\n") operation should be followed by filtering out empty strings. Here's the suggested fix:
var queryVariants = Arrays.stream(response.split("\n"))
.filter(StringUtils::hasText)
.toList();
if (CollectionUtils.isEmpty(queryVariants) || this.numberOfQueries > queryVariants.size()) {
logger.warn(
"Query expansion result does not contain the requested {} variants. Returning the input query unchanged.",
this.numberOfQueries);
return List.of(query);
}
Additional context
This is particularly important because:
- LLMs often include formatting newlines in their responses
- The current behavior causes valid expansions to be discarded unnecessarily
- The fix would make the expander more robust to normal model output variations
Metadata
Metadata
Assignees
Labels
No labels