Skip to content

Add AI-powered search features #827

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 45 commits into from
Jun 5, 2025
Merged

Conversation

Strift
Copy link
Contributor

@Strift Strift commented Mar 12, 2025

Pull Request

Related issue

Fixes #817

What does this PR do?

Update settings to handle embedders

Docs: https://www.meilisearch.com/docs/reference/api/settings#embedders

Update the methods getEmbedders, updateEmbedders, resetEmbedders. Also, the method updateSettings should be able to accept the new embedders parameter.

Here is the list of fields in the embedders object:

  • source sub field is available and accepts: ollama, rest, openAI, huggingFace and userProvided
  • apiKey sub field is available (string) - optional because not compatible with all sources. Only for openAi, ollama, rest.
  • model sub field is available (string) - optional because not compatible with all sources. Only for ollama, openAI, huggingFace
  • documentTemplate sub field is available (string) - optional
  • dimensions - optional because not compatible with all sources. Only for openAi, huggingFace, ollama, and rest
  • distribution - optional
  • request - mandatory only if using rest embedder
  • response - mandatory only if using rest embedder
  • documentTemplateMaxBytes - optional
  • revision - optional, only for huggingFace
  • headers - optional, only for rest
  • binaryQuantized - optional

Update search to handle vector search and hybrid search

Docs: https://www.meilisearch.com/docs/reference/api/search

Update the search method::

  • hybrid search parameter, with sub fields semanticRatio and embedder. embedder is mandatory if hybrid is set.
  • vector parameter is available
  • retrieveVectors parameter available
  • semanticHitCount in search response
  • Accept _semanticScore in the search response (optional)
  • vector should be returned in the search response, but optional (because depends on search parameters)
  • _vectors present in the search response, but optional

Add similar documents endpoint

Docs: https://www.meilisearch.com/docs/reference/api/similar

  • Implement searchSimilarDocuments associated with the POST /indexes/:uid/similar. Do NOT implement with GET.

PR checklist

Please check if your PR fulfills the following requirements:

  • Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
  • Have you read the contributing guidelines?
  • Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!

Summary by CodeRabbit

  • New Features

    • Added support for managing embedders settings, including retrieving, updating, and resetting embedders on indexes.
    • Introduced semantic similarity search for documents.
    • Added hybrid search configuration and vector-based search with options to retrieve vector data in search results.
    • Introduced new models for embedder configuration, embedder distribution, and hybrid search.
  • Bug Fixes

    • Improved JSON serialization to omit optional fields with null values in search requests and similar document requests.
  • Tests

    • Added comprehensive tests for embedders settings, semantic similarity search, hybrid search, vector search, and serialization of new features.
    • Enhanced tests for task queries with pagination parameters.
  • Chores

    • Enhanced test workflow logging for better visibility during CI runs.

@Strift Strift marked this pull request as draft March 12, 2025 06:48
@Strift Strift force-pushed the feat/add-ai-powered-search branch 2 times, most recently from cb480aa to c8647ec Compare March 18, 2025 02:07
@Strift Strift force-pushed the feat/add-ai-powered-search branch from 5467c2f to 4306308 Compare April 8, 2025 08:27
@Strift Strift requested a review from Copilot May 15, 2025 05:57
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds AI-powered search features by incorporating new embedders settings management, hybrid search, and vector search capabilities.

  • Added endpoints and models for embedders settings (get, update, and reset).
  • Enhanced search functionality with hybrid search parameters, vector search, and similar document search.
  • Updated tests and workflows to cover the new functionality.

Reviewed Changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/test/java/com/meilisearch/sdk/SimilarDocumentRequestTest.java Added tests to verify JSON serialization of SimilarDocumentRequest.
src/test/java/com/meilisearch/sdk/SearchRequestTest.java Introduced tests for hybrid search and vector retrieval parameters.
src/test/java/com/meilisearch/integration/TasksTest.java Updated task tests to assert correct number of returned results.
src/test/java/com/meilisearch/integration/SettingsTest.java Added tests for getting, updating, and resetting embedders settings.
src/test/java/com/meilisearch/integration/SearchTest.java Added tests for vector search functionality and retrieveVectors behavior.
src/main/java/com/meilisearch/sdk/model/Settings.java Updated the embedders field type to align with new Embedder model.
src/main/java/com/meilisearch/sdk/model/SearchResult.java Added _vectors field to support vector search responses.
src/main/java/com/meilisearch/sdk/model/Hybrid.java Introduced a new Hybrid model for hybrid search configuration.
src/main/java/com/meilisearch/sdk/model/Embedder*.java New Embedder, EmbedderDistribution models replacing the removed Embedders model.
src/main/java/com/meilisearch/sdk/SimilarDocumentRequest.java Updated JSON serialization to use putOpt for optional fields.
src/main/java/com/meilisearch/sdk/SettingsHandler.java Added methods to get, update, and reset embedders settings.
src/main/java/com/meilisearch/sdk/SearchRequest.java Extended JSON serialization with hybrid, vector, and retrieveVectors fields.
src/main/java/com/meilisearch/sdk/IndexSearchRequest.java Added support for the retrieveVectors parameter in search requests.
src/main/java/com/meilisearch/sdk/Index.java Added similar document search and embedders settings API support.
.github/workflows/tests.yml Updated integration test command to include more verbose logging.

@Strift Strift requested a review from brunoocasali May 15, 2025 06:00
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds AI-powered search features by introducing embedders settings and hybrid/vector search capabilities, along with a new similar documents endpoint. The key changes include:

  • Updating index settings to handle embedders with get, update, and reset methods.
  • Extending search functionality to support hybrid and vector searches with optional vector retrieval.
  • Adding comprehensive tests and necessary model adjustments to support the new features.

Reviewed Changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/test/java/com/meilisearch/sdk/SimilarDocumentRequestTest.java New tests for SimilarDocumentRequest JSON serialization.
src/test/java/com/meilisearch/sdk/SearchRequestTest.java New tests verifying hybrid search and retrieveVectors functionality.
src/test/java/com/meilisearch/integration/TasksTest.java Updated test assertions for tasks query limit.
src/test/java/com/meilisearch/integration/SettingsTest.java New tests for embedders settings get, update, and reset.
src/test/java/com/meilisearch/integration/SearchTest.java Added tests for vector search and retrieveVectors parameter.
src/main/java/com/meilisearch/sdk/model/Settings.java Updated embedders field type from Embedders to Embedder.
src/main/java/com/meilisearch/sdk/model/SearchResult.java Added optional _vectors field in search responses.
src/main/java/com/meilisearch/sdk/model/Hybrid.java Introduced Hybrid model for configuring hybrid search.
src/main/java/com/meilisearch/sdk/model/EmbedderDistribution.java New model for embedder distribution configuration.
src/main/java/com/meilisearch/sdk/model/Embedder.java New Embedder model replacing the removed Embedders class.
src/main/java/com/meilisearch/sdk/SimilarDocumentRequest.java Updated JSON building logic to omit null fields.
src/main/java/com/meilisearch/sdk/SettingsHandler.java Added methods to manage embedders settings through HTTP requests.
src/main/java/com/meilisearch/sdk/SearchRequest.java Added hybrid search configuration along with vector and retrieveVectors options.
src/main/java/com/meilisearch/sdk/IndexSearchRequest.java Extended search request to include retrieveVectors parameter.
src/main/java/com/meilisearch/sdk/Index.java New methods for similar document search and embedders settings management.
.github/workflows/tests.yml Minor update to test workflow command with additional logging info.
Comments suppressed due to low confidence (1)

src/main/java/com/meilisearch/sdk/SimilarDocumentRequest.java:24

  • Consider updating the constructor comment to use the correct class name 'SimilarDocumentRequest' for consistency.
/** Constructor for SimilarDocumentsRequest for building search request for similar documents */

brunoocasali
brunoocasali previously approved these changes May 21, 2025
@Strift Strift self-assigned this May 28, 2025
@brunoocasali
Copy link
Member

bors merge

meili-bors bot added a commit that referenced this pull request Jun 2, 2025
827: Add AI-powered search features  r=brunoocasali a=Strift

# Pull Request

## Related issue
Fixes #817 

## What does this PR do?

### Update settings to handle embedders

Docs: https://www.meilisearch.com/docs/reference/api/settings#embedders

Update the methods `getEmbedders`, `updateEmbedders`, `resetEmbedders`. Also, the method `updateSettings` should be able to accept the new `embedders` parameter. 

Here is the list of fields in the `embedders` object:
  - [x] `source` sub field is available and accepts: `ollama`, `rest`, `openAI`, `huggingFace` and `userProvided`
  - [x] `apiKey` sub field is available (string) - optional because not compatible with all sources. Only for `openAi`, `ollama`, `rest`.
  - [x] `model` sub field is available (string) - optional because not compatible with all sources. Only for `ollama`, `openAI`, `huggingFace`
  - [x] `documentTemplate` sub field is available (string) - optional
  - [x] `dimensions` - optional because not compatible with all sources. Only for `openAi`, `huggingFace`, `ollama`, and `rest`
  - [x] `distribution` - optional
  - [x] `request` - mandatory only if using `rest` embedder
  - [x] `response`  - mandatory only if using `rest` embedder
  - [x] `documentTemplateMaxBytes` - optional
  - [x] `revision` - optional, only for `huggingFace`
  - [x] `headers` - optional, only for `rest`
  - [x] `binaryQuantized` - optional

### Update search to handle vector search and hybrid search

Docs: https://www.meilisearch.com/docs/reference/api/search

Update the `search` method::
  - [x] `hybrid` search parameter, with sub fields `semanticRatio` and `embedder`. `embedder` is mandatory if `hybrid` is set.
  - [x] `vector` parameter is available
  - [x] `retrieveVectors` parameter available
  - [ ] ~~`semanticHitCount` in search response~~
  - [ ] ~~Accept `_semanticScore` in the search response (optional)~~
  - [ ] ~~`vector` should be returned in the search response, but optional (because depends on search parameters)~~
  - [x] `_vectors` present in the search response, but optional

### Add similar documents endpoint

Docs: https://www.meilisearch.com/docs/reference/api/similar

- [x] Implement `searchSimilarDocuments` associated with the `POST /indexes/:uid/similar`. Do NOT implement with `GET`.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

- **New Features**
  - Added support for managing embedders settings, including retrieving, updating, and resetting embedders on indexes.
  - Introduced semantic similarity search for documents.
  - Added hybrid search configuration and vector-based search with options to retrieve vector data in search results.
  - Introduced new models for embedder configuration, embedder distribution, and hybrid search.

- **Bug Fixes**
  - Improved JSON serialization to omit optional fields with null values in search requests and similar document requests.

- **Tests**
  - Added comprehensive tests for embedders settings, semantic similarity search, hybrid search, vector search, and serialization of new features.

- **Chores**
  - Enhanced test workflow logging for better visibility during CI runs.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Strift <lau.cazanove@gmail.com>
Co-authored-by: Laurent Cazanove <lau.cazanove@gmail.com>
Copy link
Contributor

meili-bors bot commented Jun 2, 2025

Build failed:

  • integration-and-unit-tests

@Strift Strift requested a review from brunoocasali June 5, 2025 05:36
@brunoocasali brunoocasali merged commit a13f631 into main Jun 5, 2025
4 checks passed
@brunoocasali brunoocasali deleted the feat/add-ai-powered-search branch June 5, 2025 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[v1.13] Stabilize AI-powered search
2 participants