Releases: meilisearch/meilisearch
v1.12.1
v1.12.0 🦗
Meilisearch v1.12 introduces significant indexing speed improvements, almost halving the time required to index large datasets. This release also introduces new settings to customize and potentially further increase indexing speed.
🧰 All official Meilisearch integrations (including SDKs, clients, and other tools) are compatible with this Meilisearch release. Integration deployment happens between 4 to 48 hours after a new version becomes available.
Some SDKs might not include all new features. Consult the project repository for detailed information. Is a feature you need missing from your chosen SDK? Create an issue letting us know you need it, or, for open-source karma points, open a PR implementing it (we'll love you for that ❤️).
New features and updates 🔥
Improve indexing speed
Indexing time is improved across the board!
- Performance is maintained or better on smaller machines
- On bigger machines with multiple cores and good IO, Meilisearch v1.12 is much faster than Meilisearch v1.11
- More than twice as fast for raw document insertion tasks.
- More than x4 as fast for incrementally updating documents in a large database.
- Embeddings generation was also improved up to x1.5 for some workloads.
The new indexer also makes task cancellation faster.
Done by @dureuill, @ManyTheFish, and @Kerollmops in #4900.
New index settings: use facetSearch
and prefixSearch
to improve indexing speed
v1.12 introduces two new index settings: facetSearch
and prefixSearch
.
Both settings allow you to skip parts of the indexing process. This leads to significant improvements to indexing speed, but may negatively impact search experience in some use cases.
Done by @ManyTheFish in #5091
facetSearch
Use this setting to toggle facet search:
curl \
-X PUT 'http://localhost:7700/indexes/books/settings/facet-search' \
-H 'Content-Type: application/json' \
--data-binary 'true'
The default value for facetSearch
is true
. When set to false
, this setting disables facet search for all filterable attributes in an index.
prefixSearch
Use this setting to configure the ability to search a word by prefix on an index:
curl \
-X PUT 'http://localhost:7700/indexes/books/settings/prefix-search' \
-H 'Content-Type: application/json' \
--data-binary 'disabled'
prefixSearch
accepts one of the following values:
"indexingTime"
: enables prefix processing during indexing. This is the default Meilisearch behavior"disabled"
: deactivates prefix search completely
Disabling prefix search means the query he
will no longer match the word hello
. This may significantly impact search result relevancy, but speeds up the indexing process.
New API route: /batches
The new /batches
endpoint allow you to query information about task batches.
GET
/batches
returns a list of batch objects:
curl -X GET 'http://localhost:7700/batches'
This endpoint accepts the same parameters as GET
/tasks
route, allowing you to narrow down which batches you want to see. Parameters used with GET
/batches
apply to the tasks, not the batches themselves. For example, GET /batches?uid=0
returns batches containing tasks with a taskUid
of 0
, not batches with a batchUid
of 0
.
You may also query GET
/batches/:uid
to retrieve information about a single batch object:
curl -X GET 'http://localhost:7700/batches/BATCH_UID'
/batches/:uid
does not accept any parameters.
Batch objects contain the following fields:
{
"uid": 160,
"progress": {
"steps": [
{
"currentStep": "processing tasks",
"finished": 0,
"total": 2
},
{
"currentStep": "indexing",
"finished": 2,
"total": 3
},
{
"currentStep": "extracting words",
"finished": 3,
"total": 13
},
{
"currentStep": "document",
"finished": 12300,
"total": 19546
}
],
"percentage": 37.986263
},
"details": {
"receivedDocuments": 19547,
"indexedDocuments": null
},
"stats": {
"totalNbTasks": 1,
"status": {
"processing": 1
},
"types": {
"documentAdditionOrUpdate": 1
},
"indexUids": {
"mieli": 1
}
},
"duration": null,
"startedAt": "2024-12-12T09:44:34.124726733Z",
"finishedAt": null
}
Additionally, task objects now include a new field, batchUid
. Use this field together with /batches/:uid
to retrieve data on a specific batch.
{
"uid": 154,
"batchUid": 142,
"indexUid": "movies_test2",
"status": "succeeded",
"type": "documentAdditionOrUpdate",
"canceledBy": null,
"details": {
"receivedDocuments": 1,
"indexedDocuments": 1
},
"error": null,
"duration": "PT0.027766819S",
"enqueuedAt": "2024-12-02T14:07:34.974430765Z",
"startedAt": "2024-12-02T14:07:34.99021667Z",
"finishedAt": "2024-12-02T14:07:35.017983489Z"
}
Done by @irevoire in #5060, #5070, #5080
Other improvements
- New query parameter for
GET
/tasks
:reverse
. Ifreverse
is set totrue
, tasks will be returned in reversed order, from oldest to newest tasks. Done by @irevoire in #5048 - Phrase searches with
showMatchesPosition
set totrue
give a single location for the whole phrase @flevi29 in #4928 - New Prometheus metrics by @PedroTurik in #5044
- When a query finds matching terms in document fields with array values, Meilisearch now includes an
indices
field to_matchesPosition
specifying which array elements contain the matches by @LukasKalbertodt in #5005 ⚠️ BreakingvectorStore
change: field distribution no longer contains_vectors
. Its value used to be incorrect, and there is no current use case for the fixed, most likely empty, value. Done as part of #4900- Improve error message by adding index name in #5056 by @airycanon
Fixes 🐞
- Return appropriate error when primary key is greater than 512 bytes, by @flevi29 in #4930
- Fix issue where numbers were segmented in different ways depending on tokenizer, by @dqkqd in meilisearch/charabia#311
- Fix pagination when embedding fails by @dureuill in #5063
- Fix issue causing Meilisearch to ignore stop words in some cases by @ManyTheFish in #5062
- Fix phrase search with
attributesToSearchOn
in #5062 by @ManyTheFish
Misc
- Dependencies updates
- Update benchmarks to match the new crates subfolder by @Kerollmops in #5021
- Fix the benchmarks by @irevoire in #5037
- Bump Swatinem/rust-cache from 2.7.1 to 2.7.5 in #5030
- Update charabia v0.9.2 by @ManyTheFish in #5098
- Update mini-dashboard to v0.2.16 version by @curquiza in #5102
- CIs and tests
- Improve performance of
delete_index.rs
by @DerTimonius in #4963 - Improve performance of
create_index.rs
by @DerTimonius in #4962 - Improve performance of
get_documents.rs
by @PedroTurik in #5025 - Improve performance of
formatted.rs
by @PedroTurik in #5043 - Fix the path used in the flaky tests CI by @Kerollmops in #5049
- Improve performance of
- Misc
- Rollback the Meilisearch Kawaii logo by @Kerollmops in #5017
- Add image source label to Dockerfile by @wuast94 in #4990
- Hide code complexity into a subfolder by @Kerollmops in #5016
- Internal tool: implement offline upgrade from v1.10 to v1.11 by @irevoire in #5034
- Internal tool: implement offline upgrade from v1.11 to v1.12 by @ManyTheFish in #5146
- Meilisearch is now able to retrieve Katakana words from a Hiragana query by @tats-u in meilisearch/charabia#312
- Improve error handling when writing into LMDB by @Kerollmops in #5089
❤️ Thanks again to our external contributors:
v1.12.0-rc.6
Warning
Since this is a release candidate (RC), we do NOT recommend using it in a production environment. Is something not working as expected? We welcome bug reports and feedback about new features.
User facing changes
Internal changes
- Use bumparaw-collections in Meilisearch/milli by @Kerollmops in #5145
- Reintroduce the Document Addition Logs by @Kerollmops in #5150
- Do not duplicate NDJson data when unecessary by @Kerollmops in #5148
- Offline upgrade v1.12 by @ManyTheFish in #5146
- Return docid in case of errors while rendering the document template by @dureuill in #5153
- Make xtasks be able to use the specified binary by @Kerollmops in #5152
- Indexer edition 2024 fix facet fst by @ManyTheFish in #5158
- Fix the New Indexer Spilling by @Kerollmops in #5159
Full Changelog: v1.12.0-rc.5...v1.12.0-rc.6
v1.12.0-rc.5 🦗
What's Changed
- Settings opt out error msg by @ManyTheFish in #5119
- Fix batch details by @irevoire in #5123
- Ignore documents whose selected fields didn't change by @dureuill in #5131
- Allow xtask bench to proceed without a commit message by @dureuill in #5138
- Attach index name in error message by @airycanon in #5056
- Use the right amount of max memory and not impact the settings by @Kerollmops in #5141
- Improve the merging of bitmaps in the merger by @ManyTheFish in #5142
New Contributors
- @airycanon made their first contribution in #5056
Full Changelog: v1.12.0-rc.4...v1.12.0-rc.5
v1.12.0-rc.4 🦗
What's Changed
- Update BBQueue repo to point to the Meilisearch org by @Kerollmops in #5111
- Change the reserve and grant function to accept a closure by @Kerollmops in #5118
- Increase margin on deletion of task by @irevoire in #5110
- Yield the BBQueue writing loop by @Kerollmops in #5122
- Add cross tasks by @ManyTheFish in #5120
- Make the tasks pulling timeout configurable by @Kerollmops in #5121
- Fix the Minimum BBQueue channel threshold by @Kerollmops in #5113
- Optimize Prefixes and Merges by @Kerollmops in #5124
- Change the default max memory usage to 5% of the total memory by @Kerollmops in #5125
Full Changelog: v1.12.0-rc.3...v1.12.0-rc.4
v1.12.0-rc.3 🦗
What's Changed
- While spamming the batches route we could see a processing batch becoming missing and then finished, this commit ensures the batches goes from processing to finished directly by @irevoire in #5107
- Fix autobatch by @dureuill in #5109
- Implement a bbqueue channel between the extractors and the writer by @Kerollmops in #5094
Full Changelog: v1.12.0-rc.2...v1.12.0-rc.3
v1.12.0-rc.2 🦗
What's Changed
- Fix index settings opt out by @ManyTheFish in #5101
- Update mini-dashboard to v0.2.16 version by @curquiza in #5102
Full Changelog: v1.12.0-rc.1...v1.12.0-rc.2
v1.12.0-rc.1 🦗
What's Changed
- Use the published crates versions by @Kerollmops in #5090
- Improve error handling when writing into LMDB by @Kerollmops in #5089
- Fix bugs for v1.12 by @ManyTheFish in #5062
- Precise spans for new indexer by @dureuill in #5092
- Settings opt out by @ManyTheFish in #5091
- Fix pagination when embedding fails by @dureuill in #5063
- Update charabia v0.9.2 by @ManyTheFish in #5098
- Span to measure the part of db writes that is after the merge/extraction by @dureuill in #5095
Full Changelog: v1.12.0-rc.0...v1.12.0-rc.1
v1.12.0-rc.0 🦗
Warning
Since this is a release candidate (RC), we do NOT recommend using it in a production environment. Is something not working as expected? We welcome bug reports and feedback about new features.
Meilisearch v1.12 introduces huge improvements regarding indexing speed. The team worked hard to divise time to import huge datasets by two. You can also customize your settings to adapt your indexing needs and so, the indexing speed that fits your usage.
New features and updates 🔥
Improve indexing speed
Indexing time for huge dataset import (multiple millions of documents) is divided by two!
Done by @dureuill, @ManyTheFish, and @Kerollmops in #4900.
More visibility around indexing tasks
In order to give more visibility around indexing processing around your task
, especially when indexing big documents, new routes have been introduced:
GET /batches/:uid
: returns a specificbatch
objectGET /batches
: return a list ofbatch
objects
The same query parameters than GET /tasks
route can be used to apply filtering.
Also, a new field is introduced in the task object: batch
.
Done by @irevoire in #5060, #5070, #5080
Other improvements
- Introduce the
reverse
query parameter forGET /tasks
route set tofalse
by default. If set totrue
, then the tasks will be returned in reversed order (the oldest first). Done by @irevoire in #5048. - Make matches consider phrases as a single
Match
by @flevi29 in #4928 - Adds new metrics to prometheus by @PedroTurik in #5044
- Add
indices
field to_matchesPosition
to specify where in an array a match comes from, by @LukasKalbertodt in #5005
Fixes 🐞
Misc
- Dependencies updates
- Update benchmarks to match the new crates subfolder by @Kerollmops in #5021
- Fix the benchmarks by @irevoire in #5037
- Bump Swatinem/rust-cache from 2.7.1 to 2.7.5 in #5030
- CIs and tests
- Improve performance of
delete_index.rs
by @DerTimonius in #4963 - Improve performance of
create_index.rs
by @DerTimonius in #4962 - Improve performance of
get_documents.rs
by @PedroTurik in #5025 - Improve performance of
formatted.rs
by @PedroTurik in #5043 - Fix the path used in the flaky tests CI by @Kerollmops in #5049
- Improve performance of
- Misc
- Rollback the Meilisearch Kawaii logo by @Kerollmops in #5017
- Add image source label to Dockerfile by @wuast94 in #4990
- Hide code complexity into a subfolder by @Kerollmops in #5016
- Internal tool: implement offline upgrade from v1.10 to v1.11 by @irevoire in #5034
❤️ Thanks again to our external contributors:
v1.11.3 🐿️
What's Changed
- For REST/OpenAI/ollama autoembedders users: Retry if deserialization of remote response failed by @dureuill in #5058
Full Changelog: v1.11.2...v1.11.3