-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Return distinct items from GetMany and SourceMany #4353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit fixes a bug where a GetMany or SourceMany API call with a repeated id would return a cartesian product of id and documents. Take for example the same id repeated 3 times; Elasticsearch will return 3 documents in the response, where each JSON object is a representation of the same underlying document. Since all 3 documents have the same id, GetMany and SourceMany would return all 3 documents as a match for the first id, all 3 as a match for the second id, and so on. With this fix, For each id, only the first matching document is returned. One _could_ argue that only a single document should be returned for the same repeated id. This fix however tries to reflect what is in the Elasticsearch response. Fixes #4342
This PR needs further work; an mget API call can return documents from one or more indices, such that documents with the same id from different indices can be returned. |
One approach that may be acceptable would be to return only a document per index matching each |
I think the returning multiple per |
This commit updates the GetMany and SourceMany implementations to keep track of the seen indices for each id, to return a single document of id and index combination _per_ id i.e. if the same id is specified multiple times in the GetMany or SourceMany call, and only a single index is targeted, each id will return just a single document.
I've added a commit to keep track of the seen indices for each Thoughts? |
This commit enumerates only distinct ids when retrieving hits or source from GetMany and SourceMany, so that the same id supplied to either will return only a single document per targeted index.
Ready for review again, @Mpdreamz, when you get a chance |
src/Nest/Document/Multiple/MultiGet/Response/MultiGetResponse.cs
Outdated
Show resolved
Hide resolved
This commit fixes a bug in the MultiGetRequestFormatter whereby the document index is removed when a request index is specified, without checking whether the document index matches the request index. Add integration test for fix
This commit fixes a bug where a GetMany or SourceMany API call with a repeated id would return a cartesian product of id and documents. - enumerate only distinct ids when retrieving hits or source from GetMany and SourceMany, so that the same id input to either will return only a single document per target index. - fix a bug in the MultiGetRequestFormatter whereby the document index is removed when a request index is specified, without checking whether the document index matches the request index. Fixes #4342 (cherry picked from commit 8cbc1fe)
This commit fixes a bug where a GetMany or SourceMany API call with a repeated id would return a cartesian product of id and documents. - enumerate only distinct ids when retrieving hits or source from GetMany and SourceMany, so that the same id input to either will return only a single document per target index. - fix a bug in the MultiGetRequestFormatter whereby the document index is removed when a request index is specified, without checking whether the document index matches the request index. Fixes #4342 (cherry-picked from commit 8cbc1fe)
This commit fixes a bug where a GetMany or SourceMany API call with a repeated id would return a cartesian product of id and documents. - enumerate only distinct ids when retrieving hits or source from GetMany and SourceMany, so that the same id input to either will return only a single document per target index. - fix a bug in the MultiGetRequestFormatter whereby the document index is removed when a request index is specified, without checking whether the document index matches the request index. Fixes #4342 (cherry-picked from commit 8cbc1fe)
This commit fixes a bug where a GetMany or SourceMany API call with a repeated id would return a cartesian product of id and documents. - enumerate only distinct ids when retrieving hits or source from GetMany and SourceMany, so that the same id input to either will return only a single document per target index. - fix a bug in the MultiGetRequestFormatter whereby the document index is removed when a request index is specified, without checking whether the document index matches the request index. Fixes #4342 (cherry picked from commit 8cbc1fe)
#4353 Fixed an issue with the GetMany helpers that returned the cartesian product of all ids specified rather then creating a distinct list if more then one index was targeted. This PR also updated the routine in the serializer to omit the index name from each item if the index is already specified on the url in case of multiple indices This updated routine in the `7.6.0` could throw if you are calling: client.GetMany<T>(ids, "indexName"); Without configuring `ConnectionSettings()` with either a default index for T or a global default index.
#4353 Fixed an issue with the GetMany helpers that returned the cartesian product of all ids specified rather then creating a distinct list if more then one index was targeted. This PR also updated the routine in the serializer to omit the index name from each item if the index is already specified on the url in case of multiple indices This updated routine in the `7.6.0` could throw if you are calling: client.GetMany<T>(ids, "indexName"); Without configuring `ConnectionSettings()` with either a default index for T or a global default index.
#4353 Fixed an issue with the GetMany helpers that returned the cartesian product of all ids specified rather then creating a distinct list if more then one index was targeted. This PR also updated the routine in the serializer to omit the index name from each item if the index is already specified on the url in case of multiple indices This updated routine in the `7.6.0` could throw if you are calling: client.GetMany<T>(ids, "indexName"); Without configuring `ConnectionSettings()` with either a default index for T or a global default index.
#4353 Fixed an issue with the GetMany helpers that returned the cartesian product of all ids specified rather then creating a distinct list if more then one index was targeted. This PR also updated the routine in the serializer to omit the index name from each item if the index is already specified on the url in case of multiple indices This updated routine in the `7.6.0` could throw if you are calling: client.GetMany<T>(ids, "indexName"); Without configuring `ConnectionSettings()` with either a default index for T or a global default index.
#4353 Fixed an issue with the GetMany helpers that returned the cartesian product of all ids specified rather then creating a distinct list if more then one index was targeted. This PR also updated the routine in the serializer to omit the index name from each item if the index is already specified on the url in case of multiple indices This updated routine in the `7.6.0` could throw if you are calling: client.GetMany<T>(ids, "indexName"); Without configuring `ConnectionSettings()` with either a default index for T or a global default index. (cherry picked from commit 87c8cdd)
#4353 Fixed an issue with the GetMany helpers that returned the cartesian product of all ids specified rather then creating a distinct list if more then one index was targeted. This PR also updated the routine in the serializer to omit the index name from each item if the index is already specified on the url in case of multiple indices This updated routine in the `7.6.0` could throw if you are calling: client.GetMany<T>(ids, "indexName"); Without configuring `ConnectionSettings()` with either a default index for T or a global default index. Co-authored-by: Martijn Laarman <Mpdreamz@gmail.com>
#4353 Fixed an issue with the GetMany helpers that returned the cartesian product of all ids specified rather then creating a distinct list if more then one index was targeted. This PR also updated the routine in the serializer to omit the index name from each item if the index is already specified on the url in case of multiple indices This updated routine in the `7.6.0` could throw if you are calling: client.GetMany<T>(ids, "indexName"); Without configuring `ConnectionSettings()` with either a default index for T or a global default index. Co-authored-by: Martijn Laarman <Mpdreamz@gmail.com>
#4353 Fixed an issue with the GetMany helpers that returned the cartesian product of all ids specified rather then creating a distinct list if more then one index was targeted. This PR also updated the routine in the serializer to omit the index name from each item if the index is already specified on the url in case of multiple indices This updated routine in the `7.6.0` could throw if you are calling: client.GetMany<T>(ids, "indexName"); Without configuring `ConnectionSettings()` with either a default index for T or a global default index. (cherry picked from commit 87c8cdd)
This commit fixes a bug where a GetMany or SourceMany API call
with a repeated id would return a cartesian product of id and documents.
Take for example the same id repeated 3 times; Elasticsearch will return
3 documents in the response, where each JSON object is a representation
of the same underlying document. Since all 3 documents have the same id,
GetMany and SourceMany would return all 3 documents as a match for the first id,
all 3 as a match for the second id, and so on.
With this fix, For each id, only the first matching document is returned.
One could argue that only a single document should be returned for the same
repeated id. This fix however tries to reflect what is in the
Elasticsearch response.
Fixes #4342