Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: query error #38831

Open
1 task done
li-yongyu opened this issue Dec 29, 2024 · 9 comments
Open
1 task done

[Bug]: query error #38831

li-yongyu opened this issue Dec 29, 2024 · 9 comments
Assignees
Labels
kind/bug Issues or changes related a bug triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@li-yongyu
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: 2.5.1-gpu or 2.5.0-bate-gpu
- Deployment mode(standalone or cluster): standalone 
- SDK version(e.g. pymilvus v2.0.0rc2): Java v2.5.2
- OS(Ubuntu or CentOS): Ubuntu

Current Behavior

  1. Search through milvusClientV2.search, When outputFields are *, 'incomplete query result, missing id %s, len(searchIDs) =%s, len(queryIDs) =%s, collection=%s: inconsistent requery result 'error information, the same error may occur when I manually fill in full fields (including vector fields), but the query results with full fields after vector fields are correctly responded
  2. The result of the filter expression in query (get query id result is the same) is wrong, and it is impossible to correctly return all the contents based on the in condition

Expected Behavior

  1. Correct full field results based on query conditions
  2. All results that meet the in condition are returned correctly
a90c76c52baa68dfb03000b0d6f417d 920ea0357ba2ca25b947c607269c8e2

Steps To Reproduce

1. outputFields is set to *
2. When multiple results are obtained through the `get` method, if the ids set exists, get cannot return all but only a part of it. The same result is obtained by using the `in` filter expression

Milvus Log

No response

Anything else?

No response

@li-yongyu li-yongyu added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 29, 2024
@yanliang567
Copy link
Contributor

@li-yongyu

  1. the first "incomplete query result, missing id" is a known issue, and we are working on a fix, please stay tune.
  2. the 2nd issue, if you are using search with expr in, it could be expected. Because ann search is not accurate. If you are using query with expr in, then it could be an issue and please upload the milvus logs for investigation

/assign @li-yongyu
/unassign

@yanliang567 yanliang567 added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 30, 2024
@li-yongyu
Copy link
Author

@li-yongyu

  1. 第一个“查询结果不完整,缺少id”是已知问题,我们正在修复,请继续关注。
  2. 第二个问题,如果你使用带有 expr 的搜索,这可能是可以预料到的。因为人工神经网络搜索并不准确。如果你使用带有 expr 的查询,那么这可能是一个问题,请上传 milvus 日志以供调查

/分配@li-yongyu /取消分配
logs.log

I made four queries in the log

  1. Full field vector search outputFields="*" results in query failure
  2. Manually fill in the fields (vector fields are excluded, I am not sure it is because of the vector fields that cause the error), and the results can be returned correctly.
  3. Query by setting ids through get of Java SDK. It seems that the search is also done by in of expr. The returned results are incorrect. 10 ids that exist in the database are not queried.
  4. Use the query of Java SDK to query 10 specified ids by manually setting the in of expr(filter). The result is the same as 3, there is no result (sometimes there are some results but they are not comprehensive), and the previous test can obtain it. 3 results

@xiaofan-luan
Copy link
Collaborator

get API should be 100% correct.

it seems to be certain that you have some duplicated PK, but right now we don't have details on what happens in your cluster.

Here are some guide line for you to help:

  1. if you generate a backup of your data and give it to us that could help us a lot on investigation
  2. run a get reqeust and collect logs. There should be some error message for us to debug

@xiaofan-luan
Copy link
Collaborator

get API should be 100% correct.

it seems to be certain that you have some duplicated PK, but right now we don't have details on what happens in your cluster.

Here are some guide line for you to help:

  1. if you generate a backup of your data and give it to us that could help us a lot on investigation
  2. run a get reqeust and collect logs. There should be some error message for us to debug

get is just a wrapper of query so there should be no difference.

@li-yongyu
Copy link
Author

get API should be 100% correct.

it seems to be certain that you have some duplicated PK, but right now we don't have details on what happens in your cluster.

Here are some guide line for you to help:

  1. if you generate a backup of your data and give it to us that could help us a lot on investigation
  2. run a get reqeust and collect logs. There should be some error message for us to debug

The operation log is given above, and the query API is called in the log

@yanliang567 Data is sent to email

@yanliang567
Copy link
Contributor

@li-yongyu thank you for the update. I got the mail and will keep you posted

@yanliang567
Copy link
Contributor

/assign @congqixia

@kagaho
Copy link

kagaho commented Jan 7, 2025

Same here, getting a similar issue, this id (my PK) doesn't even exist:

[2025/01/07 17:56:48.529 +00:00] [WARN] [proxy/task_search.go:748] ["failed to requery"] [traceID=f8e3ff633f9f7509468772cd75efb512] [nq=1] [error="incomplete query result, missing id 455115042215584610, len(searchIDs) = 6, len(queryIDs) = 0, collection=455115042216245589: inconsistent requery result"] [errorVerbose="incomplete query result, missing id 455115042215584610, len(searchIDs) = 6, len(queryIDs) = 0, collection=455115042216245589: inconsistent requery result\n(1) attached stack trace\n  -- stack trace:\n  | github.com/milvus-io/milvus/pkg/util/merr.WrapErrInconsistentRequery\n  | \t/workspace/source/pkg/util/merr/utils.go:1104\n  | github.com/milvus-io/milvus/internal/proxy.(*searchTask).Requery\n  | \t/workspace/source/internal/proxy/task_search.go:914\n  | github.com/milvus-io/milvus/internal/proxy.(*searchTask).PostExecute\n  | \t/workspace/source/internal/proxy/task_search.go:746\n  | github.com/milvus-io/milvus/internal/proxy.(*taskScheduler).processTask\n  | \t/workspace/source/internal/proxy/task_scheduler.go:485\n  | github.com/milvus-io/milvus/internal/proxy.(*taskScheduler).queryLoop.func1\n  | \t/workspace/source/internal/proxy/task_scheduler.go:566\n  | github.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1\n  | \t/workspace/source/pkg/util/conc/pool.go:82\n  | github.com/panjf2000/ants/v2.(*goWorker).run.func1\n  | \t/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.2/worker.go:67\n  | runtime.goexit\n  | \t/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.0.linux-amd64/src/runtime/asm_amd64.s:1695\nWraps: (2) incomplete query result, missing id 455115042215584610, len(searchIDs) = 6, len(queryIDs) = 0, collection=455115042216245589\nWraps: (3) inconsistent requery result\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) merr.milvusError"]
[2025/01/07 17:56:48.529 +00:00] [WARN] [proxy/task_scheduler.go:488] ["Failed to post-execute task: "] [traceID=f8e3ff633f9f7509468772cd75efb512] [error="incomplete query result, missing id 455115042215584610, len(searchIDs) = 6, len(queryIDs) = 0, collection=455115042216245589: inconsistent requery result"] [errorVerbose="incomplete query result, missing id 455115042215584610, len(searchIDs) = 6, len(queryIDs) = 0, collection=455115042216245589: inconsistent requery result\n(1) attached stack trace\n  -- stack trace:\n  | github.com/milvus-io/milvus/pkg/util/merr.WrapErrInconsistentRequery\n  | \t/workspace/source/pkg/util/merr/utils.go:1104\n  | github.com/milvus-io/milvus/internal/proxy.(*searchTask).Requery\n  | \t/workspace/source/internal/proxy/task_search.go:914\n  | github.com/milvus-io/milvus/internal/proxy.(*searchTask).PostExecute\n  | \t/workspace/source/internal/proxy/task_search.go:746\n  | github.com/milvus-io/milvus/internal/proxy.(*taskScheduler).processTask\n  | \t/workspace/source/internal/proxy/task_scheduler.go:485\n  | github.com/milvus-io/milvus/internal/proxy.(*taskScheduler).queryLoop.func1\n  | \t/workspace/source/internal/proxy/task_scheduler.go:566\n  | github.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1\n  | \t/workspace/source/pkg/util/conc/pool.go:82\n  | github.com/panjf2000/ants/v2.(*goWorker).run.func1\n  | \t/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.2/worker.go:67\n  | runtime.goexit\n  | \t/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.0.linux-amd64/src/runtime/asm_amd64.s:1695\nWraps: (2) incomplete query result, missing id 455115042215584610, len(searchIDs) = 6, len(queryIDs) = 0, collection=455115042216245589\nWraps: (3) inconsistent requery result\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) merr.milvusError"]
[2025/01/07 17:56:48.529 +00:00] [WARN] [proxy/impl.go:3389] ["HybridSearch failed to WaitToFinish"] [traceID=f8e3ff633f9f7509468772cd75efb512] [role=proxy] [db=default] [collection=Strata_01] [partitions="[]"] [OutputFields="[doc_id,text,metadata]"] [ConsistencyLevel=Strong] [useDefaultConsistency=true] [error="incomplete query result, missing id 455115042215584610, len(searchIDs) = 6, len(queryIDs) = 0, collection=455115042216245589: inconsistent requery result"] [errorVerbose="incomplete query result, missing id 455115042215584610, len(searchIDs) = 6, len(queryIDs) = 0, collection=455115042216245589: inconsistent requery result\n(1) attached stack trace\n  -- stack trace:\n  | github.com/milvus-io/milvus/pkg/util/merr.WrapErrInconsistentRequery\n  | \t/workspace/source/pkg/util/merr/utils.go:1104\n  | github.com/milvus-io/milvus/internal/proxy.(*searchTask).Requery\n  | \t/workspace/source/internal/proxy/task_search.go:914\n  | github.com/milvus-io/milvus/internal/proxy.(*searchTask).PostExecute\n  | \t/workspace/source/internal/proxy/task_search.go:746\n  | github.com/milvus-io/milvus/internal/proxy.(*taskScheduler).processTask\n  | \t/workspace/source/internal/proxy/task_scheduler.go:485\n  | github.com/milvus-io/milvus/internal/proxy.(*taskScheduler).queryLoop.func1\n  | \t/workspace/source/internal/proxy/task_scheduler.go:566\n  | github.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1\n  | \t/workspace/source/pkg/util/conc/pool.go:82\n  | github.com/panjf2000/ants/v2.(*goWorker).run.func1\n  | \t/go/pkg/mod/github.com/panjf2000/ants/v2@v2.7.2/worker.go:67\n  | runtime.goexit\n  | \t/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.0.linux-amd64/src/runtime/asm_amd64.s:1695\nWraps: (2) incomplete query result, missing id 455115042215584610, len(searchIDs) = 6, len(queryIDs) = 0, collection=455115042216245589\nWraps: (3) inconsistent requery result\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) merr.milvusError"]

In my setup, I am running milvus-standalone, I came from Milvus 2.4.14, so I am still using an external BM25 model (and BGE-M3 ) for the hybrid vector embeddings. Could it perhaps be the trigger?

Issue does not happen with 2.4.14, where I have similar collection loaded.

My config is quite simple, single collection around 115k docids

let me know if I can help with something.

Thanks!

@collinpu
Copy link

collinpu commented Jan 8, 2025

I'm also having this problem using Milvus 2.5.2.
It also only occurs when I return a vector field as part of the output. If I remove that field from the output everything works.
It is important that I can retrieve this vector field because I now have to run a redundant get query to retrieve the missing vectors.

Thanks for the great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

6 participants