[ Bug ] Neural Search Leads to "modelId is marked non-null but is null" when Targeting Multiple Indices #759

imbarazz · 2024-05-24T10:29:21Z

Note: this error was observed on OpenSearch 2.11 running on AWS cloud.

Performing a neural search against an alias, or performing a multi-search with multiple indices in a single header leads to the following error:

"null_pointer_exception: modelId is marked non-null but is null".

This is problematic when searching across different indices, each with their own embedding model.

Reproduction

Search Pipeline for Embedder Model 1

PUT /_search/pipeline/embed_pipeline_1

{
  "request_processors": [
    {
      "neural_query_enricher": {
        "neural_field_default_id": {
          "common_vector_field": "embed_model_id_1"
        }
      }
    }
  ]
}

Search Pipeline for Embedder Model 2

PUT /_search/pipeline/embed_pipeline_2

{
  "request_processors": [
    {
      "neural_query_enricher": {
        "neural_field_default_id": {
          "common_vector_field": "embed_model_id_2"
        }
      }
    }
  ]
}

Update Index 1 with Pipeline 1

PUT /index1/_settings

{
  "index.search.default_pipeline" : "embed_pipeline_1"
}

Update Index 2 with Pipeline 2

PUT /index2/_settings

{
  "index.search.default_pipeline" : "embed_pipeline_2"
}

Perform Multi-Search

GET /_msearch

{
  "index": [
    "index1",
    "index2"
  ]
}
{
  "query": {
    "neural": {
      "common_vector_field": {
        "query_text": "How do I perform a neural multi-search when dealing with multiple indices?",
        "k": 5
      }
    }
  },
  "from": 0,
  "size": 5
}

The text was updated successfully, but these errors were encountered:

dblock · 2024-06-24T16:47:44Z

Catch All Triage - 1 2 3 4 5 6

imbarazz · 2024-07-12T09:11:26Z

Playing around with this further, I've come to the realization that the issue here is that the index.search.default_pipeline assigned to an index does not take effect when targeting indices via alias. To get this to work, one must explicitly pass the search-pipeline via url parameter like so:

/my_alias/_search?search_pipeline=my_search_pipeline

vibrantvarun · 2024-07-15T21:25:48Z

@martin-gaievski I don't think it as a bug. thoughts?

martin-gaievski · 2024-07-15T23:18:03Z

if I understood this correctly the index.search.default_pipeline does support only single index search, correct? If that's the case then enricher processor will work the same way, this is expected. I have doubt about the error message, we can give some meaningful error, something that is actionable from user perspective.
Also we need to state this clearly in the documentation for enricher processor https://opensearch.org/docs/latest/search-plugins/search-pipelines/neural-query-enricher/

imbarazz · 2024-07-16T09:09:56Z

if I understood this correctly the index.search.default_pipeline does support only single index search, correct? If that's the case then enricher processor will work the same way, this is expected.

Thanks for the reply.

Currently, an index's configured neural_enricher is being completely ignored during a neural search against an alias. Is this expected behaviour? This seems strange to me.

martin-gaievski · 2024-07-17T16:01:22Z

It's not an expected behavior, more like a gap, team didn't check this scenario.

Do these steps summarize the issue correctly @imbarazz :

create index index_A with default model assigned via index.search.default_pipeline index setting, e.g. model_id_1
create alias alias_A with some filter and refer to index_A
run hybrid query, set alias_A as an index for the query. use neural query, do not put any model id

expected result is: hybrid query executed, for neural search sub-query model model_id_1 got picked up from the index setting for index_A

Sylver11 · 2024-07-24T10:51:58Z

is this being picked up? To me this is a fatal one and the opposite of expected behaviour. Took me an hour to figure out what the problem was..

navneet1v · 2024-07-29T19:19:56Z

@minalsha, @martin-gaievski , @vibrantvarun

Romasato · 2024-08-20T16:04:02Z

Ran into this very same issue just today - was hoping to get away from needing to pass in ML Model ID in Search query...

Then, as a workaround, I was planning to use the search_pipeline= query string param with search request, but I could not find a way to pass this extra parameter via .NET higher nor lower level client lib..

Any suggestions?

aalbahem · 2024-08-30T07:08:59Z

I encountered the same bug today. It was unexpected, and it would be beneficial to resolve it or prioritize it. Additionally, highlighting this issue in the documentation for now would be helpful.

vibrantvarun · 2024-10-11T17:41:26Z

Search pipelines are not supported with msearch until now. PR of the support has already been raised and merged by @owaiskazi19 in OpenSearch. In the next OpenSearch release, the support for search pipelines will be enabled with msearch.

imbarazz added bug Something isn't working untriaged labels May 24, 2024

dblock removed the untriaged label Jun 24, 2024

naveentatikonda changed the title ~~Neural Search Leads to "modelId is marked non-null but is null" when Targeting Multiple Indices~~ [ Bug ] Neural Search Leads to "modelId is marked non-null but is null" when Targeting Multiple Indices Sep 18, 2024

jmazanec15 assigned vibrantvarun Oct 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ Bug ] Neural Search Leads to "modelId is marked non-null but is null" when Targeting Multiple Indices #759

[ Bug ] Neural Search Leads to "modelId is marked non-null but is null" when Targeting Multiple Indices #759

imbarazz commented May 24, 2024

dblock commented Jun 24, 2024

imbarazz commented Jul 12, 2024 •

edited

Loading

vibrantvarun commented Jul 15, 2024

martin-gaievski commented Jul 15, 2024 •

edited

Loading

imbarazz commented Jul 16, 2024

martin-gaievski commented Jul 17, 2024 •

edited

Loading

Sylver11 commented Jul 24, 2024

navneet1v commented Jul 29, 2024

Romasato commented Aug 20, 2024

aalbahem commented Aug 30, 2024

vibrantvarun commented Oct 11, 2024

[ Bug ] Neural Search Leads to "modelId is marked non-null but is null" when Targeting Multiple Indices #759

[ Bug ] Neural Search Leads to "modelId is marked non-null but is null" when Targeting Multiple Indices #759

Comments

imbarazz commented May 24, 2024

Reproduction

Search Pipeline for Embedder Model 1

Search Pipeline for Embedder Model 2

Update Index 1 with Pipeline 1

Update Index 2 with Pipeline 2

Perform Multi-Search

dblock commented Jun 24, 2024

imbarazz commented Jul 12, 2024 • edited Loading

vibrantvarun commented Jul 15, 2024

martin-gaievski commented Jul 15, 2024 • edited Loading

imbarazz commented Jul 16, 2024

martin-gaievski commented Jul 17, 2024 • edited Loading

Sylver11 commented Jul 24, 2024

navneet1v commented Jul 29, 2024

Romasato commented Aug 20, 2024

aalbahem commented Aug 30, 2024

vibrantvarun commented Oct 11, 2024

imbarazz commented Jul 12, 2024 •

edited

Loading

martin-gaievski commented Jul 15, 2024 •

edited

Loading

martin-gaievski commented Jul 17, 2024 •

edited

Loading