Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix remote model with embedding input issue #3289

Merged
merged 5 commits into from
Dec 30, 2024

Conversation

b4sjoo
Copy link
Collaborator

@b4sjoo b4sjoo commented Dec 18, 2024

Description

All remote embedding model will throw exception when use embedding input instead of remote input. However Neural Search uses a hardcoded value FunctionName.TEXT_EMBEDDING when instantiating the MLInput:
https://github.com/opensearch-project/neural-search/blob/7feacd67b3c7694ff4a1c1c2b430f2447a1ed4ab/src/main/java/org/opensearch/neuralsearch/ml/MLCommonsClientAccessor.java#L293

Therefore, we removed the required string parameter in input interface to enable the dual usage of remote model. In the future, we plan to use different interface on different usage in same connector as a long term fix.

Related Issues

Resolves #3261

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: b4sjoo <sicheng.song@outlook.com>
@dhrubo-os
Copy link
Collaborator

  1. Can we have any unit test which can reflect this change?
  2. Is there any documentation for model interface? Do we need to update the documentation too?

Signed-off-by: b4sjoo <sicheng.song@outlook.com>
@b4sjoo b4sjoo had a problem deploying to ml-commons-cicd-env December 19, 2024 10:45 — with GitHub Actions Failure
@b4sjoo b4sjoo had a problem deploying to ml-commons-cicd-env December 19, 2024 10:45 — with GitHub Actions Failure
Signed-off-by: b4sjoo <sicheng.song@outlook.com>
@b4sjoo b4sjoo temporarily deployed to ml-commons-cicd-env December 19, 2024 10:46 — with GitHub Actions Inactive
@b4sjoo b4sjoo had a problem deploying to ml-commons-cicd-env December 19, 2024 10:46 — with GitHub Actions Failure
@b4sjoo
Copy link
Collaborator Author

b4sjoo commented Dec 19, 2024

  1. Can we have any unit test which can reflect this change?
  2. Is there any documentation for model interface? Do we need to update the documentation too?

UT added.
Interface doc: https://opensearch.org/docs/latest/ml-commons-plugin/api/model-apis/register-model/#the-interface-parameter
No need to added because we are trying to fix the interface back to the expected behavior.

@b4sjoo b4sjoo temporarily deployed to ml-commons-cicd-env December 19, 2024 12:19 — with GitHub Actions Inactive
@b4sjoo b4sjoo had a problem deploying to ml-commons-cicd-env December 19, 2024 13:17 — with GitHub Actions Failure
jngz-es
jngz-es previously approved these changes Dec 19, 2024
…chema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>
Signed-off-by: b4sjoo <sicheng.song@outlook.com>
@b4sjoo b4sjoo temporarily deployed to ml-commons-cicd-env December 20, 2024 19:30 — with GitHub Actions Inactive
@b4sjoo b4sjoo temporarily deployed to ml-commons-cicd-env December 20, 2024 19:30 — with GitHub Actions Inactive
@b4sjoo b4sjoo temporarily deployed to ml-commons-cicd-env December 20, 2024 20:27 — with GitHub Actions Inactive
@Zhangxunmt Zhangxunmt merged commit b631d89 into opensearch-project:main Dec 30, 2024
7 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Dec 30, 2024
* Fix remote model with embedding input issue

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Add UT

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Spotless

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Add UT for both embedding and remote cases for all remote embedding schema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Remove hardcoded test schema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

---------

Signed-off-by: b4sjoo <sicheng.song@outlook.com>
(cherry picked from commit b631d89)
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.16 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.16 2.16
# Navigate to the new working tree
cd .worktrees/backport-2.16
# Create a new branch
git switch --create backport/backport-3289-to-2.16
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 b631d89eb3b38fe233060a3c5826f42af2a8ca99
# Push it to GitHub
git push --set-upstream origin backport/backport-3289-to-2.16
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.16

Then, create a pull request where the base branch is 2.16 and the compare/head branch is backport/backport-3289-to-2.16.

opensearch-trigger-bot bot pushed a commit that referenced this pull request Dec 30, 2024
* Fix remote model with embedding input issue

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Add UT

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Spotless

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Add UT for both embedding and remote cases for all remote embedding schema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Remove hardcoded test schema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

---------

Signed-off-by: b4sjoo <sicheng.song@outlook.com>
(cherry picked from commit b631d89)
opensearch-trigger-bot bot pushed a commit that referenced this pull request Dec 30, 2024
* Fix remote model with embedding input issue

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Add UT

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Spotless

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Add UT for both embedding and remote cases for all remote embedding schema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Remove hardcoded test schema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

---------

Signed-off-by: b4sjoo <sicheng.song@outlook.com>
(cherry picked from commit b631d89)
Zhangxunmt pushed a commit that referenced this pull request Dec 31, 2024
* Fix remote model with embedding input issue

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Add UT

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Spotless

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Add UT for both embedding and remote cases for all remote embedding schema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Remove hardcoded test schema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

---------

Signed-off-by: b4sjoo <sicheng.song@outlook.com>
(cherry picked from commit b631d89)

Co-authored-by: Sicheng Song <sicheng.song@outlook.com>
Zhangxunmt pushed a commit that referenced this pull request Dec 31, 2024
* Fix remote model with embedding input issue

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Add UT

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Spotless

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Add UT for both embedding and remote cases for all remote embedding schema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Remove hardcoded test schema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

---------

Signed-off-by: b4sjoo <sicheng.song@outlook.com>
(cherry picked from commit b631d89)

Co-authored-by: Sicheng Song <sicheng.song@outlook.com>
Zhangxunmt pushed a commit that referenced this pull request Dec 31, 2024
* Fix remote model with embedding input issue

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Add UT

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Spotless

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Add UT for both embedding and remote cases for all remote embedding schema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Remove hardcoded test schema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

---------

Signed-off-by: b4sjoo <sicheng.song@outlook.com>
(cherry picked from commit b631d89)

Co-authored-by: Sicheng Song <sicheng.song@outlook.com>
b4sjoo added a commit to b4sjoo/ml-commons that referenced this pull request Jan 2, 2025
* Fix remote model with embedding input issue

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Add UT

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Spotless

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Add UT for both embedding and remote cases for all remote embedding schema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Remove hardcoded test schema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

---------

Signed-off-by: b4sjoo <sicheng.song@outlook.com>
Zhangxunmt pushed a commit that referenced this pull request Jan 2, 2025
)

* cherry-picking commit b29b893

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Fix remote model with embedding input issue (#3289)

* Fix remote model with embedding input issue

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Add UT

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Spotless

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Add UT for both embedding and remote cases for all remote embedding schema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

* Remove hardcoded test schema

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

---------

Signed-off-by: b4sjoo <sicheng.song@outlook.com>

---------

Signed-off-by: b4sjoo <sicheng.song@outlook.com>
Co-authored-by: opensearch-trigger-bot[bot] <98922864+opensearch-trigger-bot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Bedrock multimodal model interface can't work with neural search
5 participants