Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding multi-modal pre-processor for cohere #3219

Merged
merged 2 commits into from
Nov 18, 2024

Conversation

dhrubo-os
Copy link
Collaborator

@dhrubo-os dhrubo-os commented Nov 13, 2024

Description

[adding multi-modal pre-processor for cohere

Cohere multi-modal model either accepts array of texts or array of base64 images (but only 1 item in the array). So if customer wants to use images then then they can use this pre-processor.
]

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
public RemoteInferenceInputDataSet process(MLInput mlInput) {
TextDocsInputDataSet inputData = (TextDocsInputDataSet) mlInput.getInputDataset();
Map<String, String> parametersMap = new HashMap<>();
parametersMap.put("images", inputData.getDocs().getFirst());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only support image input? Should we also consider support text input ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cohere multi-modal doesn't support text with image input. Either image input or text input. Example notebook

For text input, we could use our regular one: connector.pre_process.cohere.embedding

Copy link
Collaborator

@ylwu-amzn ylwu-amzn Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Add some java doc to explain this?
Does the model support multiple images or just one ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one per request.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An array of image data URIs for the model to embed. Maximum number of images per call is 1.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, suggest add more java doc to explain these details

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
@b4sjoo
Copy link
Collaborator

b4sjoo commented Nov 15, 2024

Regarding to this PR, do we need to do any alternations on current blueprint?

@dhrubo-os
Copy link
Collaborator Author

Regarding to this PR, do we need to do any alternations on current blueprint?

This is a new pre-process function, so I'll add a new blue print doc.

@dhrubo-os dhrubo-os merged commit 7041c22 into opensearch-project:main Nov 18, 2024
8 of 9 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Nov 18, 2024
* adding multi-modal pre-processor for cohere

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

* added javadoc

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

---------

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
(cherry picked from commit 7041c22)
@opensearch-trigger-bot
Copy link
Contributor

The backport to feature/multi_tenancy failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-feature/multi_tenancy feature/multi_tenancy
# Navigate to the new working tree
cd .worktrees/backport-feature/multi_tenancy
# Create a new branch
git switch --create backport/backport-3219-to-feature/multi_tenancy
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 7041c225875709719262853064ae7465bc4cd042
# Push it to GitHub
git push --set-upstream origin backport/backport-3219-to-feature/multi_tenancy
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-feature/multi_tenancy

Then, create a pull request where the base branch is feature/multi_tenancy and the compare/head branch is backport/backport-3219-to-feature/multi_tenancy.

dhrubo-os added a commit to dhrubo-os/ml-commons that referenced this pull request Nov 18, 2024
* adding multi-modal pre-processor for cohere

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

* added javadoc

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

---------

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
dhrubo-os added a commit that referenced this pull request Nov 18, 2024
* adding multi-modal pre-processor for cohere (#3219)

* adding multi-modal pre-processor for cohere

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

* added javadoc

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

---------

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

* changing getFirst method

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

---------

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
opensearch-trigger-bot bot pushed a commit that referenced this pull request Nov 19, 2024
* adding multi-modal pre-processor for cohere (#3219)

* adding multi-modal pre-processor for cohere

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

* added javadoc

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

---------

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

* changing getFirst method

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

---------

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
(cherry picked from commit 622f73d)
dhrubo-os added a commit that referenced this pull request Nov 19, 2024
* adding multi-modal pre-processor for cohere (#3219)

* adding multi-modal pre-processor for cohere

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

* added javadoc

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

---------

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

* changing getFirst method

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

---------

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
(cherry picked from commit 622f73d)

Co-authored-by: Dhrubo Saha <dhrubo@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants