adding multi-modal pre-processor for cohere #3219

dhrubo-os · 2024-11-13T22:51:09Z

Description

[adding multi-modal pre-processor for cohere

Cohere multi-modal model either accepts array of texts or array of base64 images (but only 1 item in the array). So if customer wants to use images then then they can use this pre-processor.
]

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

ylwu-amzn · 2024-11-14T05:54:22Z

...ch/ml/common/connector/functions/preprocess/CohereMultiModalEmbeddingPreProcessFunction.java

+    public RemoteInferenceInputDataSet process(MLInput mlInput) {
+        TextDocsInputDataSet inputData = (TextDocsInputDataSet) mlInput.getInputDataset();
+        Map<String, String> parametersMap = new HashMap<>();
+        parametersMap.put("images", inputData.getDocs().getFirst());


Only support image input? Should we also consider support text input ?

Cohere multi-modal doesn't support text with image input. Either image input or text input. Example notebook

For text input, we could use our regular one: connector.pre_process.cohere.embedding

Got it. Add some java doc to explain this?
Does the model support multiple images or just one ?

Just one per request.

An array of image data URIs for the model to embed. Maximum number of images per call is 1.

Got it, suggest add more java doc to explain these details

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

b4sjoo · 2024-11-15T21:27:49Z

Regarding to this PR, do we need to do any alternations on current blueprint?

dhrubo-os · 2024-11-18T18:25:07Z

Regarding to this PR, do we need to do any alternations on current blueprint?

This is a new pre-process function, so I'll add a new blue print doc.

* adding multi-modal pre-processor for cohere Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> * added javadoc Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> --------- Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> (cherry picked from commit 7041c22)

opensearch-trigger-bot · 2024-11-18T18:33:23Z

The backport to feature/multi_tenancy failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-feature/multi_tenancy feature/multi_tenancy
# Navigate to the new working tree
cd .worktrees/backport-feature/multi_tenancy
# Create a new branch
git switch --create backport/backport-3219-to-feature/multi_tenancy
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 7041c225875709719262853064ae7465bc4cd042
# Push it to GitHub
git push --set-upstream origin backport/backport-3219-to-feature/multi_tenancy
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-feature/multi_tenancy

Then, create a pull request where the base branch is feature/multi_tenancy and the compare/head branch is backport/backport-3219-to-feature/multi_tenancy.

* adding multi-modal pre-processor for cohere Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> * added javadoc Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> --------- Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

* adding multi-modal pre-processor for cohere (#3219) * adding multi-modal pre-processor for cohere Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> * added javadoc Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> --------- Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> * changing getFirst method Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> --------- Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

* adding multi-modal pre-processor for cohere (#3219) * adding multi-modal pre-processor for cohere Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> * added javadoc Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> --------- Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> * changing getFirst method Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> --------- Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> (cherry picked from commit 622f73d)

* adding multi-modal pre-processor for cohere (#3219) * adding multi-modal pre-processor for cohere Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> * added javadoc Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> --------- Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> * changing getFirst method Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> --------- Signed-off-by: Dhrubo Saha <dhrubo@amazon.com> (cherry picked from commit 622f73d) Co-authored-by: Dhrubo Saha <dhrubo@amazon.com>

dhrubo-os requested review from b4sjoo, jngz-es, model-collapse, rbhavna, ylwu-amzn, zane-neo, Zhangxunmt, austintlee, HenryL27, sam-herman and xinyual as code owners November 13, 2024 22:51

dhrubo-os temporarily deployed to ml-commons-cicd-env November 13, 2024 22:51 — with GitHub Actions Inactive

dhrubo-os had a problem deploying to ml-commons-cicd-env November 13, 2024 22:51 — with GitHub Actions Failure

adding multi-modal pre-processor for cohere

5a2cd94

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

dhrubo-os force-pushed the main branch from 9fba0b2 to 5a2cd94 Compare November 13, 2024 22:52

dhrubo-os temporarily deployed to ml-commons-cicd-env November 13, 2024 22:52 — with GitHub Actions Inactive

dhrubo-os had a problem deploying to ml-commons-cicd-env November 13, 2024 22:52 — with GitHub Actions Failure

ylwu-amzn reviewed Nov 14, 2024

View reviewed changes

dhrubo-os had a problem deploying to ml-commons-cicd-env November 14, 2024 22:25 — with GitHub Actions Failure

added javadoc

0500d11

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

dhrubo-os temporarily deployed to ml-commons-cicd-env November 14, 2024 22:38 — with GitHub Actions Inactive

dhrubo-os had a problem deploying to ml-commons-cicd-env November 14, 2024 22:38 — with GitHub Actions Failure

dhrubo-os had a problem deploying to ml-commons-cicd-env November 14, 2024 23:26 — with GitHub Actions Failure

ylwu-amzn approved these changes Nov 15, 2024

View reviewed changes

dhrubo-os added backport 2.x backport feature/multi_tenancy labels Nov 15, 2024

b4sjoo approved these changes Nov 18, 2024

View reviewed changes

dhrubo-os merged commit 7041c22 into opensearch-project:main Nov 18, 2024
8 of 9 checks passed

opensearch-trigger-bot bot mentioned this pull request Nov 18, 2024

[Backport 2.x] adding multi-modal pre-processor for cohere #3225

Closed

dhrubo-os mentioned this pull request Nov 18, 2024

[BackPort to 2.x] adding multi-modal pre-processor for cohere #3227

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding multi-modal pre-processor for cohere #3219

adding multi-modal pre-processor for cohere #3219

dhrubo-os commented Nov 13, 2024 •

edited

Loading

ylwu-amzn Nov 14, 2024

dhrubo-os Nov 14, 2024

ylwu-amzn Nov 14, 2024 •

edited

Loading

dhrubo-os Nov 14, 2024

dhrubo-os Nov 14, 2024

ylwu-amzn Nov 14, 2024

b4sjoo commented Nov 15, 2024

dhrubo-os commented Nov 18, 2024

opensearch-trigger-bot bot commented Nov 18, 2024

adding multi-modal pre-processor for cohere #3219

adding multi-modal pre-processor for cohere #3219

Conversation

dhrubo-os commented Nov 13, 2024 • edited Loading

Description

Related Issues

Check List

ylwu-amzn Nov 14, 2024

Choose a reason for hiding this comment

dhrubo-os Nov 14, 2024

Choose a reason for hiding this comment

ylwu-amzn Nov 14, 2024 • edited Loading

Choose a reason for hiding this comment

dhrubo-os Nov 14, 2024

Choose a reason for hiding this comment

dhrubo-os Nov 14, 2024

Choose a reason for hiding this comment

ylwu-amzn Nov 14, 2024

Choose a reason for hiding this comment

b4sjoo commented Nov 15, 2024

dhrubo-os commented Nov 18, 2024

opensearch-trigger-bot bot commented Nov 18, 2024

dhrubo-os commented Nov 13, 2024 •

edited

Loading

ylwu-amzn Nov 14, 2024 •

edited

Loading