Skip to content

Commit

Permalink
adding blue print doc for cohere multi-modal model (opensearch-projec…
Browse files Browse the repository at this point in the history
…t#3229) (opensearch-project#3232)

* adding blue print doc

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

* addressed comments

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

* addressed comment

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>

---------

Signed-off-by: Dhrubo Saha <dhrubo@amazon.com>
(cherry picked from commit de301b7)

Co-authored-by: Dhrubo Saha <dhrubo@amazon.com>
  • Loading branch information
opensearch-trigger-bot[bot] and dhrubo-os authored Nov 19, 2024
1 parent 4f21953 commit f9cbf15
Show file tree
Hide file tree
Showing 4 changed files with 332 additions and 8 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ public CohereMultiModalEmbeddingPreProcessFunction() {
public void validate(MLInput mlInput) {
validateTextDocsInput(mlInput);
List<String> docs = ((TextDocsInputDataSet) mlInput.getInputDataset()).getDocs();
if (docs == null || docs.isEmpty() || (docs.size() == 1 && docs.get(0) == null)) {
if (docs == null || docs.isEmpty() || docs.get(0) == null) {
throw new IllegalArgumentException("No image provided");
}

Expand All @@ -34,15 +34,15 @@ public void validate(MLInput mlInput) {
@Override
public RemoteInferenceInputDataSet process(MLInput mlInput) {
TextDocsInputDataSet inputData = (TextDocsInputDataSet) mlInput.getInputDataset();
Map<String, String> parametersMap = new HashMap<>();
Map<String, Object> parametersMap = new HashMap<>();

/**
* Cohere multi-modal model expects either image or texts, not both.
* For image, customer can use this pre-process function. For texts, customer can use
* connector.pre_process.cohere.embedding
* Cohere expects An array of image data URIs for the model to embed. Maximum number of images per call is 1.
*/
parametersMap.put("images", inputData.getDocs().get(0));
parametersMap.put("images", inputData.getDocs());
return RemoteInferenceInputDataSet
.builder()
.parameters(convertScriptStringToJsonString(Map.of("parameters", parametersMap)))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ public void testProcess_whenCorrectInput_expectCorrectOutput() {
MLInput mlInput = MLInput.builder().algorithm(FunctionName.TEXT_EMBEDDING).inputDataset(textDocsInputDataSet).build();
RemoteInferenceInputDataSet dataSet = function.apply(mlInput);
assertEquals(1, dataSet.getParameters().size());
assertEquals("imageString", dataSet.getParameters().get("images"));
assertEquals("[\"imageString\"]", dataSet.getParameters().get("images"));

}

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
### Cohere Embedding Connector Blueprint:

This blueprint will show you how to connect a Cohere embedding model to your Opensearch cluster, including creating a k-nn index and your own Embedding pipeline. You will require a Cohere API key to create a connector.
This blueprint will show you how to connect a Cohere embedding model to your OpenSearch cluster, including creating a k-nn index and your own Embedding pipeline. You will require a Cohere API key to create a connector.

Cohere currently offers the following Embedding models (with model name and embedding dimensions). Note that only the following have been tested with the blueprint guide.

Expand Down Expand Up @@ -97,7 +97,7 @@ The last step is to deploy your model. Use the `model_id` returned by the regist
POST /_plugins/_ml/models/<MODEL_ID>/_deploy
```

This will once again spawn a task to deploy your Model, with a response that will look like:
This will once again spawn a task to deploy your model, with a response that will look like:

```json
{
Expand All @@ -113,11 +113,11 @@ You can run the GET tasks request again to verify the status.
GET /_plugins/_ml/tasks/<TASK_ID>
```

Once this is complete, your Model is deployed and ready!
Once this is complete, your model is deployed and ready!

##### 1e. Test model

You can try this request to test that the Model behaves correctly:
You can try this request to test that the model behaves correctly:

```json
POST /_plugins/_ml/models/<MODEL_ID_HERE>/_predict
Expand Down
Loading

0 comments on commit f9cbf15

Please sign in to comment.