Add AI connector blueprints for Aleph Alpha luminous-base embedding model #1925

ulan-yisaev · 2024-01-25T14:59:04Z

Is your feature request related to a problem?
I'm proposing to add an AI connector blueprint for the Aleph Alpha Luminous-Base Embedding Model to the current collection of remote inference blueprints in OpenSearch ML Commons.

This model is particularly effective for German language applications, providing nuanced and contextually relevant embeddings. Given the increasing demand for robust language model solutions in different languages, integrating this model could significantly enhance your offerings for German-language processing tasks.

What solution would you like?
A markdown file of AI connector blueprint for Aleph Alpha luminous-base embedding model.

What alternatives have you considered?
I was able to write one using existing blueprints here:

{
  "name": "Aleph Alpha Connector: luminous-base, representation: document",
  "description": "The connector to the Aleph Alpha luminous-base embedding model with representation: document",
  "version": "0.1",
  "protocol": "http",
  "parameters": {
    "endpoint": "api.aleph-alpha.com",
	"representation": "document",
	"normalize": true
  },
  "credential": {
    "AlephAlpha_API_Token": "XXXXXXXXXXXXXXXXXX"
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
	  "url": "https://${parameters.endpoint}/semantic_embed",
      "headers": {
        "Content-Type": "application/json",
        "Accept": "application/json",
        "Authorization": "Bearer ${credential.AlephAlpha_API_Token}"
      },
      "request_body": "{ \"model\": \"luminous-base\", \"prompt\": \"${parameters.input}\", \"representation\": \"${parameters.representation}\", \"normalize\": ${parameters.normalize}}",
      "pre_process_function": "\n    StringBuilder builder = new StringBuilder();\n    builder.append(\"\\\"\");\n    String first = params.text_docs[0];\n    builder.append(first);\n    builder.append(\"\\\"\");\n    def parameters = \"{\" +\"\\\"input\\\":\" + builder + \"}\";\n    return  \"{\" +\"\\\"parameters\\\":\" + parameters + \"}\";",
      "post_process_function": "\n      def name = \"embedding\";\n      def dataType = \"FLOAT32\";\n      if (params.embedding == null || params.embedding.length == 0) {\n        return params.message;\n      }\n      def shape = [params.embedding.length];\n      def json = \"{\" +\n                 \"\\\"name\\\":\\\"\" + name + \"\\\",\" +\n                 \"\\\"data_type\\\":\\\"\" + dataType + \"\\\",\" +\n                 \"\\\"shape\\\":\" + shape + \",\" +\n                 \"\\\"data\\\":\" + params.embedding +\n                 \"}\";\n      return json;\n    "
    }
  ]
}

saratvemulapalli · 2024-01-25T16:32:11Z

Thanks @ulan-yisaev. Do you want to contribute the change in a PR?

ulan-yisaev · 2024-01-25T16:53:29Z

Hi @saratvemulapalli ,
Sure thing, I'll be happy to contribute.

ramda1234786 · 2024-01-27T04:15:46Z

Hi @ulan-yisaev i see you have used embedding models outside of Cohere, Bedrock and OpenAI. I had been trying similar for Hugging Face text generation model to do post_process_function but not able to do that. Any idea how you can achieve this post_process_function ?

I have this

[
    {
        "generated_text": "Your Generated text"
    }
]

and i want to convert it to this below using post process fuction

    {
        "completion": "Your Generated text"
    }

I have tried this till now

"post_process_function": "\n def json = \"{\" +\n \"\\\"completion\\\":\\\"\" + params['response'][0].generated_text + \"\\\" }\";\n return json;\n "

Also @saratvemulapalli if you have any idea on this

ulan-yisaev · 2024-01-27T12:23:58Z

Hi @ramda1234786 ,
Please note that I haven't tested generation models, as my work primarily focuses on embedding models. But I suppose you could try the following function:

"post_process_function": "\n def generatedText = params.response[0].generated_text;\n def json = \"{\\\"completion\\\":\\\"\" + generatedText + \"\\\"}\";\n return json;\n"

ramda1234786 · 2024-01-28T14:10:31Z

Thanks for your response @ulan-yisaev . I tried below but no luck

I get this from predict API without post_process_function

{
    "inference_results": [
        {
            "output": [
                {
                    "name": "response",
                    "dataAsMap": {
                        "response": [
                            {
                                "generated_text": "The title is Rush, year is 2013, budget is 500000, earning is 300000, genere is action.What is the budget of Rush?\n\nThe budget of Rush is 500000...................."
                            }
                        ]
                    }
                }
            ],
            "status_code": 200
        }
    ]
}

but when i add post_process_function as you mentioned , i get this below error

{
    "error": {
        "root_cause": [
            {
                "type": "script_exception",
                "reason": "runtime error",
                "script_stack": [
                    "generatedText = params.response[0].generated_text;\n def ",
                    "                      ^---- HERE"
                ],
                "script": " ...",
                "lang": "painless",
                "position": {
                    "offset": 28,
                    "start": 6,
                    "end": 62
                }
            }
        ],
        "type": "script_exception",
        "reason": "runtime error",
        "script_stack": [
            "generatedText = params.response[0].generated_text;\n def ",
            "                      ^---- HERE"
        ],
        "script": " ...",
        "lang": "painless",
        "position": {
            "offset": 28,
            "start": 6,
            "end": 62
        },
        "caused_by": {
            "type": "null_pointer_exception",
            "reason": "Cannot invoke \"Object.getClass()\" because \"callArgs[0]\" is null"
        }
    },
    "status": 400
}

Not sure how to fix this.....

mashah · 2024-01-29T04:07:17Z

Version 2.12 is still under development. If you need RAG with OpenSearch my recommendation is to try Sycamore. We know that path works, though it's using 2.11. Once 2.12 is ready, we will have that working with Sycamore as well.

HenryL27 · 2024-04-09T23:03:20Z

Closing as the connector blueprint was added

ulan-yisaev added enhancement New feature or request untriaged labels Jan 25, 2024

saratvemulapalli added the documentation Improvements or additions to documentation label Jan 25, 2024

saratvemulapalli removed the untriaged label Jan 25, 2024

ulan-yisaev mentioned this issue Jan 28, 2024

AI connector blueprint for the Aleph Alpha Luminous-Base Embedding Model #1940

Merged

5 tasks

ramda1234786 mentioned this issue Jan 29, 2024

Will there be a GA or Production ready version for Conversational Search as it says experimental aryn-ai/conversational-opensearch#11

Open

b4sjoo added this to ml-commons projects Jan 30, 2024

b4sjoo moved this to Untriaged in ml-commons projects Jan 30, 2024

ylwu-amzn moved this from Untriaged to In Progress in ml-commons projects Feb 2, 2024

HenryL27 closed this as completed Apr 9, 2024

github-project-automation bot moved this from In Progress to Done in ml-commons projects Apr 9, 2024

bbarani added this to Test roadmap format Apr 19, 2024

github-project-automation bot moved this to Planned work items in Test roadmap format Apr 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AI connector blueprints for Aleph Alpha luminous-base embedding model #1925

Add AI connector blueprints for Aleph Alpha luminous-base embedding model #1925

ulan-yisaev commented Jan 25, 2024 •

edited

Loading

saratvemulapalli commented Jan 25, 2024

ulan-yisaev commented Jan 25, 2024

ramda1234786 commented Jan 27, 2024

ulan-yisaev commented Jan 27, 2024

ramda1234786 commented Jan 28, 2024

mashah commented Jan 29, 2024

HenryL27 commented Apr 9, 2024

Add AI connector blueprints for Aleph Alpha luminous-base embedding model #1925

Add AI connector blueprints for Aleph Alpha luminous-base embedding model #1925

Comments

ulan-yisaev commented Jan 25, 2024 • edited Loading

saratvemulapalli commented Jan 25, 2024

ulan-yisaev commented Jan 25, 2024

ramda1234786 commented Jan 27, 2024

ulan-yisaev commented Jan 27, 2024

ramda1234786 commented Jan 28, 2024

mashah commented Jan 29, 2024

HenryL27 commented Apr 9, 2024

ulan-yisaev commented Jan 25, 2024 •

edited

Loading