Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
229 changes: 229 additions & 0 deletions docs/remote_inference_blueprints/ollama_connector_chat_blueprint.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,229 @@
# Ollama (OpenAI compatible) connector blueprint example for chat

This is an AI connector blueprint for Ollama or any other local/self-hosted LLM as long as it is OpenAI compatible (Ollama, llama.cpp, vLLM, etc)

## 1. Add connector endpoint to trusted URLs

Adjust the Regex to your local IP. The following example allows all URLs.

```json
PUT /_cluster/settings
{
"persistent": {
"plugins.ml_commons.trusted_connector_endpoints_regex": [
".*$"
]
}
}
```

## 2. Enable private addresses

```json
PUT /_cluster/settings
{
"persistent": {
"plugins.ml_commons.connector.private_ip_enabled": true
}
}
```

## 3. Create the connector

In a local setting, `openAI_key` might not be needed. In case you can either set it to something irrelevant, or if removed, you need to update the `Authorization` header in the `actions`.

```json
POST /_plugins/_ml/connectors/_create
{
"name": "<YOUR CONNECTOR NAME>",
"description": "<YOUR CONNECTOR DESCRIPTION>",
"version": "<YOUR CONNECTOR VERSION>",
"protocol": "http",
"parameters": {
"endpoint": "127.0.0.1:11434",
"model": "qwen3:4b"
},
"credential": {
"openAI_key": "<YOUR API KEY HERE IF NEEDED>"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://${parameters.endpoint}/v1/chat/completions",
"headers": {
"Authorization": "Bearer ${credential.openAI_key}"
},
"request_body": "{ \"model\": \"${parameters.model}\", \"messages\": ${parameters.messages} }"
}
]
}
```

### Sample response

```json
{
"connector_id": "Keq5FpkB72uHgF272LWj"
}
```

## 4. Register the model and deploy the model

Getting a model to work is a 2-step process. Register and Deploy.

### A: Register and Deploy in two steps

One way to do this is a `_register` call followed by a `_deploy` call.
First you register:

```json
POST /_plugins/_ml/models/_register
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe you can do register and deploy is true, which save one step, POST /_plugins/_ml/models/_register?deploy=true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had that originally and then removed because IMO POST /_plugins/_ml/models/_register? Deploy=true gives a sense of automagic things happening. And with the explicit POST readers have to understand that there is this step (which you can automate) of deploying the connector.

After your comment about the tutorial, I think adding all the options in the tutorial to keep the blueprint cleaner would be a good one. But I'm open to exactly the opposite: making the blueprint very clean, even with such automations in place (and a note saying that they are there) and make the tutorial more dense with explanations about the deploying (and others) of models.

What option do you prefer?

PS: Just for reference, my first commit didn't have any deploy because I have auto deploy in my test cluster, that is something I would like to avoid for new readers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to have both options, so the blueprint is friendly for starter and also giving a good tip for advanced users who wants to speed up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do that.

{
"name": "Local LLM Model",
"function_name": "remote",
"description": "Ollama model",
"connector_id": "Keq5FpkB72uHgF272LWj"
}
```

You get a response like this:

```json
{
"task_id": "oEdPqZQBQwAL8-GOCJbw",
"status": "CREATED",
"model_id": "oUdPqZQBQwAL8-GOCZYL"
}
```

Take note of the `model_id`, it is needed for the `_deploy` call.

Then you deploy:

Use `model_id` in place of the `<MODEL_ID>` placeholder.

```json
POST /_plugins/_ml/models/<MODEL_ID>/_deploy
```

#### Sample response

Once you get a response like this, your model is ready to use.

```json
{
"task_id": "oEdPqZQBQwAL8-GOCJbw",
"status": "CREATED",
"model_id": "oUdPqZQBQwAL8-GOCZYL"
}
```

### B: Register and Deploy in a single step

Another way is doing the 2 steps at once with `deploy=true`:

```json
POST /_plugins/_ml/models/_register?deploy=true
{
"name": "Local LLM Model",
"function_name": "remote",
"description": "Ollama model",
"connector_id": "Keq5FpkB72uHgF272LWj"
}
```

#### Sample response

Once you get a response like this, your model is ready to use.

```json
{
"task_id": "oEdPqZQBQwAL8-GOCJbw",
"status": "CREATED",
"model_id": "oUdPqZQBQwAL8-GOCZYL"
}
```

### 5. Corresponding Predict request example

Notice how you have to create the whole message structure, not just the message to send.
Use `model_id` in place of the `<MODEL_ID>` placeholder.

```json
POST /_plugins/_ml/models/<MODEL_ID>/_predict
{
"parameters": {
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Why is the sky blue"
}
]
}
}
```

### Sample response

```json
{
"inference_results": [
{
"output": [
{
"name": "response",
"dataAsMap": {
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"role": "assistant",
"content": """The sky appears blue due to a phenomenon called Rayleigh scattering. Here's a simple explanation:

1. **Sunlight Composition**: Sunlight appears white, but it's actually a mix of all colors of the visible spectrum (red, orange, yellow, green, blue, indigo, violet).

2. **Atmospheric Scattering**: When sunlight enters Earth's atmosphere, it interacts with the gas molecules and tiny particles in the air. Shorter wavelengths of light (like blue and violet) are scattered more than other colors because they travel in shorter, smaller waves.

3. **Why Blue Dominates**: Although violet light is scattered even more than blue light, the sky appears blue, not violet, because:
- Our eyes are more sensitive to blue light than violet light.
- The sun emits more blue light than violet light.
- Some of the violet light gets absorbed by the upper atmosphere.

4. **Time of Day**: The sky appears blue during the day because we're seeing the scattered blue light from all directions. At sunrise or sunset, the light has to pass through more of the atmosphere, scattering the blue light away and leaving mostly red and orange hues.

This scattering effect is named after Lord Rayleigh, who mathematically described the phenomenon in the 19th century."""
}
}
],
"created": 1757369906,
"model": "qwen3:4b",
"system_fingerprint": "b6259-cebb30fb",
"object": "chat.completion",
"usage": {
"completion_tokens": 264,
"prompt_tokens": 563,
"total_tokens": 827
},
"id": "chatcmpl-iHioFpaxa8K2SXgAHd4FhQnbewLQ9PjB",
"timings": {
"prompt_n": 563,
"prompt_ms": 293.518,
"prompt_per_token_ms": 0.5213463587921847,
"prompt_per_second": 1918.1106439809487,
"predicted_n": 264,
"predicted_ms": 5084.336,
"predicted_per_token_ms": 19.258848484848485,
"predicted_per_second": 51.92418439693993
}
}
}
],
"status_code": 200
}
```
Loading
Loading