Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ai-rag plugin #11568

Merged
merged 22 commits into from
Oct 16, 2024
Merged

Conversation

shreemaan-abhishek
Copy link
Contributor

@shreemaan-abhishek shreemaan-abhishek commented Sep 11, 2024

Description

Implement the ai-rag plugin that uses RAG (Retrieval Augmented Generation), to return information about untrained events/facts from LLMs.

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)


request_schema.properties.ai_rag.properties.vector_search = vs_req_schema
request_schema.properties.ai_rag.properties.embeddings = emb_req_schema

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

construct a request_body_schema according to the service providers

core.log.error("dibag err: ", err)
core.log.warn("dibag res: ", core.json.encode(res, true))

body_tab["ai_rag"] = nil
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove ai_rag from request body because their purpose is served and also these values will cause failure when proxying requests to LLM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment the purpose in source code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if not body_tab.messages then
body_tab.messages = {}
end
decorate(decorator_conf, body_tab)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to take the vector search result and append it to the messages field in request body. This messages will be sent to LLM to get the response.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the code, I think you mean "prepend" instead of "append"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yes. My bad, it should be prepend.

@shreemaan-abhishek shreemaan-abhishek marked this pull request as ready for review September 13, 2024 08:29
@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Sep 13, 2024
@dosubot dosubot bot added enhancement New feature or request plugin labels Sep 13, 2024
Copy link
Contributor

@zhoujiexiong zhoujiexiong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is documentation related to this PR being prepared?


if ngx.req.get_method() ~= "POST" then
ngx.status = 400
ngx.say("Unsupported request method: ", ngx.req.get_method())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then return?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return
end

if header_auth == "key" then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeated judgment?

["Content-Type"] = "application/json",
["api-key"] = conf.api_key,
},
body = core.json.encode(body)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check err?

body = core.json.encode(body)
})
if not res or not res.body then
return nil, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, err
return nil, STATUS, err

return nil, internal_server_error, err
end

if type(res_tab.data) ~= "table" or #res_tab.data < 1 then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if type(res_tab.data) ~= "table" or #res_tab.data < 1 then
if type(res_tab.data) ~= "table" or core.table.isempty(res_tab.data) then

end

local embeddings_provider = next(conf.embeddings_provider)
local embeddings_provider_conf = conf.embeddings_provider[next(conf.embeddings_provider)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
local embeddings_provider_conf = conf.embeddings_provider[next(conf.embeddings_provider)]
local embeddings_provider_conf = conf.embeddings_provider[embeddings_provider]

local vs_req_schema = vector_search_driver.request_schema
local emb_req_schema = embeddings_driver.request_schema

request_schema.properties.ai_rag.properties.vector_search = vs_req_schema
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there risks with module variables in concurrent scenarios?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, can you elaborate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the process again and there is no risk of race condition. Ignore it.

core.log.error("dibag err: ", err)
core.log.warn("dibag res: ", core.json.encode(res, true))

body_tab["ai_rag"] = nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment the purpose in source code?

if not body_tab.messages then
body_tab.messages = {}
end
decorate(decorator_conf, body_tab)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the code, I think you mean "prepend" instead of "append"?

-- limitations under the License.
--
local core = require("apisix.core")
local internal_server_error = ngx.HTTP_INTERNAL_SERVER_ERROR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
local internal_server_error = ngx.HTTP_INTERNAL_SERVER_ERROR
local INTERNAL_SERVER_ERROR = ngx.HTTP_INTERNAL_SERVER_ERROR

const var, sugg. this style, others elsewhere the same :)

@bzp2010 bzp2010 self-requested a review September 22, 2024 18:49
Copy link
Contributor

@bzp2010 bzp2010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no documentation and I don't know how to use it.

Comment on lines 24 to 47
local azure_ai_search_schema = {
type = "object",
properties = {
endpoint = {
type = "string",
},
api_key = {
type = "string",
},
}
}

local azure_openai_embeddings = {
type = "object",
properties = {
endpoint = {
type = "string",
},
api_key = {
type = "string",
},
},
required = { "endpoint", "api_key" }
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved to the provider Lua file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -114,4 +114,5 @@ function _M.rewrite(conf, ctx)
end


_M.__decorate = decorate -- for ai-rag plugin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is unsafe, each plugin should be independent.

if different plugins are using same function, we should push this function to public library

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a better way to augument the request body.


local http = require("resty.http")
local core = require("apisix.core")
local decorate = require("apisix.plugins.ai-prompt-decorator").__decorate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

local azure_openai_embeddings = require("apisix.plugins.ai-rag.embeddings.azure_openai").schema
local azure_ai_search_schema = require("apisix.plugins.ai-rag.vector-search.azure_ai_search").schema

local INTERNAL_SERVER_ERROR = ngx.HTTP_INTERNAL_SERVER_ERROR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we keep the prefix HTTP_

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


local _M = {
version = 0.1,
priority = 1060, -- TODO check with other ai plugins
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this comment TODO check with other ai plugins?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no we don't need this anymore

-- limitations under the License.
--
local core = require("apisix.core")
local INTERNAL_SERVER_ERROR = ngx.HTTP_INTERNAL_SERVER_ERROR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, keep prefix HTTP_

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

-- limitations under the License.
--
local core = require("apisix.core")
local INTERNAL_SERVER_ERROR = ngx.HTTP_INTERNAL_SERVER_ERROR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

@membphis membphis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know much about this feature, only check the code style and how to use this plugin

| **Field** | **Required** | **Type** | **Description** |
| ----------------------------------------------- | ------------ | -------- | -------------------------------------------- |
| embeddings_provider | Yes | Object | Configuration for the embeddings provider |
| embeddings_provider.azure_openai | Yes | Object | Configuration for Azure OpenAI embeddings |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one question:

do we will support other provider?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we might have to support other providers if the need be. So I have written code in a modular way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue, I think we'd better to add tip in doc for APISIX user, they should know this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

| vector_search_provider | Yes | Object | Configuration for the vector search provider |
| vector_search_provider.azure_ai_search | Yes | Object | Configuration for Azure AI Search |
| vector_search_provider.azure_ai_search.endpoint | No | String | Azure AI Search endpoint |
| vector_search_provider.azure_ai_search.api_key | No | String | Azure AI Search API key |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

How come the Azure AI Search config details (endpoint and api key) are optional when their parent azure_ai_search is required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, my bad.

docs/en/latest/plugins/ai-rag.md Outdated Show resolved Hide resolved
docs/en/latest/plugins/ai-rag.md Outdated Show resolved Hide resolved
docs/en/latest/plugins/ai-rag.md Outdated Show resolved Hide resolved

| **Field** | **Required** | **Type** | **Description** |
| --------- | ------------ | -------- | ---------------------------- |
| fields | Yes | String | Fields for the vector search |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Casing consistency:

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| fields | Yes | String | Fields for the vector search |
| fields | Yes | string | Fields for the vector search |


The `ai-rag` plugin integrates Retrieval-Augmented Generation (RAG) capabilities with AI models.
It allows efficient retrieval of relevant documents or information from external data sources and
augments the AI's responses with that data, improving the accuracy and context of generated outputs.
Copy link
Member

@kayx23 kayx23 Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
augments the AI's responses with that data, improving the accuracy and context of generated outputs.
augments the LLM responses with that data, improving the accuracy and contextual relevance of the generated outputs.

By using the term LLM here you could probably save this entire sentence
image

It allows efficient retrieval of relevant documents or information from external data sources and
augments the AI's responses with that data, improving the accuracy and context of generated outputs.

**_This plugin must be used in routes that proxy requests to LLMs only._**
Copy link
Member

@kayx23 kayx23 Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


**_This plugin must be used in routes that proxy requests to LLMs only._**

**_As of now only Azure OpenAI and Azure AI Search services are supported for generating embeddings and performing vector search respectively, PRs for introducing support for other service providers are welcome._**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grammar fix and links

Suggested change
**_As of now only Azure OpenAI and Azure AI Search services are supported for generating embeddings and performing vector search respectively, PRs for introducing support for other service providers are welcome._**
**_As of now only [Azure OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service) and [Azure AI Search](https://azure.microsoft.com/en-us/products/ai-services/ai-search) services are supported for generating embeddings and performing vector search respectively. PRs for introducing support for other service providers are welcomed._**


| **Field** | **Required** | **Type** | **Description** |
| --------- | ------------ | -------- | ---------------------------- |
| fields | Yes | String | Fields for the vector search |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| fields | Yes | String | Fields for the vector search |
| fields | Yes | string | Fields for the vector search |

}
},
"override": {
"endpoint": "'"$azure_openai_endpoint"'"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the variables in the PUT request should be all caps as well to match the env vars above

return nil, HTTP_INTERNAL_SERVER_ERROR, err
end

if res.status ~= 200 then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use ngx const var instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return nil, HTTP_INTERNAL_SERVER_ERROR, err
end

if res.status ~= 200 then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use ngx const var instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

local embeddings_provider_conf = conf.embeddings_provider[next(conf.embeddings_provider)]
local embeddings_driver = require("apisix.plugins.ai-rag.embeddings." .. embeddings_provider)

local vector_search_provider = next(conf.vector_search_provider)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also ref #11541 (comment) and adjust related?

local vs_req_schema = vector_search_driver.request_schema
local emb_req_schema = embeddings_driver.request_schema

request_schema.properties.ai_rag.properties.vector_search = vs_req_schema
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the process again and there is no risk of race condition. Ignore it.

@bzp2010 bzp2010 self-requested a review October 9, 2024 05:18
Copy link
Contributor

@bzp2010 bzp2010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no more comments, so if other reviewers agree with it, I'll agree with it.

docs/en/latest/plugins/ai-rag.md Outdated Show resolved Hide resolved
Copy link
Contributor

@zhoujiexiong zhoujiexiong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@shreemaan-abhishek shreemaan-abhishek merged commit 11c9d29 into apache:master Oct 16, 2024
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request plugin size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants