-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add guide for Multi-Agent RAG with Gen2 and Cortex #2750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This document provides a comprehensive guide on building a production-grade multi-agent Retrieval-Augmented Generation (RAG) system using Snowflake Gen2 Warehouses and Cortex. It includes architecture, setup instructions, and sample queries.
|
Note: The image uploads happen async, so they may not be uploaded yet if you click this quickly.. try again in a few minutes if so. |
- Complete 6-agent architecture with full SQL/Python implementations - Semantic chunking UDF + Cortex embeddings + external vector DB - Gen2 warehouse optimization with 30-50% performance gains - Production deployment, monitoring, and troubleshooting guides - Based on Medium article about multi-agent RAG with Gen2 warehouses
|
Note: The image uploads happen async, so they may not be uploaded yet if you click this quickly.. try again in a few minutes if so. |
Updated duration formatting throughout the document to include 'Hours'.
|
Note: The image uploads happen async, so they may not be uploaded yet if you click this quickly.. try again in a few minutes if so. |
Updated duration estimates for various sections of the document to reflect more accurate time requirements.
Added overview section for multi-agent RAG system.
|
Note: The image uploads happen async, so they may not be uploaded yet if you click this quickly.. try again in a few minutes if so. |
|
Note: The image uploads happen async, so they may not be uploaded yet if you click this quickly.. try again in a few minutes if so. |
|
Could a maintainer please assign a reviewer for this PR? This guide covers building a production-grade multi-agent RAG system using Snowflake Gen2 Warehouses and Cortex. Thank you! |
sfc-gh-jreini
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested some changes
|
|
||
| nltk.download('punkt', quiet=True) | ||
|
|
||
| def chunk_doc(doc_text: str) -> List[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use native PARSE_DOCUMENT feature instead of python UDF?
| import numpy as np | ||
| import _snowflake | ||
|
|
||
| def search_vectors(query_embedding: list, top_k: int) -> list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you replicate this with cortex search instead of pinecone? Or is the pinecone integration something you're trying to show in the guide?
| ### Create Document Retriever Agent | ||
|
|
||
| ```sql | ||
| CREATE OR REPLACE FUNCTION document_retriever_agent(user_query STRING) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems more like a retrieval tool than an agent - no agent reasoning is used here, just returning documents. suggest renaming
| ``` | ||
|
|
||
| ## Agent 2: SQL Generator | ||
| Duration: 1.5 Hours |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very long duration here, intended?
| LANGUAGE SQL | ||
| AS | ||
| $$ | ||
| SELECT SNOWFLAKE.CORTEX.COMPLETE( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use AI_COMPLETE instead of SNOWFLAKE.CORTEX.COMPLETE
| ### Create SQL Generation Agent | ||
|
|
||
| ```sql | ||
| CREATE OR REPLACE FUNCTION sql_generator_agent(user_question STRING) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to use cortex analyst instead of hand-rolling this sql generator agent. Quality of sql generation will be much higher and more extensible.
| ### Business Logic Layer | ||
|
|
||
| ```sql | ||
| CREATE OR REPLACE FUNCTION semantic_model_agent(entity STRING, operation STRING) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this agent doing?
| synthesized_response AS ( | ||
| SELECT | ||
| SNOWFLAKE.CORTEX.COMPLETE( | ||
| 'mixtral-8x7b', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would recommend using a stronger model here
| AS | ||
| $$ | ||
| WITH agent_results AS ( | ||
| -- Execute all agents in parallel on Gen2 warehouse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why execute all agents in parallel if they may not be needed depending on query?
Better to have the coordinator agent do planning/tool selection instead of just using all of the agents at once no matter what.
This document provides a comprehensive guide on building a production-grade multi-agent Retrieval-Augmented Generation (RAG) system using Snowflake Gen2 Warehouses and Cortex. It includes architecture, setup instructions, and sample queries.