diff --git a/docs/2.developers/4.user-guide/50.llm-xpack/.vectorstore_pipeline/article.py b/docs/2.developers/4.user-guide/50.llm-xpack/.vectorstore_pipeline/article.py index 3dd7f352..21b729fe 100644 --- a/docs/2.developers/4.user-guide/50.llm-xpack/.vectorstore_pipeline/article.py +++ b/docs/2.developers/4.user-guide/50.llm-xpack/.vectorstore_pipeline/article.py @@ -175,7 +175,7 @@ # ### Langchain # # You can use a Pathway Vector Store in LangChain pipelines with `PathwayVectorClient` -# and configure a `VectorStoreServer` using LangChain components. For more information see [our article](/developers/templates/langchain-integration) or [LangChain documentation](https://python.langchain.com/v0.1/docs/integrations/vectorstores/pathway/). +# and configure a `VectorStoreServer` using LangChain components. For more information see [our article](/blog/langchain-integration) or [LangChain documentation](https://python.langchain.com/v0.1/docs/integrations/vectorstores/pathway/). # # %% diff --git a/docs/2.developers/4.user-guide/50.llm-xpack/10.overview.md b/docs/2.developers/4.user-guide/50.llm-xpack/10.overview.md index 06410987..9a204ab1 100644 --- a/docs/2.developers/4.user-guide/50.llm-xpack/10.overview.md +++ b/docs/2.developers/4.user-guide/50.llm-xpack/10.overview.md @@ -174,7 +174,7 @@ You can learn more about Vector Store in Pathway in a [dedicated tutorial](/deve ### Integrating with LlamaIndex and LangChain -Vector Store offer integrations with both LlamaIndex and LangChain. These allow you to incorporate Vector Store Client in your LlamaIndex and LangChain pipelines or use LlamaIndex and LangChain components in the Vector Store. Read more about the integrations in the [article on LlamaIndex](/developers/templates/llamaindex-pathway) and [on LangChain](/developers/templates/langchain-integration). +Vector Store offer integrations with both LlamaIndex and LangChain. These allow you to incorporate Vector Store Client in your LlamaIndex and LangChain pipelines or use LlamaIndex and LangChain components in the Vector Store. Read more about the integrations in the [article on LlamaIndex](/blog/llamaindex-pathway) and [on LangChain](/blog/langchain-integration). ## Rerankers diff --git a/docs/2.developers/7.templates/.gemini-multimodal-rag/__init__.py b/docs/2.developers/7.templates/.gemini-multimodal-rag/__init__.py deleted file mode 100644 index e69de29b..00000000 diff --git a/docs/2.developers/7.templates/.gemini-multimodal-rag/article.py b/docs/2.developers/7.templates/.gemini-multimodal-rag/article.py deleted file mode 100644 index 427cd372..00000000 --- a/docs/2.developers/7.templates/.gemini-multimodal-rag/article.py +++ /dev/null @@ -1,328 +0,0 @@ -# --- -# title: Multimodal RAG with Gemini -# description: "End-to-end template showing how you can launch a document processing RAG pipeline that utilizes Gemini and Pathway" -# aside: true -# article: -# thumbnail: '/assets/content/showcases/gemini_rag/Blog_Banner.png' -# thumbnailFit: 'contain' -# date: '2024-08-06' -# tags: ['showcase', 'llm'] -# keywords: ['LLM', 'RAG', 'GPT', 'OpenAI','Gemini', 'multimodal RAG', 'MM-RAG', 'unstructured', 'notebook', 'Gemini RAG', 'RAG Gemini'] -# notebook_export_path: notebooks/showcases/multimodal-rag-using-Gemini.ipynb -# run_template: "/developers/templates/template-multimodal-rag" -# author: 'pathway' -# jupyter: -# jupytext: -# text_representation: -# extension: .py -# format_name: light -# format_version: '1.5' -# jupytext_version: 1.16.2 -# kernelspec: -# display_name: Python 3 (ipykernel) -# language: python -# name: python3 -# --- - -# ::true-img -# --- -# src: '/assets/content/showcases/gemini_rag/Blog_Banner.png' -# alt: "blog banner" -# --- -# :: - -# # Multimodal RAG with Pathway and Gemini - -# The recent release of **Google Gemini 1.5**, with its impressive **1 million token context length window**, has sparked discussions about the future of RAG. However, it hasn't rendered it obsolete. This system still offers unique advantages, especially in curating and optimizing the context provided to the model, ensuring relevance and accuracy. What is particularly interesting is how these advancements can be harnessed to enhance our projects and streamline our workflows. -# -# In this article, you'll learn how to set up a **Multimodal Retrieval-Augmented Generation (MM-RAG)** system using **Pathway** and **Google Gemini**. You will walk through each step comprehensively, ensuring a solid understanding of both the theoretical and practical aspects of implementing Multimodal LLM and RAG applications. -# -# You'll explore how to leverage the capabilities of **Gemini 1.5 Flash** and **Pathway** together. If you're interested in building RAG pipelines with OpenAI, we also have an article on **Multimodal RAG using GPT-4o**, which you can check out [here](/developers/templates/multimodal-rag). -# -# If you want to skip the explanations, you can directly find the code [here](#hands-on-multimodal-rag-with-google-gemini). -# - -# ## What this article will cover: -# - What is Retrieval-Augmented Generation (RAG)? -# - Multimodality in LLMs -# - Why is Multimodal RAG (MM-RAG) Needed? -# - What is Multimodal RAG and Use Cases? -# - Gemini Models -# - Release of Gemini 1.5 and its impact on RAG architectures -# - Comparing LlamaIndex and Pathway -# - Hands-on Multimodal RAG with Google Gemini - -# ## Foundational Concepts - -# + [markdown] jp-MarkdownHeadingCollapsed=true -# ### Why is Multimodal Rag needed? -# -# **Retrieval-Augmented Generation (RAG)** enhances large language models by incorporating external knowledge sources before generating responses. This approach ensures relevant and accurate output. In today's data-rich world, documents often combine text and images to convey information comprehensively. However, most Retrieval Augmented Generation (RAG) systems overlook the valuable insights locked within images. As Multimodal Large Language Models (LLMs) gain prominence, it's crucial to explore how we can leverage visual content alongside text in RAG, unlocking a deeper understanding of the information landscape. -# -# **Multimodal RAG** is an advanced form of Retrieval-Augmented Generation (RAG) that goes beyond text to incorporate various data types like images, charts, and tables. This expanded capability allows for a deeper understanding of complex information, leading to more accurate and informative outputs. -# -# #### Two options for Multimodal RAG -# 1. **Multimodal Embeddings** - -# The multimodal embeddings model generates vectors based on the input you provide, which can include a combination of image, text, and video data. The image embedding vector and text embedding vector are in the same semantic space with the same dimensionality. Consequently, these vectors can be used interchangeably for use cases like searching image by text, or searching video by image. -# Utilize multimodal embeddings to integrate text and images, retrieve relevant content through similarity search, and then provide both the raw image and text chunks to a multimodal LLM for answer synthesis. -# -# -# 2. **Text Embeddings** - -# Generate text summaries of images using a multimodal LLM, embed and retrieve the text, and then pass the text chunks to the LLM for answer synthesis. -# -# -# #### Comparing text-based and multimodal RAG -# Multimodal RAG offers several advantages over text-based RAG: -# - **Enhanced knowledge access**: Multimodal RAG can access and process both textual and visual information, providing a richer and more comprehensive knowledge base for the LLM. -# - **Improved reasoning capabilities**: By incorporating visual cues, multimodal RAG can make better informed inferences across different types of data modalities. -# -# #### Key Advantages of MM-RAG: -# - Comprehensive Understanding: Processes multiple data formats for a better picture. -# - Improved Performance: Visual data enhances efficiency in complex tasks. -# - Versatile Applications: Useful in finance, healthcare, scientific research, and more. -# -# - - -# ### Gemini Models -# **Gemini** is Google's most capable and general AI model to date. Google has released several Gemini model variants, each tailored for different use cases and performance requirements. -# -# #### Main Gemini Models: -# - Gemini Ultra: The most powerful and advanced model, capable of handling complex tasks and offering state-of-the-art performance. -# - Gemini Pro: A versatile model that balances performance and efficiency, suitable for a wide range of applications. -# - Gemini Advanced: Designed for a broader set of tasks, offering a good balance of capabilities. -# - Gemini Lite: A smaller, more efficient model focused on speed and responsiveness, ideal for resource-constrained environments. -# -# Additional Variants: -# - Gemini 1.5 Flash: Optimized for high-volume, cost-effective applications. -# - Gemini 1.5 Pro: Offers a balance of performance and capabilities. -# - Gemini 1.0 Pro Vision: Includes vision capabilities for processing images and videos. -# - Gemini 1.0 Pro: Text-based model for general language tasks. -# -# #### Benefits of Building with Gemini: -# **Free Credits**: Google Cloud offers new users up to $300 in free credits. This can be used to experiment with Gemini models and other Google Cloud services. -# You can also seamlessly integrate MM-RAG applications with Google's Vertex AI platform for streamlined machine learning workflows. -# - -# ### Release of Gemini 1.5 and its impact on RAG architectures -# The Gemini 1.5 Flash model, released on May 24, 2024, revolutionized AI with its enhanced speed, efficiency, cost-effectiveness, long context window, and multimodal reasoning capabilities. -# -# #### Did Google Gemini 1.5 Kill the need of RAG? -# In one word **“No”**. Gemini 1.5, with a 1M context length window, has sparked a new debate about whether RAG (Retrieval Augmented Generation) is still relevant or not. LLMs commonly struggle with hallucination. To address this challenge, two solutions were introduced, one involving an increased context window and the other utilizing RAG. Gemini 1.5 outperforms Claude 2.1 and GPT-4 Turbo as it can assimilate entire code bases, process over 100 papers, and various documents, but it surely hasn’t killed RAG. -# -# RAG leverages your private knowledge database for effective Q&A while ensuring the security of sensitive information like trade secrets, confidential IP, GDPR-protected data, and internal documents. For more detailed insights explore our article on Private RAG with Connected Data Sources using Mistral, Ollama, and Pathway [here](/developers/templates/private-rag-ollama-mistral). -# -# Additionally in traditional RAG pipelines, you can enhance performance by tweaking the retrieval process, changing the embedding model, adjusting chunking strategies, or improving source data. However, with a "stuff-the-context-window-1M-tokens" strategy, your only option is to improve the source data since all data is given to the model within the token limit. Additionally the context window may be filled with many relevant facts, but 40% or more of them are “lost” to the model. If you want to make sure the model is actually using the context you are sending it, you are best off curating it first and only sending the most relevant context. In other words, doing traditional RAG. -# -# Here in this template you will use the Gemini 1.5 Flash but you can also use other multimodal models by gemini accordingly. - - -# ::true-img -# --- -# src: '/assets/content/showcases/gemini_rag/gemini1.5flashtable.png' -# alt: "Gemini 1.5 flash overview" -# --- -# :: - -# ### Multimodality with Gemini-1.5-Flash -# Gemini 1.5 Flash is the newest addition to the Gemini family of large language models, and it’s specifically designed to be fast, efficient, and cost-effective for high-volume tasks. This is achieved by being a lighter model than the Gemini 1.5 Pro. -# -# According to the paper from Google DeepMind, Gemini 1.5 Flash is “a more lightweight variant designed for efficiency with minimal regression in quality” and uses the transformer decoder model architecture “and multimodal capabilities as Gemini 1.5 Pro, designed for efficient utilization of tensor processing units (TPUs) with lower latency for model serving.” -# -# ### Gemini 1.5 Flash: Key Features -# -# - **Speed and Efficiency**: Fastest Gemini model at 60 tokens/second, ideal for real-time tasks, reducing costs by delaying autoscaling. -# - **Cost-Effective**: 1/10 the price of Gemini 1.5 Pro and cheaper than GPT-3.5. -# - **Long Context Window**: Processes up to one million tokens, handling one hour of video, 11 hours of audio, or 700,000 words without losing accuracy. -# - **Multimodal Reasoning**: Understands text, images, audio, video, PDFs, and tables. Supports function calling and real-time data access. -# - **Great Performance**: High performance with large context windows, excelling in long-document QA, long-video QA, and long-context ASR. -# - -# ::true-img -# --- -# src: '/assets/content/showcases/gemini_rag/gemini1.5flashdetails.png' -# alt: "Gemini 1.5 flash overview" -# --- -# :: - -# ## Hands on Multimodal RAG with Google Gemini - -# ![Gemini RAG overview](/assets/content/showcases/gemini_rag/RAG_diagram.png) - -# ### Step 1: Installation -# -# First, we need to install the required packages: pathway[all], litellm==1.40.0, surya-ocr==0.4.14, and google-generativeai. - -# + -# _MD_SHOW_!pip install 'pathway[all]>=0.14.0' litellm==1.40.0 -# - - -# ### Step 2: Imports and Environment Setup -# -# Next, we import the necessary libraries and set up the environment variables. - -# + -import litellm -import os -import pathway as pw -import logging -import google.generativeai as genai -from pathway.udfs import DiskCache, ExponentialBackoffRetryStrategy -from pathway.xpacks.llm import embedders, prompts, llms, parsers -from pathway.xpacks.llm.question_answering import BaseRAGQuestionAnswerer -from pathway.xpacks.llm.vector_store import VectorStoreServer - -# Set the logging level for LiteLLM to DEBUG -os.environ['LITELLM_LOG'] = 'DEBUG' #to help in debugging -# - - -# ### Step 3: API Key Setup and License Key Setup -# -# Set up the API key and the Pathway license key: - -# + -# Api key setup -GEMINI_API_KEY = "Paste your Gemini API Key here" - -os.environ['GEMINI_API_KEY'] = GEMINI_API_KEY -os.environ["TESSDATA_PREFIX"] = "/usr/share/tesseract/tessdata/" -genai.configure(api_key=GEMINI_API_KEY) - -# License key setup -pw.set_license_key("demo-license-key-with-telemetry") - -logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s') -# - - -# ### Step 4: Upload your file -# -# Create a `./data` directory if it doesn't already exist. This is where the uploaded files will be stored. Then upload your pdf documents. -# -# You can also omit this cell if you are running locally on your system - in that case create a `data` folder in the current directory and copy the files and comment out this cell. - -# !mkdir -p data - - -#Demo pdf for testing -# !wget -q -P ./data/ https://github.com/pathwaycom/llm-app/raw/main/examples/pipelines/gpt_4o_multimodal_rag/data/20230203_alphabet_10K.pdf - - -# #### Reading PDF Data -# -# Next, we read the PDF data from a folder. - -# Read the PDF data -folder = pw.io.fs.read( - path="./data/", - format="binary", - with_metadata=True, -) -sources = [folder] # you can add any other Pathway connector here! - -# ### Step 5: Document Processing and Question Answering Setup - -# #### Setting Up LiteLLM Chat -# -# Set up a LiteLLM chat instance with retry and cache strategies: - -# Setup LiteLLM chat -chat = llms.LiteLLMChat( - model="gemini/gemini-1.5-flash", # Model specified for LiteLLM - retry_strategy=ExponentialBackoffRetryStrategy(max_retries=6,backoff_factor=2.5), - temperature=0.0 -) - -# #### Setting Up Embedder -# -# Let's utilize Gemini embedders. The `GeminiEmbedder` class in Pathway provides an interface for interacting with Gemini embedders. It generates semantic embeddings with a specified model, providing methods for single items (`embed`), batches (`embed_batch`), and direct calls. - -# Setup embedder -embedder = embedders.GeminiEmbedder(model="models/embedding-001", retry_strategy=ExponentialBackoffRetryStrategy( - max_retries=6, backoff_factor=2.5)) # Specify embedder here - -# #### Setting Up Parser -# -# Next, we set up a parser for the document store. - -# + -# Setup parser -table_args = { - "parsing_algorithm": "llm",# for tables - "llm": chat, - "prompt": prompts.DEFAULT_MD_TABLE_PARSE_PROMPT, - } - -image_args ={ - "parsing_algorithm": "llm", # for images - "llm": chat, - "prompt": prompts.DEFAULT_IMAGE_PARSE_PROMPT, -} - -parser = parsers.OpenParse(table_args=table_args, image_args=image_args, parse_images=True) -# - - -# #### Setting Up Document Store -# -# We will set up the document store with the sources, embedder, and parser. - -# + -#Setup document store -#_MD_SHOW_doc_store = VectorStoreServer( -#_MD_SHOW_ *sources, -#_MD_SHOW_ embedder=embedder, -#_MD_SHOW_ splitter=None, -#_MD_SHOW_ parser=parser, -#_MD_SHOW_) -# - - -# ### Step 6: Setting Up Question Answerer Application -# -# We will set up the question answerer application using the LiteLLM-based chat object. - -# + -#Setup question answerer application -# _MD_SHOW_app = BaseRAGQuestionAnswerer( -# _MD_SHOW_ llm=chat, # Using the LiteLLM-based chat object -# _MD_SHOW_ indexer=doc_store, search_topk=2, -# _MD_SHOW_ short_prompt_template=prompts.prompt_qa) -# - - -# #### Building and Running the Server -# -# Finally, we build and run the server. - -# Build and run the server -app_host = "0.0.0.0" -app_port = 8000 -# _MD_SHOW_app.build_server(host=app_host, port=app_port) - -# + -# _MD_SHOW_import threading -# _MD_SHOW_t = threading.Thread(target=app.run_server, name="BaseRAGQuestionAnswerer") -# _MD_SHOW_t.daemon = True -# _MD_SHOW_thr = t.start() - -# + -from pathway.xpacks.llm.question_answering import RAGClient - -# Initialize the RAG client -client = RAGClient(host='0.0.0.0', port=8000) - -# + -# Example usage - -# _MD_SHOW_response = client.pw_ai_answer("What is the Total Stockholders' equity as of December 31, 2022?") -# _MD_SHOW_print(response) -# _MD_COMMENT_START_ -print("$256,144 million") -# _MD_COMMENT_END_ -# - - -# Now your chatbot is now running live! You can ask any questions and get information from your documents instantly. - -# ## Conclusion -# -# This article demonstrated how to implement a Multimodal RAG service using Pathway and Gemini. The setup leverages the capabilities of LiteLLM to process and query multimodal data effectively. If you're looking for a cost-effective alternative, consider using the Gemini Mini, which provides great performance at a lower cost. -# -# For more detailed insights and an alternative approach, check out our article on multimodal RAG using GPT-4o [here](/developers/templates/multimodal-rag). This will give you another perspective on how to handle multimodal RAG applications using different models and techniques. -# By following the steps outlined above, you can efficiently integrate and utilize various data types to enhance your AI applications, ensuring more accurate and contextually rich outputs. -# diff --git a/docs/2.developers/7.templates/.langchain-integration/.gitignore b/docs/2.developers/7.templates/.langchain-integration/.gitignore deleted file mode 100644 index 0219b0c2..00000000 --- a/docs/2.developers/7.templates/.langchain-integration/.gitignore +++ /dev/null @@ -1 +0,0 @@ -data/pathway_readme.md diff --git a/docs/2.developers/7.templates/.langchain-integration/__init__.py b/docs/2.developers/7.templates/.langchain-integration/__init__.py deleted file mode 100644 index e69de29b..00000000 diff --git a/docs/2.developers/7.templates/.langchain-integration/article.py b/docs/2.developers/7.templates/.langchain-integration/article.py deleted file mode 100644 index 942acf9c..00000000 --- a/docs/2.developers/7.templates/.langchain-integration/article.py +++ /dev/null @@ -1,169 +0,0 @@ -# --- -# title: 'Langchain and Pathway: RAG Apps with always-up-to-date knowledge' -# description: '' -# article: -# date: '2024-05-18' -# thumbnail: '/assets/content/showcases/vectorstore/Langchain-Pathway.png' -# tags: ['showcase', 'llm'] -# author: 'szymon' -# notebook_export_path: notebooks/showcases/langchain-integration.ipynb -# keywords: ['LLM', 'RAG', 'GPT', 'OpenAI', 'LangChain', 'notebook'] -# popular: true -# --- - -# # Langchain and Pathway: RAG Apps with always-up-to-date knowledge -# -# -# -# You can now use Pathway in your RAG applications which enables always up-to-date knowledge from your documents to LLMs with Langchaing integration. -# -# Pathway is now available on [Langchain](https://python.langchain.com/docs/integrations/vectorstores/pathway/), a framework for developing applications powered by large language models (LLMs). -# You can now query Pathway and access up-to-date documents for your RAG applications from LangChain using [PathwayVectorClient](https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.pathway.PathwayVectorClient.html). -# -# With this new integration, you will be able to use Pathway Vector Store natively in LangChain. In this guide, you will have a quick dive into Pathway + LangChain to learn how to create a simple, yet powerful RAG solution. - -# ## Prerequisites -# -# To work with LangChain you need to install `langchain` package, as it is not a dependence of Pathway. In the example in this guide you will also use `OpenAIEmbeddings` class for which you need `langchain_openai` package. - -# _MD_SHOW_!pip install langchain -# _MD_SHOW_!pip install langchain_community -# _MD_SHOW_!pip install langchain_openai -# _MD_COMMENT_START_ -if 1: # group to prevent isort messing up - import json - import os - - from common.shadows import fs - - os.environ["OPENAI_API_KEY"] = json.loads( - fs.open("vault://kv.v2:deployments@/legal_rag_demo").read() - )["OPENAI_KEY"] -# _MD_COMMENT_END_ - -# ## Using LangChain components in Pathway Vector Store -# When using Pathway [`VectorStoreServer`](/developers/api-docs/pathway-xpacks-llm/vectorstore#pathway.xpacks.llm.vector_store.VectorStoreServer), you can use LangChain embedder and splitter for processing documents. To do that, use [`from_langchain_components`](/developers/api-docs/pathway-xpacks-llm/vectorstore#pathway.xpacks.llm.vector_store.VectorStoreServer.from_langchain_components) class method. -# -# To start, you need to create a folder Pathway will listen to. Feel free to skip this if you already have a folder on which you want to build your RAG application. You can also use Google Drive, Sharepoint, or any other source from [pathway-io](/developers/api-docs/pathway-io). - -# !mkdir -p 'data/' - -# To run this example you also need to set OpenAI API key, or change the embedder. - -# + -import os -import getpass - -# Set OpenAI API Key -if "OPENAI_API_KEY" in os.environ: - api_key = os.environ["OPENAI_API_KEY"] -else: - api_key = getpass.getpass("OpenAI API Key:") -# - - -# To run the server use Pathway filesystem connector to read files from the `data` folder. - -# + -import pathway as pw -from pathway.xpacks.llm.vector_store import VectorStoreServer -from langchain_openai import OpenAIEmbeddings -from langchain.text_splitter import CharacterTextSplitter - -data = pw.io.fs.read( - "./data", - format="binary", - mode="streaming", - with_metadata=True, -) -# - - -# And then pass them to the server, which will split them using `CharacterTextSplitter` and embed them using `OpenAIEmbeddings`, both from LangChain. - -# + -embeddings = OpenAIEmbeddings(api_key=api_key) -splitter = CharacterTextSplitter() - -host = "127.0.0.1" -port = 8666 - -server = VectorStoreServer.from_langchain_components( - data, embedder=embeddings, splitter=splitter -) -# _MD_SHOW_server.run_server(host, port=port, with_cache=True, cache_backend=pw.persistence.Backend.filesystem("./Cache"), threaded=True) -# - - -# The server is now running and ready for querying with a [`VectorStoreServer`](/developers/api-docs/pathway-xpacks-llm/vectorstore#pathway.xpacks.llm.vector_store.VectorStoreClient) ot with a `PathwayVectorClient` from `langchain-community` described in the next Section. - -# ## Using Pathway as a Vector Store in LangChain pipelines -# -# Once you have a `VectorStoreServer` running you can access it from LangChain pipeline by using [PathwayVectorClient](https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.pathway.PathwayVectorClient.html). -# -# To do that you need to provide either the `url` or `host` and `port` of the running `VectorStoreServer`. In the code example below, you will connect to the `VectorStoreServer` defined in the previous Section, so make sure it's running before making queries. Alternatively, you can also use a publicly available [demo pipeline](https://pathway.com/solutions/rag-pipelines#try-it-out) to test your client. Its REST API you can access at `https://demo-document-indexing.pathway.stream`. This demo ingests documents from [Google Drive](https://drive.google.com/drive/u/0/folders/1cULDv2OaViJBmOfG5WB0oWcgayNrGtVs) and [Sharepoint](https://navalgo.sharepoint.com/sites/ConnectorSandbox/Shared%20Documents/Forms/AllItems.aspx?id=%2Fsites%2FConnectorSandbox%2FShared%20Documents%2FIndexerSandbox&p=true&ga=1). - -# + -from langchain_community.vectorstores import PathwayVectorClient - -client = PathwayVectorClient(host=host, port=port) -# - - -query = "What is Pathway?" -# _MD_SHOW_docs = client.similarity_search(query) -# _MD_SHOW_print(docs) - -# As you can see, the LLM cannot respond clearly as it lacks current knowledge, but this is where Pathway shines. Add new data to the folder Pathway is listening to, then ask our agent again to see how it responds. -# To do that, you can download the repo readme of Pathway into our data folder: - -# !wget 'https://raw.githubusercontent.com/pathwaycom/pathway/main/README.md' -O 'data/pathway_readme.md' -q -nc - -# Try again to query with the new data: - -# + -# _MD_SHOW_docs = client.similarity_search(query) -# _MD_SHOW_print(docs) -# - - -# ### RAG pipeline in LangChain - -# The next step is to write a chain in LangChain. The next example implements a simple RAG, that given a question, retrieves documents from Pathway Vector Store. These are then used as a context for the given question in a prompt sent to the OpenAI chat. - -# + -from langchain_core.output_parsers import StrOutputParser -from langchain_core.prompts import ChatPromptTemplate -from langchain_core.runnables import RunnablePassthrough -from langchain_openai import ChatOpenAI - -retriever = client.as_retriever() - -template = """ -You are smart assistant that helps users with their documents on Google Drive and Sharepoint. -Given a context, respond to the user question. -CONTEXT: -{context} -QUESTION: {question} -YOUR ANSWER:""" - -prompt = ChatPromptTemplate.from_template(template) -llm = ChatOpenAI() -chain = ( - {"context": retriever, "question": RunnablePassthrough()} - | prompt - | llm - | StrOutputParser() -) -# - - -# Now you have a RAG chain written in LangChain that uses Pathway as its Vector Store. Test it by asking some question. - -# + -# _MD_SHOW_chain.invoke("What is Pathway?") -# - - -# ### Vector Store statistics -# -# Just like [`VectorStoreClient`](/developers/api-docs/pathway-xpacks-llm/vectorstore#pathway.xpacks.llm.vector_store.VectorStoreClient) from the Pathway LLM xpack, `PathwayVectorClient` gives you two methods for getting information about indexed documents. -# -# The first one is [`get_vectorstore_statistics`](https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.pathway.PathwayVectorClient.html#langchain_community.vectorstores.pathway.PathwayVectorClient.get_vectorstore_statistics) and gives essential statistics on the state of the vector store, like the number of indexed files and the timestamp of the last updated one. The second one is [`get_input_files`](https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.pathway.PathwayVectorClient.html#langchain_community.vectorstores.pathway.PathwayVectorClient.get_input_files), which gets the list of indexed files along with the associated metadata. - -# + -# _MD_SHOW_print(client.get_vectorstore_statistics()) -# _MD_SHOW_print(client.get_input_files()) diff --git a/docs/2.developers/7.templates/.multimodal-rag/article.py b/docs/2.developers/7.templates/.multimodal-rag/article.py index 38f1e6f1..a7b4c9a5 100644 --- a/docs/2.developers/7.templates/.multimodal-rag/article.py +++ b/docs/2.developers/7.templates/.multimodal-rag/article.py @@ -103,7 +103,7 @@ # + [markdown] id="DBo5YKJzKpdR" # ## **What's the main difference between LlamaIndex and Pathway?** # -# Pathway offers an indexing solution that always provides the latest information to your LLM application: Pathway Vector Store preprocesses and indexes your data in real time, always giving up-to-date answers. LlamaIndex is a framework for writing LLM-enabled applications. Pathway and LlamaIndex are best [used together](/developers/templates/llamaindex-pathway). Pathway vector store is natively available in LlamaIndex. +# Pathway offers an indexing solution that always provides the latest information to your LLM application: Pathway Vector Store preprocesses and indexes your data in real time, always giving up-to-date answers. LlamaIndex is a framework for writing LLM-enabled applications. Pathway and LlamaIndex are best [used together](/blog/llamaindex-pathway). Pathway vector store is natively available in LlamaIndex. # + [markdown] id="CzxFvo4S_RIj" # ## **Architecture Used for Multimodal RAG for Production Use Cases** diff --git a/docs/2.developers/7.templates/.private_rag_ollama_mistral/article.py b/docs/2.developers/7.templates/.private_rag_ollama_mistral/article.py index 514136e6..6b1ce23f 100644 --- a/docs/2.developers/7.templates/.private_rag_ollama_mistral/article.py +++ b/docs/2.developers/7.templates/.private_rag_ollama_mistral/article.py @@ -7,7 +7,7 @@ # thumbnailFit: 'contain' # tags: ['showcase', 'llm'] # date: '2024-04-23' -# related: ['/developers/templates/adaptive-rag', '/developers/templates/llamaindex-pathway'] +# related: ['/developers/templates/adaptive-rag', '/developers/templates/demo-question-answering'] # notebook_export_path: notebooks/showcases/mistral_adaptive_rag_question_answering.ipynb # author: 'berke' # keywords: ['LLM', 'RAG', 'Adaptive RAG', 'prompt engineering', 'explainability', 'mistral', 'ollama', 'private rag', 'local rag', 'ollama rag', 'notebook', 'docker'] diff --git a/docs/2.developers/7.templates/1001.template-adaptive-rag.md b/docs/2.developers/7.templates/1001.template-adaptive-rag.md index dea8756f..414fe04c 100644 --- a/docs/2.developers/7.templates/1001.template-adaptive-rag.md +++ b/docs/2.developers/7.templates/1001.template-adaptive-rag.md @@ -10,7 +10,6 @@ article: author: "pathway" keywords: ['LLM', 'RAG', 'Adaptive RAG', 'prompt engineering', 'prompt', 'explainability', 'docker'] docker_github_link: "https://github.com/pathwaycom/llm-app/tree/main/examples/pipelines/adaptive-rag" -#hide: true --- ::alert{type="info" icon="heroicons:information-circle-16-solid"} diff --git a/docs/2.developers/7.templates/1003.template-multimodal-rag.md b/docs/2.developers/7.templates/1003.template-multimodal-rag.md index a5c45830..b1e661fa 100644 --- a/docs/2.developers/7.templates/1003.template-multimodal-rag.md +++ b/docs/2.developers/7.templates/1003.template-multimodal-rag.md @@ -10,6 +10,7 @@ article: author: "pathway" keywords: ['LLM', 'RAG', 'GPT', 'OpenAI', 'GPT-4o', 'multimodal RAG', 'unstructured', 'docker'] docker_github_link: "https://github.com/pathwaycom/llm-app/tree/main/examples/pipelines/gpt_4o_multimodal_rag" +popular: true --- ::alert{type="info" icon="heroicons:information-circle-16-solid"} diff --git a/docs/2.developers/7.templates/1009.drive-alert.md b/docs/2.developers/7.templates/1009.drive-alert.md new file mode 100644 index 00000000..cbe4b0d6 --- /dev/null +++ b/docs/2.developers/7.templates/1009.drive-alert.md @@ -0,0 +1,13 @@ +--- +title: "Alerting when answers change on Google Drive" +description: "Ask questions about your private data (docs), and tell the app to alert you whenever responses change. The app is always connected to your Google Docs folder and listening for changes. Whenever new relevant information is added to the data sources, the LLM decides if there is a substantial difference in response and notifies the user with a Slack message." +article: + tags: ['showcase', 'llm'] + date: '2024-11-07' + related: false +author: "pathway" +keywords: ['LLM', 'RAG', 'GPT', 'OpenAI', 'slack', 'indexing', 'Google Drive', 'Gdrive', 'docker'] +docker_github_link: "https://github.com/pathwaycom/llm-app/tree/main/examples/pipelines/drive_alert" +--- + +:ArticleFromUrl{url="drive_alert"} diff --git a/docs/2.developers/7.templates/165.langchain-integration.md b/docs/2.developers/7.templates/165.langchain-integration.md deleted file mode 120000 index 550f398f..00000000 --- a/docs/2.developers/7.templates/165.langchain-integration.md +++ /dev/null @@ -1 +0,0 @@ -.langchain-integration/article.md \ No newline at end of file diff --git a/docs/2.developers/7.templates/170.enterprise_rag_sharepoint.md b/docs/2.developers/7.templates/170.enterprise_rag_sharepoint.md deleted file mode 100644 index 5d247b89..00000000 --- a/docs/2.developers/7.templates/170.enterprise_rag_sharepoint.md +++ /dev/null @@ -1,275 +0,0 @@ ---- -title: 'Dynamic Enterprise RAG with SharePoint' -description: 'This article presents Dynamic Enterprise RAG application that integrates with Microsoft SharePoint as a data source' -aside: true -article: - thumbnail: '/assets/content/showcases/enterprise_sharepoint_rag/Enterprise_RAG-thumbnail.png' - thumbnailFit: 'contain' - tags: ['showcase', 'llm'] - date: '2024-07-15' -author: 'saksham' -keywords: ['LLM', 'RAG', 'Dynamic RAG', 'Explainability', 'Enterprise RAG', 'Docker', 'SharePoint'] -docker_github_link: "https://github.com/pathwaycom/llm-app/tree/main/examples/pipelines/demo-question-answering" ---- - - -# Dynamic Enterprise RAG with SharePoint and Pathway | Docker AI App Template - -Retrieval Augmented Generation (RAG) applications empower you to deliver context-specific answers based on private knowledge bases using LLMs/Gen AI. - -SharePoint offered via Microsoft 365 is a common data source on which you might want to build your RAG applications. Microsoft SharePoint leverages workflow applications, "list" databases, and other web parts and security features to enable business teams to collaborate effectively and is widely used by Microsoft Office users for sharing files in a SharePoint document library. - -[Pathway](/), on the other hand, is crucial for building successful Enterprise RAG systems and managing dynamic data sources like Microsoft SharePoint while maintaining high accuracy and reliability. - -## What is Dynamic RAG? - -In practical scenarios, files in data repositories are dynamic, i.e., frequently added, deleted, or modified. These ongoing changes require real-time synchronization and efficient incremental indexing to ensure the most current information is always available. - -Dynamic Enterprise RAG Applications help you build RAG applications that are in permanent sync with your dynamic data sources. - -This app template will help you build a Dynamic Enterprise RAG application that integrates with Microsoft SharePoint as a data source. Your application will always provide up-to-date knowledge, synchronized with any file insertions, deletions, or changes at any point in time, making your work easier. It avoids the need for constant ETL (Extract, Transform and Load) adjustments for such bound-to-implement considerations. - -You can easily run this app template in minutes using Docker containers while ensuring the best practices needed in an enterprise setup. - -## Features of Dynamic Enterprise RAG with SharePoint - -### Real-Time Synchronization - -Dynamic RAG Apps must stay in sync with your data repositories to provide relevant responses. -- Pathway's SharePoint connector supports both static and streaming modes, enabling real-time synchronization of SharePoint data. -- Ensures that your app continuously indexes documents from SharePoint, maintaining an up-to-date knowledge base. - -Imagine senior executives making strategic decisions based on last month's financial reports or outdated project statuses. This lag in information leads to misinformed decisions, missed opportunities, or significant financial losses. Real-time synchronization ensures your app delivers the most current and accurate information, preventing such scenarios. - -### Detailed Metadata Handling - -Enterprise RAG applications include comprehensive metadata such as file paths, modification times, and creation times in the output table. This additional context is crucial for effectively tracking and managing documents. -- Pathway's streaming mode ensures that this metadata is always up-to-date. - -### High Security with Certificate-Based Authentication - -Enterprise workflows must ensure high security and compliance with enterprise standards. -- Pathway's certificate-based authentication future-proofs your system against the potential deprecation of simpler authentication methods by SharePoint. -- For enhanced security, locally deployed LLMs can be set up within an isolated environment, like a Faraday cage, that protects against external interference. This setup ensures that sensitive data remains secure and private, adhering to the highest security standards. - -While this template uses the OpenAI API as an example, you can easily swap it with private RAG setups using the additional resources provided at the end. - -### Scalable and Production-Ready Deployment - -Enterprise applications handle vast and ever-growing data sources, often increasing as many users within a company work on them. -- Pathway provides fast, built-in, and persistent vector indexing for up to millions of pages of documents, eliminating the need for complex ETL processes. -- Pathway is built for scale, and it offers an integrated solution where the server and endpoints are part of the same application. -- The easy Docker setup ensures consistency across different environments. - -### High Accuracy and Enhanced Query Capabilities - -Pathway's SharePoint connector allows you to easily query and manage your datasets stored in SharePoint, providing flexible and powerful options for accessing your data. -- You can configure the connector to read data from specific directories or entire subsites, with options for both recursive and non-recursive scans. -- Starting with a basic RAG pipeline provides initial accuracy, but leveraging more advanced methods such as hybrid indexing and multimodal search can increase accuracy up to 98% and beyond. - -By using this app template, you will leverage Pathway and Microsoft SharePoint to build a dynamic, secure, and efficient Enterprise RAG system tailored to your specific needs. - -## Prerequisites for the Enterprise RAG App Template - -1. Docker Desktop: You can download it from the [Docker website](https://www.docker.com/products/docker-desktop/). -2. OpenAI API Key: Sign up on the [OpenAI website](https://www.openai.com/) and generate an API key from the [API Key Management page](https://platform.openai.com/account/api-keys). Keep this key secure as you will need to use it in your configuration. -3. Pathway License Key: Get your free license key [here](/get-license). -4. Certificate-Based Authentication Setup for SharePoint Integration - -For better security, we use certificate-based authentication to access data from SharePoint. For this we use Azure AD, which is now renamed to Microsoft Entra ID. - -You can follow the steps in the video below to create and upload your SSL certificate to obtain necessary parameters for [Pathway's SharePoint connector](/developers/api-docs/pathway-xpacks-sharepoint). - -[![How to Create and Register an Application on Microsoft Entra ID (previously Azure AD](https://img.youtube.com/vi/9ks6zhAPAz4/0.jpg)](https://www.youtube.com/watch?v=9ks6zhAPAz4) - -Once done, you will use these parameters to update the `config.yaml` file to successfully build and deploy your Dynamic Enterprise RAG application using Microsoft SharePoint and Pathway. - -## Components of your RAG Pipeline - -- `app.py`, the application code using Pathway and written in Python. -- `config.yaml`, the file containing configuration stubs for the data sources, the LLM model, and the web server. -- `requirements.txt`, the dependencies for the pipeline. -- `Dockerfile`, the Docker configuration for running the pipeline in the container. -- `.env`, an environment variable configuration file where you will store the OpenAI API key. - -## Step-by-Step Process to Implement Production-Ready Enterprise RAG Pipeline - -### Step 1: Clone the Pathway LLM App Repository - -Clone the llm-app repository from GitHub. This repository contains all the files you’ll need. - -```Bash -git clone https://github.com/pathwaycom/llm-app.git -``` - -If you have previously cloned an older version, update it using a pull command. - -```Bash -git pull -``` - -## Step 2: Navigate to the Project Directory - -Make sure you are in the right directory. - -```Bash -cd llm-app/examples/pipelines/demo-question-answering -``` - -## Step 3: Create a `.env` File and put your Open AI API key - -Configure your key in a `.env` file. - -``` -OPENAI_API_KEY=sk-******* -``` - -Save the file as `.env` in the `demo-question-answering` folder. - -## Step 4: Update your Pathway license key in `app.py` - -Update your free license key in the application code. Rest of the code is already configured for you. - -```python -# Set up license key for using Sharepoint feature -pw.set_license_key("demo-license-key-with-telemetry") -``` - -## Step 5: Update the `config.yaml` File - -To include the SharePoint configuration in the `config.yaml` file, follow the steps below. You can change the model specification from gpt-3.5-turbo to other OpenAI models like GPT-4 or GPT-4o as needed. Additionally, you can use 300+ LLMs via [Pathway LiteLLM Class](/developers/user-guide/llm-xpack/overview/#what-about-other-models) or build it with open-source models hosted locally. - -```python -llm_config: - model: "gpt-3.5-turbo" -host_config: - host: "0.0.0.0" - port: 8000 -cache_options: - with_cache: True - cache_folder: "./Cache" -sources: - - sharepoint_folder: - kind: sharepoint - config: - # The sharepoint is part of Pathway's commercial offering, please contact us for a demo - # Please contact here: `contact@pathway.com` - root_path: ROOT_PATH - url: SHAREPOINT_URL - tenant: SHAREPOINT_TENANT - client_id: SHAREPOINT_CLIENT_ID - cert_path: SHAREPOINT.pem - thumbprint: SHAREPOINT_THUMBPRINT - refresh_interval: 5 -``` - -Mandatory Parameters: - -- `url`: The SharePoint site URL, including the site's path. For example: https://company.sharepoint.com/sites/MySite. -- `tenant`: The ID of the SharePoint tenant, typically a GUID. -- `client_id`: The Client ID of the SharePoint application with the required grants to access the data. -- `cert_path`: The path to the certificate (typically a .pem file) added to the application for authentication. -- `thumbprint`: The thumbprint for the specified certificate. -- `root_path`: The path for a directory or file within the SharePoint space to be read. -- `refresh_interval`: Time in seconds between scans if the mode is set to "streaming". - -For more details on additional configurations, visit Pathway's [SharePoint Connector page](/developers/api-docs/pathway-xpacks-sharepoint/#pathway.xpacks.connectors.sharepoint.read). - -Example Configuration: - -To illustrate the utility of this connector, consider a scenario where you need to access a dataset stored in the `Shared Documents/Data` directory of the SharePoint site `Datasets`. Below is a basic example demonstrating how to configure the connector for reading this dataset in streaming mode: - -```python -t = pw.xpacks.connectors.sharepoint.read( - url="https://company.sharepoint.com/sites/Datasets", - tenant="c2efaf1f-8add-4334-b1ca-32776acb61ea", - client_id="f521a53a-0b36-4f47-8ef7-60dc07587eb2", - cert_path="certificate.pem", - thumbprint="33C1B9D17115E848B1E956E54EECAF6E77AB1B35", - root_path="Shared Documents/Data", -) -``` - -In this setup, the connector targets the `Shared Documents/Data` directory and recursively scans all subdirectories. This method ensures that no file is overlooked, providing comprehensive access to all pertinent data within the specified path. - -## Step 6: Build the Docker Image for your Enterprise RAG App - -This step might take a few minutes. Ensure you have enough space on your device (approximately 8 GB). - -```Bash -docker build -t ragshare . -``` - -## Step 7: Run the Docker Container - -Run the Docker container and expose via a port, i.e. 8000 in this template. You can pick another port as well that isn't used. - -```Bash -docker run -p 8000:8000 ragshare -``` - -Open up another terminal window to follow the next steps. - -## Step 8: Check the List of Files - -Check if your files in SharePoint are indexed for information retrieval for LLMs. To test it, query to get the list of available inputs and associated metadata using curl: - -```Bash -curl -X 'POST' 'http://localhost:8000/v1/pw_list_documents' -H 'accept: */*' -H 'Content-Type: application/json' -``` - -This will return the list of files e.g. if you start with this file uploaded on your sharepoint the answer will be as follows: - -`[{"created_at": null, "modified_at": 1718810417, "owner": "root", "path":"data/IdeanomicsInc_20160330_10-K_EX-10.26_9512211_EX-10.26_Content License Agreement.pdf", "seen_at": 1718902304}]` - -## Step 9: Last Step – Run the RAG Service - -You can now run the RAG service. Start by asking a simple question. For example: - -```Bash -curl -X 'POST' \ - 'http://0.0.0.0:8000/v1/pw_ai_answer' \ - -H 'accept: */*' \ - -H 'Content-Type: application/json' \ - -d '{ - "prompt": "What is the start date of the contract?" -}' -``` - - -This will return the following answer: - -`December 21, 2015` - -## Conclusions - -In this app template, you: -- Learned about Dynamic RAG and key considerations for Enterprise RAG applications. -- Successfully created and deployed a Dynamic Enterprise RAG application using Pathway with Microsoft SharePoint as a data source. - -By leveraging the combined power of Pathway and Microsoft SharePoint, you built a secure, efficient, and scalable Enterprise RAG system tailored to your specific needs. Pathway enables you to rapidly deliver production-ready, enterprise-grade LLM projects at a fraction of the cost. This traditional RAG setup can be refined with rerankers, adaptive RAG, multimodal RAG, and other techniques. - -## Additional Resources on Enterprise RAG - -- **Slides AI Search**: Set up high accuracy multimodal RAG pipelines for presentations and PDFs on the [Slides AI Search GitHub repo](https://github.com/pathwaycom/llm-app/tree/main/examples/pipelines/slides_ai_search). This template helps you build a multi-modal search service using GPT-4o with Metadata Extraction and Vector Index. You can also try out the [hosted demo here](https://sales-rag-chat.demo.pathway.com/#search-your-slide-decks). -- **Private RAG with Connected Data Sources using Mistral, Ollama, and Pathway**: Set up a private RAG pipeline with adaptive retrieval using Pathway, Mistral, and Ollama. This app template allows you to run the entire application locally while ensuring low costs without compromising on accuracy, making it ideal for production use-cases with sensitive data and explainable AI needs. Get started with the [app template here](/developers/templates/private-rag-ollama-mistral). -- **Multimodal RAG for PDFs with Text, Images, and Charts**: This showcase demonstrates how you can launch a MultiModal RAG pipeline that utilizes GPT-4o in the parsing stage. Pathway extracts information from unstructured financial documents in your folders, updating results as documents change or new ones arrive. Learn more [here](https://github.com/pathwaycom/llm-app/tree/main/examples/pipelines/gpt_4o_multimodal_rag). - -### Need More Help? - -Pathway is trusted by thousands of developers and enterprises, including Intel, Formula 1 Teams, CMA CGM, and more. Reach out for assistance with your enterprise applications. [Contact us here](/solutions/enterprise-generative-ai?modal=requestdemo) to discuss your project needs or to request a demo. - -### Troubleshooting - -To provide feedback or report a bug, please raise an issue on our [issue tracker](https://github.com/pathwaycom/pathway/issues). You can also join the Pathway Discord server ([#get-help](https://discord.com/invite/pathway)) and let us know how the Pathway community can help you. - -::shoutout-banner ---- -href: "https://discord.gg/pathway" -icon: "ic:baseline-discord" ---- -#title -Discuss tricks & tips for RAG -#description -Join our Discord community and dive into discussions on tricks and tips for mastering Retrieval Augmented Generation -:: diff --git a/docs/2.developers/7.templates/6.llamaindex-pathway.md b/docs/2.developers/7.templates/6.llamaindex-pathway.md deleted file mode 100644 index 248f42de..00000000 --- a/docs/2.developers/7.templates/6.llamaindex-pathway.md +++ /dev/null @@ -1,192 +0,0 @@ ---- -title: 'LlamaIndex and Pathway: RAG Apps with always-up-to-date knowledge' -description: 'Pathway is now available in LlamaIndex as Reader and Retriever' -author: 'pathway' -article: - date: '2024-01-12' - thumbnail: '/assets/content/showcases/vectorstore/llamaindexpathway.png' - tags: ['showcase', 'llm'] -keywords: ['LLM', 'RAG', 'GPT', 'OpenAI', 'LlamaIndex', 'docker'] -docker_github_link: "https://github.com/pathway-labs/realtime-indexer-qa-chat/tree/main" ---- - - - -# LlamaIndex and Pathway: RAG Apps with always-up-to-date knowledge - -You can now use Pathway in your RAG applications which enables always up-to-date knowledge from your documents to LLMs with LlamaIndex integration. - -Pathway is now available on [LlamaIndex](https://docs.llamaindex.ai/en/stable/), a data framework for LLM-based applications to ingest, structure, and access private or domain-specific data. -You can now query Pathway and access up-to-date documents for your RAG applications from LlamaIndex using Pathway [Reader](https://docs.llamaindex.ai/en/stable/examples/data_connectors/PathwayReaderDemo.html#pathway-reader) and [Retriever](https://docs.llamaindex.ai/en/stable/examples/retrievers/pathway_retriever.html#pathway-retriever). - -With this new integration, you will be able to use Pathway vector store natively in LlamaIndex, which opens up endless new possibilities! -In this article, you will have a quick dive into Pathway + LlamaIndex to explore how to create a simple, yet powerful RAG solution using PathwayRetriever. - - -## Why Pathway? - -Pathway offers an indexing solution that is always up to date without the need for traditional ETL pipelines, which are needed in regular VectorDBs. It can monitor several data sources (files, S3 folders, cloud storage) and provide the latest information to your LLM application. - -## Learning outcomes -You will learn how to create a simple RAG solution using Pathway and LlamaIndex. - -This article consists of: -- Create data sources. Define data sources Pathway will read and keep the vector store updated. -- Creating a transformation pipeline (parsing, splitting, embedding) for loading documents into Vector store -- Querying your data and getting answers from LlamaIndex. - -## Prerequisites - -### Installing Pathway and LlamaIndex. -```bash -pip install pathway -pip install llama-index -pip install llama-index-retrievers-pathway -pip install llama-index-embeddings-openai -``` - -### Setting up a folder -To start, you need to create a folder Pathway will listen to. Feel free to skip this if you already have a folder on which you want to build your RAG application. You can also use Google Drive, Sharepoint, or any other source from [pathway-io](/developers/api-docs/pathway-io). -```bash -mkdir -p 'data/' -``` - -### Set up OpenAI API Key - -```python -import getpass -import os - -# omit if embedder of choice is not OpenAI -if "OPENAI_API_KEY" not in os.environ: - os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:") -``` - -### Define data sources - -Pathway can listen to many sources simultaneously, such as local files, S3 folders, cloud storage, and any data stream. - -See [pathway-io](/developers/api-docs/pathway-io) for more information. - -You can easily connect to the data inside the folder with the Pathway file system connector. The data will automatically be updated by Pathway whenever the content of the folder changes. - -```python -import pathway as pw - -data_sources = [] -data_sources.append( - pw.io.fs.read( - "./data", - format="binary", - mode="streaming", - with_metadata=True, - ) # This creates a `pathway` connector that tracks - # all the files in the ./data directory -) -``` - -### Create the document indexing pipeline - -Now that the data is ready, you must create the document indexing pipeline. The transformations should be a list of `TransformComponent`s ending with an Embedding transformation. - -First, split the text using `TokenTextSplitter`, then embed it with `OpenAIEmbedding`. - -Finally, you can run the server with `run_server`. - -```python -from pathway.xpacks.llm.vector_store import VectorStoreServer -from llama_index.embeddings.openai import OpenAIEmbedding -from llama_index.core.node_parser import TokenTextSplitter - -embed_model = OpenAIEmbedding(embed_batch_size=10) - -transformations_example = [ - TokenTextSplitter( - chunk_size=150, - chunk_overlap=10, - separator=" ", - ), - embed_model, -] - -processing_pipeline = VectorStoreServer.from_llamaindex_components( - *data_sources, - transformations=transformations_example, -) - -# Define the Host and port that Pathway will be on -PATHWAY_HOST = "127.0.0.1" -PATHWAY_PORT = 8754 - -# `threaded` runs pathway in detached mode, you have to set it to False when running from terminal or container -# for more information on `with_cache` check out /developers/api-docs/persistence-api -processing_pipeline.run_server( - host=PATHWAY_HOST, port=PATHWAY_PORT, with_cache=False, threaded=True -) -``` - -Awesome! The vector store is now active, you're set to start sending queries. - -### Create LlamIndex Retriever and create Query Engine - -```python -from llama_index.retrievers.pathway import PathwayRetriever - -retriever = PathwayRetriever(host=PATHWAY_HOST, port=PATHWAY_PORT) -retriever.retrieve(str_or_query_bundle="what is pathway") - - -from llama_index.core.query_engine import RetrieverQueryEngine - -query_engine = RetrieverQueryEngine.from_args( - retriever, -) - -response = query_engine.query("What is Pathway?") -print(str(response)) -``` - -``` -Out[]: Empty Response -``` - -As you can see, the LLM cannot respond clearly as it lacks current knowledge, but this is where Pathway shines. Add new data to the folder Pathway is listening to, then ask our agent again to see how it responds. - -To do that, you can download the repo readme of Pathway into our `data` folder: - -```bash -wget 'https://raw.githubusercontent.com/pathwaycom/pathway/main/README.md' -O 'data/pathway_readme.md' -``` - -Try again to query with the new data: - -```python -response = query_engine.query("What is Pathway?") -print(str(response)) -``` - -``` -Out[]: Pathway is a Python framework that allows for high-throughput and low-latency real-time data processing... -``` - -As you can see, after downloading the document to the folder Pathway is listening to, changes are reflected to the query engine immediately. -LLM responses are up to date with the latest changes in the documents which would require extra ETL steps in regular Vector DBs. - -## Conclusion - -With the integration of Pathway within LlamaIndex, you can now access up-to-date documents for your RAG applications from LlamaIndex. -You should now be able to use Pathway Reader and Retriever to connect to your data sources and monitor for changes, providing always up-to-date documents for your LlamaIndex application. - -If you are interested in building RAG solutions with Pathway, don't hesitate to read [how the vector store pipeline is built with Pathway](/developers/user-guide/llm-xpack/vectorstore_pipeline/). -To learn more about the possibilities of combining the live indexing pipeline of Pathway and LLMs, check out [real-time RAG alerting with Pathway](https://github.com/pathwaycom/llm-app/tree/main/examples/pipelines/drive_alert) and [ingesting unstructured data to structured](/developers/templates/unstructured-to-structured/). - -::shoutout-banner ---- -href: "https://discord.gg/pathway" -icon: "ic:baseline-discord" ---- -#title -Discuss tricks & tips for RAG -#description -Join our Discord community and dive into discussions on tricks and tips for mastering Retrieval Augmented Generation -:: diff --git a/docs/2.developers/7.templates/gemini-rag.md b/docs/2.developers/7.templates/gemini-rag.md deleted file mode 120000 index 45736380..00000000 --- a/docs/2.developers/7.templates/gemini-rag.md +++ /dev/null @@ -1 +0,0 @@ -.gemini-multimodal-rag/article.md \ No newline at end of file