diff --git a/docs/source/examples.md b/docs/source/examples.md index adbc8d9b74..9659496be0 100644 --- a/docs/source/examples.md +++ b/docs/source/examples.md @@ -30,3 +30,18 @@ Ensure the environment is set up by following [Getting Started with Morpheus](./ * [Completion](../../examples/llm/completion/README.md) * [VDB Upload](../../examples/llm/vdb_upload/README.md) * [Retreival Augmented Generation (RAG)](../../examples/llm/rag/README.md) + + +## Environments +Morpheus supports multiple environments, each environment is intended to support a given use-case. Each example documents which environments it is able to run in. With the exception of the Morpheus Release Container, the examples require fetching the model and example datasets via the `fetch_data.sh` script: +```bash +./scripts/fetch_data.py fetch examples models +``` + +The following are the supported environments: +| Environment | Description | +|-------------|-------------| +| [Conda](./developer_guide/contributing.md#build-in-a-conda-environment) | Morpheus is built from source by the end user, and dependencies are installed via the Conda package manager. | +| [Morpheus Docker Container](./developer_guide/contributing.md#build-in-docker-container) | A Docker container that is built from source by the end user, Morpheus is then built from source from within the container. | +| [Morpheus Release Container](./getting_started.md#building-the-morpheus-container) | Pre-built Docker container that is built from source by the Morpheus team, and is available for download from the [NGC container registry](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/morpheus/containers/morpheus/tags), or can be built locally from source. | +| [Dev Container](https://github.com/nv-morpheus/Morpheus/blob/branch-24.06/.devcontainer/README.md) | A [Dev Container](https://containers.dev/) that is built from source by the end user, Morpheus is then built from source from within the container. | diff --git a/examples/README.md b/examples/README.md index 4bdc94648f..983adc08db 100644 --- a/examples/README.md +++ b/examples/README.md @@ -29,4 +29,18 @@ limitations under the License. * [Agents](./llm/agents/README.md) * [Completion](./llm/completion/README.md) * [VDB Upload](./llm/vdb_upload/README.md) - * [Retreival Augmented Generation (RAG)](./llm/rag/README.md) + * [Retrieval Augmented Generation (RAG)](./llm/rag/README.md) + +## Environments +Morpheus supports multiple environments, each environment is intended to support a given use-case. Each example documents which environments it is able to run in. With the exception of the Morpheus Release Container, the examples require fetching the model and example datasets via the `fetch_data.sh` script: +```bash +./scripts/fetch_data.py fetch examples models +``` + +The following are the supported environments: +| Environment | Description | +|-------------|-------------| +| [Conda](../docs/source/developer_guide/contributing.md#build-in-a-conda-environment) | Morpheus is built from source by the end user, and dependencies are installed via the Conda package manager. | +| [Morpheus Docker Container](../docs/source/developer_guide/contributing.md#build-in-docker-container) | A Docker container that is built from source by the end user, Morpheus is then built from source from within the container. | +| [Morpheus Release Container](../docs/source/getting_started.md#building-the-morpheus-container) | Pre-built Docker container that is built from source by the Morpheus team, and is available for download from the [NGC container registry](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/morpheus/containers/morpheus/tags), or can be built locally from source. | +| [Dev Container](../.devcontainer/README.md) | A [Dev Container](https://containers.dev/) that is built from source by the end user, Morpheus is then built from source from within the container. | diff --git a/examples/abp_nvsmi_detection/README.md b/examples/abp_nvsmi_detection/README.md index e92e5f0ed6..1927ffc9cb 100644 --- a/examples/abp_nvsmi_detection/README.md +++ b/examples/abp_nvsmi_detection/README.md @@ -19,6 +19,15 @@ limitations under the License. This example illustrates how to use Morpheus to automatically detect abnormal behavior in NVIDIA SMI logs by utilizing a Forest Inference Library (FIL) model and Triton Inference Server. The particular behavior we will be searching for is cryptocurrency mining. +## Supported Environments +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | Requires launching Triton on the host | +| Morpheus Release Container | ✔ | Requires launching Triton on the host | +| Dev Container | ✔ | Requires using the `dev-triton-start` script and replacing `--server_url=localhost:8000` with `--server_url=triton:8000` | + + ## Background The goal of this example is to identify whether or not a monitored NVIDIA GPU is actively mining for cryptocurrencies and take corrective action if detected. Cryptocurrency mining can be a large resource drain on GPU clusters and detecting mining can be difficult since mining workloads appear similar to other valid workloads. diff --git a/examples/abp_pcap_detection/README.md b/examples/abp_pcap_detection/README.md index 440c3fb783..8ff40ac2e9 100644 --- a/examples/abp_pcap_detection/README.md +++ b/examples/abp_pcap_detection/README.md @@ -17,6 +17,13 @@ limitations under the License. # ABP Detection Example Using Morpheus +## Supported Environments +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | Requires launching Triton on the host | +| Morpheus Release Container | ✔ | Requires launching Triton on the host | +| Dev Container | ✔ | Requires using the `dev-triton-start` script. If using the `run.py` script this requires adding the `--server_url=triton:8000` flag. If using the CLI example this requires replacing `--server_url=localhost:8000` with `--server_url=triton:8000` | ## Setup To run this example, an instance of Triton Inference Server and a sample dataset is required. The following steps will outline how to build and run Triton with the provided FIL model. diff --git a/examples/developer_guide/1_simple_python_stage/README.md b/examples/developer_guide/1_simple_python_stage/README.md new file mode 100644 index 0000000000..e09c13b783 --- /dev/null +++ b/examples/developer_guide/1_simple_python_stage/README.md @@ -0,0 +1,25 @@ + + + +## Supported Environments +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | | +| Morpheus Release Container | ✔ | | +| Dev Container | ✔ | | diff --git a/examples/developer_guide/2_1_real_world_phishing/README.md b/examples/developer_guide/2_1_real_world_phishing/README.md new file mode 100644 index 0000000000..af607cd3a0 --- /dev/null +++ b/examples/developer_guide/2_1_real_world_phishing/README.md @@ -0,0 +1,25 @@ + + + +## Supported Environments +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | Requires launching Triton on the host | +| Morpheus Release Container | ✔ | Requires launching Triton on the host | +| Dev Container | ✔ | Requires using the `dev-triton-start` script and replacing `--server_url=localhost:8000` with `--server_url=triton:8000` | diff --git a/examples/developer_guide/2_2_rabbitmq/README.md b/examples/developer_guide/2_2_rabbitmq/README.md index 053f4a28a2..51ebee8347 100644 --- a/examples/developer_guide/2_2_rabbitmq/README.md +++ b/examples/developer_guide/2_2_rabbitmq/README.md @@ -18,6 +18,15 @@ limitations under the License. # Example RabbitMQ stages This example includes two stages `RabbitMQSourceStage` and `WriteToRabbitMQStage` +## Supported Environments +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | Requires launching the RabbitMQ container on the host | +| Morpheus Release Container | ✔ | Requires launching the RabbitMQ container on the host | +| Dev Container | ✘ | | + + ## Testing with a RabbitMQ container Testing can be performed locally with the RabbitMQ supplied docker image from the [RabbitMQ container registry](https://registry.hub.docker.com/_/rabbitmq/): ```bash diff --git a/examples/developer_guide/3_simple_cpp_stage/README.md b/examples/developer_guide/3_simple_cpp_stage/README.md new file mode 100644 index 0000000000..6e62534325 --- /dev/null +++ b/examples/developer_guide/3_simple_cpp_stage/README.md @@ -0,0 +1,25 @@ + + + +## Supported Environments +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | | +| Morpheus Release Container | ✔ | Requires adding development packages to the container's Conda environment via `conda env update --solver=libmamba -n morpheus --file /workspace/conda/environments/dev_cuda-121_arch-x86_64.yaml` | +| Dev Container | ✔ | | diff --git a/examples/developer_guide/4_rabbitmq_cpp_stage/README.md b/examples/developer_guide/4_rabbitmq_cpp_stage/README.md index 3e3364d22b..313fa34f98 100644 --- a/examples/developer_guide/4_rabbitmq_cpp_stage/README.md +++ b/examples/developer_guide/4_rabbitmq_cpp_stage/README.md @@ -20,6 +20,14 @@ This example builds upon the `examples/developer_guide/2_2_rabbitmq` example add This example adds two flags to the `read_simple.py` script. A `--use_cpp` flag which defaults to `True` and a `--num_threads` flag which defaults to the number of cores on the system as returned by `os.cpu_count()`. +## Supported Environments +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | Requires launching the RabbitMQ container on the host | +| Morpheus Release Container | ✔ | Requires launching the RabbitMQ container on the host, and adding development packages to the container's Conda environment via `conda env update --solver=libmamba -n morpheus --file /workspace/conda/environments/dev_cuda-121_arch-x86_64.yaml` | +| Dev Container | ✘ | | + ## Installing Pika The `RabbitMQSourceStage` and `WriteToRabbitMQStage` stages use the [pika](https://pika.readthedocs.io/en/stable/#) RabbitMQ client for Python. To install this into the current env run: ```bash diff --git a/examples/developer_guide/7_python_modules/README.md b/examples/developer_guide/7_python_modules/README.md new file mode 100644 index 0000000000..e09c13b783 --- /dev/null +++ b/examples/developer_guide/7_python_modules/README.md @@ -0,0 +1,25 @@ + + + +## Supported Environments +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | | +| Morpheus Release Container | ✔ | | +| Dev Container | ✔ | | diff --git a/examples/digital_fingerprinting/starter/README.md b/examples/digital_fingerprinting/starter/README.md index bad1a347ca..69e545e48b 100644 --- a/examples/digital_fingerprinting/starter/README.md +++ b/examples/digital_fingerprinting/starter/README.md @@ -14,6 +14,8 @@ # limitations under the License. --> +> **Warning**: This example is currently broken and fails with a Segmentation fault [#1641](https://github.com/nv-morpheus/Morpheus/issues/1641) + # "Starter" Digital Fingerprinting Pipeline We show here how to set up and run the DFP pipeline for three log types: CloudTrail, Duo, and Azure. Each of these log types uses a built-in source stage that handles that specific data format. New source stages can be added to allow the DFP pipeline to process different log types. All stages after the source stages are identical across all log types but can be configured differently via pipeline or stage configuration options. diff --git a/examples/gnn_fraud_detection_pipeline/README.md b/examples/gnn_fraud_detection_pipeline/README.md index 6fed193dca..9084471400 100644 --- a/examples/gnn_fraud_detection_pipeline/README.md +++ b/examples/gnn_fraud_detection_pipeline/README.md @@ -16,6 +16,15 @@ limitations under the License. --> # GNN Fraud Detection Pipeline +## Supported Environments +All environments require additional Conda packages which can be installed with either the `conda/environments/all_cuda-121_arch-x86_64.yaml` or `conda/environments/examples_cuda-121_arch-x86_64.yaml` environment files. Refer to the [Requirements](#requirements) section for more information. +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | | +| Morpheus Release Container | ✔ | | +| Dev Container | ✔ | | + ## Requirements Prior to running the GNN fraud detection pipeline, additional requirements must be installed in to your Conda environment. A supplemental requirements file has been provided in this example directory. diff --git a/examples/llm/agents/README.md b/examples/llm/agents/README.md index f336fac245..c9392a692a 100644 --- a/examples/llm/agents/README.md +++ b/examples/llm/agents/README.md @@ -34,6 +34,15 @@ limitations under the License. - [Run example (Simple Pipeline)](#run-example-simple-pipeline) - [Run example (Kafka Pipeline)](#run-example-kafka-pipeline) +## Supported Environments +All environments require additional Conda packages which can be installed with either the `conda/environments/all_cuda-121_arch-x86_64.yaml` or `conda/environments/examples_cuda-121_arch-x86_64.yaml` environment files. Refer to the [Install Dependencies](#install-dependencies) section for more information. +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | | +| Morpheus Release Container | ✔ | | +| Dev Container | ✔ | | + # Background Information ### Purpose diff --git a/examples/llm/completion/README.md b/examples/llm/completion/README.md index e2d2d7091f..50a0deae58 100644 --- a/examples/llm/completion/README.md +++ b/examples/llm/completion/README.md @@ -28,6 +28,16 @@ limitations under the License. - [Setting up NGC API Key](#setting-up-ngc-api-key) - [Running the Morpheus Pipeline](#running-the-morpheus-pipeline) +## Supported Environments +All environments require additional Conda packages which can be installed with either the `conda/environments/all_cuda-121_arch-x86_64.yaml` or `conda/environments/examples_cuda-121_arch-x86_64.yaml` environment files. Refer to the [Install Dependencies](#install-dependencies) section for more information. +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | | +| Morpheus Release Container | ✔ | | +| Dev Container | ✔ | | + + ## Background Information ### Purpose diff --git a/examples/llm/rag/README.md b/examples/llm/rag/README.md index 6bb35978b8..1fb5d451f7 100644 --- a/examples/llm/rag/README.md +++ b/examples/llm/rag/README.md @@ -17,6 +17,15 @@ limitations under the License. # Retrieval Augmented Generation (RAG) Pipeline +## Supported Environments +All environments require additional Conda packages which can be installed with either the `conda/environments/all_cuda-121_arch-x86_64.yaml` or `conda/environments/examples_cuda-121_arch-x86_64.yaml` environment files. This example also requires the [VDB upload](../vdb_upload/README.md) pipeline to have been run previously. +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | Requires launching Milvus on the host | +| Morpheus Release Container | ✔ | Requires launching Milvus on the host | +| Dev Container | ✘ | | + ## Table of Contents ## Background Information @@ -36,12 +45,6 @@ additional background contextual and factual information which the LLM can pull - An example of populating a database is illustrated in [VDB upload](../vdb_upload/README.md) - This example assumes that pipeline has already been run to completion. -### Embedding Model - -- This pipeline can support any type of embedding model that can convert text into a vector of floats. -- For the example, we will use `all-MiniLM-L6-v2`. It is small, accurate, and included in the Morpheus repo via LFS; - it is also the default model used in the [VDB upload](../vdb_upload/README.md) pipeline. - ### Vector Database Service - Any vector database can be used to store the resulting embedding and corresponding metadata. @@ -67,8 +70,6 @@ were incorporated: ### Rationale Behind Design Decisions -- **Choice of Embedding Model:** all-MiniLM-L6-v2 was chosen due to its compactness and accuracy. This makes it ideal - for real-time operations and ensures that the embeddings are of high quality. - **Using Milvus as VDB:** Milvus offers scalable and efficient vector search capabilities, making it a natural choice for embedding retrieval in real-time. - **Flexible LLM integration:** The LLM is integrated into the pipeline as a standalone component, which allows for @@ -92,59 +93,6 @@ The standalone Morpheus pipeline is built using the following components: > **Note:** For this to function correctly, the VDB upload pipeline must have been run previously. -### Persistent Morpheus Pipeline - -#### Technical Overview - -![Example RAG Pipeline Diagram](./images/persistent_pipeline.png) - -The provided diagram illustrates the structural composition of the Morpheus data processing pipeline. This pipeline is -designed with the intent to handle various data streams in support of Retrieval Augmented Generation. - -> **Note**: The current `persistent` pipeline implementation differs from the above diagram in the follwing ways: - -- The source for the upload and retrieval are both KafkaSourceStage to make it easy for the user to control when - messages are processed by the example pipeline. -- There is a SplitStage added after the embedding portion of the pipeline which determines which sink to send each - message to. -- The final sink for the retrieval task is sent to another Kafka topic retrieve_output. - -#### Data Input Points - -The pipeline has multiple data input avenues: - -1. **User Uploaded Documents**: Raw documents provided by users for further processing. -2. **Streaming Event Logs**: Logs that are streamed in real-time. -3. **Streaming Data Feeds**: Continuous streams of data that could be dynamic in nature. -4. **RSS Threat Intel Feeds**: RSS-based feeds that might focus on threat intelligence. -5. **LLM Input Query**: Queries that are inputted for processing by the Large Language Model. - -#### Data Processing Stages - -The ingested data traverses several processing stages: - -1. **Sources Integration**: Data from different origins such as REST Server, Kafka, and RSS Scraper are directed into - the pipeline. -2. **Batching**: Data items are grouped together for more efficient bulk processing. -3. **Data Transformation**: - - **Chunking**: Data might be broken into smaller chunks if too large. - - **Tokenization**: Textual data is typically converted into tokens suitable for model processing. - - **Embedding**: This step likely converts data into its vector representation. - - **Mean Pooling**: Embeddings might be combined to yield a mean vector. - -4. **Inference**: Models may be used to extract patterns or make predictions from the data. -5. **Storage and Retrieval**: Vector representations are stored in the Vector DB and can be fetched upon request. - Retrieval might employ GPU-based IVF indexes (such as RAFT or FAISS). - -### Backend Components - -The pipeline is supported by a set of backend components: - -1. **Knowledge Graph DB & Service**: This serves as a potential repository and query mechanism for stored knowledge. -2. **Vector DB & Service**: Appears to handle the storage and querying of vector data. -3. **Triton Inference Server**: An inference component that interfaces with the LLM Service. - -## Getting Started ## Prerequisites @@ -178,27 +126,6 @@ Before running the pipeline, we need to ensure that the following services are r - Follow the instructions [here](https://milvus.io/docs/install_standalone-docker.md) to install and run a Milvus service. -### Triton Service - -- Pull the Docker image for Triton: - ```bash - docker pull nvcr.io/nvidia/tritonserver:23.06-py3 - ``` - -- From the Morpheus repo root directory, run the following to launch Triton and load the `all-MiniLM-L6-v2` model: - ```bash - docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD/models:/models nvcr.io/nvidia/tritonserver:23.06-py3 tritonserver --model-repository=/models/triton-model-repo --exit-on-error=false --model-control-mode=explicit --load-model all-MiniLM-L6-v2 - ``` - - This will launch Triton and only load the `all-MiniLM-L6-v2` model. Once Triton has loaded the model, the following - will be displayed: - ``` - +------------------+---------+--------+ - | Model | Version | Status | - +------------------+---------+--------+ - | all-MiniLM-L6-v2 | 1 | READY | - +------------------+---------+--------+ - ``` ### Running the Morpheus Pipeline @@ -208,8 +135,6 @@ pipeline option of `rag`: ### Run example (Standalone Pipeline): -**TODO:** Add model specification syntax - **Using NGC Nemo LLMs** ```bash @@ -223,53 +148,3 @@ python examples/llm/main.py rag pipeline export OPENAI_API_KEY=[YOUR_KEY_HERE] python examples/llm/main.py rag pipeline --llm_service=OpenAI --model_name=gpt-3.5-turbo ``` - -### Run example (Persistent Pipeline): - -**TODO** - -**Using NGC Nemo LLMs** - -```bash -export NGC_API_KEY=[YOUR_KEY_HERE] -python examples/llm/main.py rag persistent -``` - -**Using OpenAI LLM models** - -```bash -export OPENAI_API_KEY=[YOUR_KEY_HERE] -python examples/llm/main.py rag persistent -``` - -### Options: - -- `--log_level [CRITICAL|FATAL|ERROR|WARN|WARNING|INFO|DEBUG]` - - **Description**: Specifies the logging level. - - **Default**: `INFO` - -- `--use_cpp BOOLEAN` - - **Description**: Opt to use C++ node and message types over python. Recommended only in case of bugs. - - **Default**: `False` - -- `--version` - - **Description**: Display the script's current version. - -- `--help` - - **Description**: Show the help message with options and commands details. - -### Commands: - -- ... other pipelines ... -- `rag` - ---- - -## Options for `rag` Command - -The `rag` command has its own set of options and commands: - -### Commands: - -- `persistant` -- `pipeline` diff --git a/examples/llm/rag/images/persistent_pipeline.png b/examples/llm/rag/images/persistent_pipeline.png deleted file mode 100644 index 26b188f214..0000000000 Binary files a/examples/llm/rag/images/persistent_pipeline.png and /dev/null differ diff --git a/examples/llm/rag/persistant_pipeline.py b/examples/llm/rag/persistant_pipeline.py deleted file mode 100644 index 7f5cc4f756..0000000000 --- a/examples/llm/rag/persistant_pipeline.py +++ /dev/null @@ -1,197 +0,0 @@ -# Copyright (c) 2023-2024, NVIDIA CORPORATION. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import time - -import mrc -import mrc.core.operators as ops -from mrc.core.node import Broadcast - -from morpheus.config import Config -from morpheus.config import PipelineModes -from morpheus.llm import LLMEngine -from morpheus.llm.nodes.extracter_node import ExtracterNode -from morpheus.llm.nodes.rag_node import RAGNode -from morpheus.llm.task_handlers.simple_task_handler import SimpleTaskHandler -from morpheus.messages import ControlMessage -from morpheus.messages import MessageMeta -from morpheus.pipeline.pipeline import Pipeline -from morpheus.pipeline.stage import Stage -from morpheus.pipeline.stage_schema import StageSchema -from morpheus.service.vdb.vector_db_service import VectorDBResourceService -from morpheus.stages.inference.triton_inference_stage import TritonInferenceStage -from morpheus.stages.input.kafka_source_stage import KafkaSourceStage -from morpheus.stages.llm.llm_engine_stage import LLMEngineStage -from morpheus.stages.output.write_to_kafka_stage import WriteToKafkaStage -from morpheus.stages.output.write_to_vector_db_stage import WriteToVectorDBStage -from morpheus.stages.preprocess.deserialize_stage import DeserializeStage -from morpheus.stages.preprocess.preprocess_nlp_stage import PreprocessNLPStage - -from ..common.utils import build_default_milvus_config -from ..common.utils import build_llm_service -from ..common.utils import build_milvus_service - - -class SplitStage(Stage): - - def __init__(self, c: Config): - super().__init__(c) - - self._create_ports(1, 2) - - @property - def name(self) -> str: - return "split" - - def supports_cpp_node(self): - return False - - def compute_schema(self, schema: StageSchema): - schema.output_schemas[0].set_type(schema.input_type) - schema.output_schemas[1].set_type(schema.input_type) - - def _build(self, builder: mrc.Builder, input_nodes: list[mrc.SegmentObject]) -> list[mrc.SegmentObject]: - assert len(input_nodes) == 1, "Only 1 input supported" - - # Create a broadcast node - broadcast = Broadcast(builder, "broadcast") - builder.make_edge(input_nodes[0], broadcast) - - def filter_higher_fn(data: MessageMeta): - return MessageMeta(data.df[data.df["v2"] >= 0.5]) - - def filter_lower_fn(data: MessageMeta): - return MessageMeta(data.df[data.df["v2"] < 0.5]) - - # Create a node that only passes on rows >= 0.5 - filter_higher = builder.make_node("filter_higher", ops.map(filter_higher_fn)) - builder.make_edge(broadcast, filter_higher) - - # Create a node that only passes on rows < 0.5 - filter_lower = builder.make_node("filter_lower", ops.map(filter_lower_fn)) - builder.make_edge(broadcast, filter_lower) - - return [filter_higher, filter_lower] - - -def _build_engine(model_name: str, vdb_service: VectorDBResourceService, llm_service: str): - engine = LLMEngine() - - engine.add_node("extracter", node=ExtracterNode()) - - prompt = """You are a helpful assistant. Given the following background information:\n -{% for c in contexts -%} -Title: {{ c.title }} -Summary: {{ c.summary }} -Text: {{ c.page_content }} -{% endfor %} - -Please answer the following question: \n{{ query }}""" - - llm_service = build_llm_service(model_name, llm_service=llm_service, temperature=0.5, tokens_to_generate=200) - - engine.add_node("rag", - inputs=[("/extracter/*", "*")], - node=RAGNode(prompt=prompt, vdb_service=vdb_service, embedding=None, llm_client=llm_service)) - - engine.add_task_handler(inputs=["/rag"], handler=SimpleTaskHandler()) - - return engine - - -def pipeline(num_threads, pipeline_batch_size, model_max_batch_size, embedding_size, model_name, llm_service: str): - config = Config() - config.mode = PipelineModes.OTHER - - # Below properties are specified by the command line - config.num_threads = num_threads - config.pipeline_batch_size = pipeline_batch_size - config.model_max_batch_size = model_max_batch_size - config.mode = PipelineModes.NLP - config.edge_buffer_size = 128 - - vdb_service = build_milvus_service(embedding_size=embedding_size) - - upload_task = {"task_type": "upload", "task_dict": {"input_keys": ["questions"], }} - retrieve_task = {"task_type": "retrieve", "task_dict": {"input_keys": ["questions", "embedding"], }} - - pipe = Pipeline(config) - - # Source of the retrieval queries - retrieve_source = pipe.add_stage(KafkaSourceStage(config, bootstrap_servers="auto", input_topic=["retrieve_input"])) - - retrieve_deserialize = pipe.add_stage( - DeserializeStage(config, message_type=ControlMessage, task_type="llm_engine", task_payload=retrieve_task)) - - pipe.add_edge(retrieve_source, retrieve_deserialize) - - # Source of continually uploading documents - upload_source = pipe.add_stage(KafkaSourceStage(config, bootstrap_servers="auto", input_topic=["upload"])) - - upload_deserialize = pipe.add_stage( - DeserializeStage(config, message_type=ControlMessage, task_type="llm_engine", task_payload=upload_task)) - - pipe.add_edge(upload_source, upload_deserialize) - - # Join the sources into one for tokenization - preprocess = pipe.add_stage( - PreprocessNLPStage(config, - vocab_hash_file="data/bert-base-uncased-hash.txt", - do_lower_case=True, - truncation=True, - add_special_tokens=False, - column='content')) - - pipe.add_edge(upload_deserialize, preprocess) - pipe.add_edge(retrieve_deserialize, preprocess) - - inference = pipe.add_stage( - TritonInferenceStage(config, - model_name=model_name, - server_url="localhost:8001", - force_convert_inputs=True, - use_shared_memory=True)) - pipe.add_edge(preprocess, inference) - - # Split the results based on the task - split = pipe.add_stage(SplitStage(config)) - pipe.add_edge(inference, split) - - # If it's a retrieve task, branch to the LLM engine for RAG - retrieve_llm_engine = pipe.add_stage( - LLMEngineStage(config, - engine=_build_engine(model_name=model_name, - vdb_service=vdb_service.load_resource("RSS"), - llm_service=llm_service))) - pipe.add_edge(split.output_ports[0], retrieve_llm_engine) - - retrieve_results = pipe.add_stage( - WriteToKafkaStage(config, bootstrap_servers="auto", output_topic="retrieve_output")) - pipe.add_edge(retrieve_llm_engine, retrieve_results) - - # If it's an upload task, then send it to the database - upload_vdb = pipe.add_stage( - WriteToVectorDBStage(config, - resource_name="RSS", - resource_kwargs=build_default_milvus_config(embedding_size=embedding_size), - recreate=True, - service=vdb_service)) - pipe.add_edge(split.output_ports[1], upload_vdb) - - start_time = time.time() - - # Run the pipeline - pipe.run() - - return start_time diff --git a/examples/llm/rag/run.py b/examples/llm/rag/run.py index c060f34127..ace82fea0f 100644 --- a/examples/llm/rag/run.py +++ b/examples/llm/rag/run.py @@ -88,47 +88,3 @@ def pipeline(**kwargs): from .standalone_pipeline import standalone return standalone(**kwargs) - - -@run.command() -@click.option( - "--num_threads", - default=os.cpu_count(), - type=click.IntRange(min=1), - help="Number of internal pipeline threads to use", -) -@click.option( - "--pipeline_batch_size", - default=1024, - type=click.IntRange(min=1), - help=("Internal batch size for the pipeline. Can be much larger than the model batch size. " - "Also used for Kafka consumers"), -) -@click.option( - "--model_max_batch_size", - default=64, - type=click.IntRange(min=1), - help="Max batch size to use for the model", -) -@click.option( - "--embedding_size", - default=384, - type=click.IntRange(min=1), - help="The output size of the embedding calculation. Depends on the model supplied by --model_name", -) -@click.option( - "--model_name", - required=True, - type=str, - default='gpt-43b-002', - help="The name of the model that is deployed on Triton server", -) -@click.option("--llm_service", - default="NemoLLM", - type=click.Choice(['NemoLLM', 'OpenAI'], case_sensitive=False), - help="LLM service to issue requests to, should be used in conjunction with --model_name.") -def persistant(**kwargs): - - from .persistant_pipeline import pipeline as _pipeline - - return _pipeline(**kwargs) diff --git a/examples/llm/vdb_upload/README.md b/examples/llm/vdb_upload/README.md index 3e73f3524f..b8a3ef35e5 100644 --- a/examples/llm/vdb_upload/README.md +++ b/examples/llm/vdb_upload/README.md @@ -33,6 +33,16 @@ limitations under the License. - [Options for vdb_upload Command](#options-for-vdb_upload-command) - [Exporting and Deploying a Different Model from Huggingface](#exporting-and-deploying-a-different-model-from-huggingface) +## Supported Environments +All environments require additional Conda packages which can be installed with either the `conda/environments/all_cuda-121_arch-x86_64.yaml` or `conda/environments/examples_cuda-121_arch-x86_64.yaml` environment files. +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | Requires launching Triton and Milvus on the host | +| Morpheus Release Container | ✔ | Requires launching Triton and Milvus on the host | +| Dev Container | ✘ | | + + ## Background Information ### Purpose diff --git a/examples/log_parsing/README.md b/examples/log_parsing/README.md index 425e1c0b1c..eff8d62538 100644 --- a/examples/log_parsing/README.md +++ b/examples/log_parsing/README.md @@ -18,6 +18,14 @@ Example Morpheus pipeline using Triton Inference server and Morpheus. +## Supported Environments +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | Requires launching Triton on the host | +| Morpheus Release Container | ✔ | Requires launching Triton on the host | +| Dev Container | ✔ | Requires using the `dev-triton-start` script. If using the `run.py` script this requires adding the `--server_url=triton:8000` flag. If using the CLI example this requires replacing `--server_url=localhost:8000` with `--server_url=triton:8000` | + ### Set up Triton Inference Server ##### Pull Triton Inference Server Docker Image diff --git a/examples/nlp_si_detection/README.md b/examples/nlp_si_detection/README.md index 33081caf00..19d38e19c0 100644 --- a/examples/nlp_si_detection/README.md +++ b/examples/nlp_si_detection/README.md @@ -19,6 +19,14 @@ limitations under the License. This example illustrates how to use Morpheus to detect Sensitive Information (SI) in network packets automatically by utilizing a Natural Language Processing (NLP) neural network and Triton Inference Server. +## Supported Environments +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | Requires launching Triton on the host | +| Morpheus Release Container | ✔ | Requires launching Triton on the host | +| Dev Container | ✔ | Requires using the `dev-triton-start` script and replacing `--server_url=localhost:8000` with `--server_url=triton:8000` | + ## Background The goal of this example is to identify potentially sensitive information in network packet data as quickly as possible to limit exposure and take corrective action. Sensitive information is a broad term but can be generalized to any data that should be guarded from unauthorized access. Credit card numbers, passwords, authorization keys, and emails are all examples of sensitive information. diff --git a/examples/ransomware_detection/README.md b/examples/ransomware_detection/README.md index 6c04feae46..6b9e19f1ac 100644 --- a/examples/ransomware_detection/README.md +++ b/examples/ransomware_detection/README.md @@ -19,6 +19,14 @@ limitations under the License. Example of a Morpheus Pipeline using Triton Inference server. +## Supported Environments +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | Requires launching Triton on the host | +| Morpheus Release Container | ✔ | Requires launching Triton on the host | +| Dev Container | ✔ | Requires using the `dev-triton-start` script. If using the `run.py` script this requires adding the `--server_url=triton:8000` flag. If using the CLI example this requires replacing `--server_url=localhost:8000` with `--server_url=triton:8000` | + ## Setup Triton Inference Server ##### Pull Triton Inference Server Docker Image @@ -59,12 +67,6 @@ Once Triton server finishes starting up, it will display the status of all loade > **Note**: If this is not present in the output, check the Triton log for any error messages related to loading the model. -## Requirements -> **Note**: Make sure `dask` and `distributed` are installed in your Conda environment before running the ransomware detection pipeline. Run the installation command specified below if not. - -```bash -mamba install 'dask>=2023.1.1' 'distributed>=2023.1.1' -``` ## Run Ransomware Detection Pipeline Run the following from the root of the Morpheus repo to start the ransomware detection pipeline: diff --git a/examples/root_cause_analysis/README.md b/examples/root_cause_analysis/README.md index b456c3ff72..84f3d47b2a 100644 --- a/examples/root_cause_analysis/README.md +++ b/examples/root_cause_analysis/README.md @@ -19,6 +19,14 @@ limitations under the License. These examples illustrate how to use Morpheus to build a binary sequence classification pipelines to perform root cause analysis on DGX kernel logs. +## Supported Environments +| Environment | Supported | Notes | +|-------------|-----------|-------| +| Conda | ✔ | | +| Morpheus Docker Container | ✔ | Requires launching Triton on the host | +| Morpheus Release Container | ✔ | Requires launching Triton on the host | +| Dev Container | ✔ | Requires using the `dev-triton-start` script and replacing `--server_url=localhost:8000` with `--server_url=triton:8000` | + ## Background Like any other Linux based machine, DGX's generate a vast amount of logs. Analysts spend hours trying to identify the root causes of each failure. There could be infinitely many types of root causes of the failures. Some patterns might help to narrow it down; however, regular expressions can only help to identify previously known patterns. Moreover, this creates another manual task of maintaining a search script.