Documentation improvements (#2117)

* Consolidate the `examples/digital_fingerprinting/production/README.md` and `docs/source/developer_guide/guides/5_digital_fingerprinting.md` documents (#2107) * Ensure that the `README.md` file refers to the `5_digital_fingerprinting.md` file. * Remove redundant build instructions from `5_digital_fingerprinting.md` and instead direct the user to `README.md`. * The `README.md` file now documents how to build and run the example. * The `5_digital_fingerprinting.md` file now serves as a reference for features and output fields, along with guiding the user for customizing the pipeline. * Support ARM builds for DFP containers * Remove DFP documentation regarding helm charts. * Document the requirement for installing `model-utils` dependency target for the `onnx-to-trt` tool (#2103). * Update the `onnx-to-trt` import error message to reflect the `model-utils` Conda env file, rather than logging-and-raising place the error message directly into the exception, prevents the error message from being lost in the traceback. * Update the `--seq_length` flag in the `onnx-to-trt` command for converting the phishing model (#2116). * Replace hard-coded instances of `x86_64` #2114 * Add ARM to matrix for the `model-utils` target. * Add `.cache*` to `.gitignore` allows for platform-specific `.cache` directories. * Ignore verifying anchor tags for github.com, the way github.com handles anchor tags into markdown conflicts with the link checker. Closes #2103 Closes #2107 Closes #2114 Closes #2116 ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/nv-morpheus/Morpheus/blob/main/docs/source/developer_guide/contributing.md). - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. Authors: - David Gardner (https://github.com/dagardner-nv) Approvers: - Michael Demoret (https://github.com/mdemoret-nv) URL: #2117
nv-morpheus · Jan 18, 2025 · c25de50 · c25de50
1 parent cf8a9df
commit c25de50
Show file tree

Hide file tree

Showing 30 changed files with 108 additions and 233 deletions.
diff --git a/.gitignore b/.gitignore
@@ -88,7 +88,7 @@ htmlcov/
 .tox/
 .coverage
 .coverage.*
-.cache
+.cache*
 nosetests.xml
 coverage.xml
 *.cover

diff --git a/ci/release/update-version.sh b/ci/release/update-version.sh
@@ -98,11 +98,17 @@ sed_runner 's/'"VERSION ${CURRENT_FULL_VERSION}.*"'/'"VERSION ${NEXT_FULL_VERSIO
    examples/developer_guide/3_simple_cpp_stage/CMakeLists.txt \
    examples/developer_guide/4_rabbitmq_cpp_stage/CMakeLists.txt
 
+# docs/source/basics/overview.rst
+sed_runner "s|blob/branch-${CURRENT_SHORT_TAG}|blob/branch-${NEXT_SHORT_TAG}|g" docs/source/basics/overview.rst
+
 # docs/source/cloud_deployment_guide.md
 sed_runner "s|${CURRENT_SHORT_TAG}.tgz|${NEXT_SHORT_TAG}.tgz|g" docs/source/cloud_deployment_guide.md
 sed_runner "s|blob/branch-${CURRENT_SHORT_TAG}|blob/branch-${NEXT_SHORT_TAG}|g" docs/source/cloud_deployment_guide.md
 sed_runner "s|tree/branch-${CURRENT_SHORT_TAG}|tree/branch-${NEXT_SHORT_TAG}|g" docs/source/cloud_deployment_guide.md
 
+# docs/source/developer_guide/guides/5_digital_fingerprinting.md
+sed_runner "s|blob/branch-${CURRENT_SHORT_TAG}|blob/branch-${NEXT_SHORT_TAG}|g" docs/source/developer_guide/guides/5_digital_fingerprinting.md
+
 # docs/source/examples.md
 sed_runner "s|blob/branch-${CURRENT_SHORT_TAG}|blob/branch-${NEXT_SHORT_TAG}|g" docs/source/examples.md
 

diff --git a/conda/environments/model-utils_cuda-125_arch-aarch64.yaml b/conda/environments/model-utils_cuda-125_arch-aarch64.yaml
@@ -0,0 +1,26 @@
+# This file is generated by `rapids-dependency-file-generator`.
+# To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.
+channels:
+- conda-forge
+- huggingface
+- rapidsai
+- rapidsai-nightly
+- nvidia
+- nvidia/label/dev
+- pytorch
+dependencies:
+- cuml=24.10.*
+- jupyterlab
+- matplotlib
+- onnx
+- pandas
+- pip
+- python=3.10
+- scikit-learn=1.3.2
+- seaborn
+- seqeval=1.2.2
+- transformers=4.36.2
+- xgboost
+- pip:
+  - tensorrt-cu12
+name: model-utils_cuda-125_arch-aarch64
diff --git a/dependencies.yaml b/dependencies.yaml
@@ -164,7 +164,7 @@ files:
     output: conda
     matrix:
       cuda: ["12.5"]
-      arch: [x86_64]
+      arch: [x86_64, aarch64]
     includes:
       - model-training-tuning
       - python

diff --git a/docs/source/basics/overview.rst b/docs/source/basics/overview.rst
@@ -107,6 +107,15 @@ queried in the same manner:
    --max_workspace_size INTEGER    [default: 16000]
    --help                          Show this message and exit.
 
+ONNX To TensorRT
+----------------
+The ONNX to TensorRT (TRT) conversion utility requires additional packages, which can be installed using the following command:
+```bash
+conda env update --solver=libmamba -n morpheus --file conda/environments/model-utils_cuda-125_arch-$(arch).yaml
+```
+
+Example usage of the ONNX to TRT conversion utility can be found in `models/README.md <https://github.com/nv-morpheus/Morpheus/blob/branch-25.02/models/README.md#generating-trt-models-from-onnx>`_.
+
 AutoComplete
 ------------
 

diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -193,13 +193,16 @@
 # Config linkcheck
 # Ignore localhost and url prefix fragments
 # Ignore openai.com links, as these always report a 403 when requested by the linkcheck agent
+# The way Github handles anchors into markdown files is not compatible with the way linkcheck handles them.
+# This allows us to continue to verify that the links are valid, but ignore the anchors.
 linkcheck_ignore = [
     r'http://localhost:\d+/',
     r'https://localhost:\d+/',
     r'^http://$',
     r'^https://$',
     r'https://(platform\.)?openai.com',
-    r'https://code.visualstudio.com'
+    r'https://code.visualstudio.com',
+    r"^https://github.com/nv-morpheus/Morpheus/blob/.*#.+$"
 ]
 
 # Add any paths that contain templates here, relative to this directory.

diff --git a/docs/source/developer_guide/contributing.md b/docs/source/developer_guide/contributing.md
@@ -159,11 +159,11 @@ Morpheus provides multiple Conda environment files to support different workflow
 The following are the available Conda environment files, all are located in the `conda/environments` directory, with the following naming convention: `<environment>_<cuda_version>_arch-<architecture>.yaml`.
 | Environment | File | Description |
 | --- | --- | --- |
-| `all` | `all_cuda-125_arch-x86_64.yaml` | All dependencies required to build, run and test Morpheus, along with all of the examples. This is a superset of the `dev`, `runtime` and `examples` environments. |
-| `dev` | `dev_cuda-125_arch-x86_64.yaml` | Dependencies required to build, run and test Morpheus. This is a superset of the `runtime` environment. |
-| `examples` | `examples_cuda-125_arch-x86_64.yaml` | Dependencies required to run all examples. This is a superset of the `runtime` environment. |
-| `model-utils` | `model-utils_cuda-125_arch-x86_64.yaml` | Dependencies required to train models independent of Morpheus. |
-| `runtime` | `runtime_cuda-125_arch-x86_64.yaml` | Minimal set of dependencies strictly required to run Morpheus. |
+| `all` | `all_cuda-125_arch-<arch>.yaml` | All dependencies required to build, run and test Morpheus, along with all of the examples. This is a superset of the `dev`, `runtime` and `examples` environments. |
+| `dev` | `dev_cuda-125_arch-<arch>.yaml` | Dependencies required to build, run and test Morpheus. This is a superset of the `runtime` environment. |
+| `examples` | `examples_cuda-125_arch-<arch>.yaml` | Dependencies required to run all examples. This is a superset of the `runtime` environment. |
+| `model-utils` | `model-utils_cuda-125_arch-<arch>.yaml` | Dependencies required to train models independent of Morpheus. |
+| `runtime` | `runtime_cuda-125_arch-<arch>.yaml` | Minimal set of dependencies strictly required to run Morpheus. |
 
 
 ##### Updating Morpheus Dependencies
@@ -200,11 +200,11 @@ When ready, commit both the changes to the `dependencies.yaml` file and the upda
    ```
 1. Create the Morpheus Conda environment using either the `dev` or `all` environment file. Refer to the [Conda Environment YAML Files](#conda-environment-yaml-files) section for more information.
    ```bash
-   conda env create --solver=libmamba -n morpheus --file conda/environments/dev_cuda-125_arch-x86_64.yaml
+   conda env create --solver=libmamba -n morpheus --file conda/environments/dev_cuda-125_arch-$(arch).yaml
    ```
    or
    ```bash
-   conda env create --solver=libmamba -n morpheus --file conda/environments/all_cuda-125_arch-x86_64.yaml
+   conda env create --solver=libmamba -n morpheus --file conda/environments/all_cuda-125_arch-$(arch).yaml
 
    ```
 

diff --git a/docs/source/developer_guide/guides/5_digital_fingerprinting.md b/docs/source/developer_guide/guides/5_digital_fingerprinting.md
@@ -22,8 +22,11 @@ Every account, user, service, and machine has a digital fingerprint that represe
 
 To construct this digital fingerprint, we will be training unsupervised behavioral models at various granularities, including a generic model for all users in the organization along with fine-grained models for each user to monitor their behavior. These models are continuously updated and retrained over time, and alerts are triggered when deviations from normality occur for any user.
 
+## Running the DFP Example
+Instructions for building and running the DFP example are available in the [`examples/digital_fingerprinting/production/README.md`](https://github.com/nv-morpheus/Morpheus/blob/branch-25.02/examples/digital_fingerprinting/production/README.md) guide in the Morpheus repository.
+
 ## Training Sources
-The data we will want to use for the training and inference will be any sensitive system that the user interacts with, such as VPN, authentication and cloud services. The digital fingerprinting example (`examples/digital_fingerprinting/README.md`) included in Morpheus ingests logs from [Azure Active Directory](https://docs.microsoft.com/en-us/azure/active-directory/reports-monitoring/concept-sign-ins), and [Duo Authentication](https://duo.com/docs/adminapi).
+The data we will want to use for the training and inference will be any sensitive system that the user interacts with, such as VPN, authentication and cloud services. The digital fingerprinting example ([`examples/digital_fingerprinting/production/README.md`](https://github.com/nv-morpheus/Morpheus/blob/branch-25.02/examples/digital_fingerprinting/production/README.md)) included in Morpheus ingests logs from [Azure Active Directory](https://docs.microsoft.com/en-us/azure/active-directory/reports-monitoring/concept-sign-ins), and [Duo Authentication](https://duo.com/docs/adminapi).
 
 The location of these logs could be either local to the machine running Morpheus, a shared file system like NFS, or on a remote store such as [Amazon S3](https://aws.amazon.com/s3/).
 
@@ -131,145 +134,6 @@ The reference architecture is composed of the following services:
 | `morpheus_pipeline` | Used for executing both training and inference pipelines |
 | `fetch_data` | Downloads the example datasets for the DFP example |
 
-### Running via `docker-compose`
-#### System requirements
-* [Docker](https://docs.docker.com/get-docker/) and [docker-compose](https://docs.docker.com/compose/) installed on the host machine
-* Supported GPU with [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
-
-> **Note:**  For GPU Requirements refer to the [Getting Started](../../getting_started.md#requirements) guide.
-
-#### Building the services
-From the root of the Morpheus repo, run:
-```bash
-cd examples/digital_fingerprinting/production
-export MORPHEUS_CONTAINER_VERSION="$(git describe --tags --abbrev=0)-runtime"
-docker compose build
-```
-
-> **Note:** This requires version 1.28.0 or higher of Docker Compose, and preferably v2. If you encounter an error similar to:
->
-> ```
-> ERROR: The Compose file './docker-compose.yml' is invalid because:
-> services.jupyter.deploy.resources.reservations value Additional properties are not allowed ('devices' was
-> unexpected)
-> ```
->
-> This is most likely due to using an older version of the `docker-compose` command, instead re-run the build with `docker compose`. Refer to [Migrate to Compose V2](https://docs.docker.com/compose/migrate/) for more information.
-
-#### Downloading the example datasets
-First, we will need to install additional requirements in to the Conda environment. Then run the `examples/digital_fingerprinting/fetch_example_data.py` script. This will download the example data into the `examples/data/dfp` dir.
-
-The script can be run from within the `fetch_data` Docker Compose service, or from within a Conda environment on the host machine.
-
-##### Docker Compose Service Method
-This approach has the advantage of not requiring any additional setup on the host machine. From the `examples/digital_fingerprinting/production` dir run:
-```bash
-docker compose up mlflow
-```
-##### Conda Environment Method
-This approach is useful for users who have already set up a Conda environment on their host machine, and has the advantage that the downloaded data will be owned by the host user.
-
-If a Conda environment has already been created, it can be updated by running the following command from the root of the Morpheus repo:
-```bash
-conda env update --solver=libmamba \
-  -n ${CONDA_DEFAULT_ENV} \
-  --file ./conda/environments/examples_cuda-125_arch-x86_64.yaml
-```
-
-If a Conda environment has not been created, it can be created by running the following command from the root of the Morpheus repo:
-```bash
-conda env create --solver=libmamba \
-  -n morpheus \
-  --file ./conda/environments/all_cuda-125_arch-x86_64.yaml
-```
-
-Once the Conda environment has been updated or created, fetch the data with the following command:
-```bash
-python examples/digital_fingerprinting/fetch_example_data.py all
-```
-
-#### Running the services
-##### Jupyter Server
-From the `examples/digital_fingerprinting/production` dir run:
-```bash
-docker compose up jupyter
-```
-
-Once the build is complete and the service has started, a message similar to the following should display:
-```
-jupyter  |     To access the server, open this file in a browser:
-jupyter  |         file:///root/.local/share/jupyter/runtime/jpserver-7-open.html
-jupyter  |     Or copy and paste one of these URLs:
-jupyter  |         http://localhost:8888/lab?token=<token>
-jupyter  |      or http://127.0.0.1:8888/lab?token=<token>
-```
-
-Copy and paste the URL into a web browser. There are four notebooks included with the DFP example:
-* dfp_azure_training.ipynb - Training pipeline for Azure Active Directory data
-* dfp_azure_inference.ipynb - Inference pipeline for Azure Active Directory data
-* dfp_duo_training.ipynb - Training pipeline for Duo Authentication
-* dfp_duo_inference.ipynb - Inference pipeline for Duo Authentication
-
-> **Note:** The token in the URL is a one-time use token and a new one is generated with each invocation.
-
-##### Morpheus Pipeline
-By default, the `morpheus_pipeline` will run the training pipeline for Duo data from the `examples/digital_fingerprinting/production` dir run:
-```bash
-docker compose up morpheus_pipeline
-```
-
-If instead you want to run a different pipeline from the `examples/digital_fingerprinting/production` dir, run:
-```bash
-docker compose run morpheus_pipeline bash
-```
-
-
-From the prompt within the `morpheus_pipeline` container, you can run either the `dfp_azure_pipeline.py` or `dfp_duo_pipeline.py` pipeline scripts.
-```bash
-python dfp_azure_pipeline.py --help
-python dfp_duo_pipeline.py --help
-```
-
-Both scripts are capable of running either a training or inference pipeline for their respective data sources. The command-line options for both are the same:
-| Flag | Type | Description |
-| ---- | ---- | ----------- |
-| `--train_users` | One of: `all`, `generic`, `individual`, `none` | Indicates whether or not to train per user or a generic model for all users. Selecting `none` runs the inference pipeline. |
-| `--skip_user` | TEXT | User IDs to skip. Mutually exclusive with `only_user` |
-| `--only_user` | TEXT | Only users specified by this option will be included. Mutually exclusive with `skip_user` |
-| `--start_time` | TEXT | The start of the time window, if undefined `start_date` will be `now()-duration` |
-| `--duration` | TEXT | The duration to run starting from `start_time` [default: `60d`] |
-| `--cache_dir` | TEXT | The location to cache data such as S3 downloads and pre-processed data  [environment variable: `DFP_CACHE_DIR`; default: `./.cache/dfp`] |
-| `--log_level` | One of: `CRITICAL`, `FATAL`, `ERROR`, `WARN`, `WARNING`, `INFO`, `DEBUG` | Specify the logging level to use. [default: `WARNING`] |
-| `--sample_rate_s` | INTEGER | Minimum time step, in milliseconds, between object logs. [environment variable: `DFP_SAMPLE_RATE_S`; default: 0] |
-| `-f`, `--input_file` | TEXT | List of files to process. Can specify multiple arguments for multiple files. Also accepts glob (*) wildcards and schema prefixes such as `s3://`. For example, to make a local cache of an s3 bucket, use `filecache::s3://mybucket/*`. Refer to [`fsspec` documentation](https://filesystem-spec.readthedocs.io/en/latest/api.html?highlight=open_files#fsspec.open_files) for list of possible options. |
-| `--watch_inputs` | FLAG | Instructs the pipeline to continuously check the paths specified by `--input_file` for new files. This assumes that the at least one paths contains a wildcard. |
-| `--watch_interval` | FLOAT | Amount of time, in seconds, to wait between checks for new files. Only used if --watch_inputs is set. [default `1.0`] |
-| `--tracking_uri` | TEXT | The MLflow tracking URI to connect to. [default: `http://localhost:5000`] |
-| `--help` | | Show this message and exit. |
-
-
-To run the DFP pipelines with the example datasets within the container, run:
-
-* Duo Training Pipeline
-   ```bash
-   python dfp_duo_pipeline.py --train_users=all --start_time="2022-08-01" --input_file="/workspace/examples/data/dfp/duo-training-data/*.json"
-   ```
-
-* Duo Inference Pipeline
-   ```bash
-   python dfp_duo_pipeline.py --train_users=none --start_time="2022-08-30" --input_file="/workspace/examples/data/dfp/duo-inference-data/*.json"
-   ```
-
-* Azure Training Pipeline
-   ```bash
-   python dfp_azure_pipeline.py --train_users=all --start_time="2022-08-01" --input_file="/workspace/examples/data/dfp/azure-training-data/*.json"
-   ```
-
-* Azure Inference Pipeline
-   ```bash
-   python dfp_azure_pipeline.py --train_users=none  --start_time="2022-08-30" --input_file="/workspace/examples/data/dfp/azure-inference-data/*.json"
-   ```
-
 ##### Output Fields
 The output files will contain those logs from the input dataset for which an anomaly was detected; this is determined by the z-score in the `mean_abs_z` field. By default, any logs with a z-score of 2.0 or higher are considered anomalous. Refer to [`DFPPostprocessingStage`](6_digital_fingerprinting_reference.md#post-processing-stage-dfppostprocessingstage).
 

diff --git a/examples/abp_nvsmi_detection/README.md b/examples/abp_nvsmi_detection/README.md
@@ -63,7 +63,7 @@ This example can be easily applied to datasets generated from your own NVIDIA GP
 
 pyNVML is not installed by default, use the following command to install it:
 ```bash
-conda env update --solver=libmamba -n morpheus --file conda/environments/examples_cuda-125_arch-x86_64.yaml
+conda env update --solver=libmamba -n morpheus --file conda/environments/examples_cuda-125_arch-$(arch).yaml
 ```
 
 Run the following to start generating your dataset:

diff --git a/examples/developer_guide/3_simple_cpp_stage/README.md b/examples/developer_guide/3_simple_cpp_stage/README.md
@@ -21,5 +21,5 @@ limitations under the License.
 |-------------|-----------|-------|
 | Conda | ✔ | |
 | Morpheus Docker Container | ✔ | |
-| Morpheus Release Container | ✔ | Requires adding development packages to the container's Conda environment via `conda env update --solver=libmamba -n morpheus --file /workspace/conda/environments/dev_cuda-125_arch-x86_64.yaml` |
+| Morpheus Release Container | ✔ | Requires adding development packages to the container's Conda environment via `conda env update --solver=libmamba -n morpheus --file /workspace/conda/environments/dev_cuda-125_arch-$(arch).yaml` |
 | Dev Container | ✔ |  |
diff --git a/examples/developer_guide/4_rabbitmq_cpp_stage/README.md b/examples/developer_guide/4_rabbitmq_cpp_stage/README.md
@@ -23,7 +23,7 @@ This example builds upon the `examples/developer_guide/2_2_rabbitmq` example add
 |-------------|-----------|-------|
 | Conda | ✔ | |
 | Morpheus Docker Container | ✔ | Requires launching the RabbitMQ container on the host |
-| Morpheus Release Container | ✔ | Requires launching the RabbitMQ container on the host, and adding development packages to the container's Conda environment via `conda env update --solver=libmamba -n morpheus --file /workspace/conda/environments/dev_cuda-125_arch-x86_64.yaml` |
+| Morpheus Release Container | ✔ | Requires launching the RabbitMQ container on the host, and adding development packages to the container's Conda environment via `conda env update --solver=libmamba -n morpheus --file /workspace/conda/environments/dev_cuda-125_arch-$(arch).yaml` |
 | Dev Container | ✘ |  |
 
 ## Installing Pika

diff --git a/examples/digital_fingerprinting/production/Dockerfile b/examples/digital_fingerprinting/production/Dockerfile
@@ -16,7 +16,7 @@
 ARG BASE_IMG=nvcr.io/nvidia/cuda
 ARG BASE_IMG_TAG=12.5.1-base-ubuntu22.04
 
-FROM ${BASE_IMG}:${BASE_IMG_TAG} AS base
+FROM --platform=$TARGETPLATFORM ${BASE_IMG}:${BASE_IMG_TAG} AS base
 
 # Install necessary dependencies using apt-get
 RUN apt-get update && apt-get install -y \
@@ -26,7 +26,7 @@ RUN apt-get update && apt-get install -y \
     && apt-get clean
 
 # Install miniconda
-RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /tmp/miniconda.sh \
+RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-$(arch).sh -O /tmp/miniconda.sh \
     && bash /tmp/miniconda.sh -b -p /opt/conda \
     && rm /tmp/miniconda.sh
 
@@ -48,20 +48,20 @@ WORKDIR /workspace/examples/digital_fingerprinting/production
 COPY . /workspace/examples/digital_fingerprinting/production/
 
 # Create a conda env with morpheus-dfp and any additional dependencies needed to run the examples
-RUN conda env create --solver=libmamba -y --name morpheus-dfp --file ./conda/environments/dfp_example_cuda-125_arch-x86_64.yaml
+RUN conda env create --solver=libmamba -y --name morpheus-dfp --file ./conda/environments/dfp_example_cuda-125_arch-$(arch).yaml
 
 ENTRYPOINT [ "/opt/conda/envs/morpheus-dfp/bin/tini", "--", "/workspace/examples/digital_fingerprinting/production/docker/entrypoint.sh" ]
 
 SHELL ["/bin/bash", "-c"]
 
 # ===== Setup for running unattended =====
-FROM base AS runtime
+FROM --platform=$TARGETPLATFORM base AS runtime
 
 # Launch morpheus
 CMD ["./launch.sh"]
 
 # ===== Setup for running Jupyter =====
-FROM base AS jupyter
+FROM --platform=$TARGETPLATFORM base AS jupyter
 
 # Install the jupyter specific requirements
 RUN source activate morpheus-dfp &&\