Skip to content

Commit

Permalink
Documentation improvements (#2117)
Browse files Browse the repository at this point in the history
* Consolidate the `examples/digital_fingerprinting/production/README.md` and `docs/source/developer_guide/guides/5_digital_fingerprinting.md` documents (#2107)
  * Ensure that the `README.md` file refers to the `5_digital_fingerprinting.md` file.
  * Remove redundant build instructions from `5_digital_fingerprinting.md` and instead direct the user to `README.md`.
  * The `README.md` file now documents how to build and run the example.
  * The `5_digital_fingerprinting.md` file now serves as a reference for features and output fields, along with guiding the user for customizing the pipeline.
* Support ARM builds for DFP containers
* Remove DFP documentation regarding helm charts.
* Document the requirement for installing `model-utils` dependency target for the `onnx-to-trt` tool (#2103).
* Update the `onnx-to-trt` import error message to reflect the `model-utils` Conda env file, rather than logging-and-raising place the error message directly into the exception, prevents the error message from being lost in the traceback.
* Update the `--seq_length` flag in the `onnx-to-trt` command for converting the phishing model (#2116).
* Replace hard-coded instances of `x86_64` #2114
* Add ARM to matrix for the `model-utils` target.
* Add `.cache*` to `.gitignore` allows for platform-specific `.cache` directories.
* Ignore verifying anchor tags for github.com, the way github.com handles anchor tags into markdown conflicts with the link checker.

Closes #2103
Closes #2107
Closes #2114
Closes #2116


## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/nv-morpheus/Morpheus/blob/main/docs/source/developer_guide/contributing.md).
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - Michael Demoret (https://github.com/mdemoret-nv)

URL: #2117
  • Loading branch information
dagardner-nv authored Jan 18, 2025
1 parent cf8a9df commit c25de50
Show file tree
Hide file tree
Showing 30 changed files with 108 additions and 233 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ htmlcov/
.tox/
.coverage
.coverage.*
.cache
.cache*
nosetests.xml
coverage.xml
*.cover
Expand Down
6 changes: 6 additions & 0 deletions ci/release/update-version.sh
Original file line number Diff line number Diff line change
Expand Up @@ -98,11 +98,17 @@ sed_runner 's/'"VERSION ${CURRENT_FULL_VERSION}.*"'/'"VERSION ${NEXT_FULL_VERSIO
examples/developer_guide/3_simple_cpp_stage/CMakeLists.txt \
examples/developer_guide/4_rabbitmq_cpp_stage/CMakeLists.txt

# docs/source/basics/overview.rst
sed_runner "s|blob/branch-${CURRENT_SHORT_TAG}|blob/branch-${NEXT_SHORT_TAG}|g" docs/source/basics/overview.rst

# docs/source/cloud_deployment_guide.md
sed_runner "s|${CURRENT_SHORT_TAG}.tgz|${NEXT_SHORT_TAG}.tgz|g" docs/source/cloud_deployment_guide.md
sed_runner "s|blob/branch-${CURRENT_SHORT_TAG}|blob/branch-${NEXT_SHORT_TAG}|g" docs/source/cloud_deployment_guide.md
sed_runner "s|tree/branch-${CURRENT_SHORT_TAG}|tree/branch-${NEXT_SHORT_TAG}|g" docs/source/cloud_deployment_guide.md

# docs/source/developer_guide/guides/5_digital_fingerprinting.md
sed_runner "s|blob/branch-${CURRENT_SHORT_TAG}|blob/branch-${NEXT_SHORT_TAG}|g" docs/source/developer_guide/guides/5_digital_fingerprinting.md

# docs/source/examples.md
sed_runner "s|blob/branch-${CURRENT_SHORT_TAG}|blob/branch-${NEXT_SHORT_TAG}|g" docs/source/examples.md

Expand Down
26 changes: 26 additions & 0 deletions conda/environments/model-utils_cuda-125_arch-aarch64.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# This file is generated by `rapids-dependency-file-generator`.
# To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.
channels:
- conda-forge
- huggingface
- rapidsai
- rapidsai-nightly
- nvidia
- nvidia/label/dev
- pytorch
dependencies:
- cuml=24.10.*
- jupyterlab
- matplotlib
- onnx
- pandas
- pip
- python=3.10
- scikit-learn=1.3.2
- seaborn
- seqeval=1.2.2
- transformers=4.36.2
- xgboost
- pip:
- tensorrt-cu12
name: model-utils_cuda-125_arch-aarch64
2 changes: 1 addition & 1 deletion dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ files:
output: conda
matrix:
cuda: ["12.5"]
arch: [x86_64]
arch: [x86_64, aarch64]
includes:
- model-training-tuning
- python
Expand Down
9 changes: 9 additions & 0 deletions docs/source/basics/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,15 @@ queried in the same manner:
--max_workspace_size INTEGER [default: 16000]
--help Show this message and exit.
ONNX To TensorRT
----------------
The ONNX to TensorRT (TRT) conversion utility requires additional packages, which can be installed using the following command:
```bash
conda env update --solver=libmamba -n morpheus --file conda/environments/model-utils_cuda-125_arch-$(arch).yaml
```

Example usage of the ONNX to TRT conversion utility can be found in `models/README.md <https://github.com/nv-morpheus/Morpheus/blob/branch-25.02/models/README.md#generating-trt-models-from-onnx>`_.

AutoComplete
------------

Expand Down
5 changes: 4 additions & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -193,13 +193,16 @@
# Config linkcheck
# Ignore localhost and url prefix fragments
# Ignore openai.com links, as these always report a 403 when requested by the linkcheck agent
# The way Github handles anchors into markdown files is not compatible with the way linkcheck handles them.
# This allows us to continue to verify that the links are valid, but ignore the anchors.
linkcheck_ignore = [
r'http://localhost:\d+/',
r'https://localhost:\d+/',
r'^http://$',
r'^https://$',
r'https://(platform\.)?openai.com',
r'https://code.visualstudio.com'
r'https://code.visualstudio.com',
r"^https://github.com/nv-morpheus/Morpheus/blob/.*#.+$"
]

# Add any paths that contain templates here, relative to this directory.
Expand Down
14 changes: 7 additions & 7 deletions docs/source/developer_guide/contributing.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,11 +159,11 @@ Morpheus provides multiple Conda environment files to support different workflow
The following are the available Conda environment files, all are located in the `conda/environments` directory, with the following naming convention: `<environment>_<cuda_version>_arch-<architecture>.yaml`.
| Environment | File | Description |
| --- | --- | --- |
| `all` | `all_cuda-125_arch-x86_64.yaml` | All dependencies required to build, run and test Morpheus, along with all of the examples. This is a superset of the `dev`, `runtime` and `examples` environments. |
| `dev` | `dev_cuda-125_arch-x86_64.yaml` | Dependencies required to build, run and test Morpheus. This is a superset of the `runtime` environment. |
| `examples` | `examples_cuda-125_arch-x86_64.yaml` | Dependencies required to run all examples. This is a superset of the `runtime` environment. |
| `model-utils` | `model-utils_cuda-125_arch-x86_64.yaml` | Dependencies required to train models independent of Morpheus. |
| `runtime` | `runtime_cuda-125_arch-x86_64.yaml` | Minimal set of dependencies strictly required to run Morpheus. |
| `all` | `all_cuda-125_arch-<arch>.yaml` | All dependencies required to build, run and test Morpheus, along with all of the examples. This is a superset of the `dev`, `runtime` and `examples` environments. |
| `dev` | `dev_cuda-125_arch-<arch>.yaml` | Dependencies required to build, run and test Morpheus. This is a superset of the `runtime` environment. |
| `examples` | `examples_cuda-125_arch-<arch>.yaml` | Dependencies required to run all examples. This is a superset of the `runtime` environment. |
| `model-utils` | `model-utils_cuda-125_arch-<arch>.yaml` | Dependencies required to train models independent of Morpheus. |
| `runtime` | `runtime_cuda-125_arch-<arch>.yaml` | Minimal set of dependencies strictly required to run Morpheus. |


##### Updating Morpheus Dependencies
Expand Down Expand Up @@ -200,11 +200,11 @@ When ready, commit both the changes to the `dependencies.yaml` file and the upda
```
1. Create the Morpheus Conda environment using either the `dev` or `all` environment file. Refer to the [Conda Environment YAML Files](#conda-environment-yaml-files) section for more information.
```bash
conda env create --solver=libmamba -n morpheus --file conda/environments/dev_cuda-125_arch-x86_64.yaml
conda env create --solver=libmamba -n morpheus --file conda/environments/dev_cuda-125_arch-$(arch).yaml
```
or
```bash
conda env create --solver=libmamba -n morpheus --file conda/environments/all_cuda-125_arch-x86_64.yaml
conda env create --solver=libmamba -n morpheus --file conda/environments/all_cuda-125_arch-$(arch).yaml
```

Expand Down
144 changes: 4 additions & 140 deletions docs/source/developer_guide/guides/5_digital_fingerprinting.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,11 @@ Every account, user, service, and machine has a digital fingerprint that represe

To construct this digital fingerprint, we will be training unsupervised behavioral models at various granularities, including a generic model for all users in the organization along with fine-grained models for each user to monitor their behavior. These models are continuously updated and retrained over time​, and alerts are triggered when deviations from normality occur for any user​.

## Running the DFP Example
Instructions for building and running the DFP example are available in the [`examples/digital_fingerprinting/production/README.md`](https://github.com/nv-morpheus/Morpheus/blob/branch-25.02/examples/digital_fingerprinting/production/README.md) guide in the Morpheus repository.

## Training Sources
The data we will want to use for the training and inference will be any sensitive system that the user interacts with, such as VPN, authentication and cloud services. The digital fingerprinting example (`examples/digital_fingerprinting/README.md`) included in Morpheus ingests logs from [Azure Active Directory](https://docs.microsoft.com/en-us/azure/active-directory/reports-monitoring/concept-sign-ins), and [Duo Authentication](https://duo.com/docs/adminapi).
The data we will want to use for the training and inference will be any sensitive system that the user interacts with, such as VPN, authentication and cloud services. The digital fingerprinting example ([`examples/digital_fingerprinting/production/README.md`](https://github.com/nv-morpheus/Morpheus/blob/branch-25.02/examples/digital_fingerprinting/production/README.md)) included in Morpheus ingests logs from [Azure Active Directory](https://docs.microsoft.com/en-us/azure/active-directory/reports-monitoring/concept-sign-ins), and [Duo Authentication](https://duo.com/docs/adminapi).

The location of these logs could be either local to the machine running Morpheus, a shared file system like NFS, or on a remote store such as [Amazon S3](https://aws.amazon.com/s3/).

Expand Down Expand Up @@ -131,145 +134,6 @@ The reference architecture is composed of the following services:​
| `morpheus_pipeline` | Used for executing both training and inference pipelines |
| `fetch_data` | Downloads the example datasets for the DFP example |

### Running via `docker-compose`
#### System requirements
* [Docker](https://docs.docker.com/get-docker/) and [docker-compose](https://docs.docker.com/compose/) installed on the host machine​
* Supported GPU with [NVIDIA Container Toolkit​](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)

> **Note:** For GPU Requirements refer to the [Getting Started](../../getting_started.md#requirements) guide.
#### Building the services
From the root of the Morpheus repo, run:
```bash
cd examples/digital_fingerprinting/production
export MORPHEUS_CONTAINER_VERSION="$(git describe --tags --abbrev=0)-runtime"
docker compose build
```

> **Note:** This requires version 1.28.0 or higher of Docker Compose, and preferably v2. If you encounter an error similar to:
>
> ```
> ERROR: The Compose file './docker-compose.yml' is invalid because:
> services.jupyter.deploy.resources.reservations value Additional properties are not allowed ('devices' was
> unexpected)
> ```
>
> This is most likely due to using an older version of the `docker-compose` command, instead re-run the build with `docker compose`. Refer to [Migrate to Compose V2](https://docs.docker.com/compose/migrate/) for more information.
#### Downloading the example datasets
First, we will need to install additional requirements in to the Conda environment. Then run the `examples/digital_fingerprinting/fetch_example_data.py` script. This will download the example data into the `examples/data/dfp` dir.
The script can be run from within the `fetch_data` Docker Compose service, or from within a Conda environment on the host machine.
##### Docker Compose Service Method
This approach has the advantage of not requiring any additional setup on the host machine. From the `examples/digital_fingerprinting/production` dir run:
```bash
docker compose up mlflow
```
##### Conda Environment Method
This approach is useful for users who have already set up a Conda environment on their host machine, and has the advantage that the downloaded data will be owned by the host user.

If a Conda environment has already been created, it can be updated by running the following command from the root of the Morpheus repo:
```bash
conda env update --solver=libmamba \
-n ${CONDA_DEFAULT_ENV} \
--file ./conda/environments/examples_cuda-125_arch-x86_64.yaml
```

If a Conda environment has not been created, it can be created by running the following command from the root of the Morpheus repo:
```bash
conda env create --solver=libmamba \
-n morpheus \
--file ./conda/environments/all_cuda-125_arch-x86_64.yaml
```

Once the Conda environment has been updated or created, fetch the data with the following command:
```bash
python examples/digital_fingerprinting/fetch_example_data.py all
```

#### Running the services
##### Jupyter Server
From the `examples/digital_fingerprinting/production` dir run:
```bash
docker compose up jupyter
```

Once the build is complete and the service has started, a message similar to the following should display:
```
jupyter | To access the server, open this file in a browser:
jupyter | file:///root/.local/share/jupyter/runtime/jpserver-7-open.html
jupyter | Or copy and paste one of these URLs:
jupyter | http://localhost:8888/lab?token=<token>
jupyter | or http://127.0.0.1:8888/lab?token=<token>
```

Copy and paste the URL into a web browser. There are four notebooks included with the DFP example:
* dfp_azure_training.ipynb - Training pipeline for Azure Active Directory data
* dfp_azure_inference.ipynb - Inference pipeline for Azure Active Directory data
* dfp_duo_training.ipynb - Training pipeline for Duo Authentication
* dfp_duo_inference.ipynb - Inference pipeline for Duo Authentication

> **Note:** The token in the URL is a one-time use token and a new one is generated with each invocation.
##### Morpheus Pipeline
By default, the `morpheus_pipeline` will run the training pipeline for Duo data from the `examples/digital_fingerprinting/production` dir run:
```bash
docker compose up morpheus_pipeline
```

If instead you want to run a different pipeline from the `examples/digital_fingerprinting/production` dir, run:
```bash
docker compose run morpheus_pipeline bash
```


From the prompt within the `morpheus_pipeline` container, you can run either the `dfp_azure_pipeline.py` or `dfp_duo_pipeline.py` pipeline scripts.
```bash
python dfp_azure_pipeline.py --help
python dfp_duo_pipeline.py --help
```

Both scripts are capable of running either a training or inference pipeline for their respective data sources. The command-line options for both are the same:
| Flag | Type | Description |
| ---- | ---- | ----------- |
| `--train_users` | One of: `all`, `generic`, `individual`, `none` | Indicates whether or not to train per user or a generic model for all users. Selecting `none` runs the inference pipeline. |
| `--skip_user` | TEXT | User IDs to skip. Mutually exclusive with `only_user` |
| `--only_user` | TEXT | Only users specified by this option will be included. Mutually exclusive with `skip_user` |
| `--start_time` | TEXT | The start of the time window, if undefined `start_date` will be `now()-duration` |
| `--duration` | TEXT | The duration to run starting from `start_time` [default: `60d`] |
| `--cache_dir` | TEXT | The location to cache data such as S3 downloads and pre-processed data [environment variable: `DFP_CACHE_DIR`; default: `./.cache/dfp`] |
| `--log_level` | One of: `CRITICAL`, `FATAL`, `ERROR`, `WARN`, `WARNING`, `INFO`, `DEBUG` | Specify the logging level to use. [default: `WARNING`] |
| `--sample_rate_s` | INTEGER | Minimum time step, in milliseconds, between object logs. [environment variable: `DFP_SAMPLE_RATE_S`; default: 0] |
| `-f`, `--input_file` | TEXT | List of files to process. Can specify multiple arguments for multiple files. Also accepts glob (*) wildcards and schema prefixes such as `s3://`. For example, to make a local cache of an s3 bucket, use `filecache::s3://mybucket/*`. Refer to [`fsspec` documentation](https://filesystem-spec.readthedocs.io/en/latest/api.html?highlight=open_files#fsspec.open_files) for list of possible options. |
| `--watch_inputs` | FLAG | Instructs the pipeline to continuously check the paths specified by `--input_file` for new files. This assumes that the at least one paths contains a wildcard. |
| `--watch_interval` | FLOAT | Amount of time, in seconds, to wait between checks for new files. Only used if --watch_inputs is set. [default `1.0`] |
| `--tracking_uri` | TEXT | The MLflow tracking URI to connect to. [default: `http://localhost:5000`] |
| `--help` | | Show this message and exit. |


To run the DFP pipelines with the example datasets within the container, run:

* Duo Training Pipeline
```bash
python dfp_duo_pipeline.py --train_users=all --start_time="2022-08-01" --input_file="/workspace/examples/data/dfp/duo-training-data/*.json"
```

* Duo Inference Pipeline
```bash
python dfp_duo_pipeline.py --train_users=none --start_time="2022-08-30" --input_file="/workspace/examples/data/dfp/duo-inference-data/*.json"
```

* Azure Training Pipeline
```bash
python dfp_azure_pipeline.py --train_users=all --start_time="2022-08-01" --input_file="/workspace/examples/data/dfp/azure-training-data/*.json"
```

* Azure Inference Pipeline
```bash
python dfp_azure_pipeline.py --train_users=none --start_time="2022-08-30" --input_file="/workspace/examples/data/dfp/azure-inference-data/*.json"
```

##### Output Fields
The output files will contain those logs from the input dataset for which an anomaly was detected; this is determined by the z-score in the `mean_abs_z` field. By default, any logs with a z-score of 2.0 or higher are considered anomalous. Refer to [`DFPPostprocessingStage`](6_digital_fingerprinting_reference.md#post-processing-stage-dfppostprocessingstage).

Expand Down
2 changes: 1 addition & 1 deletion examples/abp_nvsmi_detection/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ This example can be easily applied to datasets generated from your own NVIDIA GP

pyNVML is not installed by default, use the following command to install it:
```bash
conda env update --solver=libmamba -n morpheus --file conda/environments/examples_cuda-125_arch-x86_64.yaml
conda env update --solver=libmamba -n morpheus --file conda/environments/examples_cuda-125_arch-$(arch).yaml
```

Run the following to start generating your dataset:
Expand Down
2 changes: 1 addition & 1 deletion examples/developer_guide/3_simple_cpp_stage/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,5 +21,5 @@ limitations under the License.
|-------------|-----------|-------|
| Conda || |
| Morpheus Docker Container || |
| Morpheus Release Container || Requires adding development packages to the container's Conda environment via `conda env update --solver=libmamba -n morpheus --file /workspace/conda/environments/dev_cuda-125_arch-x86_64.yaml` |
| Morpheus Release Container || Requires adding development packages to the container's Conda environment via `conda env update --solver=libmamba -n morpheus --file /workspace/conda/environments/dev_cuda-125_arch-$(arch).yaml` |
| Dev Container || |
2 changes: 1 addition & 1 deletion examples/developer_guide/4_rabbitmq_cpp_stage/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ This example builds upon the `examples/developer_guide/2_2_rabbitmq` example add
|-------------|-----------|-------|
| Conda || |
| Morpheus Docker Container || Requires launching the RabbitMQ container on the host |
| Morpheus Release Container || Requires launching the RabbitMQ container on the host, and adding development packages to the container's Conda environment via `conda env update --solver=libmamba -n morpheus --file /workspace/conda/environments/dev_cuda-125_arch-x86_64.yaml` |
| Morpheus Release Container || Requires launching the RabbitMQ container on the host, and adding development packages to the container's Conda environment via `conda env update --solver=libmamba -n morpheus --file /workspace/conda/environments/dev_cuda-125_arch-$(arch).yaml` |
| Dev Container || |

## Installing Pika
Expand Down
10 changes: 5 additions & 5 deletions examples/digital_fingerprinting/production/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
ARG BASE_IMG=nvcr.io/nvidia/cuda
ARG BASE_IMG_TAG=12.5.1-base-ubuntu22.04

FROM ${BASE_IMG}:${BASE_IMG_TAG} AS base
FROM --platform=$TARGETPLATFORM ${BASE_IMG}:${BASE_IMG_TAG} AS base

# Install necessary dependencies using apt-get
RUN apt-get update && apt-get install -y \
Expand All @@ -26,7 +26,7 @@ RUN apt-get update && apt-get install -y \
&& apt-get clean

# Install miniconda
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /tmp/miniconda.sh \
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-$(arch).sh -O /tmp/miniconda.sh \
&& bash /tmp/miniconda.sh -b -p /opt/conda \
&& rm /tmp/miniconda.sh

Expand All @@ -48,20 +48,20 @@ WORKDIR /workspace/examples/digital_fingerprinting/production
COPY . /workspace/examples/digital_fingerprinting/production/

# Create a conda env with morpheus-dfp and any additional dependencies needed to run the examples
RUN conda env create --solver=libmamba -y --name morpheus-dfp --file ./conda/environments/dfp_example_cuda-125_arch-x86_64.yaml
RUN conda env create --solver=libmamba -y --name morpheus-dfp --file ./conda/environments/dfp_example_cuda-125_arch-$(arch).yaml

ENTRYPOINT [ "/opt/conda/envs/morpheus-dfp/bin/tini", "--", "/workspace/examples/digital_fingerprinting/production/docker/entrypoint.sh" ]

SHELL ["/bin/bash", "-c"]

# ===== Setup for running unattended =====
FROM base AS runtime
FROM --platform=$TARGETPLATFORM base AS runtime

# Launch morpheus
CMD ["./launch.sh"]

# ===== Setup for running Jupyter =====
FROM base AS jupyter
FROM --platform=$TARGETPLATFORM base AS jupyter

# Install the jupyter specific requirements
RUN source activate morpheus-dfp &&\
Expand Down
Loading

0 comments on commit c25de50

Please sign in to comment.