forked from nv-morpheus/Morpheus
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge morpheus core spear phishing components. (nv-morpheus#1044)
Add Morpheus core spear phishing components. Authors: - Devin Robison (https://github.com/drobison00) - Bhargav Suryadevara (https://github.com/bsuryadevara) Approvers: - Michael Demoret (https://github.com/mdemoret-nv) URL: nv-morpheus#1044
- Loading branch information
1 parent
7c2db78
commit 0987dfb
Showing
52 changed files
with
2,722 additions
and
74 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
<!-- | ||
SPDX-FileCopyrightText: Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
SPDX-License-Identifier: Apache-2.0 | ||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--> | ||
|
||
## SQL Loader | ||
|
||
[DataLoader](./../../modules/core/data_loader.md) module is configured to use this loader function. SQL loader to | ||
fetch data from a SQL database and store it in a DataFrame, and returns the updated ControlMessage object with payload | ||
as MessageMeta. | ||
|
||
### Example Loader Configuration | ||
|
||
```json | ||
{ | ||
"loaders": [ | ||
{ | ||
"id": "SQLLoader" | ||
} | ||
] | ||
} | ||
``` | ||
|
||
**Note** : Loaders can receive configuration from the `load` task via ControlMessage during runtime. | ||
|
||
### Task Configurable Parameters | ||
|
||
The parameters that can be configured for this specific loader at load task level: | ||
|
||
| Parameter | Type | Description | Example Value | Default Value | | ||
|--------------|------------|------------------------------------------|--------------------|---------------| | ||
| `strategy` | string | Strategy for combining queries | "aggregate" | `aggregate` | | ||
| `loader_id` | string | Unique identifier for the loader | "file_to_df" | `[Required]` | | ||
| `sql_config` | dictionary | Dictionary containing SQL queries to run | "file_to_df" | `See below` | | ||
|
||
`sql_config` | ||
|
||
| Parameter | Type | Description | Example Value | Default Value | | ||
|-----------|------|---------------------------------------------------|--------------------------------------------|---------------| | ||
| `queries` | list | List of dictionaries composing a query definition | "[query_dict_1, ..., query_dict_n]" | `See below` | | ||
|
||
`queries` | ||
|
||
| Parameter | Type | Description | Example Value | Default Value | | ||
|---------------------|------------|--------------------------------------|-----------------------------------------------------------------|---------------| | ||
| `connection_string` | string | Strategy for combining queries | "postgresql://postgres:postgres@localhost:5432/postgres" | `[required]` | | ||
| `query` | string | SQL Query to execute | "SELECT * FROM test_table WHERE id IN (?, ?, ?)" | `[Required]` | | ||
| `params` | dictionary | Named or positional paramters values | "[foo, bar, baz]" | `-` | | ||
|
||
### Example Load Task Configuration | ||
|
||
Below JSON configuration specifies how to pass additional configuration to the loader through a control message task at | ||
runtime. | ||
|
||
```json | ||
{ | ||
"type": "load", | ||
"properties": { | ||
"loader_id": "SQLLoader", | ||
"strategy": "aggregate", | ||
"sql_config": { | ||
"queries": [ | ||
{ | ||
"connection_string": "postgresql://postgres:postgres@localhost:5431/postgres", | ||
"query": "SELECT * FROM test_table WHERE id IN (?, ?, ?)", | ||
"params": [ | ||
"foo", | ||
"bar", | ||
"baz" | ||
] | ||
} | ||
] | ||
} | ||
} | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
<!-- | ||
SPDX-FileCopyrightText: Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
SPDX-License-Identifier: Apache-2.0 | ||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--> | ||
|
||
## Batch Data Payload Module | ||
|
||
This module batches incoming control message data payload into smaller batches based on the specified configurations. | ||
|
||
### Configurable Parameters | ||
|
||
| Parameter | Type | Description | Example Value | Default Value | | ||
|-----------------------------|------------|-----------------------------------|---------------------------------|---------------| | ||
| `max_batch_size` | integer | The maximum size of each batch | 256 | `256` | | ||
| `raise_on_failure` | boolean | Whether to raise an exception if a failure occurs during processing | false | `false` | | ||
| `group_by_columns` | list | The column names to group by when batching | ["col1", "col2"] | `[]` | | ||
| `disable_max_batch_size` | boolean | Whether to disable the `max_batch_size` and only batch by group | false | `false` | | ||
| `timestamp_column_name` | string | The name of the timestamp column | None | `None` | | ||
| `timestamp_pattern` | string | The pattern to parse the timestamp column | None | `None` | | ||
| `period` | string | The period for grouping by timestamp | H | `D` | | ||
|
||
|
||
### Example JSON Configuration | ||
|
||
```json | ||
{ | ||
"max_batch_size": 256, | ||
"raise_on_failure": false, | ||
"group_by_columns": [], | ||
"disable_max_batch_size": false, | ||
"timestamp_column_name": null, | ||
"timestamp_pattern": null, | ||
"period": "D" | ||
} | ||
``` |
42 changes: 42 additions & 0 deletions
42
docs/source/modules/examples/spear_phishing/sp_email_enrichment.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
<!-- | ||
SPDX-FileCopyrightText: Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
SPDX-License-Identifier: Apache-2.0 | ||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--> | ||
|
||
## Spear Phishing Email Enrichment Module | ||
|
||
Module ID: email_enrichment | ||
Module Namespace: morpheus_spear_phishing | ||
|
||
This module performs spear phishing email enrichment. | ||
|
||
### Configurable Parameters | ||
|
||
| Parameter | Type | Description | Example Value | Default Value | | ||
|--------------------------|------|---------------------------------------------------------------------|------------------------|---------------| | ||
| `sender_sketches` | list | List of sender strings naming sender sketch inputs. | ["sender1", "sender2"] | `[]` | | ||
| `intents` | list | List of intent strings naming computed intent inputs. | ["intent1", "intent2"] | `[]` | | ||
| `raise_on_failure` | boolean | Indicate if we should treat processing errors as pipeline failures. | false | `false` | | ||
| `token_length_threshold` | integer | Minimum token length to use when computing syntax similarity | 5 | None | | ||
|
||
### Example JSON Configuration | ||
|
||
```json | ||
{ | ||
"sender_sketches": ["sender1", "sender2"], | ||
"intents": ["intent1", "intent2"], | ||
"raise_on_failure": false, | ||
"token_length_threshold": 5 | ||
} |
54 changes: 54 additions & 0 deletions
54
docs/source/modules/examples/spear_phishing/sp_inference_intent.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
<!-- | ||
SPDX-FileCopyrightText: Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
SPDX-License-Identifier: Apache-2.0 | ||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--> | ||
|
||
## Inference Intent | ||
|
||
Module ID: infer_email_intent | ||
Module Namespace: morpheus_spear_phishing | ||
|
||
Infers an 'intent' for a given email body. | ||
|
||
### Configurable Parameters | ||
|
||
| Parameter | Type | Description | Example Value | Default Value | | ||
|--------------------|------|-----------------------------------------|-----------------------|-------------------------| | ||
| `intent` | string | The intent for the model | "classify" | `None` | | ||
| `task` | string | The task for the model | "text-classification" | `"text-classification"` | | ||
| `model_path` | string | The path to the model | "/path/to/model" | `None` | | ||
| `truncation` | boolean | If true, truncates inputs to max_length | true | `true` | | ||
| `max_length` | integer | Maximum length for model input | 512 | `512` | | ||
| `batch_size` | integer | The size of batches for processing | 256 | `256` | | ||
| `feature_col` | string | The feature column to use | "body" | `"body"` | | ||
| `label_col` | string | The label column to use | "label" | `"label"` | | ||
| `device` | integer | The device to run on | 0 | `0` | | ||
| `raise_on_failure` | boolean | If true, raise exceptions on failures | false | `false` | | ||
|
||
### Example JSON Configuration | ||
|
||
```json | ||
{ | ||
"intent": "classify", | ||
"task": "text-classification", | ||
"model_path": "/path/to/model", | ||
"truncation": true, | ||
"max_length": 512, | ||
"batch_size": 256, | ||
"feature_col": "body", | ||
"label_col": "label", | ||
"device": 0, | ||
"raise_on_failure": false | ||
} |
42 changes: 42 additions & 0 deletions
42
docs/source/modules/examples/spear_phishing/sp_inference_sp_classifier.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
<!-- | ||
SPDX-FileCopyrightText: Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
SPDX-License-Identifier: Apache-2.0 | ||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--> | ||
|
||
## Spear Phishing Inference Module | ||
|
||
Module ID: inference | ||
Module Namespace: morpheus_spear_phishing | ||
|
||
This module defines a setup for spear-phishing inference. | ||
|
||
### Configurable Parameters | ||
|
||
| Parameter | Type | Description | Example Value | Default Value | | ||
|------------------------|------|---------------------------------------|--------------------|---------------| | ||
| `tracking_uri` | string | The tracking URI for the model | "/path/to/uri" | `None` | | ||
| `registered_model` | string | The registered model for inference | "model_1" | `None` | | ||
| `input_model_features` | list | The input features for the model | ["feat1", "feat2"] | `[]` | | ||
| `raise_on_failure` | boolean | If true, raise exceptions on failures | false | `false` | | ||
|
||
### Example JSON Configuration | ||
|
||
```json | ||
{ | ||
"tracking_uri": "/path/to/uri", | ||
"registered_model": "model_1", | ||
"input_model_features": ["feat1", "feat2"], | ||
"raise_on_failure": false | ||
} |
38 changes: 38 additions & 0 deletions
38
docs/source/modules/examples/spear_phishing/sp_label_and_score.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
<!-- | ||
SPDX-FileCopyrightText: Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
SPDX-License-Identifier: Apache-2.0 | ||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--> | ||
|
||
## Spear Phishing Email Scoring Module | ||
|
||
Module ID: label_and_score | ||
Module Namespace: morpheus_spear_phishing | ||
|
||
This module defines a setup for spear-phishing email scoring. | ||
|
||
### Configurable Parameters | ||
|
||
| Parameter | Type | Description | Example Value | Default Value | | ||
|--------------------|------|---------------------------------------|---------------------------|---------------| | ||
| `scoring_config` | dictionary | The scoring configuration | {"method": "probability"} | `None` | | ||
| `raise_on_failure` | boolean | If true, raise exceptions on failures | false | `false` | | ||
|
||
### Example JSON Configuration | ||
|
||
```json | ||
{ | ||
"scoring_config": {"method": "probability"}, | ||
"raise_on_failure": false | ||
} |
38 changes: 38 additions & 0 deletions
38
docs/source/modules/examples/spear_phishing/sp_preprocessing.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
<!-- | ||
SPDX-FileCopyrightText: Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
SPDX-License-Identifier: Apache-2.0 | ||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--> | ||
|
||
## Spear Phishing Inference Pipeline Preprocessing Module | ||
|
||
Module ID: inference_pipeline_preproc | ||
Module Namespace: morpheus_spear_phishing | ||
|
||
This module defines a pre-processing setup for the spear phishing inference pipeline. | ||
|
||
### Configurable Parameters | ||
|
||
| Parameter | Type | Description | Example Value | Default Value | | ||
|--------------------|------|---------------------------------------------------|---------------|---------------| | ||
| `attach_uuid` | boolean | If true, attach a unique identifier to each input | true | `false` | | ||
| `raise_on_failure` | boolean | If true, raise exceptions on failures | false | `false` | | ||
|
||
### Example JSON Configuration | ||
|
||
```json | ||
{ | ||
"attach_uuid": false, | ||
"raise_on_failure": false | ||
} |
Oops, something went wrong.