-
Notifications
You must be signed in to change notification settings - Fork 588
[FR][DAC] Add Support for Custom Schemas #3679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 15 commits
20c4c56
4eeee9a
d5e6ee5
58daa84
1a342ca
9a62cc9
ec26963
c973452
e59e810
d002779
efb2a5d
e07d098
aae13a8
184bf67
648941f
4dc7223
0d9cdf6
b5e98c1
e449181
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
# or more contributor license agreements. Licensed under the Elastic License | ||
# 2.0; you may not use this file except in compliance with the Elastic License | ||
# 2.0. | ||
|
||
"""Custom Schemas management.""" | ||
from pathlib import Path | ||
|
||
import eql | ||
import eql.types | ||
|
||
from .config import parse_rules_config | ||
from .utils import cached | ||
|
||
RULES_CONFIG = parse_rules_config() | ||
|
||
|
||
@cached | ||
def get_custom_schemas(stack_version: str) -> dict: | ||
"""Load custom schemas if present.""" | ||
custom_schema_dump = {} | ||
stack_schema_map = RULES_CONFIG.stack_schema_map[stack_version] | ||
|
||
for schema, value in stack_schema_map.items(): | ||
if schema not in ["beats", "ecs", "endgame"]: | ||
eric-forte-elastic marked this conversation as resolved.
Show resolved
Hide resolved
|
||
schema_path = Path(value) | ||
if not schema_path.is_absolute(): | ||
schema_path = RULES_CONFIG.stack_schema_map_file.parent / value | ||
if schema_path.is_file(): | ||
custom_schema_dump.update(eql.utils.load_dump(str(schema_path))) | ||
elif schema_path.is_dir(): | ||
custom_schema_dump.update(load_schemas_from_dir(schema_path)) | ||
|
||
return custom_schema_dump | ||
|
||
|
||
def load_schemas_from_dir(schema_dir: Path) -> dict: | ||
"""Load all schemas from a directory.""" | ||
schemas_dump = {} | ||
for file_path in schema_dir.iterdir(): | ||
if file_path.is_file() and file_path.suffix in [".json"]: | ||
eric-forte-elastic marked this conversation as resolved.
Show resolved
Hide resolved
|
||
schemas_dump.update(eql.utils.load_dump(str(file_path))) | ||
|
||
return schemas_dump |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,7 +18,7 @@ | |
import kql | ||
|
||
from . import ecs, endgame | ||
from .config import load_current_package_version | ||
from .config import CUSTOM_RULES_DIR, load_current_package_version | ||
from .integrations import (get_integration_schema_data, | ||
load_integrations_manifests) | ||
from .rule import (EQLRuleData, QueryRuleData, QueryValidator, RuleMeta, | ||
|
@@ -192,11 +192,17 @@ def validate_integration( | |
integration_schema_data["integration"], | ||
) | ||
integration_schema = integration_schema_data["schema"] | ||
stack_version = integration_schema_data["stack_version"] | ||
|
||
# Add non-ecs-schema fields | ||
for index_name in data.index: | ||
integration_schema.update(**ecs.flatten(ecs.get_index_schema(index_name))) | ||
|
||
# Add custom schema fields for appropriate stack version | ||
if data.index and CUSTOM_RULES_DIR: | ||
for index_name in data.index: | ||
integration_schema.update(**ecs.flatten(ecs.get_custom_index_schema(index_name, stack_version))) | ||
eric-forte-elastic marked this conversation as resolved.
Show resolved
Hide resolved
Comment on lines
+201
to
+204
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: This looks identical to lines L396-L400. Might be worth moving to a small method for cleanliness. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree it would be better, but since I think both validation methods have a number of nearly identical code blocks, that this would be better done in a larger refactor. For instance another identical block would be for non-ecs schema fields.
|
||
|
||
# Add endpoint schema fields for multi-line fields | ||
integration_schema.update(**ecs.flatten(ecs.get_endpoint_schemas())) | ||
if integration: | ||
|
@@ -387,6 +393,11 @@ def validate_integration(self, data: QueryRuleData, meta: RuleMeta, | |
for index_name in data.index: | ||
integration_schema.update(**ecs.flatten(ecs.get_index_schema(index_name))) | ||
|
||
# Add custom schema fields for appropriate stack version | ||
if data.index and CUSTOM_RULES_DIR: | ||
for index_name in data.index: | ||
integration_schema.update(**ecs.flatten(ecs.get_custom_index_schema(index_name, stack_version))) | ||
|
||
# add endpoint schema fields for multi-line fields | ||
integration_schema.update(**ecs.flatten(ecs.get_endpoint_schemas())) | ||
package_schemas[package].update(**integration_schema) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if it is one of these three?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should ignore the schema if it is one of those, as those are reserved "schema words" that are by definition not custom. Do you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I think that is the smart approach, but wont this return an empty dict in this case? It would be better to raise an error, forbidding using reserved words. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is a good idea, we will need to have it function such that it will allow the words once, as we would want them to be able to specify beats, etc. or to not use those if desired. But we would not want them to use multiple as the result would be confusing 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upon further testing and reflection, I think the test for additional beats, ecs, or endgame schemas should not be in the
custom_schemas.py
as it might be confusing to have the custom schema loader be validating non-custom schemas. I think this check would be considered increased validation for the schema map in general. As such, I think this could target main, as the check is not DAC specific.