Skip to content

Conversation

@AsuNa-jp
Copy link
Contributor

@AsuNa-jp AsuNa-jp commented May 30, 2024

Change Summary

This PR adds new metrics fields for the following events. Furthermore, I added the missing fields to sample_event.json.

  • ETW WMI Activity Provider events
    • Endpoint.metrics.system_impact.wmi_events.week_idle_ms
    • Endpoint.metrics.system_impact.wmi_events.week_ms
  • Credential access events
    • Endpoint.metrics.system_impact.cred_access_events.week_idle_ms
    • Endpoint.metrics.system_impact.cred_access_events.week_ms
  • Ransomware events
    • Endpoint.metrics.system_impact.ransomware.week_idle_ms
    • Endpoint.metrics.system_impact.ransomware.week_ms
  • Process Injection events
    • Endpoint.metrics.system_impact.process_injection.week_idle_ms
    • Endpoint.metrics.system_impact.process_injection.week_ms
  • Memory scan events
    • Endpoint.metrics.system_impact.memory_scan.week_idle_ms
    • Endpoint.metrics.system_impact.memory_scan.week_ms
  • Behavior protection events
    • Endpoint.metrics.system_impact.behavior_protection.week_idle_ms
    • Endpoint.metrics.system_impact.behavior_protection.week_ms
  • Diagnostic behavior protection events
    • Endpoint.metrics.system_impact.diagnostic_behavior_protection.week_idle_ms
    • Endpoint.metrics.system_impact.diagnostic_behavior_protection.week_ms
  • ETW TCP/IP Provider events
    • Endpoint.metrics.system_impact.tcpip_events.week_idle_ms
    • Endpoint.metrics.system_impact.tcpip_events.week_ms
  • Attack surface events
    • Endpoint.metrics.system_impact.attack_surface_events.week_idle_ms
    • Endpoint.metrics.system_impact.attack_surface_events.week_ms
  • Volume device events  (CC: @Trinity2019 )
    • Endpoint.metrics.system_impact.volume_device_events.week_idle_ms
    • Endpoint.metrics.system_impact.volume_device_events.week_ms
    • Endpoint.metrics.documents_volume.volume_device_events.sent_bytes
    • Endpoint.metrics.documents_volume.volume_device_events.sent_count
    • Endpoint.metrics.documents_volume.volume_device_events.suppressed_bytes
    • Endpoint.metrics.documents_volume.volume_device_events.suppressed_count

Sample values (Example No.1)

                    "cred_access_events": {
                        "week_idle_ms": 0,
                        "week_ms": 20
                    },
                    "wmi_events": {
                        "week_idle_ms": 0,
                        "week_ms": 10
                    },
                    "process_injection": {
                        "week_idle_ms": 0,
                        "week_ms": 100
                    },
                    "memory_scan": {
                        "week_idle_ms": 0,
                        "week_ms": 30
                    },
                    "behavior_protection": {
                        "week_idle_ms": 0,
                        "week_ms": 40
                    },
                    "diagnostic_behavior_protection": {
                        "week_idle_ms": 0,
                        "week_ms": 50
                    },
                    "tcpip_events": {
                        "week_idle_ms": 0,
                        "week_ms": 10
                    },
                    "volume_device_events": {
                        "week_idle_ms": 0,
                        "week_ms": 20
                    },
                    "attack_surface_events": {
                        "week_idle_ms": 0,
                        "week_ms": 10
                    }

Sample values (Example No.2)

                "volume_device_events": {
                    "suppressed_count": 0,
                    "suppressed_bytes": 0,
                    "sent_count": 2,
                    "sent_bytes": 3783
                },

Release Target

8.15.

Q/A

For mapping changes:

  • I ran make after making the schema changes, and committed all changes
  • If these field(s) are "exception"-able, I made a companion PR to Kibana adding it (see Readme)
  • If this is a metadata change, I also updated both transform destination schemas to match

@AsuNa-jp AsuNa-jp requested a review from a team as a code owner May 30, 2024 06:28
@AsuNa-jp AsuNa-jp requested review from szwarckonrad and tomsonpl May 30, 2024 06:28
@AsuNa-jp AsuNa-jp self-assigned this May 30, 2024
- name: metrics.system_impact.wmi_events.week_idle_ms
level: custom
type: unsigned_long
index: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these will be shipped as index: false and doc_values: false

which is roughly the same as not mapping them at all. They will be present in _source, but not indexed or searchable.

Generally, one only needs to update the mappings in the package when a field needs to be indexed for search. Everything else is freely returned in _source as it was sent in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pzl

Thank you for pointing out that setting index: false and doc_values: false will make no mappings.

However, on the endpoint side, if these fields are not in the endpoint-package, an error occurs as follows. Therefore, although I do not intend to map them, I would like to add them to the endpoint-package.

Endpoint.metrics.system_impact.keyboard_events.week_idle_ms - Did not pass ecs_type cast check: Term not found in validation schema
Endpoint.metrics.system_impact.keyboard_events.week_ms - Did not pass ecs_type cast check: Term not found in validation schema
2023-08-23 02:40:25.051 - ERROR: es_client.__check_schema:241  Document is not valid ECS:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are field limits, caps, and performance concerns with adding a lot of fields, and not a lot of benefit if those fields are then not actually indexed.

The validation errors you posted sound like an external (to this repo) validation tool that has built-in the assumption that documents sent by Endpoint must be fully-mapped here in the endpoint package. On the longer term, it sounds like that assumption needs to be challenged, and we find a way to describe or validate fields like this, when it does not make sense to add them to the package mappings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR - I suggest that we don't merge this PR.

☝️ @nfritts It looks like our current design is having downstream impact. We might need to address this sooner than previously planned.

@pzl We're considering this internally also, but needed to consider the ramifications of making a breaking change.

Option 1 -
We just leave these fields as permanently unmapped. It sounds like this might not actually be a problem - just a misunderstanding?
We could even consider removing some of the existing non-indexed mappings - since they appear to be doing more harm than good?

Option 2 -
We design and implement a new approach to these metrics.
And include an advanced policy flag to include the old-style fields in the document also - but have it off by default (and mentioned in the release notes).

The original design of the detailed system impact metrics document requires new fields for each new metric type. An alternative approach would be to have a array with a type field -

system_impact.detailed: [
    { type: "keyboard_events", week_idle_ms: 12, week_idle_ms: 34 },
    { type: "wmi_events", week_idle_ms: 56, week_idle_ms: 78 }
]

Of note, this PR includes quite a few fields that we discovered we weren't generating impact metrics for after an audit by @gabriellandau. The fact that their absence hasn't caused any issues makes me suspect that a breaking change will actually have little customer impact here.

Copy link
Contributor

@intxgo intxgo Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pzl It's unfortunate that this this was not caught earlier. We can skip validating those fields, but endpoint-package is also a source of public facing custom documentation for endpoint documents. If we don't add them here, these fields won't be documented and that's bad as we need to provide for some customers an exact description of all documents nodes send up the stack from endpoint.

cc @AsuNa-jp @nfritts @ferullo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears I wasn't aware of the discussion https://elastic.slack.com/archives/CP7FEJS4U/p1718197314863479

all seems agreed

@pzl pzl requested review from pzl and removed request for szwarckonrad and tomsonpl May 30, 2024 13:47
@AsuNa-jp
Copy link
Contributor Author

@pzl Thank you for the review!
I have also asked my team members to review this PR. As a result, we are currently having various discussions, and I will update this PR once those discussions are finished.

@AsuNa-jp AsuNa-jp requested review from jdu2600 and removed request for jdu2600 June 7, 2024 13:31
@AsuNa-jp AsuNa-jp changed the title WMI Activity API events metrics WMI Activity API events (and + 7 events) metrics Jun 10, 2024
@AsuNa-jp AsuNa-jp requested a review from jdu2600 June 10, 2024 03:36
@AsuNa-jp AsuNa-jp changed the title WMI Activity API events (and + 7 events) metrics WMI Activity API events (and + 9 events) metrics Jun 12, 2024
@AsuNa-jp AsuNa-jp requested a review from Trinity2019 June 12, 2024 09:03
@AsuNa-jp
Copy link
Contributor Author

@pzl
After discussing with our team, I have added the metrics to the endpoint-package that we previously overlooked. I would appreciate it if you could review this PR again.

Additionally, I have answered above regarding why I set index: false and doc_values: false.

type: long
description: Total size of suppressed documents

- name: metrics.documents_volume.volume_device_events.sent_count
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole section will be indexed, is that intentional?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think volume_device is a production event, as long as the related metrics behavior align with the other production events we're good.

@Trinity2019
Copy link
Contributor

Thanks for adding the volume_device fields, very much appreciated @AsuNa-jp !

@AsuNa-jp AsuNa-jp closed this Jul 1, 2024
@AsuNa-jp AsuNa-jp deleted the 8.15_wmi_events branch October 4, 2024 04:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants