Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept events which fail to parse when 'RenderedText' used #46

Open
dvas0004 opened this issue Sep 29, 2023 · 10 comments
Open

Accept events which fail to parse when 'RenderedText' used #46

dvas0004 opened this issue Sep 29, 2023 · 10 comments

Comments

@dvas0004
Copy link
Contributor

There seem to be instances when if --content-format RenderedText is used in a subscription, windows doesn't send properly formatted XML messages. This in turn causes oenwec to return HTTP error 500 back to the source as it fails to parse the XML (as expected) :

image

I double checked and I very much doubt the malformed XML is an openwec problem (i'd be happy to hear otherwise), but if you turn on tracing you'll observe that the underlying hyper server reports receiving a malformed payload - which happens before openwec does much processing:

image

In the above screenshot note how the event is sent with a truncated RenderingInfo XML tag, though it is correct encapsulated in a SOAP event... again leading me to suspect this is a windows problem rather than openwec. This doesnt happen with all events. The same source on the same subscription will send properly formatted events - the issue seems related to those events which contain tabs/newlines/etc in their rendered info (such as most security events)

In this kind of situation, it might be beneficial to accept the event anyway but mark it as malformed, while saving the original content. Like that the source can keep on sending events (rather than stopping at an HTTP 500), and admins can troubleshoot why an event was not accepted.

I realise this is not ideal in all environments since it depends on your use case... in which case maybe the behavior can be controlled by a CLI flag?

I am using this branch to address the above (though the behavior is hardcoded rather than behind a flag as I suggested): https://github.com/CyberSift/openwec/tree/flexible_error

In this branch if the event contains a malformed event it is anyway stored, but we modified the fields stored to contain an error message and the original event content, for example, note the new Error field containing the ErrorMessage and OriginalContent fields:

{
   "System":{
      "Provider":{
         
      },
      "EventID":0,
      "Computer":""
   },
   "OpenWEC":{
      "IpAddress":"127.0.0.1",
      "TimeReceived":"2023-09-29T07:38:13.575037814+00:00",
      "Principal":"demo-client",
      "Subscription":{
         "Uuid":"91E05B32-F8F6-48CF-8AB4-4038233B83AC",
         "Version":"D5FB6FFA-F3FB-4B0D-9655-398EA6114D0F",
         "Name":"my-test-subscription"
      },
      "Errors":{
         "OriginalContent":"<Event xmlns='http://schemas.microsoft.com/win/2004/08/events/event'><System><Provider Name='Microsoft-Windows-Security-Auditing' Guid='{54849625-5478-4994-a5ba-3e3b0328c30d}'/><EventID>5061</EventID><Version>0</Version><Level>0</Level><Task>12290</Task><Opcode>0</Opcode><Keywords>0x8020000000000000</Keywords><TimeCreated SystemTime='2023-09-29T07:25:36.3365627Z'/><EventRecordID>978308</EventRecordID><Correlation ActivityID='{f59bb999-ec5b-0008-f6b9-9bf55becd901}'/><Execution ProcessID='1440' ThreadID='16716'/><Channel>Security</Channel><Computer>xxxxxxxx</Computer><Security/></System><EventData><Data Name='SubjectUserSid'>S-1-5-20</Data><Data Name='SubjectUserName'>xxxxxxxx$</Data><Data Name='SubjectDomainName'>WORKGROUP</Data><Data Name='SubjectLogonId'>0x3e4</Data><Data Name='ProviderName'>Microsoft Software Key Storage Provider</Data><Data Name='AlgorithmName'>ECDH_P384</Data><Data Name='KeyName'>{6DF73E17-F1C9-481A-9B2F-8B16352E5D6E}</Data><Data Name='KeyType'>%%2499</Data><Data Name='Operation'>%%2480</Data><Data Name='ReturnCode'>0x0</Data></EventData><RenderingInfo Culture='en-US'><Message>Cryptographic operation.&#13;&#10;&#13;&#10;Subject:&#13;&#10;&#9;",
         "ErrorMessage":"Failed to parse event XML"
      }
   }
}
@dvas0004
Copy link
Contributor Author

P.S. if anyone can point me in the right direction to actually get WEF to send me full events rather than truncated events I'd be very grateful 😃

@vruello
Copy link
Contributor

vruello commented Sep 29, 2023

Hi @dvas0004!

I haven't come across such a situation 😞 . I do agree that it seems to be more a problem with the Windows event logging system rather than openwec. To confirm this statement and perhaps try to diagnose the problem:

  • Did you try to look at the problematic event(s) in Windows Event Viewer?
  • Did you try to look at the problematic event(s) directly in .evtx files using a parser such as evtx_dump?
  • What version of Windows are you running, with what language packs installed?
  • Have you been able to reproduce the problem on other computers?
  • Did you search for an event in other channels (such as Applications.../Windows/Eventlog) that might be related to this problem?

As you said, not being able to parse an event will "block" the events stream from that machine for this subscription, which is undesirable in most cases. In fact, except for debugging openwec, I can't find any real-world use cases where the current behavior is better than what you're suggesting. I don't think that the event forwarder is going to "correct" the event it's trying to send because it gets a 500... So I don't even think it's necessary to make it configurable.

If you are willing to create a PR, I would rename Errors to Error since it can only represent one error, and also rename ErrorMessage to Message as it is already in an Error object. Also, in case of an error, I think it would be better not to send the System object at all instead of sending it almost empty... What do you think?

By the way, I would recommend using the Raw content format which seems to be more reliable.

dvas0004 added a commit to CyberSift/openwec that referenced this issue Sep 29, 2023
@dvas0004
Copy link
Contributor Author

  • Did you try to look at the problematic event(s) in Windows Event Viewer?

It appears rendered properly:

image

  • Did you try to look at the problematic event(s) directly in .evtx files using a parser such as evtx_dump?

(PS thanks didnt know about this util! 🙂) They are shown correct but without the RenderInfo field... same as when we use OpenWEC's Raw content type. So probably something is going wrong when the windows event forwarding plugin tries to render the event, then wrap it in a SOAP message for forwarding...

  • What version of Windows are you running, with what language packs installed?

windows 11 with US lang. What may be pertinent is that it's windows "home" edition and not joined to a domain

  • Have you been able to reproduce the problem on other computers?

i will have the opportunity of testing this later.. at the moment this happens on just one PC (my dev laptop)

  • Did you search for an event in other channels (such as Applications.../Windows/Eventlog) that might be related to this problem?

no unexpected error messages

If you are willing to create a PR, I would rename Errors to Error since it can only represent one error, and also rename ErrorMessage to Message as it is already in an Error object. Also, in case of an error, I think it would be better not to send the System object at all instead of sending it almost empty... What do you think?

I agree... I modified the System struct to be Option<System>, but I also tried to make a "best effort". Seeing as that you (and I for that matter) have never seen this behavior except for this particular case, I took a stab at "recovering" the event as you can see here:

https://github.com/CyberSift/openwec/blob/ca7e6b5faf0d622d2018fcc3a0c528f544724e07/server/src/event.rs#L130-L154

The resulting "bad" event looks like this after testing:

{
   "System":{
      "Provider":{
         "Name":"Microsoft-Windows-Security-Auditing",
         "Guid":"{54849625-5478-4994-a5ba-3e3b0328c30d}"
      },
      "EventID":4798,
      "Version":0,
      "Level":0,
      "Task":13824,
      "Opcode":0,
      "Keywords":"0x8020000000000000",
      "TimeCreated":"2023-09-29T13:39:08.7234692Z",
      "EventRecordID":980236,
      "Correlation":{
         "ActivityID":"{f59bb999-ec5b-0008-f6b9-9bf55becd901}"
      },
      "Execution":{
         "ProcessID":1440,
         "ThreadID":16952
      },
      "Channel":"Security",
      "Computer":"xxxx_xps"
   },
   "EventData":{
      "SubjectLogonId":"0x3e7",
      "TargetDomainName":"xxxxx_xps",
      "CallerProcessId":"0x28d4",
      "CallerProcessName":"C:\\\\Windows\\\\System32\\\\svchost.exe",
      "TargetUserName":"davev",
      "SubjectDomainName":"WORKGROUP",
      "SubjectUserName":"xxxxx_XPS$",
      "TargetSid":"S-1-5-21-1604529354-1295832394-4197355770-1001",
      "SubjectUserSid":"S-1-5-18"
   },
   "OpenWEC":{
      "IpAddress":"127.0.0.1",
      "TimeReceived":"2023-09-29T14:33:12.574363325+00:00",
      "Principal":"demo-client",
      "Subscription":{
         "Uuid":"91E05B32-F8F6-48CF-8AB4-4038233B83AC",
         "Version":"523D1886-E73E-4A96-A95D-F0326CB282F0",
         "Name":"my-test-subscription"
      },
      "Errors":{
         "OriginalContent":"<Event xmlns='http://schemas.microsoft.com/win/2004/08/events/event'><System><Provider Name='Microsoft-Windows-Security-Auditing' Guid='{54849625-5478-4994-a5ba-3e3b0328c30d}'/><EventID>4798</EventID><Version>0</Version><Level>0</Level><Task>13824</Task><Opcode>0</Opcode><Keywords>0x8020000000000000</Keywords><TimeCreated SystemTime='2023-09-29T13:39:08.7234692Z'/><EventRecordID>980236</EventRecordID><Correlation ActivityID='{f59bb999-ec5b-0008-f6b9-9bf55becd901}'/><Execution ProcessID='1440' ThreadID='16952'/><Channel>Security</Channel><Computer>dvas0004_xps</Computer><Security/></System><EventData><Data Name='TargetUserName'>davev</Data><Data Name='TargetDomainName'>xxxxx_xps</Data><Data Name='TargetSid'>S-1-5-21-1604529354-1295832394-4197355770-1001</Data><Data Name='SubjectUserSid'>S-1-5-18</Data><Data Name='SubjectUserName'>xxxxx_XPS$</Data><Data Name='SubjectDomainName'>WORKGROUP</Data><Data Name='SubjectLogonId'>0x3e7</Data><Data Name='CallerProcessId'>0x28d4</Data><Data Name='CallerProcessName'>C:\\\\Windows\\\\System32\\\\svchost.exe</Data></EventData><RenderingInfo Culture='en-US'><Message>A user's local group membership was enumerated.&#13;&#10;&#13;&#10;Subject:&#13;&#10;&#9;",
         "ErrorMessage":"Failed to parse event XML"
      }
   }
}

if you like this approach i'll issue a PR next week 🙂otherwise i'll revert the behavior to not sending an empty System struct

By the way, I would recommend using the Raw content format which seems to be more reliable.

you are very right... though after a very quick survey of SOC analyst colleagues, they find the rendered field very useful, and they would like to keep it if possible....

@vruello
Copy link
Contributor

vruello commented Sep 30, 2023

It appears rendered properly:

The fact that the Event Viewer formats it correctly is unexpected. I would assume that the code that builds RenderingInfo would be the same in both cases (event viewer and event forwarding), but maybe it isn't, or the problem comes from somewhere else...

windows 11 with US lang. What may be pertinent is that it's windows "home" edition and not joined to a domain

I installed a Windows 11 Home VM, version 22H2 (build 22621.1413) with the US language pack, but I couldn't reproduce the problem...

  • Can you share your Windows build number?
  • I assume that you are using TLS encryption?
  • Can you share your subscription settings (the output of openwec subscriptions show <name>)?

I agree... I modified the System struct to be Option, but I also tried to make a "best effort". Seeing as that you (and I for that matter) have never seen this behavior except for this particular case, I took a stab at "recovering" the event as you can see here

The "best effort" approach is interesting, especially if the problem occurs regularly (does it?). The Event schema specifies that RenderingInfo must be the last "specified" node of an Event (see event.xsd), so we shouldn't miss anything.
However, I think the implementation can be improved:

  • I would prefer to use rsplit_once (https://doc.rust-lang.org/std/string/struct.String.html#method.rsplit_once) to split on the last occurrence of <RenderingInfo, even if it should be the only one (if some other fields contain a "<RenderingInfo" string, it would be encoded as &lt;RenderingInfo). With this in mind, I'm not sure about calling Event::from_str recursively since we should only try it once.
  • Comments should be added to explain why we need to do this sometimes
  • Some debug logging messages would be a great addition
  • We should add some unit tests to cover these cases

@vruello
Copy link
Contributor

vruello commented Sep 30, 2023

What do you think of the following implementation: vruello@92f20d3? I've used the data you provided to create test cases, I hope it's ok for you.

@dvas0004
Copy link
Contributor Author

dvas0004 commented Oct 1, 2023

Can you share your Windows build number?

Edition Windows 11 Home
Version 22H2
OS build 22621.2283

I assume that you are using TLS encryption?

That's right! Though I doubt the issue is here, as there is no indication of HTTP(S) level issues... the event isnt exactly truncated, only the RenderingInfo field is truncated and then wrapped in valid SOAP

Can you share your subscription settings (the output of openwec subscriptions show )?

./target/release/openwec -c /etc/openwec.toml subscriptions show my-test-subscription
Subscription my-test-subscription
        UUID: 91E05B32-F8F6-48CF-8AB4-4038233B83AC
        Version: 523D1886-E73E-4A96-A95D-F0326CB282F0
        URI: Not configured
        Heartbeat interval: 3600s
        Connection retry count: 5
        Connection retry interval: 60s
        Max time without heartbeat/events: 30s
        Max envelope size: 512000 bytes
        ReadExistingEvents: false
        Content format: RenderedText
        Ignore channel error: true
        Principal filter: Not configured
        Outputs:
        - 0: Enabled: true, Format: Json, Output: Redis(RedisConfiguration { addr: "127.0.0.1:6377", list: "wec" })
        - 1: Enabled: true, Format: Raw, Output: Files(FileConfiguration { base: "/data/logs", split_on_addr_index: None, append_node_name: false, filename: "messages" })
        Enabled: true

Event filter query:

<!-- query.xml -->
<QueryList>
    <Query Id="0">
        <Select Path="Application">*</Select>
        <Select Path="Security">*</Select>
        <Select Path="Setup">*</Select>
        <Select Path="System">*</Select>
    </Query>
</QueryList>

Worth noting I see the same issue on both outputs (files and redis)...

What do you think of the following implementation: vruello@92f20d3? I've used the data you provided to create test cases, I hope it's ok for you.

This is phenomenal - nicely done!!!! 💪
It is absolutely fine by me and if you need any other data or any contribution from me let me know!

The "best effort" approach is interesting, especially if the problem occurs regularly (does it?)

On my test setup, almost all the "security" event have this issue, but i suspect it's because these issues tend to have a lot of newlines and tabs in their rendered content. So it happens very often - in fact I came up my initial implementation of the "recoverable error" to see if it affected all event or just some.... (it affects some, but not all)

@vruello
Copy link
Contributor

vruello commented Oct 1, 2023

Could you enable the Applications and Services logs/Microsoft/Windows/Windows Remote Management/Analytic channel and look at the contents of events 771 (which contain the raw payload sent by the forwarder)? In particular, we want to make sure that it is already malformed here :)

This is a bit more work on your end, but it would be interesting to see what happens with an official Windows "wec" server. I would happily do it myself but for now I can't reproduce the problem...

@dvas0004
Copy link
Contributor Author

dvas0004 commented Oct 2, 2023

Could you enable the Applications and Services logs/Microsoft/Windows/Windows Remote Management/Analytic channel and look at the contents of events 771 (which contain the raw payload sent by the forwarder)? In particular, we want to make sure that it is already malformed here :)

Confirming it's already malformed there:

image

image

This is a bit more work on your end, but it would be interesting to see what happens with an official Windows "wec" server. I would happily do it myself but for now I can't reproduce the problem...

time permitting i'll try this and let you know 🙂

@vruello
Copy link
Contributor

vruello commented Nov 22, 2023

Hi @dvas0004! Have you found some time to try and reproduce this behavior with a Windows Server WEC?

@dvas0004
Copy link
Contributor Author

hey @vruello !!!

unfortunately not 😟😟 but please keep this issue on me and I will try especially if I manage to get a test setup up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants