Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cherry-pick #23678 to 7.x: [Libbeat][New Processor] XML Decode #24049

Merged
merged 1 commit into from
Feb 15, 2021

Conversation

P1llus
Copy link
Member

@P1llus P1llus commented Feb 15, 2021

Cherry-pick of PR #23678 to 7.x branch. Original message:

What does this PR do?

This PR adds 2 components to Libbeat.

  1. It adds a small XML Unmarshal helper function to the common folder in Libbeat, the reason this is added is so that not only processors can utilize it if/when needed, but also inputs. For example I plan to reuse this in http_endpoint to add XML support there as well. Other beats/inputs that would benefit from this is Winlogbeat.
    The helper function expects a []byte with valid XML object(s), and will return it as a struct reusing the names of the XML object(s).
    It supports lists, arrays, nested fields, object identifiers (for example <book seq="1">).
    If the XML is not valid it will return a proper error message describing why it failed.
    For more information on supported formats, please see the included unit test files.

  2. Second component is a decode_xml processor, this is the first processor I made, so please review it carefully. I have used urldecode, mime_type and the decode_json_fields processor as a guideline for this one.
    For the supported objects and more information on supported formats, please see the included unit test files.

Why is it important?

This adds the possibility to read XML files for file/log input, or decoding XML strings in existing message fields, for example XML strings in existing JSON fields.
Implements an easy helper function for other parts of beats to reuse if they want to Unmarshal XML into a struct.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

* stashing before initial commit

* Initial commit

* updating go.sum

* updating it again

* adding feedback from PR comments and removing expandkeys config entry

* Updating changelog

* removing expanded_keys from allowed fields

* adding new changes based on PR comments, a few more changes remains

* moving the xml decoder to its own subpackage based on PR comments

* reverting back to Target being a string pointer, to be able to differentiate between null and empty string

* Updating certain tests to fit the new ignore_failure and ignore_missing options

* Updating unit test to test with missing field

* updating license headers

* adding benchmark test

* benchmark, now also with allocation results

* updating changelog entry

* removing duplicate Changelog entry

* changing changelog entry name to new name

* Simplify error handling and fix race

$ benchcmp old.txt new.txt
benchmark                                             old ns/op     new ns/op     delta
BenchmarkProcessor_Run/single_object-12               15691         15686         -0.03%
BenchmarkProcessor_Run/nested_and_array_object-12     39673         39098         -1.45%

benchmark                                             old allocs     new allocs     delta
BenchmarkProcessor_Run/single_object-12               158            158            +0.00%
BenchmarkProcessor_Run/nested_and_array_object-12     376            374            -0.53%

benchmark                                             old bytes     new bytes     delta
BenchmarkProcessor_Run/single_object-12               8597          8597          +0.00%
BenchmarkProcessor_Run/nested_and_array_object-12     20310         19798         -2.52%

* internal xml to json implementation

* Use internal xml to json decoder

benchmark                                             old ns/op     new ns/op     delta
BenchmarkProcessor_Run/single_object-12               15686         8051          -48.67%
BenchmarkProcessor_Run/nested_and_array_object-12     39098         20540         -47.47%

benchmark                                             old allocs     new allocs     delta
BenchmarkProcessor_Run/single_object-12               158            75             -52.53%
BenchmarkProcessor_Run/nested_and_array_object-12     374            184            -50.80%

benchmark                                             old bytes     new bytes     delta
BenchmarkProcessor_Run/single_object-12               8597          3520          -59.06%
BenchmarkProcessor_Run/nested_and_array_object-12     19798         7824          -60.48%
benchmark                                             old ns/op     new ns/op     delta
BenchmarkProcessor_Run/single_object-12               15686         8051          -48.67%
BenchmarkProcessor_Run/nested_and_array_object-12     39098         20540         -47.47%

benchmark                                             old allocs     new allocs     delta
BenchmarkProcessor_Run/single_object-12               158            75             -52.53%
BenchmarkProcessor_Run/nested_and_array_object-12     374            184            -50.80%

benchmark                                             old bytes     new bytes     delta
BenchmarkProcessor_Run/single_object-12               8597          3520          -59.06%
BenchmarkProcessor_Run/nested_and_array_object-12     19798         7824          -60.48%

* changelog fix

* Update docs

* Add godoc example of xml to json

* updating test name to fit Example naming convention

Co-authored-by: Andrew Kroh <andrew.kroh@elastic.co>
(cherry picked from commit 6839307)
@P1llus P1llus added [zube]: In Review backport Team:Elastic-Agent Label for the Agent team Team:Automation Label for the Observability productivity team Team:Ingest Management Team:Integrations Label for the Integrations team labels Feb 15, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations (Team:Integrations)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/agent (Team:Agent)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ingest-management (Team:Ingest Management)

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Feb 15, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Feb 15, 2021
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: Pull request #24049 opened

  • Start Time: 2021-02-15T20:37:17.994+0000

  • Duration: 95 min 13 sec

  • Commit: 454e7d8

Test stats 🧪

Test Results
Failed 0
Passed 46428
Skipped 4799
Total 51227

Trends 🧪

Image of Build Times

Image of Tests

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 46428
Skipped 4799
Total 51227

@P1llus P1llus merged commit 4e9bc56 into elastic:7.x Feb 15, 2021
@zube zube bot removed the [zube]: Done label May 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Team:Automation Label for the Observability productivity team Team:Elastic-Agent Label for the Agent team Team:Integrations Label for the Integrations team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants