-
Notifications
You must be signed in to change notification settings - Fork 8.5k
[Security Solution] Optimizes the index queries to not block the NodeJS event loop #75716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security Solution] Optimizes the index queries to not block the NodeJS event loop #75716
Conversation
|
Pinging @elastic/siem (Team:SIEM) |
💚 Build SucceededBuild metrics
History
To update your PR or re-run it, just comment with: |
spong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Appreciate the test coverage and detailed comments to ensure the HOT CODE PATH's are easily identified so other developers know the implications of changes to this part of the code base. Many thanks for all the work here @FrankHassanabad!
…ic#75716) ## Summary Before this PR you can see event loop block times of: ```ts formatIndexFields: 7986.884ms ``` After this PR you will see event loop block times of: ```ts formatIndexFields: 85.012ms ``` within the file: ```ts x-pack/plugins/security_solution/server/lib/index_fields/elasticsearch_adapter.ts ``` For the GraphQL query of `SourceQuery`/`IndexFields` This also fixes the issue of `unknown` being returned to the front end by removing code that is no longer functioning as it was intended. Ensure during testing of this PR that blank/default and non exist indexes within `securitySolution:defaultIndex` still work as expected. Before, notice the `unknown` instead of the `filebeat-*`: <img width="733" alt="Screen Shot 2020-08-20 at 4 55 52 PM" src="https://user-images.githubusercontent.com/1151048/90949129-f5047900-e402-11ea-9278-b4c7bf5cd16d.png"> After: <img width="830" alt="Screen Shot 2020-08-20 at 4 56 03 PM" src="https://user-images.githubusercontent.com/1151048/90949133-02b9fe80-e403-11ea-8504-f5bbe043048a.png"> An explanation of how to see the block times for before and after --- For perf testing you first add timed testing to the file: ```ts x-pack/plugins/security_solution/server/lib/index_fields/elasticsearch_adapter.ts ``` Before this PR, around lines 42: ```ts console.time('formatIndexFields'); // <--- start timer const fields = formatIndexFields( responsesIndexFields, Object.keys(indexesAliasIndices) as IndexAlias[] ); console.timeEnd('formatIndexFields'); // <--- outputs the end timer return fields; ``` After this PR, around lines 42: ```ts console.time('formatIndexFields'); // <--- start timer const fields = await formatIndexFields(responsesIndexFields, indices); console.timeEnd('formatIndexFields'); // <--- outputs the end timer return fields; ``` And then reload the security solutions application web page here: ``` http://localhost:5601/app/security/timelines/default ``` Be sure to load it _twice_ for testing as NodeJS will sometimes report better numbers the second time as it does optimizations after the first time it encounters some code paths. You will begin to see numbers similar to this before this PR: ```ts formatIndexFields: 2553.279ms ``` This indicates that it is blocking the event loop for ~2.5 seconds befofe this fix. If you add additional indexes to your `securitySolution:defaultIndex` indexes that have additional fields then this amount will increase exponentially. For developers using our test servers I created two other indexes called delme-1 and delme-2 with additional mappings you can add like below ```ts apm-*-transaction*, auditbeat-*, endgame-*, filebeat-*, logs-*, packetbeat-*, winlogbeat-*, delme-1, delme-2 ``` <img width="980" alt="Screen Shot 2020-08-21 at 8 21 50 PM" src="https://user-images.githubusercontent.com/1151048/90949142-211ffa00-e403-11ea-8ab2-f66de977dce3.png"> Then you are going to see times approaching 8 seconds of blocking the event loop like so: ```ts formatIndexFields: 7986.884ms ``` After this fix on the first pass unoptimized it will report ```ts formatIndexFields: 373.082ms ``` Then after it optimizes the code paths on a second page load it will report ```ts formatIndexFields: 84.304ms ``` ### Checklist Delete any items that are not applicable to this PR. - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
…ic#75716) ## Summary Before this PR you can see event loop block times of: ```ts formatIndexFields: 7986.884ms ``` After this PR you will see event loop block times of: ```ts formatIndexFields: 85.012ms ``` within the file: ```ts x-pack/plugins/security_solution/server/lib/index_fields/elasticsearch_adapter.ts ``` For the GraphQL query of `SourceQuery`/`IndexFields` This also fixes the issue of `unknown` being returned to the front end by removing code that is no longer functioning as it was intended. Ensure during testing of this PR that blank/default and non exist indexes within `securitySolution:defaultIndex` still work as expected. Before, notice the `unknown` instead of the `filebeat-*`: <img width="733" alt="Screen Shot 2020-08-20 at 4 55 52 PM" src="https://user-images.githubusercontent.com/1151048/90949129-f5047900-e402-11ea-9278-b4c7bf5cd16d.png"> After: <img width="830" alt="Screen Shot 2020-08-20 at 4 56 03 PM" src="https://user-images.githubusercontent.com/1151048/90949133-02b9fe80-e403-11ea-8504-f5bbe043048a.png"> An explanation of how to see the block times for before and after --- For perf testing you first add timed testing to the file: ```ts x-pack/plugins/security_solution/server/lib/index_fields/elasticsearch_adapter.ts ``` Before this PR, around lines 42: ```ts console.time('formatIndexFields'); // <--- start timer const fields = formatIndexFields( responsesIndexFields, Object.keys(indexesAliasIndices) as IndexAlias[] ); console.timeEnd('formatIndexFields'); // <--- outputs the end timer return fields; ``` After this PR, around lines 42: ```ts console.time('formatIndexFields'); // <--- start timer const fields = await formatIndexFields(responsesIndexFields, indices); console.timeEnd('formatIndexFields'); // <--- outputs the end timer return fields; ``` And then reload the security solutions application web page here: ``` http://localhost:5601/app/security/timelines/default ``` Be sure to load it _twice_ for testing as NodeJS will sometimes report better numbers the second time as it does optimizations after the first time it encounters some code paths. You will begin to see numbers similar to this before this PR: ```ts formatIndexFields: 2553.279ms ``` This indicates that it is blocking the event loop for ~2.5 seconds befofe this fix. If you add additional indexes to your `securitySolution:defaultIndex` indexes that have additional fields then this amount will increase exponentially. For developers using our test servers I created two other indexes called delme-1 and delme-2 with additional mappings you can add like below ```ts apm-*-transaction*, auditbeat-*, endgame-*, filebeat-*, logs-*, packetbeat-*, winlogbeat-*, delme-1, delme-2 ``` <img width="980" alt="Screen Shot 2020-08-21 at 8 21 50 PM" src="https://user-images.githubusercontent.com/1151048/90949142-211ffa00-e403-11ea-8ab2-f66de977dce3.png"> Then you are going to see times approaching 8 seconds of blocking the event loop like so: ```ts formatIndexFields: 7986.884ms ``` After this fix on the first pass unoptimized it will report ```ts formatIndexFields: 373.082ms ``` Then after it optimizes the code paths on a second page load it will report ```ts formatIndexFields: 84.304ms ``` ### Checklist Delete any items that are not applicable to this PR. - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios # Conflicts: # x-pack/plugins/security_solution/server/lib/index_fields/elasticsearch_adapter.ts # x-pack/plugins/security_solution/server/utils/beat_schema/index.test.ts # x-pack/plugins/security_solution/server/utils/beat_schema/index.ts
… (#75942) ## Summary Before this PR you can see event loop block times of: ```ts formatIndexFields: 7986.884ms ``` After this PR you will see event loop block times of: ```ts formatIndexFields: 85.012ms ``` within the file: ```ts x-pack/plugins/security_solution/server/lib/index_fields/elasticsearch_adapter.ts ``` For the GraphQL query of `SourceQuery`/`IndexFields` This also fixes the issue of `unknown` being returned to the front end by removing code that is no longer functioning as it was intended. Ensure during testing of this PR that blank/default and non exist indexes within `securitySolution:defaultIndex` still work as expected. Before, notice the `unknown` instead of the `filebeat-*`: <img width="733" alt="Screen Shot 2020-08-20 at 4 55 52 PM" src="https://user-images.githubusercontent.com/1151048/90949129-f5047900-e402-11ea-9278-b4c7bf5cd16d.png"> After: <img width="830" alt="Screen Shot 2020-08-20 at 4 56 03 PM" src="https://user-images.githubusercontent.com/1151048/90949133-02b9fe80-e403-11ea-8504-f5bbe043048a.png"> An explanation of how to see the block times for before and after --- For perf testing you first add timed testing to the file: ```ts x-pack/plugins/security_solution/server/lib/index_fields/elasticsearch_adapter.ts ``` Before this PR, around lines 42: ```ts console.time('formatIndexFields'); // <--- start timer const fields = formatIndexFields( responsesIndexFields, Object.keys(indexesAliasIndices) as IndexAlias[] ); console.timeEnd('formatIndexFields'); // <--- outputs the end timer return fields; ``` After this PR, around lines 42: ```ts console.time('formatIndexFields'); // <--- start timer const fields = await formatIndexFields(responsesIndexFields, indices); console.timeEnd('formatIndexFields'); // <--- outputs the end timer return fields; ``` And then reload the security solutions application web page here: ``` http://localhost:5601/app/security/timelines/default ``` Be sure to load it _twice_ for testing as NodeJS will sometimes report better numbers the second time as it does optimizations after the first time it encounters some code paths. You will begin to see numbers similar to this before this PR: ```ts formatIndexFields: 2553.279ms ``` This indicates that it is blocking the event loop for ~2.5 seconds befofe this fix. If you add additional indexes to your `securitySolution:defaultIndex` indexes that have additional fields then this amount will increase exponentially. For developers using our test servers I created two other indexes called delme-1 and delme-2 with additional mappings you can add like below ```ts apm-*-transaction*, auditbeat-*, endgame-*, filebeat-*, logs-*, packetbeat-*, winlogbeat-*, delme-1, delme-2 ``` <img width="980" alt="Screen Shot 2020-08-21 at 8 21 50 PM" src="https://user-images.githubusercontent.com/1151048/90949142-211ffa00-e403-11ea-8ab2-f66de977dce3.png"> Then you are going to see times approaching 8 seconds of blocking the event loop like so: ```ts formatIndexFields: 7986.884ms ``` After this fix on the first pass unoptimized it will report ```ts formatIndexFields: 373.082ms ``` Then after it optimizes the code paths on a second page load it will report ```ts formatIndexFields: 84.304ms ``` ### Checklist Delete any items that are not applicable to this PR. - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios # Conflicts: # x-pack/plugins/security_solution/server/lib/index_fields/elasticsearch_adapter.ts # x-pack/plugins/security_solution/server/utils/beat_schema/index.test.ts # x-pack/plugins/security_solution/server/utils/beat_schema/index.ts
… (#75941) ## Summary Before this PR you can see event loop block times of: ```ts formatIndexFields: 7986.884ms ``` After this PR you will see event loop block times of: ```ts formatIndexFields: 85.012ms ``` within the file: ```ts x-pack/plugins/security_solution/server/lib/index_fields/elasticsearch_adapter.ts ``` For the GraphQL query of `SourceQuery`/`IndexFields` This also fixes the issue of `unknown` being returned to the front end by removing code that is no longer functioning as it was intended. Ensure during testing of this PR that blank/default and non exist indexes within `securitySolution:defaultIndex` still work as expected. Before, notice the `unknown` instead of the `filebeat-*`: <img width="733" alt="Screen Shot 2020-08-20 at 4 55 52 PM" src="https://user-images.githubusercontent.com/1151048/90949129-f5047900-e402-11ea-9278-b4c7bf5cd16d.png"> After: <img width="830" alt="Screen Shot 2020-08-20 at 4 56 03 PM" src="https://user-images.githubusercontent.com/1151048/90949133-02b9fe80-e403-11ea-8504-f5bbe043048a.png"> An explanation of how to see the block times for before and after --- For perf testing you first add timed testing to the file: ```ts x-pack/plugins/security_solution/server/lib/index_fields/elasticsearch_adapter.ts ``` Before this PR, around lines 42: ```ts console.time('formatIndexFields'); // <--- start timer const fields = formatIndexFields( responsesIndexFields, Object.keys(indexesAliasIndices) as IndexAlias[] ); console.timeEnd('formatIndexFields'); // <--- outputs the end timer return fields; ``` After this PR, around lines 42: ```ts console.time('formatIndexFields'); // <--- start timer const fields = await formatIndexFields(responsesIndexFields, indices); console.timeEnd('formatIndexFields'); // <--- outputs the end timer return fields; ``` And then reload the security solutions application web page here: ``` http://localhost:5601/app/security/timelines/default ``` Be sure to load it _twice_ for testing as NodeJS will sometimes report better numbers the second time as it does optimizations after the first time it encounters some code paths. You will begin to see numbers similar to this before this PR: ```ts formatIndexFields: 2553.279ms ``` This indicates that it is blocking the event loop for ~2.5 seconds befofe this fix. If you add additional indexes to your `securitySolution:defaultIndex` indexes that have additional fields then this amount will increase exponentially. For developers using our test servers I created two other indexes called delme-1 and delme-2 with additional mappings you can add like below ```ts apm-*-transaction*, auditbeat-*, endgame-*, filebeat-*, logs-*, packetbeat-*, winlogbeat-*, delme-1, delme-2 ``` <img width="980" alt="Screen Shot 2020-08-21 at 8 21 50 PM" src="https://user-images.githubusercontent.com/1151048/90949142-211ffa00-e403-11ea-8ab2-f66de977dce3.png"> Then you are going to see times approaching 8 seconds of blocking the event loop like so: ```ts formatIndexFields: 7986.884ms ``` After this fix on the first pass unoptimized it will report ```ts formatIndexFields: 373.082ms ``` Then after it optimizes the code paths on a second page load it will report ```ts formatIndexFields: 84.304ms ``` ### Checklist Delete any items that are not applicable to this PR. - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
|
Pinging @elastic/security-solution (Team: SecuritySolution) |
Summary
Before this PR you can see event loop block times of:
After this PR you will see event loop block times of:
within the file:
For the GraphQL query of
SourceQuery/IndexFieldsThis also fixes the issue of
unknownbeing returned to the front end by removing code that is no longer functioning as it was intended. Ensure during testing of this PR that blank/default and non exist indexes withinsecuritySolution:defaultIndexstill work as expected.Before, notice the

unknowninstead of thefilebeat-*:After:

An explanation of how to see the block times for before and after
For perf testing you first add timed testing to the file:
Before this PR, around lines 42:
After this PR, around lines 42:
And then reload the security solutions application web page here:
Be sure to load it twice for testing as NodeJS will sometimes report better numbers the second time as it does optimizations after the first time it encounters some code paths.
You will begin to see numbers similar to this before this PR:
This indicates that it is blocking the event loop for ~2.5 seconds befofe this fix. If you add additional indexes to your
securitySolution:defaultIndexindexes that have additional fields then this amount will increase exponentially. For developers using our test servers I created two other indexes called delme-1 and delme-2 with additional mappings you can add like belowThen you are going to see times approaching 8 seconds of blocking the event loop like so:
After this fix on the first pass unoptimized it will report
Then after it optimizes the code paths on a second page load it will report
Checklist
Delete any items that are not applicable to this PR.