Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of pattern matching in OpenMetrics #5764

Merged
merged 3 commits into from
Feb 20, 2020

Conversation

ahmed-mez
Copy link
Contributor

@ahmed-mez ahmed-mez commented Feb 17, 2020

What does this PR do?

Follow-up on https://github.com/DataDog/integrations-core/pull/5759/files#r380296940

Compile and cache wildcards instead of calling fnmatchcase to match metric names.

Motivation

Perf improvment

Additional Notes

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • PR must have changelog/ and integration/ labels attached

self.TELEMETRY_COUNTER_METRICS_IGNORE_COUNT, len(metric.samples), scraper_config
)
return # Ignore the metric
if scraper_config['_ignored_patterns']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can also wrap if metric.name in scraper_config['_ignored_metrics'] if you change to if config['ignore_metrics']

@ofek
Copy link
Contributor

ofek commented Feb 17, 2020

Just to clarify: I was saying fnmatchcase is super slow, as is everything else in the fnmatch module

The performant way to do this is calling https://docs.python.org/3/library/fnmatch.html#fnmatch.translate on all patterns, then '|'.join(patterns), then compile that and cache it

In the end you just call .search once

@ahmed-mez ahmed-mez force-pushed the ahmed-mez/check-ignore_metrics-config branch from a3d2c79 to 81a67c4 Compare February 18, 2020 09:23
@ahmed-mez
Copy link
Contributor Author

@ofek yes I got you point and this is not related to fnmatch. Just to avoid two unnecessary lookups in case ignore_metrics wasn't configured in the first place.

@vboulineau
Copy link
Contributor

Just to clarify: I was saying fnmatchcase is super slow, as is everything else in the fnmatch module

The performant way to do this is calling https://docs.python.org/3/library/fnmatch.html#fnmatch.translate on all patterns, then '|'.join(patterns), then compile that and cache it

In the end you just call .search once

Out of curiosity, I wanted to bench this good idea (on the Mac + Python 3.7) on a KSM output (320M), time for 10 runs.

./test_py.py
With 5 excludes
FNMatch: 41.420358180999756
RESearch: 16.89269208908081
With 10 excludes
FNMatch: 78.64589977264404
RESearch: 22.81194281578064

Worth changing to this suggestion!

@ahmed-mez
Copy link
Contributor Author

pushed an attempt with translate, compile and search.

ofek
ofek previously approved these changes Feb 18, 2020
vboulineau
vboulineau previously approved these changes Feb 18, 2020
@ahmed-mez
Copy link
Contributor Author

@ofek shouldn't we implement a similar logic to match whitelisted metrics? https://github.com/DataDog/integrations-core/blob/master/datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py#L564
I'd be happy to implement it in this PR.

@ofek
Copy link
Contributor

ofek commented Feb 18, 2020

Sure!

@ofek ofek changed the title Small perf improvement Improve performance of pattern matching in OpenMetrics Feb 18, 2020
@ahmed-mez ahmed-mez dismissed stale reviews from vboulineau and ofek via ccdec75 February 18, 2020 17:16
@ahmed-mez ahmed-mez requested a review from ofek February 18, 2020 17:22
@ahmed-mez ahmed-mez force-pushed the ahmed-mez/check-ignore_metrics-config branch from ccdec75 to e9b2e8b Compare February 18, 2020 17:27
@ahmed-mez ahmed-mez merged commit efde0a9 into master Feb 20, 2020
@ahmed-mez ahmed-mez deleted the ahmed-mez/check-ignore_metrics-config branch February 20, 2020 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants