Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exclude lists for flask integration #630

Merged
merged 14 commits into from
May 3, 2020

Conversation

lzchen
Copy link
Contributor

@lzchen lzchen commented Apr 29, 2020

Leverage global configurations to allow users to specify paths and hosts that they do not want to trace within their Flask applications.

OPENTELEMETRY_PYTHON_FLASK_EXCLUDED_HOSTS and OPENTELEMETRY_PYTHON_FLASK_EXCLUDED_PATHS are the env variables used. Use a comma delimited string to represent seperate hosts/paths to blacklist.

TODO:

  1. Perhaps allow users to pro-grammatically configure these (through code, maybe Configurations should be modifiable through code as well)
  2. Related to [make suppress_instrumentation emit a defaultspan #581]. Perhaps we should emit a default span instead of not patching. I don't really see a benefit for this though.

@lzchen lzchen requested a review from a team April 29, 2020 22:53
@ocelotl
Copy link
Contributor

ocelotl commented Apr 29, 2020

I think this fixes #565

@lzchen
Copy link
Contributor Author

lzchen commented Apr 29, 2020

@ocelotl Oh yes, thanks for including that!

Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking on this work! My suggestions are mostly around updating the terminology to use excluded/exclude list. Otherwise this is looking good.

ext/opentelemetry-ext-flask/CHANGELOG.md Outdated Show resolved Hide resolved
ext/opentelemetry-ext-flask/README.rst Outdated Show resolved Hide resolved
opentelemetry-api/src/opentelemetry/util/__init__.py Outdated Show resolved Hide resolved
opentelemetry-api/src/opentelemetry/util/__init__.py Outdated Show resolved Hide resolved
opentelemetry-api/src/opentelemetry/util/__init__.py Outdated Show resolved Hide resolved
@codeboten
Copy link
Contributor

Just looked at the go repo and it looks like they're using the term filter:
open-telemetry/opentelemetry-go@2ef25ea#diff-a86e092adb1036300f3e9802aaca1e48R44-R46

@lzchen
Copy link
Contributor Author

lzchen commented Apr 30, 2020

@codeboten
Made the changes you've requested.

@lzchen lzchen changed the title Add blacklisting for flask integration Add exclude lists for flask integration Apr 30, 2020
Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@codeboten codeboten added this to the Beta v0.7 milestone Apr 30, 2020
@ocelotl
Copy link
Contributor

ocelotl commented Apr 30, 2020

Leverage global configurations to allow users to specify paths and hosts that they do not want to trace within their Flask applications.

OPENTELEMETRY_PYTHON_FLASK_EXCLUDED_HOSTS and OPENTELEMETRY_PYTHON_FLASK_EXCLUDED_PATHS are the env variables used. Use a comma delimited string to represent seperate hosts/paths to blacklist.

TODO:

  1. Perhaps allow users to pro-grammatically configure these (through code, maybe Configurations should be modifiable through code as well)
  2. Related to [make suppress_instrumentation emit a defaultspan #581]. Perhaps we should emit a default span instead of not patching. I don't really see a benefit for this though.

I think we need the configuration object to be immutable. Since it is global, and every component of OpenTelemetry can use it to read configuration, if a component changed one of its attributes this could break another component that relied on the original value of the same attribute.

In order to avoid doing so, the BaseInstrumentor interface for instrument supports optional arguments in **kwargs that can be used to pass values that can override the configuration attributes read in the instrument (or uninstrument) code. In this way, the user has a programmatic control of the very same options that can be configured for automatic instrumentation without putting any other component at risk.

Copy link
Contributor

@ocelotl ocelotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments suggesting using regular expressions in the environment variable(s).

opentelemetry-api/src/opentelemetry/util/__init__.py Outdated Show resolved Hide resolved
Comment on lines +165 to +166
excluded_hosts = configuration.Configuration().FLASK_EXCLUDED_HOSTS
excluded_paths = configuration.Configuration().FLASK_EXCLUDED_PATHS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this separation necessary? I think we can just define a sequence of regular expressions in one environment variable and if the hostname or path matches any regular expression it is rejected. This will be useful for the end user who would not need to repeat a lot of related paths in the environment variable, but use a single regex that matches them all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry a little bit about that, as it makes it hard if you happen to have a host or path that share the same name.

Horrible example, but what about the following:

host: localhost
path: check_if_localhost

and you only want to filter for the localhost host.

@toumorokoshi
Copy link
Member

Perhaps allow users to pro-grammatically configure these (through code, maybe Configurations should be modifiable through code as well

I think we should just do this work in Configurations itself, But it's great that this PR is utilization the global configurations and can leverage that behavior.

Copy link
Member

@toumorokoshi toumorokoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good! But I think the Configuration clearing mechanism should be added so we don't have the hacky re-init and slots.

Or if you want to file an issue for that and take it on as a follow-up PR, I'm fine with that too.

Copy link
Member

@toumorokoshi toumorokoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll switch this to approve, the config work can be picked up in a follow-up PR.

Copy link
Contributor

@ocelotl ocelotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there might be a bug in the regex, please take a look, @lzchen.

*************
Excludes certain hosts and paths from being tracked. Pass in comma delimited string into environment variables.

Excluded hosts: OPENTELEMETRY_PYTHON_FLASK_EXCLUDED_HOSTS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs to be explained here that the values of these environment variables are regexes that should match with everything but the r"(https?|ftp)://.*?/" (are there any other protocols besides http, https and ftp that we want to take into consideration?) part of the URL the user wants excluded from tracing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've included an explanation of what hosts and paths means. I don't think we need to inform the user of how the implementation is actually doing this. As for other protocols, these are the ones being excluded in OpenCensus, any new protocols I feel can be added later on.

opentelemetry-api/src/opentelemetry/util/__init__.py Outdated Show resolved Hide resolved
Copy link
Contributor

@ocelotl ocelotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

@toumorokoshi toumorokoshi merged commit 1d84ee9 into open-telemetry:master May 3, 2020
@lzchen lzchen deleted the blacklist branch May 4, 2020 03:45
toumorokoshi pushed a commit that referenced this pull request May 5, 2020
It is a common occurrence in tests that the global Configuration object needs to be "reset" between tests. This means that its attributes need to be set back to their original values. Since the Configuration object is immutable by design, some additional, non-production available mechanism is needed to perform this action.

The need for this feature was mentioned in a conversation in #630.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants