-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
include source_app label in StatsD #3645
Conversation
lib/statsd_middleware.rb
Outdated
@@ -31,16 +77,24 @@ def call(env) | |||
if path_parameters | |||
controller = path_parameters[:controller] | |||
action = path_parameters[:action] | |||
source_app = env['HTTP_SOURCE_APP_NAME'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# We should never use a dynamic path to apply the tag for the instrumentation,
# since this will permit a rogue actor to increase the number of time series
# exported from the process and causes instability in the metrics system.
@wyattwalter do you see a problem with this given the ominous comment ☝️ . We do perform a whitelist later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose my real question is: does this warrant some changes in our prometheus storage config?
I came across a prior postmortem, where we created a new label that had an infinite number of possible values. In practice, I'm not sure where that breaking point is of how many possible values warrant config changes.
Is trying to calculate the storage impact upfront a practical solution? Or is it best to merge and then monitor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - I'd still like to hear @wyattwalter's opinion on additional tags before it's merged
verify | ||
veteran-id-card | ||
veteran-representative | ||
vic-v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cvalarida I got this list by looking in vets-website for all manifest.json files and finding the line:
"entryName": "foo"
Do you know if this list is complete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's as complete as I think we can get right now. You can verify this by running yarn apps
or (npm run apps
). That'll do the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ideally we'd put this into configuration somehow. Either load it from an endpoint on process startup or during the rendering of the deploy configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created a ticket for this: department-of-veterans-affairs/va.gov-team#4421
lib/statsd_middleware.rb
Outdated
@@ -3,6 +3,52 @@ | |||
class StatsdMiddleware | |||
STATUS_KEY = 'api.rack.request' | |||
DURATION_KEY = 'api.rack.request.duration' | |||
SOURCE_APP_NAMES = %w[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do we intend for FE engineers to know to keep this list up to date? Can you add a comment, so people in the future know that this should be kept up to date manually and FE engineers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Some thoughts and stuff:
- added logic to warn to sentry when an unrecognized
Source-App-Name
header comes in - add instructional comment on how to regenerate
- if we're out of date on the whitelist it isnt that big of a deal, it will result in some small holes in our metrics
lib/statsd_middleware.rb
Outdated
return nil if source_app.nil? | ||
return source_app if SOURCE_APP_NAMES.include?(source_app) | ||
|
||
Raven.capture_message('Unrecognized Source App Request Header', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this send an alert to Sentry? That would create a similar problem if we're generating a job just for that. Though, I suppose we want to know when it happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it sends to sentry, and yes we want to know when this happens.
You're poking at the fact that a malicious user could load up our sidekiq queues? Since we use sidekiq enterprise, we could enforce some rate limiting / throttling on this call to Sentry.
Alternatively, we could use the Rails.logger and setup a metric log filter + alert.
I think I'll just remove it altogether for now and create a new ticket since knowing when this happens can happen a bit later.
lib/statsd_middleware.rb
Outdated
|
||
def instrument_statsd(status, duration, controller, action, source_app) | ||
duration_tags = ["controller:#{controller}", "action:#{action}"] | ||
duration_tags.push("source_app:#{source_app}") unless source_app.nil? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to make sure that every metric sent to statsd with the same name has the same labels too. I'm not sure if null
or an empty string is preferable to the library, but they should all have the same "shape" (i.e. labels).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
K. J/w - what happens if we don't?
Description of change
Adds a
source_app
label to the StatsD metrics that vets-api serves to prometheus. This value comes from a header thatvets-website
passes on each request to vets-api (link to that PR)Testing done
rspec
Testing planned
Applies to all PRs