-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sinatra] tag span with request id if it is available in response headers #427
Conversation
8c62952
to
63e6316
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the spirit of what's being done here: extract header tagging to a level where both Rack and Sinatra can use it. This will add a lot of value to our integrations moving forward.
However, I'm not sure to what modules these new responsibilities are assigned, or what they're named is quite right yet.
- The main library probably shouldn't own Rack specific responsibilities.
- HTTP probably shouldn't be a kind of tagging, and the act of tagging itself probably shouldn't know anything about HTTP. HTTP should probably implement tagging behavior.
Maybe we need some set of layered responsibilities here, like: Tagging < HTTPTagger < RackTagger
? Just a very rough idea there.
I think I'd like to see a refactored, alternative design that incorporates the above points, while keeping the general strategy implemented here.
Overall, this is a good start!
test/contrib/sinatra/tracer_test.rb
Outdated
@@ -30,6 +30,10 @@ class TracerTestApp < Sinatra::Application | |||
get '/literal-template' do | |||
erb '<%= msg %>', locals: { msg: 'hello' } | |||
end | |||
|
|||
before do | |||
response.headers['X-Request-ID'] = 'request-id' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given these are integration tests, is it better to force headers here, or to pass them in from rack-test
's #get
method? This kind of change makes some assumptions about how the Sinatra API works. For test resiliency, it might be beneficial to take a more black-box approach here.
test/contrib/sinatra/tracer_test.rb
Outdated
|
||
assert_equal(200, last_response.status) | ||
|
||
spans = @writer.spans() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could drop the ()
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
test/contrib/sinatra/tracer_test.rb
Outdated
@@ -237,4 +244,28 @@ def test_literal_template | |||
assert_equal(0, root.status) | |||
assert_nil(root.parent) | |||
end | |||
|
|||
def test_tagging_request_headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this test covering all the header options? I think it could use some more assertions around both request and response tags. I might suggest looking at Rack's header tagging test, which might not be perfect, but has some identical behavior its testing for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added tests for both cases. Which showed that Sinatra's after
hook doesn't have all response headers yet, so I had to introduce simple rack Middleware to correctly handle that case.
lib/ddtrace/utils/mass_tagger.rb
Outdated
module Datadog | ||
module Utils | ||
# Helper class to used to tag configured headers | ||
module MassTagger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Datadog::Utils::MassTagger
is a bit ambiguous. What kind of object is it tagging? What kind of data is it doing en masse?
I'd suggest making the name more specific, or splitting the behavior into smaller modules/files that are more specific. Perhaps something like: Datadog::Utils::Tagging::Rack::Request
or Datadog::Utils::Tagging::Rack::Response
?
Furthermore... is this the best place for this behavior? I have mixed feelings that something named "Rack" exists in the main library. Rack is a very specific integration, and I think our main library probably should reference a minimal number of external integrations, even if its only doing so by name.
In this sense, my suggestion here would to move this behavior under ddtrace/contrib/rack/utils
namespace. Any other integration that requires this behavior, e.g. Rails or Sinatra which specifies Rack as a dependency anyways, could then require 'ddtrace/contrib/rack/utils/tagging/request
or require 'ddtrace/contrib/rack/utils/tagging/response
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've renamed MassTagger to simply Tagger
. As its quite generic and could be used outside of Rack (i.e. GRPC).
Rack specific code now lives under contrib/rack/tagging
. Thanks for suggesting it, I wasn't sure if code should be shared among 'contribs'.
lib/ddtrace/utils/mass_tagger.rb
Outdated
module Datadog | ||
module Utils | ||
# Helper class to used to tag configured headers | ||
module MassTagger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add unit tests for this module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed this to Tagger. And it now has tests 👍
@@ -32,18 +36,19 @@ module Tracer | |||
option :tracer, default: Datadog.tracer | |||
option :resource_script_names, default: false | |||
option :distributed_tracing, default: false | |||
option :headers, default: Utils::MassTagger::DEFAULT_HEADERS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that both integrations will have the same set of default options. DRY is good!
However, something about the name of the constant it sources from doesn't sit quite right: why does a MassTagger
own a list of default headers? Does tagging implicitly include HTTP? Or does HTTP instead implement tagging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved that to RequestSpanMiddleware which I think might be more appropriate
end | ||
end | ||
def set_header_tags!(span, env, headers) | ||
Datadog::Utils::MassTagger.tag( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks cleaner!
@@ -169,6 +159,10 @@ def set_request_tags!(request_span, env, status, headers, response, original_env | |||
end | |||
end | |||
|
|||
def configuration | |||
Datadog.configuration[:rack] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
docs/GettingStarted.md
Outdated
@@ -980,6 +980,7 @@ Where `options` is an optional `Hash` that accepts the following parameters: | |||
| ``resource_script_names`` | Prepend resource names with script name | ``false`` | | |||
| ``distributed_tracing`` | Enables [distributed tracing](#distributed-tracing) so that this service trace is connected with a trace of another service if tracing headers are received | `false` | | |||
| ``tracer`` | A ``Datadog::Tracer`` instance used to instrument the application. Usually you don't need to set that. | ``Datadog.tracer`` | | |||
| ``headers`` | Hash of HTTP request or response headers to add as tags to the `rack.request`. Accepts `request` and `response` keys with Array values e.g. `['Last-Modified']`. Adds `http.request.headers.*` and `http.response.headers.*` tags respectively. | ``{ response: ['Content-Type', 'X-Request-ID'] }`` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy/paste error here: it doesn't add to the rack.request
, should be Sinatra.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Thanks for review @delner. I think the code looks better now, and is ready for next review. |
We chatted a little more about this via Slack. Overall, I think the general direction of refactoring some generic Rack behavior out of the tracer and into modules is a great way to go. This will help make our design a bit easier to use and integrate into other Rack-based contribs. For now, in the interest of simplifying things, I'd suggest extracting out some of these broader Rack changes into its own refactor PR, so its clearer what is necessary for this Sinatra feature, and what's a general improvement upon other parts of the code. That would allow us to focus on getting the Sinatra portion merged, so we could afterwards turn our attention to improving code sharing between the contribs. |
f429673
to
5b21ab9
Compare
I've extracted previous changes to #432. While I've kept changes in this PR to minimum |
888a118
to
7ae275e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for extracting those changes, this is now much simpler and easier to evaluate!
I have a big question regarding the use of middleware, and then some smaller questions regarding implementation/feature details. See below.
@@ -71,6 +76,8 @@ def render(engine, data, *) | |||
end | |||
end | |||
|
|||
app.use RequestSpanMiddleware |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of me likes the idea of adding this middleware, but part of me is a little confused on the boundaries here.
- What parts of tracing is
Datadog::Contrib::Sinatra::Tracer
responsible for? And this new middleware? - Are these co-dependent?
- Are they doing things the other can't?
I think I need some clarification on these points so I can understand how this design improves the Sinatra tracer by some measure of coupling, simplicity, and extensibility. I'm concerned we're introducing two co-dependent modules here which increases complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good questions.
I've introduced the RequestSpanMiddleware
because its actually impossible to fetch all response headers from app.after
hook, as it this filter happens before the response is fully rendered.
As such I think its still worth to have Sinatra specific instrumentation in Tracer
as it provides access to Sinatra's specific code and information, which might not be possible from within Rack Middleware.
One thing that could be improved here though is coupling via calling order. I.e. Sinatra::Tracer relies on the span to be created with RequestSpanMiddleware
. While in general this is not a problem because Middleware is always executed before. However its still unnecessary coupling I'll address that in next commit
span = tracer.trace('sinatra.request', | ||
service: Datadog.configuration[:sinatra][:service_name], | ||
span_type: Datadog::Ext::HTTP::TYPE) | ||
span = RequestSpanMiddleware.fetch_sinatra_trace(request.env) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below theres a @datadog_request_span
. Is that obsolete? Should it just be using this middleware to drive span storage/retrieval?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems I missed your changes when backporting some of these concepts. I've reallign it now to use RequestSpan
module to process creation, storage and retrieval of spans.
@@ -4,6 +4,7 @@ | |||
class TracerTest < TracerTestBase | |||
class TracerTestApp < Sinatra::Application | |||
get '/request' do | |||
headers['X-Request-Id'] = request.env['HTTP_X_REQUEST_ID'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since current implementation only observes existing headers. So to actually test request_id handling I need to somehow inject the header into response. I'll clarify that in the comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see; so this is to emulate a Sinatra app that adds a request ID header to its response? Seems okay to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes exactly.
def call(env) | ||
span = request_span!(env) | ||
_, headers, = @app.call(env) | ||
ensure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this ensure
here, and set_error
below in tracer.rb
. Are we appropriately setting failed requests with errors? And where should this be done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good question. Tracer uses Sinatra facilities to process error, which I think is appropriate in this context.
286e59e
to
98af086
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have a bug in this design, but I proposed an idea of how we could fix it, while improving on the design here a little further.
Let me know what you think about the proposal; if that sounds good to you, I think it might get us to "approved" status. (Very optimistic we're close on this one!)
env[SINATRA_REQUEST_SPAN] ||= build_span(env) | ||
end | ||
|
||
def fetch_span(env) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a functional difference between this and #span!
? Is there some place in our code you'd want to fetch
a span and get nothing back? And that would be okay?
If not, couldn't you just use #span!
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pawelchcki thoughts here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added fetch_span
to support the existing case in Tracer
response filter, where an error is logged if span doesn't exist. (Not sure how valid that approach is, but It wasn't something I wanted to change at this point).
span!
will populate the env
if the span does not exist hence !
to indicate that. fetch_span
is only meant to hide implementation detail of env[SINATRA_REQUEST_SPAN]
span = RequestSpan.span!(env) | ||
|
||
# Request headers | ||
parse_request_headers(env).each do |name, value| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I understand correctly, you mentioned that the reason for doing this header tagging in the middleware, instead of in the existing tracer extension, was because we can only access some data we need from this middleware layer. Is that right?
If that is correct, what data can we exclusively access from this middleware, that we can't access in the tracer?
status, headers, response_body = @app.call(env) | ||
|
||
# Response headers | ||
parse_response_headers(headers).each do |name, value| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you confirm if this is the call order?
Middleware.call(env)
RequestSpan.span!(env)
Datadog::Tracer.tracer.trace('sinatra.request')
parse_request_headers(env)
span.set_tag
@app.call(env)
app.before
app.after
span.finish
parse_response_headers(env)
span.set_tag
If so, I think there's a problem with having a span.set_tag
call after we do span.finish
. We shouldn't modify spans that have already been finished, because they might be immediately flushed by the writer. All modifications to the span should happen before its closed.
My proposal would be to change this HeaderTaggerMiddleware
into TracerMiddleware
that is responsible for both starting and finishing the span. Then we should refactor as much of the behavior as we can from the Tracer
into this Middleware
instead. For whatever can't move into this middleware (e.g. setting the resource as the route, tags, etc), the existing Tracer
extension can instead access & modify the span via env
, like you've had it doing already.
Beyond solving the existing span start/finish problem, what I like about this is that the Tracer
extension effectively becomes an extension to this middleware, and the middleware is no longer co-dependent on the Tracer
extension to be in place. It also means Sinatra app users could, in theory, extend or otherwise modify the Tracer
module to introduce some custom modifications to the Sinatra tracing.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it looks like TracerMiddleware
is better name when it handles whole lifecycle of the span.
As for old Tracer
its still there to fill in Sinatra provided data that is not otherwise available from Rack context. Currently both do not depend on each other. I.e. they will work independently but to get good span information both should run.
I don't think there is an alternative approach other than not using Rack Middleware
and not being able to tag every header in the response.
span.set_tag(name, value) if span.get_tag(name).nil? | ||
end | ||
|
||
span.finish |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this'll need to be in an ensure
block, and only call finish
for the span if it isn't finished. Otherwise if the application raises an error, we could have an unclosed span which would cause the trace to disappear. We should also add a test to verify this doesn't happen (if we don't already have one.)
+ ensure span is finished
@@ -71,52 +76,30 @@ def render(engine, data, *) | |||
end | |||
end | |||
|
|||
app.use HeaderTaggerMiddleware |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You renamed the file but didn't rename the constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for that, some brain meltdown happened
module Contrib | ||
module Sinatra | ||
# Middleware used for automatically tagging configured headers and handle request span | ||
class HeaderTaggerMiddleware |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Constant needs to be renamed here too?
@delner I've addressed the comments - its ready for next review round. Feel free to merge this if you find it ok. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Thanks for your patience and hard work on this one @pawelchcki !
💃 Thanks for help and reviews! First PRs are often a bit involved. Happy to see this one merged. |
Add header tagging support to Sinatra in order to support tagging request id.
This PR refactors out the implementation of header tagging from Rack and makes it more generic - able to be used in other situations that require tagging based on a list of potential tags.
Related to: #375