-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Be able to correlate timeouts in reverse-proxy layer in front of Synapse (tag specific request headers) #13786
Changes from all commits
2a3f3c7
7b73d0d
e244ee5
b52f921
8aa4be5
1e4f918
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Add `opentracing.request_headers_to_tag` config to specify which request headers extract and tag traces with in order to correlate timeouts in reverse-proxy layers in front of Synapse with traces. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -173,6 +173,21 @@ async def get_user_by_req( | |
parent_span.set_tag("device_id", requester.device_id) | ||
if requester.app_service is not None: | ||
parent_span.set_tag("appservice_id", requester.app_service.id) | ||
|
||
# Tag any headers that we need to extract from the request. This | ||
# is useful to specify any headers that a reverse-proxy in front | ||
# of Synapse may be sending to correlate and match up something | ||
# in that layer to a Synapse trace. ex. when Cloudflare times | ||
# out it gives a `cf-ray` header which we can also tag here to | ||
# find the trace. | ||
for header_name in self.hs.config.tracing.request_headers_to_tag: | ||
headers = request.requestHeaders.getRawHeaders(header_name) | ||
MadLittleMods marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if headers is not None: | ||
parent_span.set_tag( | ||
SynapseTags.REQUEST_HEADER_PREFIX + header_name, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not really familiar with opentracing tags, is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The tag can be whatever. And since it's the name of a header, I think it's probably easiest to follow with the actual header name. Similar to what we do with function arguments but those happen to conform to the Python snake casing already. These are the OpenTelemetry docs but here are some naming conventions, https://opentelemetry.io/docs/reference/specification/common/attribute-naming/ |
||
str(headers[0]), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why the call to See: |
||
) | ||
Comment on lines
+177
to
+189
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've tested that this works locally but I'm not sure if the Based on the following documentation and that we see it in our haproxy logs already, I assume this will be viable,
|
||
|
||
return requester | ||
|
||
@cancellable | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not to throw a wrench in these plans, but if other reverse proxies do something similar; I wonder if it would make more sense to have a header which could be the source for
SynapseRequest.request_id
(instead of generating a new one):synapse/synapse/http/site.py
Lines 174 to 175 in 9772e36
Would that be clearer instead of having two separate IDs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Seems reasonable to me and a little cleaner. I'll create a new PR with this solution ⏩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this would make it harder to track things in the logs or anything, but seems like it would be a lot cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried this out in #13801
Losing the simplicity of the incrementing integer to reference in logs as you scroll kinda sucks though. I'm leaning back to this option I think 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that really used though? I usually just copy and paste for grep. Once the number gets beyond a few digits it is useless anyway IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not the one having to log dive and debug big Synapse instances so I don't know. In tests, we will still have the same sequential request ID's since this config won't be defined. We're not changing any default either so it's basically opt-in.
If anyone has a strong desire for the old way, we can always revert as well. If you're happy with #13801, we can move forward with that ⏩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like we should ask folks real quick then!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for asking in #synapse-dev, https://matrix.to/#/!vcyiEtMVHIhWXcJAfl:sw1v.org/$kMyUMCOU4fZpH6AOUPOIqGyZADn38ebt2kn2wo9cAXU?via=matrix.org&via=element.io&via=beeper.com ❤️
Discussion being tracked in #13801 (comment)