This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Be able to correlate timeouts in reverse-proxy layer in front of Synapse (pull request ID from header) #13801
Be able to correlate timeouts in reverse-proxy layer in front of Synapse (pull request ID from header) #13801
Changes from 10 commits
103aa86
41d5244
b448894
51cb363
f6fb0c8
bf76f22
2d09324
b1f527f
f851f58
3d73210
5e79b02
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we don't use the simple sequential ID when this config is defined, it might harder to follow each request in the logs (have to remember a random string like
GET-74a2179568cf26ff-MSP
vs an incrementing integer,GET-17
). But when the sequential number is sufficiently big, it's hard to remember as well.In tests, we still use the same sequential request ID's since this config won't be defined there. We're not changing any default either so it's basically opt-in but the plan is to define this on
matrix.org
so we can correlate CloudFlare timeouts with the trace in Synapse.Is there any strong preference for keeping the
GET-17
,POST-541
, etc? Any real world usefulness from debugging a big Synapse instance?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's often easier in my experience to know the ordering of requests from their number rather than timestamp, given punctuation can obscure the problem:
2022-09-13 12:15:08
and2022-09-13 12:16:09
look very similar in a pile of logs, but they're actually a full minute apart. The request ID would be able to show that there was a thousand requests in the meantime, for example.Not that it's a super strong argument though - would just be sad to see it go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I usually just copy a one of those and grep the logs to find matching log lines, I'm not sure there's much gained by them being incrementing.
As you mention this is also "opt-in", but I find it much simpler to have a single request ID you can use across all logs related to a request (e.g. this would let you pivot across logs + jaeger easier than #13801).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summarizing the discussion in #synapse-devs:
Overall, no strong objections.