This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we get any benefit from splitting up the traces by instance or instance type before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason I've been holding off from doing this is that we lose the "colouring" of the traces, i.e. it becomes much harder to tell when processing moves to a different process. Not sure if there's another way of achieving the same effect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid I would second Erik's sort-of objection here; although having to find the right service is a minor pain, I really find the colouring useful because it helps to spot the points where the processing moves over replication.
Perhaps something else to consider: I wonder if e.g. Grafana's interface for viewing traces is better for finding traces regardless of service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also do something like change the service name to the worker type? I.e. the instance name without the index. That way it'd be easier to find stuff (as you don't have to figure out the instance), and for the most part instances don't talk to the same type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was going to be my suggestion!
We could also add a tag which is the entire instance name if you did want to filter by that. I think the only time that is useful is if you want to see what happens when an instance is under load or something, but...I don't think this is the right interface for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh, makes sense. I have been mostly looking at the
RoomMessageListRestServlet
which is all turqoise in one process.For reference, an example that spans multiple processes is
RoomStateEventRestServlet
Any tips for how to accomplish because this is what I also saw looking at this yesterday and just went with
- Synapse
?We could strip number suffixes from the opaque worker name. Can we consider that a safe-enough convention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't be too offended if we took an educated guess and stripped
"[_-]?[0-9]+$"
off the end of the worker name. I think it's mainly just us/developers that would be using Jaeger anyway, so I don't consider it too harmful.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could make it configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated to the sensible default of stripping off the number suffix. Our
matrix.org
worker naming convention seems sane and we can avoid the config.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Split that out to #13761