-
Notifications
You must be signed in to change notification settings - Fork 180
RUST-1587 Implement server selection tracing events #805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RUST-1587 Implement server selection tracing events #805
Conversation
@@ -11,7 +11,8 @@ use crate::{ | |||
|
|||
/// A description of the most up-to-date information known about a topology. Further details can | |||
/// be found in the [Server Discovery and Monitoring specification](https://github.com/mongodb/specifications/blob/master/source/server-discovery-and-monitoring/server-discovery-and-monitoring.rst). | |||
#[derive(Clone)] | |||
#[derive(Clone, derive_more::Display)] | |||
#[display(fmt = "{}", description)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this just means that the output will be the Display impl for description
. I took advantage of derive_more::Display
in a couple places in this PR for the sake of brevity.
also, it was an intentional choice to leave Display
prefixed with derive_more
rather than importing it throughout this PR. I found myself confused other places in the code base that we were using an imported derive_more
because it is easy to mistake it for the std
trait with the same name.
@@ -102,38 +103,3 @@ impl fmt::Debug for TopologyDescription { | |||
.finish() | |||
} | |||
} | |||
|
|||
impl fmt::Display for TopologyDescription { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this implementation was combined with the existing and more bare-bones implementation of Display
on the internal crate::sdam::TopologyDescription
type, so that this logic can be reused for the tracing messages, where we only have the internal type available.
@@ -25,7 +25,7 @@ use crate::{ | |||
use self::server_selection::IDLE_WRITE_PERIOD; | |||
|
|||
/// The possible types for a topology. | |||
#[derive(Debug, Clone, Copy, Eq, PartialEq, Deserialize, Serialize)] | |||
#[derive(Debug, Clone, Copy, Eq, PartialEq, Deserialize, Serialize, derive_more::Display)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seemed worth adding Display here for use in TopologyDescription
's Display
impl.
write!(f, "{{ Type: {:?}, Servers: [ ", self.topology_type)?; | ||
for server_info in self.servers.values().map(ServerInfo::new_borrowed) { | ||
write!(f, "{}, ", server_info)?; | ||
fn fmt(&self, f: &mut fmt::Formatter) -> std::result::Result<(), fmt::Error> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic is largely copied over from the event::sdam::TopologyDescription
implementation.
@@ -13,18 +13,20 @@ use crate::{ | |||
}; | |||
|
|||
/// Describes which servers are suitable for a given operation. | |||
#[derive(Clone, Derivative)] | |||
#[derive(Clone, Derivative, derive_more::Display)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see unit tests in src/test/spec/trace.rs to see how the SelectionCriteria
gets stringified by this logic.
@@ -128,13 +128,16 @@ impl TestCursor { | |||
|
|||
impl ClientEntity { | |||
pub(crate) fn new( | |||
client: Client, | |||
client_options: ClientOptions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
related to the issue we discussed in standup/slack, we just pass the options here now so we can make sure the observer exists first.
@@ -281,7 +280,7 @@ impl TestRunner { | |||
self.sync_workers().await; | |||
|
|||
let all_tracing_events = tracing_subscriber | |||
.collect_events(Duration::from_millis(500), |_| true) | |||
.collect_events(Duration::from_millis(1000), |_| true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was running into some cases on macOS sharded clusters with SSL/auth/compression where we occasionally wouldn't see all of the tracing events within 500ms. increasing this fixed that
type Representation; | ||
|
||
fn tracing_representation(self) -> Self::Representation; | ||
fn tracing_representation(&self) -> Self::Representation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the only implementation of this trait that needed to take ownership of the value was for bson::Document
. to save a bunch of unneeded clones of e.g. the topology description, I opted to update this to take self by reference and to remove the implementation of the trait for bson::Document
. all document serialization is already currently handled via serialize_command_or_reply
so it seemed OK to not rely on the trait and just serialize directly in that method.
.description | ||
.server_selection_timeout_error_message(criteria), | ||
); | ||
match result { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's really unfortunate how much noise this adds to the function - can logging be factored out to helper methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great idea, done. I ended up adding a helper type to store the shared info and avoid having to pass it with each message. this matches nicely with our existing XEventEmitter
types for CMAP and command tracing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just a few comments that don't require re-review.
src/client/mod.rs
Outdated
@@ -480,13 +481,18 @@ impl Client { | |||
&self, | |||
criteria: Option<&SelectionCriteria>, | |||
) -> Result<ServerAddress> { | |||
let server = self.select_server(criteria).await?; | |||
let server = self.select_server(criteria, "test_select_server").await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The names given elsewhere are short phrases rather than a snake_case tag, might be better to be consistent one way or the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to "Test select server" for consistency.
.description | ||
.server_selection_timeout_error_message(criteria), | ||
); | ||
match result { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
src/trace/server_selection.rs
Outdated
) { | ||
if trace_or_log_enabled!(target: SERVER_SELECTION_TRACING_EVENT_TARGET, TracingOrLogLevel::Debug) | ||
// TODO: RUST-1585 Remove this condition. | ||
&& self.operation_name != "Check sessions support status" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these values are being matched on, they should be named constants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. since the constant needs to be used unconditional of the tracing feature (in get_session_support_status
), this couldn't be hidden away in the trace module. I ended up putting it in client/mod.rs
for now.
pushed one more small change: I noticed that the other event emitter types were named |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
This implements the server selection tracing/log events/messages as defined in mongodb/specifications#1341.
There is a workaround in here for the fact that we still currently perform server selection to detect sessions support; logging messages in those cases makes these tests flaky. That flakiness is why the spec PR hasn't been merged yet, though I expect for any driver that doesn't log on server selection attempts for sessions support the tests will pass.
We could merge this and remove the workaround as part of RUST-1499, or alternatively we can wait to merge until that work is in; I don't have a strong opinion either way.
Will leave some more comments inline.