Skip to content

RUST-580 Add tracing integration and command tracing events #757

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Oct 20, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
f654434
add tracing support and command events
kmahar Sep 29, 2022
f8caf7a
use relaxed, not canonical, extJSON
kmahar Oct 5, 2022
9d47a00
add RUST-1496 TODO
kmahar Oct 7, 2022
8f13e8e
use camelcase for tracing event property names
kmahar Oct 10, 2022
b986b86
expand comments
kmahar Oct 10, 2022
4657388
only generate command event once and only clone if needed
kmahar Oct 10, 2022
cf22534
more robust error redaction
kmahar Oct 10, 2022
bd0fbf3
sync redaction tests and incorporate updated error redaction logic
kmahar Oct 11, 2022
d288519
format
kmahar Oct 11, 2022
6aff38f
incorporate latest spec changes
kmahar Oct 12, 2022
2673195
Merge remote-tracking branch 'origin' into RUST-580/logging-commands
kmahar Oct 12, 2022
f15e1e6
add gridfs error to match
kmahar Oct 12, 2022
49c4af6
add csfle to feature-combinations.sh
kmahar Oct 13, 2022
7e06949
bump tracing-subscriber version and remove RUSTSEC-2021-0139 from ign…
kmahar Oct 13, 2022
21787e8
remove redundant comment
kmahar Oct 13, 2022
c657018
schema version bump
kmahar Oct 14, 2022
38eeb34
more robust redaction testing
kmahar Oct 14, 2022
4e4e4bc
as_test_file_entities -> test_file_entities
kmahar Oct 14, 2022
4f518d4
fix compilation with csfle, remove unused import warning
kmahar Oct 14, 2022
5e0c680
format again
kmahar Oct 14, 2022
0586381
include topology ID in events rather than using entity ID
kmahar Oct 17, 2022
1323f8e
topology_id -> topologyId for consistency with other fields
kmahar Oct 17, 2022
761893c
Abraham comments
kmahar Oct 18, 2022
7e88293
Isabel comments
kmahar Oct 19, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 6 additions & 17 deletions .evergreen/compile-only.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#!/bin/bash

set -o errexit
set -o xtrace
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple changes to this script:

  • switch to using the feature-combinations.sh script to produce feature combinations to test. as a side effect of this we no longer have separate compilation tasks based on runtime; there is just one task and between all the values in the feature combinations list we get coverage of both tokio and async-std.
  • xtrace seemed useful so that when there is a failure you can tell which combination of feature flags actually caused it.


source ./.evergreen/configure-rust.sh
rustup update $RUST_VERSION
Expand All @@ -10,21 +11,9 @@ if [ "$MSRV" = "true" ]; then
cp .evergreen/MSRV-Cargo.lock Cargo.lock
fi

if [ "$ASYNC_RUNTIME" = "tokio" ]; then
FEATURE_FLAGS=snappy-compression,zlib-compression
source ./.evergreen/feature-combinations.sh

# Zstd requires Rust version 1.54
if [[ $RUST_VERSION == "nightly" ]]; then
FEATURE_FLAGS=$FEATURE_FLAGS,zstd-compression
fi

rustup run $RUST_VERSION cargo build --features $FEATURE_FLAGS
rustup run $RUST_VERSION cargo build --features tokio-sync,$FEATURE_FLAGS

elif [ "$ASYNC_RUNTIME" = "async-std" ]; then
rustup run $RUST_VERSION cargo build --no-default-features --features async-std-runtime
rustup run $RUST_VERSION cargo build --no-default-features --features sync
else
echo "invalid async runtime: ${ASYNC_RUNTIME}" >&2
exit 1
fi
# build with all available features to ensure all optional dependencies are brought in too.
for ((i = 0; i < ${#FEATURE_COMBINATIONS[@]}; i++)); do
rustup run $RUST_VERSION cargo build ${FEATURE_COMBINATIONS[$i]}
done
5 changes: 2 additions & 3 deletions .evergreen/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -548,7 +548,7 @@ functions:
working_dir: "src"
script: |
${PREPARE_SHELL}
ASYNC_RUNTIME=${ASYNC_RUNTIME} RUST_VERSION=${RUST_VERSION} MSRV=${MSRV} .evergreen/compile-only.sh
RUST_VERSION=${RUST_VERSION} MSRV=${MSRV} .evergreen/compile-only.sh
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related to changes in the compile-only.sh script, we no longer need to pass a runtime here, and no longer need to include async runtime in the matrix definition below.


"check cargo deny":
- command: shell.exec
Expand Down Expand Up @@ -1858,9 +1858,8 @@ buildvariants:
matrix_spec:
os:
- ubuntu-18.04
async-runtime: "*"
extra-rust-versions: "*"
display_name: "! Compile on Rust ${extra-rust-versions} with ${async-runtime}"
display_name: "! Compile on Rust ${extra-rust-versions}"
tasks:
- "compile-only"

Expand Down
2 changes: 1 addition & 1 deletion .evergreen/feature-combinations.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@
export FEATURE_COMBINATIONS=(
'' # default features
'--no-default-features --features async-std-runtime,sync' # features that conflict w/ default features
'--features tokio-sync,zstd-compression,snappy-compression,zlib-compression,openssl-tls,aws-auth,csfle' # additive features
'--features tokio-sync,zstd-compression,snappy-compression,zlib-compression,openssl-tls,aws-auth,tracing-unstable,csfle' # additive features
)
2 changes: 1 addition & 1 deletion .evergreen/run-tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ if [ "$SINGLE_THREAD" = true ]; then
OPTIONS="$OPTIONS --test-threads=1"
fi

FEATURE_FLAGS="zstd-compression,snappy-compression,zlib-compression,${TLS_FEATURE}"
FEATURE_FLAGS="zstd-compression,snappy-compression,zlib-compression,tracing-unstable,${TLS_FEATURE}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially had planned to introduce separate tasks for testing tracing to avoid introducing any potential extra overhead from having a tracing subscriber registered in every test. however, I tried out just running the tracing tests alongside the other ones, and the total runtime for a full patch seemed to fall within the usual bounds; for the last 10 commits to main patches took anywhere from 14 days, 7 hours to 16 days, 2 hours computing time, and both of mine took around 15 days, 8 hours.

So there does not seem to be a ton of overhead. also, if we split out the tracing tasks we would need to separately install rustup, cargo, and driver deps for even more tasks which would probably cancel out any speedup in the other tests.


if [ "$ASYNC_RUNTIME" = "tokio" ]; then
ASYNC_FEATURE_FLAGS=${FEATURE_FLAGS}
Expand Down
8 changes: 8 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,11 @@ snappy-compression = ["snap"]
# DO NOT USE; see https://jira.mongodb.org/browse/RUST-569 for the status of CSFLE support in the Rust driver.
csfle = ["mongocrypt", "rayon", "num_cpus"]

# DO NOT USE; see https://jira.mongodb.org/browse/RUST-580 for the status of tracing/logging support in the Rust driver.
# Enables support for emitting tracing events.
# TODO: pending https://github.com/tokio-rs/tracing/issues/2036 stop depending directly on log.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this doesn't end up getting in before we release I don't think it is a huge deal because we already transitively depend on log via a bunch of our dependencies so we're not adding anything extra to compile here.

tracing-unstable = ["tracing", "log"]

[dependencies]
async-trait = "0.1.42"
base64 = "0.13.0"
Expand All @@ -85,6 +90,7 @@ futures-executor = "0.3.14"
hex = "0.4.0"
hmac = "0.12.1"
lazy_static = "1.4.0"
log = { version = "0.4.17", optional = true }
md-5 = "0.10.1"
mongocrypt = { git = "https://github.com/mongodb/libmongocrypt-rust.git", branch = "main", optional = true }
num_cpus = { version = "1.13.1", optional = true }
Expand All @@ -105,6 +111,7 @@ strsim = "0.10.0"
take_mut = "0.2.2"
thiserror = "1.0.24"
tokio-openssl = { version = "0.6.3", optional = true }
tracing = { version = "0.1.36", optional = true }
trust-dns-proto = "0.21.2"
trust-dns-resolver = "0.21.2"
typed-builder = "0.10.0"
Expand Down Expand Up @@ -172,6 +179,7 @@ serde = { version = "*", features = ["rc"] }
serde_json = "1.0.64"
semver = "1.0.0"
time = "0.3.9"
tracing-subscriber = "0.3.16"
regex = "1.6.0"

[package.metadata.docs.rs]
Expand Down
43 changes: 24 additions & 19 deletions src/client/executor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,12 @@ use crate::{
TRANSIENT_TRANSACTION_ERROR,
UNKNOWN_TRANSACTION_COMMIT_RESULT,
},
event::command::{CommandFailedEvent, CommandStartedEvent, CommandSucceededEvent},
event::command::{
CommandEvent,
CommandFailedEvent,
CommandStartedEvent,
CommandSucceededEvent,
},
hello::LEGACY_HELLO_COMMAND_NAME_LOWERCASE,
operation::{
AbortTransaction,
Expand Down Expand Up @@ -624,23 +629,21 @@ impl Client {
bytes: serialized,
};

self.emit_command_event(|handler| {
self.emit_command_event(|| {
let command_body = if should_redact {
Document::new()
} else {
Document::from_reader(raw_cmd.bytes.as_slice())
.unwrap_or_else(|e| doc! { "serialization error": e.to_string() })
};
let command_started_event = CommandStartedEvent {
CommandEvent::Started(CommandStartedEvent {
command: command_body,
db: raw_cmd.target_db.clone(),
command_name: raw_cmd.name.clone(),
request_id,
connection: connection_info.clone(),
service_id,
};

handler.handle_command_started_event(command_started_event);
})
});

let start_time = Instant::now();
Expand Down Expand Up @@ -727,17 +730,20 @@ impl Client {

match command_result {
Err(mut err) => {
self.emit_command_event(|handler| {
let command_failed_event = CommandFailedEvent {
self.emit_command_event(|| {
let mut err = err.clone();
if should_redact {
err.redact();
}

CommandEvent::Failed(CommandFailedEvent {
duration,
command_name: cmd_name,
failure: err.clone(),
command_name: cmd_name.clone(),
failure: err,
request_id,
connection: connection_info,
connection: connection_info.clone(),
service_id,
};

handler.handle_command_failed_event(command_failed_event);
})
});

if let Some(ref mut session) = session {
Expand All @@ -750,7 +756,7 @@ impl Client {
op.handle_error(err)
}
Ok(response) => {
self.emit_command_event(|handler| {
self.emit_command_event(|| {
let reply = if should_redact {
Document::new()
} else {
Expand All @@ -759,15 +765,14 @@ impl Client {
.unwrap_or_else(|e| doc! { "deserialization error": e.to_string() })
};

let command_succeeded_event = CommandSucceededEvent {
CommandEvent::Succeeded(CommandSucceededEvent {
duration,
reply,
command_name: cmd_name.clone(),
request_id,
connection: connection_info,
connection: connection_info.clone(),
service_id,
};
handler.handle_command_succeeded_event(command_succeeded_event);
})
});

#[cfg(feature = "csfle")]
Expand Down
47 changes: 44 additions & 3 deletions src/client/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,13 @@ use derivative::Derivative;

#[cfg(test)]
use crate::options::ServerAddress;
#[cfg(feature = "tracing-unstable")]
use crate::trace::{
command::CommandTracingEventEmitter,
trace_or_log_enabled,
TracingOrLogLevel,
COMMAND_TRACING_EVENT_TARGET,
};
use crate::{
bson::Document,
change_stream::{
Expand All @@ -25,7 +32,7 @@ use crate::{
concern::{ReadConcern, WriteConcern},
db::Database,
error::{ErrorKind, Result},
event::command::CommandEventHandler,
event::command::{handle_command_event, CommandEvent},
operation::{AggregateTarget, ListDatabases},
options::{
ClientOptions,
Expand All @@ -39,6 +46,7 @@ use crate::{
sdam::{server_selection, SelectedServer, SessionSupportStatus, Topology},
ClientSession,
};

pub(crate) use executor::{HELLO_COMMAND_NAMES, REDACTED_COMMANDS};
pub(crate) use session::{ClusterTime, SESSIONS_UNSUPPORTED_COMMANDS};

Expand Down Expand Up @@ -145,9 +153,42 @@ impl Client {
Ok(client)
}

pub(crate) fn emit_command_event(&self, emit: impl FnOnce(&Arc<dyn CommandEventHandler>)) {
#[cfg(not(feature = "tracing-unstable"))]
pub(crate) fn emit_command_event(&self, generate_event: impl FnOnce() -> CommandEvent) {
if let Some(ref handler) = self.inner.options.command_event_handler {
emit(handler);
let event = generate_event();
handle_command_event(handler.as_ref(), event);
}
}

#[cfg(feature = "tracing-unstable")]
pub(crate) fn emit_command_event(&self, generate_event: impl FnOnce() -> CommandEvent) {
let tracing_emitter = if trace_or_log_enabled!(
target: COMMAND_TRACING_EVENT_TARGET,
TracingOrLogLevel::Debug
) {
Some(CommandTracingEventEmitter::new(
self.inner.options.tracing_max_document_length_bytes,
self.inner.topology.id,
))
} else {
None
};
let apm_event_handler = self.inner.options.command_event_handler.as_ref();
if !(tracing_emitter.is_some() || apm_event_handler.is_some()) {
return;
}

let event = generate_event();
if let (Some(event_handler), Some(ref tracing_emitter)) =
(apm_event_handler, &tracing_emitter)
{
handle_command_event(event_handler.as_ref(), event.clone());
handle_command_event(tracing_emitter, event);
} else if let Some(event_handler) = apm_event_handler {
handle_command_event(event_handler.as_ref(), event);
} else if let Some(ref tracing_emitter) = tracing_emitter {
handle_command_event(tracing_emitter, event);
}
}

Expand Down
16 changes: 16 additions & 0 deletions src/client/options/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -530,6 +530,20 @@ pub struct ClientOptions {
#[builder(default)]
pub tls: Option<Tls>,

/// The maximum number of bytes that the driver should include in a tracing event
/// or log message's extended JSON string representation of a BSON document, e.g. a
/// command or reply from the server.
/// If truncation of a document at the exact specified length would occur in the middle
/// of a Unicode codepoint, the document will be truncated at the closest larger length
/// which falls on a boundary between codepoints.
/// Note that in cases where truncation occurs the output will not be valid JSON.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Note that in cases where truncation occurs the output will not be valid JSON.
/// Note that in cases where truncation occurs the output may not be valid JSON.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually never mind, I got mixed up here. "will" is the right word to use

///
/// The default value is 1000.
#[cfg(any(feature = "tracing-unstable", docsrs))]
#[cfg_attr(docsrs, doc(cfg(feature = "tracing-unstable")))]
#[builder(default)]
pub tracing_max_document_length_bytes: Option<usize>,

/// Specifies the default write concern for operations performed on the Client. See the
/// WriteConcern type documentation for more details.
#[builder(default)]
Expand Down Expand Up @@ -1259,6 +1273,8 @@ impl ClientOptions {
sdam_event_handler: None,
#[cfg(test)]
test_options: None,
#[cfg(feature = "tracing-unstable")]
tracing_max_document_length_bytes: None,
}
}

Expand Down
3 changes: 2 additions & 1 deletion src/client/session/test/causal_consistency.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,9 @@ use crate::{
client::options::SessionOptions,
coll::options::CollectionOptions,
error::Result,
event::command::CommandEvent,
options::ReadConcern,
test::{log_uncaptured, CommandEvent, EventClient, LOCK},
test::{log_uncaptured, EventClient, LOCK},
ClientSession,
Collection,
};
Expand Down
Loading