Description
Versions/Environment
- What version of Rust are you using? 1.66.0
- What operating system are you using? MacOS
- What versions of the driver and its dependencies are you using? (Run
cargo pkgid mongodb
&cargo pkgid bson
)https://github.com/rust-lang/crates.io-index#mongodb@2.6.0
andhttps://github.com/rust-lang/crates.io-index#bson@2.6.1
- What version of MongoDB are you using? (Check with the MongoDB shell using
db.version()
) 6.0.8 - What is your MongoDB topology (standalone, replica set, sharded cluster, serverless)? Replica Set
Describe the bug
When writing concurrently do a MongoDB collection, the full time until execution continues after the insert has been completed grows non-stop linearly. So if you do 10 concurrent writes, whichever one is the first to go will be done in ~60ms but the slowest will take about 480ms. This slowness growths linearly with a 100 concurrent writes the slowest one takes more than 1 second.
This is despite the tracing logs seemingly saying that the command continues to succeed in the appropriate amount of time:
2023-07-19T19:10:08.449191Z DEBUG mongodb::command: Command succeeded topologyId="64b8350f0d64165cf22c9b29" reply="{\"n\":1,\"electionId\":{\"$oid\":\"7fffffff000000000000001d\"},\"opTime\":{\"ts\":{\"$timestamp\":{\"t\":1689793808,\"i\":10}},\"t\":29},\"ok\":1.0,\"$clusterTime\":{\"clusterTime\":{\"$timestamp\":{\"t\":1689793808,\"i\":10}},\"signature\":{\"hash\":{\"$binary\":{\"base64\":\"nLRyMv2XttX7Rkr34LTRkAiMgEg=\",\"subType\":\"00\"}},\"keyId\":7210123958148923394}},\"operationTime\":{\"$timestamp\":{\"t\":1689793808,\"i\":10}}}" commandName="insert" requestId=34 driverConnectionId=4 serverConnectionId=1179 serverHost="ac-qwvoave-shard-00-01.prroxke.mongodb.net" serverPort=27017 durationMS=46
What I would expect is that the difference between running 1 write operation concurrently and running 2 to not be meaningfully different. While I'd expect latency to get worse with a significant amount of concurrent writes, I'd expect that to occur more gradually and under more significant load. And that the time it takes to do the write would be about the same as what the command succeeded tracing logs emits.
We've had this issue for a while but only recently took the time to isolate the issue. I've posted a rust program that replicates the issue with the relevant details below
Things I've tested out:
- Upgrading the cluster to have more computing. Specifically migrating from a M10 to an M30 on Atlas. No change at all
- This cluster is used for other things, maybe its some caching issue with other usage. I ran it in an isolated M30 Cluster on Atlas and the issue persists
- I've tried to change the minimum and maximums pool sizes for the driver. Reducing the maximum further does slow it down but increasing the minimum or the maximum has minimal/no effect.
- Various BSON-related things: Seems to have no effect at all
Based on the command log, it implies that the driver is getting things back at the time that I'd expect but that's possibly a red herring Apart from some lock-related issue I'm not sure what could be causing this. Do y'all have any idea of what the issue could be
To Reproduce
Use the following program to reproduce the issue and run it with the following command:
RUST_LOG='mongodb::command=debug' cargo run
main.rs
use serde;
#[derive(serde::Serialize, serde::Deserialize, Clone, Debug, PartialEq)]
struct EventDocument {
#[serde(
rename = "_id",
with = "mongodb::bson::serde_helpers::uuid_1_as_binary"
)] // rename for it to be picked up by the _id index
pub id: uuid::Uuid, // UUID
pub owner_id: String,
}
#[tokio::main]
async fn main() {
tracing_subscriber::fmt::init();
let db_name = "slow_test";
let uri = "<mongo_uri>";
let db = {
let client_options = mongodb::options::ClientOptions::parse(uri).await.unwrap();
let client = mongodb::Client::with_options(client_options).unwrap();
client.database(db_name)
};
let collection = db.collection::<EventDocument>("events");
collection.list_index_names().await.unwrap();
let total_count = 2;
let total_instant = tokio::time::Instant::now();
let mut handles = vec![];
for i in 0..total_count {
let collection = collection.clone();
let handle: tokio::task::JoinHandle<()> = tokio::spawn(async move {
let instant = tokio::time::Instant::now();
let event = EventDocument {
id: uuid::Uuid::new_v4(),
owner_id: format!("event-{}", i),
};
collection.insert_one(event, None).await.unwrap();
println!("inserted {} in {:?}", i, instant.elapsed());
});
handles.push(handle);
}
futures::future::join_all(handles).await;
println!("total time {:?}", total_instant.elapsed());
}
Cargo.toml:
[package]
name = "slow-mongo-test"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
mongodb = { version = "2.6.0", features = ["tracing-unstable", "bson-chrono-0_4", "bson-serde_with", "bson-uuid-1"]}
tokio = { version = "1.29.1", features = ["macros", "rt-multi-thread"] }
serde = { version = "1.0.136", features = ["derive"] }
uuid = "1.4.1"
futures = "0.3.28"
rand = "0.8.5"
tracing = "0.1.37"
tracing-subscriber = "0.3.17"