Skip to content

Mongoose is throwing Unhandled Quiesce mode error that the Mongo driver doesn't throw  #14834

Closed as not planned

Description

Prerequisites

  • I have written a descriptive issue title
  • I have searched existing issues to ensure the bug has not already been reported

Mongoose version

8.3.2

Node.js version

20.x

MongoDB server version

7.x (Atlas)

Typescript version (if applicable)

5.3.3

Description

We are getting errors whenever our mongo server changes primary server during upgrades.

{
  "errorType":"Runtime.UnhandledPromiseRejection",
  "errorMessage":"MongoServerError: The server is in quiesce mode and will shut down"
}

We’ve tried running the same code under load testing in a staging environment with mongoose and with just the base mongo driver. With just the base mongo driver we see this error one in a million requests. When we use mongoose nearly 10% of those million requests are failing with this error. It looks like this has [been fixed before](#11661), but there is possible a regression issue?

Using Mongo v7 on Atlas

[Node] mongoose version: 8.3.2

[Node] mongo driver version (for testing): 6.6.2

Our system is using Node.js AWS lambdas (without callbacks; we use async/await) for our APIs. We are using all of the default settings except for { autoIndex: false, bufferCommands: false} . We believe after investigating that mongoose is somehow throwing this error outside of areas that we can catch. We wrapped all of the mongoose code inside of a try catch statement and it is still throwing this error as an unhandled exception.

We have an extra layer in our lambdas that creates an express app so we can take advantage of a tool called Fern that does api request and response shape validation. The layer creates an express server and wraps it in a function that should still use the async/await pattern using the serverless-http package. We thought that might be the issue so we added an express middleware to catch errors, but didn’t catch them there either. And, we are seeing these errors on APIs that don’t have that express wrapper.

We were only able to replicate the issue when we started scaling our Atlas mongo server up to M60/M80 and hitting it with large load. Not sure why it needed large load to replicate exactly, but perhaps it is because when there is larger load on the database the quiesce mode lasts longer trending up towards it’s maximum of 15 seconds.

We also ran tests where serverSelectionTimeoutMS was longer and we completely removed the timeout on our lambdas, so they would have extra time to catch up in case it was just really long timing or something and we needed to change some sort of query performance, but no matter how much time we gave it in testing (even long beyond a reasonable time for APIs to be running) it didn’t solve the problem.

We didn’t get this issue when we hit the few apis that we have on AWS containers. Only in Lambdas.

Steps to Reproduce

While load testing an endpoint run a resilience test in MongoDB Atlas. More details of our system above.

Expected Behavior

Mongoose and the mongo driver together should handle Quiesce mode without errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    StalehelpThis issue can likely be resolved in GitHub issues. No bug fixes, features, or docs necessary

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions