Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRIVERS-1003 Introduce CMAP error tests #1369

Open
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

patrickfreed
Copy link
Contributor

DRIVERS-1003

This PR introduces test coverage for some basic CMAP error cases. The changes involve a few new CMAP format tests as well as a few new unified format tests. The CMAP format's version of runOnRequirements was updated to support the auth requirement from the Unified Test Format. Also, the existing logging tests were moved into a single unified test directory along with the new tests.

This PR also addresses DRIVERS-1784 and DRIVERS-1785, both of which were closely related.

@@ -519,10 +527,6 @@ Populating the pool MUST NOT block any application threads. For example, it
could be performed on a background thread or via the use of non-blocking/async
I/O. Populating the pool MUST NOT be performed unless the pool is "ready".

If an error is encountered while populating a connection, it MUST be handled via
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This notice and associated psuedocode was moved to the "Establishing a Connection" section above to avoid duplication with "Checking out a Connection". The requirement still remains.

@patrickfreed patrickfreed marked this pull request as ready for review January 20, 2023 21:58
@patrickfreed patrickfreed requested a review from a team as a code owner January 20, 2023 21:58
@patrickfreed patrickfreed requested review from BorisDog and ShaneHarvey and removed request for a team January 20, 2023 21:58
Copy link
Member

@ShaneHarvey ShaneHarvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff! I will give this a more thorough review including POCing the new tests asap.

=============

Tests for connection pool logging can be found in the `/logging <./logging>`__ subdirectory and are written in the
`Unified Test Format <../../unified-test-format/unified-test-format.rst>`__.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this intentionally removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the logging and new unified tests were combined into a single unified directory, rather than having a separate one for logging.

count: 3
- name: ready
events:
- type: ConnectionPoolReady
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On non standalone topologies we'll be getting multiple ConnectionPoolReady events.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cmap-format tests are unit tests of an individual pool, so the events in each one should only be associated with that pool. If a driver is testing this via an actual client which makes pools per server, I think it would probably be best if the test runner for that driver just filtered out any events observed from other pools.

}
},
{
"name": "find",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there might be a race here:
Pool is paused before Find happens, and then we don't get connectionCreated/connectionClosed event.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, i added a really high heartbeatFrequency to this test so that the failpoint wont be triggered by the monitors.

Copy link
Contributor

@BorisDog BorisDog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Tests work in .NET.

reason: connectionError
ignore:
- ConnectionPoolCreated
- ConnectionPoolReady
Copy link
Member

@ShaneHarvey ShaneHarvey Feb 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails in Python. For example the python cmap-format test runner mocks various pieces of the MongoClient and Topology to get a connection Pool suitable for the tests (based on the other requirements like the fact that the pool needs to start in the Paused state and not be marked Ready by SDAM), but this test expects to observe the effects of the Topology's error handling rules. I think this kind of test is outside the scope of the cmap test format and has to be implemented in the unified runner. Any test for the interaction between SDAM and CMAP needs to be moved over to the unified runner IMO. Does that make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BorisDog, how does this test pass in .NET?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an integration tests format, so it runs against real deployment, with real SDAM in .NET.

Copy link
Member

@ShaneHarvey ShaneHarvey Feb 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure using SDAM here is expected. If SDAM is running then the heartbeats will fail and clear the pool at any time, thus the events would be non-deterministic. How does .NET avoid those kind of issues here? This is why we had to stub out the Topology in these tests in Python.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we hack it a bit: we set 10 mins heartbeat interval, and then force the pool to mark itself as ready.

Copy link
Member

@ShaneHarvey ShaneHarvey Feb 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose I can implement more hacks to get this to "work" but at that point it wouldn't even test the same code path that the driver takes for handling an error while checking out a connection for an operation. I'm worried other teams will encounter the same problems trying to implement these tests. Is there any reason not to move these tests over to the unified format?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In C# it's just the initial setup that is adjusted, so the rest of SDAM works as usual.

But I agree that if unified format fits better, than it's preferable.
BTW this PR already contains an unified test similar to this one: "Pool properly handles network error during checkout", not sure if they test the exact same thing...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Yes it looks like that test is exactly what I'm asking for. Let's remove this test. Let me check which other cmap-format tests are problematic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The integration format was introduced at a time when I think the unified format didn't have CMAP events nor the ability to run multiple threads, though now that it does, I imagine a lot of those tests can just be written in the unified format.

One thing that's still difficult to represent in the unified format is a checkOut that doesn't do much with the connection and just holds on to it until the test needs to check it back in for the purposes of testing some specific behavior. The "checking in a connection after checkout fails closes connection" test added in this PR is one such example, though its possible this could be implemented with a server side sleep or something. The rest of the ones introduced in this PR can pretty easily be (or already are) represented as unified tests though, so I'll update them.

Of the existing integration tests, only "threads blocked by maxConnecting check out returned connections" jumps out at me as being hard to do in the unified format, but that one doesn't rely on any SDAM functionality, so it could probably just stay as an integration test. I filed DRIVERS-2555 for updating the existing tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for letting this one simmer for a while--I've gone ahead and updated all the newly introduced CMAP-format tests to the unified format, with the exception of "checking in a connection after checkout fails closes connection" as mentioned above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants