-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRIVERS-1003 Introduce CMAP error tests #1369
base: master
Are you sure you want to change the base?
DRIVERS-1003 Introduce CMAP error tests #1369
Conversation
pendingConnectionCount in psuedocode
@@ -519,10 +527,6 @@ Populating the pool MUST NOT block any application threads. For example, it | |||
could be performed on a background thread or via the use of non-blocking/async | |||
I/O. Populating the pool MUST NOT be performed unless the pool is "ready". | |||
|
|||
If an error is encountered while populating a connection, it MUST be handled via |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This notice and associated psuedocode was moved to the "Establishing a Connection" section above to avoid duplication with "Checking out a Connection". The requirement still remains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff! I will give this a more thorough review including POCing the new tests asap.
============= | ||
|
||
Tests for connection pool logging can be found in the `/logging <./logging>`__ subdirectory and are written in the | ||
`Unified Test Format <../../unified-test-format/unified-test-format.rst>`__. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this intentionally removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, the logging and new unified tests were combined into a single unified
directory, rather than having a separate one for logging.
count: 3 | ||
- name: ready | ||
events: | ||
- type: ConnectionPoolReady |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On non standalone topologies we'll be getting multiple ConnectionPoolReady
events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cmap-format tests are unit tests of an individual pool, so the events in each one should only be associated with that pool. If a driver is testing this via an actual client which makes pools per server, I think it would probably be best if the test runner for that driver just filtered out any events observed from other pools.
source/connection-monitoring-and-pooling/tests/cmap-format/pool-checkout-error-auth.json
Outdated
Show resolved
Hide resolved
source/connection-monitoring-and-pooling/tests/unified/network-error.json
Show resolved
Hide resolved
source/connection-monitoring-and-pooling/tests/unified/network-error.json
Show resolved
Hide resolved
} | ||
}, | ||
{ | ||
"name": "find", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there might be a race here:
Pool is paused before Find happens, and then we don't get connectionCreated
/connectionClosed
event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, i added a really high heartbeatFrequency to this test so that the failpoint wont be triggered by the monitors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Tests work in .NET.
reason: connectionError | ||
ignore: | ||
- ConnectionPoolCreated | ||
- ConnectionPoolReady |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test fails in Python. For example the python cmap-format test runner mocks various pieces of the MongoClient and Topology to get a connection Pool suitable for the tests (based on the other requirements like the fact that the pool needs to start in the Paused state and not be marked Ready by SDAM), but this test expects to observe the effects of the Topology's error handling rules. I think this kind of test is outside the scope of the cmap test format and has to be implemented in the unified runner. Any test for the interaction between SDAM and CMAP needs to be moved over to the unified runner IMO. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BorisDog, how does this test pass in .NET?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an integration tests format, so it runs against real deployment, with real SDAM in .NET.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure using SDAM here is expected. If SDAM is running then the heartbeats will fail and clear the pool at any time, thus the events would be non-deterministic. How does .NET avoid those kind of issues here? This is why we had to stub out the Topology in these tests in Python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we hack it a bit: we set 10 mins heartbeat interval, and then force the pool to mark itself as ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose I can implement more hacks to get this to "work" but at that point it wouldn't even test the same code path that the driver takes for handling an error while checking out a connection for an operation. I'm worried other teams will encounter the same problems trying to implement these tests. Is there any reason not to move these tests over to the unified format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In C# it's just the initial setup that is adjusted, so the rest of SDAM works as usual.
But I agree that if unified format fits better, than it's preferable.
BTW this PR already contains an unified test similar to this one: "Pool properly handles network error during checkout", not sure if they test the exact same thing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Yes it looks like that test is exactly what I'm asking for. Let's remove this test. Let me check which other cmap-format tests are problematic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The integration format was introduced at a time when I think the unified format didn't have CMAP events nor the ability to run multiple threads, though now that it does, I imagine a lot of those tests can just be written in the unified format.
One thing that's still difficult to represent in the unified format is a checkOut that doesn't do much with the connection and just holds on to it until the test needs to check it back in for the purposes of testing some specific behavior. The "checking in a connection after checkout fails closes connection" test added in this PR is one such example, though its possible this could be implemented with a server side sleep or something. The rest of the ones introduced in this PR can pretty easily be (or already are) represented as unified tests though, so I'll update them.
Of the existing integration tests, only "threads blocked by maxConnecting check out returned connections" jumps out at me as being hard to do in the unified format, but that one doesn't rely on any SDAM functionality, so it could probably just stay as an integration test. I filed DRIVERS-2555 for updating the existing tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for letting this one simmer for a while--I've gone ahead and updated all the newly introduced CMAP-format tests to the unified format, with the exception of "checking in a connection after checkout fails closes connection" as mentioned above.
DRIVERS-1003
This PR introduces test coverage for some basic CMAP error cases. The changes involve a few new CMAP format tests as well as a few new unified format tests. The CMAP format's version of runOnRequirements was updated to support the
auth
requirement from the Unified Test Format. Also, the existing logging tests were moved into a singleunified
test directory along with the new tests.This PR also addresses DRIVERS-1784 and DRIVERS-1785, both of which were closely related.