Skip to content

Fix testGetSnapshotsMultipleRepos #43993

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

original-brownbear
Copy link
Contributor

@original-brownbear original-brownbear commented Jul 4, 2019

The order tested in the equals check in the un-muted test needs exact timestamps. Otherwise two subsequent snapshots might have the same timestamps and hence will get ordered differently than the list of snapshot names that was built as the snapshots were created => Fixed by having exact timestamps in this test.

This was triggered by #43148 indirectly since we now use cached time when creating SnapshotInfo so if the test runs very quickly the ordering of SnapshotInfo by time is gone because two subsequent instances have the same timestamp due to the now higher time uncertainty form the caching.

The order doesn't matter here and this just failed on me randomly with the order being off -> fixed by using unordered assertion
@original-brownbear original-brownbear added >test Issues or PRs that are addressing/adding tests :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 labels Jul 4, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@original-brownbear
Copy link
Contributor Author

Jenkins run elasticsearch-ci/2 (unrelated watcher test failure)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced this is a 100% reliable fix. It's possible that two snapshots could occur in the same millisecond (and/or the clock isn't granular enough). I think we should do something like this in the test at the point that we require time to advance:

                for (final ThreadPool threadPool : internalCluster().getInstances(ThreadPool.class)) {
                    final long startMillis = threadPool.absoluteTimeInMillis();
                    assertBusy(() -> assertThat(threadPool.absoluteTimeInMillis(), greaterThan(startMillis)));
                }

@original-brownbear
Copy link
Contributor Author

@DaveCTurner I'm not sure this is a real world possibility with us fsyncing a ton files for each snapshot, but sure I can add that real quick.

@original-brownbear
Copy link
Contributor Author

@DaveCTurner added that clock wait :)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's hard to judge how likely this is, but if nothing else there's a benefit to documenting the fact that we're relying on time moving forwards. Also we've been bitten in the past with assumptions that certain operations always take a measurable time, particularly across platforms (looking at you, Windows 👀).

Also "fsyncing" isn't really fsyncing in tests IIRC.

@original-brownbear
Copy link
Contributor Author

Jenkins test elasticsearch-ci/1 (reported unrelated failure in #44012 )

@original-brownbear original-brownbear merged commit 3572bfd into elastic:master Jul 5, 2019
@original-brownbear original-brownbear deleted the fix-snapshot-test branch July 5, 2019 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >test Issues or PRs that are addressing/adding tests v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants