Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Test Fix #3424] PointInTimeOperationTest.listAllPits_positive #3611

Conversation

EduardoCorazon
Copy link
Contributor

@EduardoCorazon EduardoCorazon commented Oct 28, 2023

Description

This PR is meant to address the issue of cleanUpPits() returning before all of the PITs have actually been deleted on the cluster as referenced in issue #3424.

[Describe what this change achieves]

  • Category: Test fix
  • Why these changes are required? : The file PointInTimeOperationTest.java has functions listAllPits_positive() and deleteAllPits_positive() calling cleanUpPits(). However, it was suggested that cleanUpPits() was returning before all of the PITs actually got deleted. Upon testing, the call to delete all pits in restHighLevelClient.deleteAllPits(DEFAULT) yields the prevalent error:
java.io.IOException: Unable to parse response body for Response{requestLine=DELETE /_search/point_in_time/_all HTTP/1.1, host=https://127.0.0.1:47210, response=HTTP/2.0 200 OK}
 at __randomizedtesting.SeedInfo.seed([14B732A7AAEADD96:E3D6D13BF1B12B63]:0) ... Caused by: com.fasterxml.jackson.core.io.JsonEOFException: Unexpected end-of-input: was expecting closing '"' for name
 at [Source: (ByteArrayInputStream); line: 1, column: 12]
  • What is the old behavior before changes and new behavior after changes? : The old behavior as mentioned above, resulted in failed integration tests using gradlew and vs code debuggers. The new changes fix this issue by fetching information about existing PIT searches before attempting to delete them. It's possible this solutions works either because there might be some internal synchronization or update to the PIT search state when we call getAllPits or it could be a race condition whereby there is an attempt to delete PITs before internal operations have completed.

Issues Resolved

#3424

Is this a backport? If so, please add backport PR # and/or commits #

Testing

Integration tests with gradlew were run on both listAllPits_positive() and the entire class PointInTimeOperationtest(). These tests were set to run at least 10 times by configuring ci.yml as outlined here: how to do this

Exit Criteria

  • Ensure that PITs have been cleaned up in a deterministic way - note this could be a product change too
  • Run the test at least 10 times and see no failures - how to do this

Check List

  • New functionality includes testing
  • New functionality has been documented
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Eduardo Corazon <corazoneduardo@att.net>
Signed-off-by: Eduardo Corazon <corazoneduardo@att.net>
Signed-off-by: Eduardo Corazon <corazoneduardo@att.net>
@EduardoCorazon
Copy link
Contributor Author

Currently investigating why solution works & properly implementing the fix. Will @ when ready for full review.

…eOperationTests PR

Signed-off-by: Eduardo Corazon <corazoneduardo@att.net>
@peternied
Copy link
Member

I'm going to switch this to draft so folks get notified when you are ready for a review.

image

@peternied peternied marked this pull request as draft October 30, 2023 14:55
Signed-off-by: Eduardo Corazon <corazoneduardo@att.net>
@EduardoCorazon EduardoCorazon marked this pull request as ready for review November 12, 2023 02:24
@EduardoCorazon
Copy link
Contributor Author

@DarshitChanpura I've marked this PR as ready, the original problem for cleanUpPits has been addressed. I've reverted all accidental changes from CrossClusterSearchTests.java but I'm still getting conflicts. I was wondering why this might be as there were no changes to the file outside the reverted accidental change?

@EduardoCorazon
Copy link
Contributor Author

I also deeply apologize for the delayed response regarding this matter.

@cwperks
Copy link
Member

cwperks commented Nov 13, 2023

Thank you @EduardoCorazon. Can you explain how adding restHighLevelClient.getAllPits(DEFAULT); resolves the issue with cleaning up the PITs? What is the response body when the IOException occurs?

Copy link
Member

@DarshitChanpura DarshitChanpura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EduardoCorazon Seems like there are conflicts, would you please address them?

try (RestHighLevelClient restHighLevelClient = cluster.getRestHighLevelClient(ADMIN_USER)) {

restHighLevelClient.getAllPits(DEFAULT);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq, why add this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seemingly fixes the test flakiness. I've found that even just having the declaration of restHighLevelClient.getAllPits(DEFAULT) before attempting to delete all PITS makes the listAllPits_positive integration test succeed as opposed to failing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EduardoCorazon Have we verified the cause of this?

Signed-off-by: Eduardo Corazon <corazoneduardo@att.net>
Signed-off-by: Eduardo Corazon <corazoneduardo@att.net>
@EduardoCorazon
Copy link
Contributor Author

EduardoCorazon commented Nov 17, 2023

@cwperks @DarshitChanpura I think there might be a synchronization issue that's affecting the behavior of cleanUpPits() and it could be why the test is flaky. Adding restHighLevelClient.getAllPits(DEFAULT) before attempting to delete all PITs (even if you declare .getAllPits(DEFAULT) as an object that will never be used) seemingly fixes the flaky test. It could be that the getAllPits method might be triggering some internal update that stabilizes all PITs before deleting them.

This can be verified by running ./gradlew integrationTest --tests org.opensearch.security.PointInTimeOperationTest.listAllPits_positive and if you include restHighLevelClient.getAllPits(DEFAULT) the test is a success but if it's commented it out, the test fails.

@EduardoCorazon
Copy link
Contributor Author

Also, @cwperks do you know how I could push out the raw HTTP response as standard output? I've exhausted my attempts at attempting to do such given the current libraries. I apologize for the simple question.

@peternied
Copy link
Member

Doesn't look like this is being worked on anymore, going to close this out. Please feel free to reopen if you re-engage on this @EduardoCorazon

@peternied peternied closed this Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants