-
Notifications
You must be signed in to change notification settings - Fork 9
GEN-5150/GEN-5151 sdk #76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit of tighter scoping / more context on the TimeOfUseServiceTests please.
On the other one, how did you identify it's an indexing concern? We've been tracking an issue with intermittent errors trying to use profiles shortly after creation... that one's looking like replication lag between read-replicas. Might that be also what's going on here?
@@ -213,7 +213,17 @@ public void deletePrivateTouGroupWorksCorrectly() { | |||
} | |||
|
|||
private void cleanUpPrivateTou(TimeOfUseGroup grp) { | |||
touService.deletePrivateTimeOfUseGroup(grp.getLseId(), grp.getTouGroupId()); | |||
try { | |||
// Check if the TOU group has been deleted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add context to someone reading the code without knowledge of this issue: "Try to delete the private TOU only if it still exists (sometimes it might be already deleted)"
touService.deletePrivateTimeOfUseGroup(grp.getLseId(), grp.getTouGroupId()); | ||
} | ||
} catch (GenabilityException e) { | ||
if (!e.getMessage().contains("404")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have two calls, which throws the GenabilityException? Wrap the try/catch around just that one -- you generally want try/catches as tightly scoped as possible.
Also explain why a 404-containing message should be ignored (didn't you already check it exists?)
@@ -132,6 +132,13 @@ public void testPaginatedAccountList() { | |||
} | |||
|
|||
try { | |||
// Add a delay to allow for indexing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh. How did you figure out it's an indexing concern? (Can you add a comment or note on the PR?)
How confident are you that 2000ms is enough? If it's just a guess / subject to additional tuning, it's helpful for future developers to eg add a comment to the line that errors like "This line fail with xxxException? Consider tuning the sleep above to give more time..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question - I assumed it was indexing or a consistency issue because the test creates accounts and immediately searches for them using a name filter. I added the .sleep(2000) to give mongo time to catch up, which fixed the flakiness locally, but its definitely just a best guess. Its kind of tricky to test this since the failures were intermittent.
// Could be due to indexing delay or read-replica sync delay. | ||
// 2000ms is a best-guess based on some testing; if this test continues to fail, consider increasing this delay. | ||
try { | ||
Thread.sleep(2000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 seconds is a high value. I think replica lag would be under 100ms typically. I wonder if 500ms or 200ms are better starting values.
try { | ||
touService.deletePrivateTimeOfUseGroup(grp.getLseId(), grp.getTouGroupId()); | ||
} catch (GenabilityException e) { | ||
// It's possible the TOU group was deleted after the check but before the delete call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my concern that we are retrieving/creating TOU grp by generic test name but deleting TOUs by integer ID. This opens us up to concurrency condition with other simultaneous tests. Not sure with github job runner if this a real issue, but why would the tou group be deleted before the clean up?
Intermittent failure: SDK > TimeOfUseServiceTests.addPrivateTouGroupWorksCorrectly
The test itself passes (adding the TOU group works)
The failure occurs during cleanup with a 404 error when trying to delete the TOU group
The resource was getting deleted by another process or there's a race condition
Intermittent failure: SDK > AccountServiceTests.testPaginatedAccountList:139