Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transition the Database Entry Event handling from experimental to production #4985

Open
edwbuck opened this issue Mar 15, 2024 · 4 comments
Open
Labels
priority/backlog Issue is approved and in the backlog

Comments

@edwbuck
Copy link
Contributor

edwbuck commented Mar 15, 2024

After a sufficient amount of time has passed, we should remove the experimental categorization of the database entry event performance improvements.

Much of the testing of the experimental feature has already been performed in production environments. This issue is to coordinate any remaining required efforts and the final code changes necessary to effect the change.

@evan2645 evan2645 added the triage/in-progress Issue triage is in progress label Mar 19, 2024
edwbuck added a commit to edwbuck/spire that referenced this issue Mar 25, 2024
Closes spiffe#4985

Signed-off-by: Edwin Buck <edwbuck@gmail.com>
@azdagron azdagron added this to the 1.10.0 milestone Mar 28, 2024
@azdagron azdagron added priority/backlog Issue is approved and in the backlog and removed triage/in-progress Issue triage is in progress labels Mar 28, 2024
@azdagron azdagron removed this from the 1.10.0 milestone Mar 28, 2024
@faisal-memon
Copy link
Contributor

@edwbuck
Copy link
Contributor Author

edwbuck commented Apr 12, 2024

@evan2645 Please retarget this for 1.11.0. The issues at Uber are just too close to the 1.10.0 release to have enough observation period to build confidence that no new issues are discovered / introduced.

@stevend-uber
Copy link
Contributor

Can we add an exit criteria for this feature to be production ready to have a holistic integration test for all of the different workflows we've talked about in the past few months?

I think this workflow that @faisal-memon is working on faisal-memon#344, is both valuable but complex and we should have a deep integration test for the whole event-based feature to give confidence for other users.

@edwbuck
Copy link
Contributor Author

edwbuck commented Apr 16, 2024

@stevend-uber The only bug that we have encountered that is a race condition in the relational database, one that is well known in database access programming and has a known solution.

The solution you mention "is complex" is, in simple terms:

  • We rely on the database's guaranteed to issue ID numbers within the events table.
  • We don't rely on the database issuing them in order (here is where the prior mistake was made).
  • To avoid relying on them being issued in order, we now take note of the skipped IDs and prior to processing new events prior to pulling a new batch.

For handling the request of a holistic integration test, I would like to note that the unit testing can capture detecting skipped ids, and note that unit testing can check that skipped ids are processed in order prior to the currently existing batch processing. This would put the entire solution under test, and is the current test plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/backlog Issue is approved and in the backlog
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants