-
Notifications
You must be signed in to change notification settings - Fork 19
Description
User Story
As a user of Sleeper, I want changes to the Sleeper table state to be applied quickly and reliably, so that my data is not lost and I can retrieve the data I expect in a timely manner.
Description / Background
At time of writing the multithreaded state store committer uses a single StateStoreProvider and TablePropertiesProvider, and retrieves from both of them on many threads at once. Both of these are backed by a HashMap, and this results in both put and get calls made from many threads.
We'd like to avoid any concurrency bugs or problems related to this.
Technical Notes / Implementation Details
HashMap is documented to require synchronization for use over multiple threads. It's not clear what the behaviour will be when conflicts or race conditions occur. It could result in the state not being cached when it should be, or it could be worse than that.
We can refactor the use of StateStoreCommitter to allow reading from the HashMap objects in the main thread, before handing off to the thread per table.
Dependencies / Blockers
Conflicts with: