Persistence: purge unreferenced Obj
s (WIP)
#9688
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is an attempt to implement the algorithm mentioned in the PR #9401.
The
Obj.referenced()
attribute contains the timestamp when the object was last "referenced" (aka: attempted to be written). It is ...storeObj()
storeObj()
upsertObj()
updateConditional()
Let's assume that there is a mechanism to identify the IDs of all referenced objects (it would be very similar to what the export functionality does). The algorithm to purge unreferenced objects must never delete an object that is referenced at any point of time, and must consider the case that an object that was unreferenced when a purge-unreferenced-objects routine started, but became referenced while it is running.
An approach could work as follows:
* the ID is not in the set (or bloom filter) generated in step 2 ...
* AND have a
referenced
timestamp less than the memoized timestamp.Any deletion in the backing database would follow the meaning of this pseudo SQL:
DELETE FROM objs WHERE obj_id = :objId AND referenced < :memoizedTimestamp
.Noting, that the
referenced
attribute is rather incorrect when retrieved from the objects cache (aka: during normal operations), which is not a problem, because thatreferenced
attribute is irrelevant for production accesses.There are two edge cases / race conditions:
storeObj()
operation detected that the object already exists - then the purge routine deletes that object - and then thestoreObj()
tries to upddate thereferenced
attribute. The result is the loss of that object. This race condition can only occur, if the object existed but was not referenced.