-
Notifications
You must be signed in to change notification settings - Fork 295
feat: Rely on storage for metadata persistence and caching instead of ShapeStatus
#3572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3572 +/- ##
=======================================
Coverage 88.13% 88.13%
=======================================
Files 18 18
Lines 1643 1643
Branches 409 412 +3
=======================================
Hits 1448 1448
Misses 193 193
Partials 2 2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
20001a4 to
8ee5277
Compare
|
benchmark this |
|
One thing I figured out is that the read through cache is not actually active if the shape consumer is not "alive", because when a consumer is shut down we also take down the stack ETS entry: electric/packages/sync-service/lib/electric/shape_cache/pure_file_storage.ex Lines 537 to 540 in c6fb885
electric/packages/sync-service/lib/electric/shapes/consumer.ex Lines 818 to 830 in c6fb885
So somehow we need to keep alive this read-through cache, or use a different one since this one is also not alive if the consumer is never started (which would be common with lazy initialization). Open to ideas, I would like to have a single cache ideally rather than a separate one. Perhaps upon recovering shapes we do an initialization on them that isn't necessarily starting a whole process? @magnetised would like to hear your thoughts |
magnetised
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice. Feels like we're working with the grain of this thing
| end | ||
|
|
||
| defp mark_snapshot_started(%State{stack_id: stack_id, shape_handle: shape_handle} = state) do | ||
| :ok = ShapeCache.ShapeStatus.mark_snapshot_as_started(stack_id, shape_handle) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re our conversation yesterday about giving shape db some kind of validity flag, is it worth keeping this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not because the validity will be reliant on a snapshot_finished? rather than this snapshot started, so it needs to be implemented in a different way anyway.
| ctx do | ||
| expect_shape_status( | ||
| set_latest_offset: fn _, @shape_handle1, _ -> | ||
| raise "The unexpected error" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we keep this test but just move the crash onto a function that is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried but couldn't find something that made sense - we are literally crashing the process in the next test (which a raise does exactly that, sends an exit signal to itself AFAIK) so I figured this is unnecessary coverage
|
benchmark this |
Good catch. We need some stack-level equivalent to the shape status ets for the storage. We have the |
8ee5277 to
1f87aca
Compare
|
benchmark this |
|
@magnetised I've added a read-through cache on the storage layer so that readers directly go from memory when the shape is "sleeping" |
magnetised
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, though I'm confused why we're only reading from the read through cache when the writer is inactive. Feels messy to have two modes. Would it be horrible to remove the split?
| end | ||
| end | ||
|
|
||
| defp delete_shape_ets_read_through_cache_entry(stack_id, shape_handle) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'm all for clear function names but this is a little verbose, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very verbose - but couldn't think of a better name that retains the clarity
| WriteLoop.cached_chunk_boundaries(initial_acc) | ||
| ) | ||
|
|
||
| # remove any existing read-through cache now that writer is active |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the read through cache is now unnecessary, and will also be outdated, so I'd rather remove it rather than leave outdated data hanging in memory
packages/sync-service/lib/electric/shape_cache/pure_file_storage.ex
Outdated
Show resolved
Hide resolved
packages/sync-service/lib/electric/shape_cache/pure_file_storage.ex
Outdated
Show resolved
Hide resolved
packages/sync-service/test/electric/shape_cache/pure_file_storage_test.exs
Show resolved
Hide resolved
packages/sync-service/test/electric/shape_cache/pure_file_storage_test.exs
Outdated
Show resolved
Hide resolved
@magnetised the read through cache is a cache that gets populated on read - the Because If you're willing to have a go at consolidating them I'd be happy to accept some help on this. We could also merge this and do it in a separate PR (?) |
b4abfa9 to
9e45664
Compare
7e0d8c1 to
f6dca76
Compare
alco
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lovely!
f6dca76 to
6f73ec4
Compare
|
@msfstef I switched things so that some keys are always written to the stack cache, even if there's a writer active. Simplifies things at the expense of splitting reads/writes across two tables. |
msfstef
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@magnetised love the change - had some comments about naming that feels very hard to decode what the intent was, and questions around what the existing ETS storage does now that it is not getting updated for these keys
| ) do | ||
| if old_last_persisted_txn_offset != last_persisted_txn_offset do | ||
| write_metadata!(opts, :last_persisted_txn_offset, last_persisted_txn_offset) | ||
| write_cached_metadata!(opts, :last_persisted_txn_offset, last_persisted_txn_offset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how this affects performance, I had decided to leave it as is purely for the purpose of not doing additional roundtrips to ETS as there seemed to have been an intentional effort to avoid them.
This call is on the hot path so it's an important thing to note.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this last_persisted_txn_offset is available to the read path, so if we don't do a write-through here then reads will go to the disk (or will be stale).
also note that the original does a write to the disk here, then a write to ets immediately below, so there's not a huge difference (apart from having to split the writes between tables, so not being able to issue a single update_element call
| do: Postgrex.query!(ctx.db_conn, stmt) | ||
|
|
||
| Process.sleep(100) | ||
| Process.sleep(120) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could alternatively just have a helper assert_with_timeout where it retries the assertion for a maximum of 500ms
| ] | ||
|
|
||
| # Record that caches read-through metadata to avoid disk reads | ||
| defrecord :storage_meta_stack, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the name storage_meta_stack as opposed to storage_meta I have to admit is a bit confusing - what was the reasoning for this name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what you mean by "as opposed to" here. I renamed it from storage_meta_read_through to storage_meta_stack - the reasoning is that these are the values maintained at the stack level, not by the writer.
I'll admit that the naming isn't 100%. perhaps storage_meta_global or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I figured, the confusing aspect is that storage_meta is also stored in what we call the "stack ets" - so the whole thing is a bit hard to decode
a reference to it being a cache record (similar to how you renamed the ets to a cache table) might work?
| end | ||
|
|
||
| defp write_metadata_cache(%__MODULE__{} = opts, key, value) | ||
| when key in @stack_cached_keys do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we're always writing to this, and not the other, should the other not be storing them in ETS at all? This looks like a good change but I'm confused that it's leaving "code residue" in the existing ETS that's a bit hard to decode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the existing ets is still used for everything that isn't made available for the read path. also, what do you mean by "the other" here. wanna be sure what you mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by "the other" I do mean the original stack ets - basically I'm asking, given that we're storing the same keys in two different ETS tables and records, how are we ensuring they are never out of sync?
6f73ec4 to
32829d4
Compare
|
benchmark this |
|
@msfstef this now uses the stack ets for the read cache and write cache, i've merged into this commit: the read path is careful about initializing the cache from disk -- theres a potential race with the writer initialising so it uses insert_new, the write path isn't - it just initializes the cached values because it knows that it's in charge. rather than deleting the cached values when the writer terminates, it just nils the cached values that aren't needed for the write path. i've looked at clean up as well and decided to always explictly delete the writer's ets table at the cost of a few extra file writes. |
28d702d to
fe0d5c8
Compare
msfstef
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful, much better @magnetised
Only question is around populating the read through cache - it seems that if it is missing, we populate it in its entirety with all read path keys (rather than lazily populating each key) - was this deliberate? Why not only load the desired key with actual data and the rest with placeholders?
packages/sync-service/lib/electric/shape_cache/pure_file_storage/shared_records.ex
Show resolved
Hide resolved
| :removed -> :ok | ||
| end | ||
| # always need to terminate writer to remove the writer ets (which belongs | ||
| # to this process). leads to unecessary writes in the case of a deleted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to clarify - the unnecessary writes you mean after the shape has been marked for removal, it might receive some events that it will write to the log? in a future pr we could probably just fix that since we mark the removal anyway
otherwise the terminate_writer technically just closes files (which is a no-op if they were already closed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a notionally deleted shape may have some buffered writes that the Storage.terminate call will flush to disk. we could save those writes, which is what I was doing by deleting the writer, but in the end I think it's worth knowing that the cleanup will always execute.
| defp populate_read_through_cache!(%__MODULE__{} = opts, extra_keys) do | ||
| %{shape_handle: handle, stack_ets: stack_ets} = opts | ||
|
|
||
| read_keys = Enum.into(extra_keys, MapSet.new(@read_path_keys)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so to clarify - if there's no active "cache", then on any request for a single key will trigger as many reads as there are @read_path_keys? I thought the idea was to only populate it with the desired key and everything else would have a placeholder :not_cached
Unless the thought here is that if someone is doing a read, they will probably need all the read path keys anyway, so might as well load them all up front if someone needs a read. Will appreciate to know what the logic is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless the thought here is that if someone is doing a read, they will probably need all the read path keys anyway, so might as well load them all up front if someone needs a read
That's the logic -- there is no partial read: if you need one value you'll need them all.
| ) do | ||
| if old_last_persisted_txn_offset != last_persisted_txn_offset do | ||
| write_metadata!(opts, :last_persisted_txn_offset, last_persisted_txn_offset) | ||
| write_cached_metadata!(opts, :last_persisted_txn_offset, last_persisted_txn_offset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this last_persisted_txn_offset is available to the read path, so if we don't do a write-through here then reads will go to the disk (or will be stale).
also note that the original does a write to the disk here, then a write to ets immediately below, so there's not a huge difference (apart from having to split the writes between tables, so not being able to issue a single update_element call
| :removed -> :ok | ||
| end | ||
| # always need to terminate writer to remove the writer ets (which belongs | ||
| # to this process). leads to unecessary writes in the case of a deleted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a notionally deleted shape may have some buffered writes that the Storage.terminate call will flush to disk. we could save those writes, which is what I was doing by deleting the writer, but in the end I think it's worth knowing that the cleanup will always execute.
| defp populate_read_through_cache!(%__MODULE__{} = opts, extra_keys) do | ||
| %{shape_handle: handle, stack_ets: stack_ets} = opts | ||
|
|
||
| read_keys = Enum.into(extra_keys, MapSet.new(@read_path_keys)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless the thought here is that if someone is doing a read, they will probably need all the read path keys anyway, so might as well load them all up front if someone needs a read
That's the logic -- there is no partial read: if you need one value you'll need them all.
|
This PR has been released! 🚀 The following packages include changes from this PR:
Thanks for contributing to Electric! |
Closes #3451
With @magnetised 's move of the handle -> shape and shape -> handle lookups into the
ShapeDbabstraction, we actually no longer need to persist anything else in ShapeStatus.For
snapshot_started?we can directly read from the storage, which has its own read-through cache, and since this is only called inawait_snapshot_start?we can just directly call it there and get rid of all the unused APIs.For
latest_offset, I've taken that out of ShapeStatus and instead we read it from storage with the same read-through cache. TheShapeCacheAPIs are responsible for collating shape handles found in the indices with an offset from the storage to maintain the same outward facing API behaviour.With this change,
ShapeStatusno longer either recovers any of the metadata (getting rid of 66% of reads on a cold restore), nor persists any thing other than the lookups. Everything else is populated upon recovery.I've gotten rid of all unused APIs and calls in the consumer and shape status, and I took the chance to do some API renaming for consistent behaviour (e.g.
fetch_handle_by_shapeandfetch_shape_by_handle, and both do{:ok, res}or:error).I've also split
Storage.get_current_positionintoStorage.get_latest_offsetandStorage.get_pg_snapshotto avoid doing two reads when only one is needed, since in most places we need either one or the other. Only the consumer on initialization needs both, and it can simply call both APIs with the same performance.I've also changed ETS tables for most things to
setfromordered_set, basically all except the relation lookups since most don't do any ordered/range lookups.If I have time I will also benchmark cold shape recovery for 50k shapes with this change.