Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sync-service): Clean up publication filters #2154

Merged
merged 24 commits into from
Dec 17, 2024

Conversation

msfstef
Copy link
Contributor

@msfstef msfstef commented Dec 11, 2024

Closes #1774

This work started to introduce column filters (see #1831) but ended up on a road block because of us using REPLICA IDENTITY FULL - however the work also takes care of cleaning up filters.

  • Introduced singular process for updating publication - we were locking on it before anyway, might as well linearise it ourselves.
  • Process maintains reference counted structure for the filters per relation, including where clauses and filtered columns, in order to produce correct overall filters per relation
  • Update to the publication is debounced to allow batching together many shape creations
  • Every update does a complete rewrite of the publication filters so they are maintained clean - but also introduced a remove_shape call so that if electric remains with no shapes it should also have no subscriptions to tables.

TODOs

  • Write tests for PublicationManager
  • Write procedure for recovering in-memory state from shape_status.list_shapes in recover_shapes
  • Split where clauses at top-level ANDs to improve filter optimality (suggested be @icehaunter ) - [edit: not doing this now, as we can be smart about this an do even more "merging" of where clauses like x = 1 and x = 2 to x in (1, 2) - separate PR]

@msfstef msfstef force-pushed the msfstef/filter-publication-columns branch 2 times, most recently from eb3a49c to be3538e Compare December 12, 2024 12:37
@msfstef msfstef marked this pull request as ready for review December 12, 2024 14:24
@msfstef
Copy link
Contributor Author

msfstef commented Dec 12, 2024

Regarding #1831 - we need to figure out when we need REPLICA IDENTITY FULL and make it so that it is only set when necessary (and column filters are removed in that case)

Once we determine if we can afford not to use REPLICA IDENTITY FULL at all times then we should be able to do column filtering easily with these changes.

Copy link
Contributor

@robacourt robacourt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work 👍

@msfstef
Copy link
Contributor Author

msfstef commented Dec 12, 2024

benchmark this

@msfstef msfstef force-pushed the msfstef/filter-publication-columns branch from f8edde4 to 64729a4 Compare December 12, 2024 16:56
Copy link
Contributor

github-actions bot commented Dec 12, 2024

Benchmark results, triggered for f8edd

concurrent shape creation completed

OLD

image

NEW

image

@electric-sql electric-sql deleted a comment from github-actions bot Dec 12, 2024
@msfstef msfstef force-pushed the msfstef/filter-publication-columns branch 2 times, most recently from 5105ced to 6e1aeca Compare December 16, 2024 13:33
@msfstef msfstef force-pushed the msfstef/filter-publication-columns branch from 6e1aeca to d0f488c Compare December 16, 2024 13:37
Copy link
Contributor

@icehaunter icehaunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job, and good tests. Coupld of nitpicks but feel free to ignore

@msfstef msfstef force-pushed the msfstef/filter-publication-columns branch from d0f488c to 5e5531a Compare December 16, 2024 14:12
@electric-sql electric-sql deleted a comment from github-actions bot Dec 17, 2024
@msfstef
Copy link
Contributor Author

msfstef commented Dec 17, 2024

benchmark this

Copy link
Contributor

github-actions bot commented Dec 17, 2024

Benchmark results, triggered for 8d697

  • write fanout completed

write fanout results

  • diverse shape fanout completed

diverse shape fanout results

  • concurrent shape creation completed

concurrent shape creation results

  • many shapes one client latency completed

many shapes one client latency results

  • unrelated shapes one client latency completed

unrelated shapes one client latency results

@msfstef msfstef merged commit d7e7c72 into main Dec 17, 2024
32 checks passed
@msfstef msfstef deleted the msfstef/filter-publication-columns branch December 17, 2024 10:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clean up publication filters when shapes are removed
3 participants