Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving usability of subscription #18208

Open
3 of 5 tasks
hzxa21 opened this issue Aug 23, 2024 · 2 comments
Open
3 of 5 tasks

Improving usability of subscription #18208

hzxa21 opened this issue Aug 23, 2024 · 2 comments
Assignees
Labels
needs-discussion type/enhancement Improvements to existing implementation. type/feature
Milestone

Comments

@hzxa21
Copy link
Collaborator

hzxa21 commented Aug 23, 2024

  • Change the default behavior of DECLARE cursor_name SUBSCRIPTION CURSOR FOR subscription_name to SINCE now() without backfilling historical data.
  • Change op column type from int16 to varchar to make it more understandable: 1 -> insert, 2 -> update_insert, 3 -> delete, 4 -> update_delete
  • When waiting for new subscription data to come, change the FETCH behavior from returning empty result to blocking (with an optional timeout) until new data arrives. This makes user easier to develop their application in event-driven manner. make FETCH NEXT FROM cur a blocking call #18107
  • When user session is active (i.e. client-FE connection is alive), automatically retry querying log store on retryable errors, including cluster recovery, query stream timeout, transient network error between FE and CN.
  • more msg with show cursors feat(suscription): Improving usability of subscription #18217 (comment)

Feel free to post more ideas under this issue.

@hzxa21
Copy link
Collaborator Author

hzxa21 commented Aug 27, 2024

When user session is active (i.e. client-FE connection is alive), automatically retry querying log store on retryable errors, including cluster recovery, query stream timeout, transient network error between FE and CN.

Recently we found that if user declare a cursor but doesn't fetch it frequently, it may cause the query stream remain valid but unpolled for a long time, which may causes the storage epoch being pinned for a long time. We may extend the above idea to actively shutdown idle query stream and re-create one from the previous pos in log store when the cursor is fetched again.

@lmatz
Copy link
Contributor

lmatz commented Sep 11, 2024

https://docs.risingwave.com/docs/current/subscription/#persisting-the-consumption-progress requires users to persist in the progress by themselves. We can let RW handle this internally by exposing an "ack" function to users.

In terms of what RW does underneath the "ack",

cur.execute("INSERT INTO subscription_progress (sub_name, progress)", (sub_name, progress, progress))
cur.execute("FLUSH")

I wonder if there could be cases (e.g. barriers already pile up in the system?) where "flush" takes a non-negligible period of time, which makes "exactly-once delivery" and getting the latest results at low latency via subscribe not achievable at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-discussion type/enhancement Improvements to existing implementation. type/feature
Projects
None yet
Development

No branches or pull requests

3 participants