-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
coord: Optimize single statement read transaction #15255
Conversation
76b19e8
to
77b749f
Compare
The PostgreSQL extended wire protocol (https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY) leaves it ambiguous as to whether or not the current statement will be single statement implicit transaction or a multi statement implicit transaction. Due to this ambiguity, we previously removed all of our single statement read transactions in a887755. This ended up negatively impacting query freshness and latency in the Serializable isolation level. PostgreSQL will eagerly commit implicit transactions after executing certain statements but before receiving a Sync message. See https://www.postgresql.org/message-id/flat/17434-d9f7a064ce2a88a3%40postgresql.org and https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=f92944137. This commit will eagerly commit ALL statements in an implicit transaction when using the extended protocol. This vastly simplifies things and allows us to re-apply the optimizations we previously made to single statement transactions. Multi-statement implicit transactions using the extended protocol are extremely rare and an edge case. They are not possible unless a client uses pipelining in psql or a custom driver. The PostgreSQL BEGIN documentation even wrongly implies that they don't exist (https://www.postgresql.org/docs/current/sql-begin.html). Therefore, it's much better to apply the optimizations in the common case than to allow this edge case feature. Much of this commit is reversing certain parts of a887755. Fixes #14696
77b749f
to
9538e2e
Compare
@@ -0,0 +1,49 @@ | |||
# Test that multiple DDL statements in the same extended request are supported |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this differ from postgres (why is this file in pgtest-mz instead of pgtest)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Postgres the Insert will be aborted while in Materialize the Insert will be committed in it's own transaction. I've added a SELECT
at the end to clarify this.
The single-statement optimization introduced in MaterializeInc#15255 caused Materialize to deviate in behavior from PostgreSQL when statements are submitted in implicit transactions using the extended query protocol. The deviation in behavior was intentional, as doing so greatly simplifies our code and it's fairly hard to hit this case in the protocol. However, a user recently discovered that the Node.js pg-query-stream library does not work correctly with Materialize due to this deviation in behavior. pg-query-stream attempts to page through a portal that is established in an implicit transaction via the extended query protocol, but the portal is immediately expired after the first batch of results. This commit adjusts the adapter/pgwire protocol to special-case paging through a portal established in an implicit transaction via the extended protocol, without losing the single-statement optimizations from MaterializeInc#15255. It adds an end-to-end test to verify that the Node.js pg-query-stream library is now working as expected. Fix #24059.
The single-statement optimization introduced in MaterializeInc#15255 caused Materialize to deviate in behavior from PostgreSQL when statements are submitted in implicit transactions using the extended query protocol. The deviation in behavior was intentional, as doing so greatly simplifies our code and it's fairly hard to hit this case in the protocol. However, a user recently discovered that the Node.js pg-query-stream library does not work correctly with Materialize due to this deviation in behavior. pg-query-stream attempts to page through a portal that is established in an implicit transaction via the extended query protocol, but the portal is immediately expired after the first batch of results. This commit adjusts the adapter/pgwire protocol to special-case paging through a portal established in an implicit transaction via the extended protocol, without losing the single-statement optimizations from MaterializeInc#15255. It adds an end-to-end test to verify that the Node.js pg-query-stream library is now working as expected. Fix #24059.
coord: Optimize single statement read transaction
The PostgreSQL extended wire protocol
(https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY)
leaves it ambiguous as to whether or not the current statement will be
single statement implicit transaction or a multi statement implicit
transaction. Due to this ambiguity, we previously removed all of our
single statement read transactions in
a887755. This ended up negatively
impacting query freshness and latency in the Serializable isolation
level.
PostgreSQL will eagerly commit implicit transactions after executing
certain statements but before receiving a Sync message. See
https://www.postgresql.org/message-id/flat/17434-d9f7a064ce2a88a3%40postgresql.org
and
https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=f92944137.
This commit will eagerly commit ALL statements in an implicit
transaction when using the extended protocol. This vastly simplifies
things and allows us to re-apply the optimizations we previously made
to single statement transactions.
Multi-statement implicit transactions using the extended protocol are
extremely rare and an edge case. They are not possible unless a client
uses pipelining in psql or a custom driver. The PostgreSQL BEGIN
documentation even wrongly implies that they don't exist
(https://www.postgresql.org/docs/14/sql-begin.html). Therefore,
it's much better to apply the optimizations in the common case than to
allow this edge case feature.
Much of this commit is reversing certain parts of
a887755.
Fixes MaterializeInc/database-issues#4202
Fixes MaterializeInc/database-issues#4010
Motivation
This PR adds a known-desirable feature.
Tips for reviewer
Checklist
This PR has adequate test coverage / QA involvement has been duly considered.
This PR evolves an existing
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way) and therefore is tagged with aT-proto
label.This PR includes the following user-facing behavior changes: