-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple requests in a transaction #286
Comments
I'm curious that there hasn't been a response to this. How are people handling many-many relationships with postgrest? |
Hey sorry for the slow response. I've been away on vacation this week and not on the computer much. There is currently no general way to batch operations in a single transaction. As a workaround you can create a stored procedure to do complicated updates and call it with the rpc interface. It would be good to invent a nice interface for batch operations. We could take advantage of HTTP/2 multiplexing and share a single transaction per request. Then the client could send several requests in the same connection to execute atomically. |
Thanks for the reply. An interface for batch operations would be awesome. But apart from the "how can I do this atomically" question, I guess I was wondering if there was a better way to model the data relationship given the capabilities of postgrest now? Given my naive model above, If I want to update an event along with the list of participants associated, I need to:
Perhaps, as you say the RPC approach is the best one. |
The RPC is your best bet at the moment, but we really need a batch interface. I can see that being high priority after the upcoming v0.3 release. |
Is this still a priority? When this is implemented, I will be able to use PostgREST in many of my projects. |
Not in the immediate future but there is a possibility of manually implementing this by having a trigger on insert/update in the parent table take the values that are meant for the child table and send them there. |
Our web server supports HTTP/2, so we can do this bulk operation stuff at some point but it'll take a significant amount of work and there are a number of more pressing issues to fix before attempting it. |
I'm not sure about the relative stupidity of this suggestion, but couldn't you use custom headers (or reuse semantically appropriate standard ones) to track requests that should be wrapped in a single transaction?
|
Your suggestion makes sense. I've seen a related scheme recommended on various places on the web: expose an HTTP resource that represents transactions. Let me outline how it could work, then talk about some problems. How it could workThe server can expose With the txid you could then send regular postgrest requests and either include it in a query param like ChallengesOK that's the general idea, and here are challenges we would face:
So I still think HTTP/2 is the most elegant way, but it may have problems of its own like client compatibility. What do you think? Do you have solutions for these problems, or perhaps problems I didn't imagine? |
How would HTTP/2 handle arbitrary SELECT/UPDATE type transactions? |
What about - until a good way of utilizing HTTP2 connections can be established - just supporting a simple I'm thinking the format could look like this: POST /transaction
Content-Type: application/vnd.pgrst.patch+json
[
{
"verb": "DELETE",
"endpoint": "/dogs?alive=is.false"
},
{
"verb": "POST",
"endpoint": "/dogs",
"data": "{\"name\": \"Foo\"}"
}
] |
Aside from the http2 interface to implement this feature, we also need a way to restrict the cost of what could be many expensive queries/mutations in a single transaction and that way avoid the risk of clients ddosing the database. I think this would be better enforced at the database level with a |
Comment from the peanut gallery about request batching: there's a standard |
This approach seems to be the best for current versions. However it would be great to see something more transactions-friendly in feature. |
The approach seems something like https://github.com/tobyhede/postgresql-event-sourcing, but it would need a complete rewrite of the application logic, something that is not always possible. |
Even if the http2 interface is implemented, that still wouldn't solve the problem of INSERTing a row dependent on the id of another row(child/parent table). I made a proposal for solving that part in #818 (comment). |
Just throwing out an idea here: Implement a /batch or /transaction endpoint. You can POST your array of commands: [ These would all be performed inside of a transaction. Should work for other HTTP requests as well. Thoughts? |
What's the benefit of a transaction interface over exposing an rpc function that takes json objects, arrays and what not? I know it's been said before, but I'm saying it again; you can do it all in your function. Sure, creating such a function means putting some effort handling json and arrays (of json objects) as arguments. But, implementing an http request to construct a transaction (as proposed above) requires effort too. I would put my effort in building the Postgres function instead of such an http request. There's an argument to be made about keeping PostgREST lean because that's more likely to not introduce extra maintenance and bugs. Offload your requirements to Postgres, not PostgREST when possible. |
The benefit is that while a json parsing /rpc function would need to be individually coded for every parent/child relationship of every project that uses PostgREST, a transaction interface as proposed above would be coded once into PostgREST. I agree that a json parsing /rpc function can be written, but it's not necessary easy to write one well that handles error cases correctly and reports meaningful error messages. And these /rpc function have a big maintenance problem in that any change to the schema would require changes to the /rpc function. There's probably more to consider regarding how to declare the arguments meaningfully in the OpenAPI spec, but I haven't thought that through. From the client's perspective, I think it's a minimal effort to batch the calls. If handling multiple-table relationships transactionally can be somehow be accommodated in PostgREST, I believe it would appeal to a significantly larger user base. As you can see above, others like me would find PostgREST the perfect tool if it could handle this very common use case. |
To avoid the above drawbacks and have client-safe transactions, one option could be to maintain state for our regular requests. Maintaining a hashmap of the transaction guid + a list of the generated queries would allow us to send all at the same time using a single pool connection + transaction. We would take the hit in memory consumption instead of the db since we'd store the state until the transaction is sent. An inactivity timeout for saving up memory would be needed, this would account for HTTP 1 latency and it'd be configurable. HTTP2 would improve this as it reduces latency. InterfaceReusing our PATCH /tbl?id=eq.1
Prefer: tx=begin; guid=550e8400-e29b-41d4-a716-446655440000
{"body": "value"}
200 OK
Preference-Applied: Prefer: tx=begin; guid=550e8400-e29b-41d4-a716-446655440000 POST /other
Prefer: tx=continue; guid=550e8400-e29b-41d4-a716-446655440000
{"body": "value"}
200 OK
Preference-Applied: Prefer: tx=continue; guid=550e8400-e29b-41d4-a716-446655440000 DELETE /another?id=eq.3
Prefer: tx=commit; guid=550e8400-e29b-41d4-a716-446655440000
200 OK
Preference-Applied: Prefer: tx=commit; guid=550e8400-e29b-41d4-a716-446655440000
{"..."} Limitations
Further possibilities
It could be possible to lift the above restriction if we refine the interface to construct a single query instead of multiple by using Data-Modifying Statements in WITH. The basic idea is to add the previous request result as a body of the next and refer to it with underscore operators. |
@juanvilladev for transactional data, it will only work in the DB layer. There is no workaround for this using only the Supabase client. You can venture into using a full-fledged ORM like Prisma or MikroORM if they support the feature you want in your service layer. |
This comment was marked as off-topic.
This comment was marked as off-topic.
I'll hide some of the Supabase-specific comments as they're off topic. I suggest moving those to a discussion on Supabase repos (maybe here or here).
@gitbugr True, so there's another way of combining both reads + writes in one single request. Unlike the above proposal it doesn't require to maintain state nor multiple requests. Basically a custom POST / HTTP/1.1
Content-Type: multipart/vnd.pgrst.transaction;boundary="boundary"
--boundary
Content-Type: application/vnd.pgrst.search+json; col.eq=$field;
Content-Disposition: target="a_table";
{"field": "val"}
--boundary
Content-Type: application/vnd.pgrst.patch+json; another_col.eq=$another_field;
Content-Disposition: target="a_view";
{"another_field": "another_val"} It might be possible to have a placeholder like
It's somewhat similar to the JSON batch idea above, but has some advantages:
@bombillazo ORMs are inherently different as they're on a trusted environment (backend) and can leave a transaction open. In our case we need to provide client-side transactions for web users (untrusted), leaving them open can potentially DoS the database (intentionally or not). Additionally we need to ensure that this is cohesive with the rest of the feature set. Otherwise it will be hard to maintain and prone to bugs. |
any updates on this feature? i can't imagine writing a bunch of RPC functions for an app that deals with finances. is there an ETA?, Thanks. |
I can't imagine making multiple http request or crafting a complex single request when my goal is to write multiple things to the database in a single transaction, especially when my data is about finance. Why go through that if you can simply write a postgres function and do a single POST to its endpoint? |
So you've never used an ORM? You write all your calls as postgres stored procedures? You rewrite code you've written in your app's codebase? Manage debugging and versioning how? Doing everything to the db layer makes work in the real world pretty difficult to manage and increases the load on the db where it may be better suited for the client/server. As was mentioned here even supabase edge functions don't support transactions, so it's not even a question of "don't do it client side", it's use RPC or bust. |
Way back. But not anymore. All business logic inside PostgreSQL. Dumb client only, no other server "app".
Certainly not all - because a lot of things work very nicely with regular requests on
I'm writing my app from scratch with PostgREST.
All SQL code is written and managed in git, versioned and migrated through postgresql extensions. CD workflow with GitOps principles - fully automated.
Yes, the tooling is not perfect, yet. But it's certainly possible to do it.
I have not made that experience at all. And I even store all files, up to 1GB in size, in PostgreSQL. I stream videos from it (with nginx as a cache layer). It is possible - just a different mind-set. |
true but i meant "the app supports multiple features, so... many rpc functions to maintain. its a headache, also I'll have to do a ton of input sanitization before doing an INSERT, with TS, i can use zod but with postgres, ill have to write a bunch of ugly and verbose IF checks! so yeah... currently am looking into a solution where i call my node api that uses a pg lib like node-postgres, i then set jwt claims like supabase does in order to enforce RLS then i do the operations. i like my business logic living at the service layer instead of the database. |
Apart from requesting to views, yeah. My work focusses on the database where almost all business rules are, and the client. I rarely code in the server. I do not use ORM's anymore, but I was never a heavy user of that to be honest. When none-database stuff needs to occur, I write a trigger on a table or let a function create a job that a graphile-worker handles to run some Node.js task. Lots of exposed postgres functions to maintain. I just counted them, a 150 of them in my current project (and 77 exposed views). I could probably do with less. But nevertheless that logic must sit somewhere. And I've gotten so used and comfortable with having it all in the database and have it exposed generically by Postgrest (or Postgraphile). By only needing to code in the database and the client, I found that my velocity has increased compared to my old days where I also had to code in the server. |
@christiaanwesterbeek if i may ask, how do you do data transformation and or validation before |
This is not some far-fetched concept, in fact, it is how its been historically done and I'm willing to bet it is the industry standard when it comes to many battle-tested, legacy yet super robust scaled solutions.
Debugging is a pain in SQL for sure, but it's doable and eased with a good SQL IDE. You can use migrations and a git repo to store the latest state of your schema and your functions.
People underestimate the processing capabilities of PostgreSQL and its server. RPCs have been used for decades for high-processing framework applications before the advent of ORMs and Query Builders. They're proven to handle huge loads of data and processing.
I responded to that thread, the issue isn't edge functions, it is that Supabase chose PostREST as the basis of its client, which is not suited for transactions. So either PostREST resolves the transactions dilemma somehow, or Supabase will NEED to abandon PostREST and build its client around an ORM (or build their own) to support transactions.
You can create reusable internal RPCs, that can validate the different parts of your payloads, and call them inside your other RPCs. |
Oh yeah, 100%.
In my case, I have a lot of constraints on the tables/columns and make sure to have them properly typed. Even if bad data makes it through the api layer, the storage layer will not be happy and error out. |
Browsing through this thread I am asking myself: What are some actual real-world use-cases here? I only see the case in issue body itself (event + event_participants + participants). This can be nicely solved with a If you don't want to create that view, but use embedding here, then you're looking for #818 instead. Would love to see some more examples that don't fit the view or relational insert pattern. Some questions to answer for each use-case:
My gut feeling tells me, that for the latter case, where return values are required for the next statement, but everything should still happen atomically - RPCs are indeed the best (only) answer. But I think there is a gray area between "clearly a relational insert/update" and "clearly an RPC". I have one example myself: In my project, I can remove a user from a group. In theory this is a simple If I could send multiple requests in one transaction, I would be sending a |
@wolfgangwalther I think the most common use case is the comfort and friendly DX of atomically making any data changes in your DB using JS/TS. For example, creating an account may require data insertion in related tables in the same transaction so that any errors or failures will revert the intermediary data. Another example is making an API call to an external service during multiple inserts that must be atomic and reverting if the response is inadequate, etc. I can't deny it is much easier to develop if you're using TypeScript; you have typing cues while building the logic versus pure SQL queries where column types, looping, iterations, JSON transforms, etc, are more complex to program. Debugging in SQL is also less friendly, having to RAISE logs and frequently run queries to test the logic versus having step debuggers, types, and other advanced tools. I have hit a sweet spot with Kysely; it is a query builder that supports auto-generated typings for your DB schema, supports transactions (and other DB native features), and has the flexibility for different styles of programming. I view it almost as "writing SQL on TypeScript". |
That's very much the "relational insert" mentioned above.
If I understand correctly, you want to make one request to PostgREST, then another one to an external service, then a second one to PostgREST - and all of them should be atomic on the PostgREST side. This is not going to happen - because that would mean, we'd need to keep the transaction open on the server-side and that's going to be blocking other requests. This does not scale well at all, and also is just not... REST. None of it. So I don't see us doing that.
I find complex operations involving data much easier to do in SQL than in JavaScript, TypeScript or anything.
I use pgtap, run my tests with a temporary throw-away database and a watcher on my SQL and spec files - pretty much the same as my jest setup for the client. Works great, actually. I agree some better tooling here would be awesome, though.
That reads pretty much like "all business logic in the client". I don't think that is the PostgREST approach. For something like that, you are indeed much more helped with just a database connection directly from the client. |
I want to mention Kanel, which I started using this year.
|
When I need to do data-transformation while POSTing PATCHing, I would create an api view that has exactly the fields that the UI needs to send. Next I write an instead of insert/update trigger on the view that calls a trigger function that validates the data, modifies the data, and writes it to the necessary tables. I might also simply create an rpc function where the UI can POST to. I couldn't really tell you when to use the view or rpc since I am not sure myself. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Want to mirror what this comment is saying now that I'm using Supabase. Functions are enough to do all of my complicated inserts that need to be atomic. It's as simple as writing a function with parameters and then calling the function using PostgREST RPC. It's been working well so far. I'm struggling to find a use-case where this feature makes sense. |
It does mostly come down to developer experience as there are "ways around" each of these issues, but with scale, they become larger and larger issues. It's a little like saying "why use a power tool when my screwdriver does the job?".
|
This is a much more generic problem: Writing backwards-compatible APIs. Supporting multiple requests in a transaction does not solve this challenge.
Wait, I can put this code on the server-side and make sure that anyone calling this, will do the right steps in that transaction? I don't need to replicate the same steps in different places, when multiple clients / multiple parts of my client need to do "the same thing"? That's re-using code.
PostgREST is not a tool to avoid SQL. The opposite, actually. PostgREST allows to build an API with almost only SQL. The thing you're avoiding by using PostgREST is the server-side code in other languages. So context switching between <your-favorite-client-language> and SQL is more than expected. |
I get the DevEx points, though. The issue that affects the DevEx here is the leakage of using PostgREST via other client libraries. The natural and immediate inclination of developers using PostgREST via a JS client like the Supabase client, where querying/inserts/updates are highly abstracted from their SQL implementations, is to start using PostgREST for everything directly in the front end (FE): simple insert logic, updating tables, complex join queries, etc. This happens because it is effortless and quick to "build queries" using these libraries that abstract the SQL. But the moment you need transactional logic, it all breaks down because it forces one to move code from your FE logic into the backend (BE), meaning RPCs) and juggle what should go in FE or BE. We went through that process. At that point, moving more and more logic to RPCs reached its natural conclusion. We opted to simply abandon the PostgREST approach and do my queries all in the backend because having to track what was being implemented using PostgREST vs directly in RPCs and having to call RPCs for some things or using PostgREST for others was becoming increasingly complex as more logic and resources are added to the application. If one keeps using PostgREST though, migrations can take care of the downtime issue for updating function logic. Still, I get that starting to use the PostgREST API logic from the FE and suddenly needing to migrate or duplicate code in the RPCs is a bad DevEx, but like I said I think its was a misunderstanding of the limits of the technology. |
To an extent. But the lack of transactions exacerbates the issue / gives you more to worry about. No RPC = No need to keep your RPC backward compatible. The new behaviour will simply be reflected when the user's cache is updated.
If you put all your queries on the server-side and call them from the client, yes, you can re-use them. But the server cannot call I think @bombillazo 's reply above about moving to abandoning PostgREST in favour of doing all queries in the backend illustrates my point.
You know what I mean.. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Imho, everything has been said about this feature request. I don't expect any new input to come up anymore. Since this issue receives off-topic / spam repeatedly, I will lock the conversation for now. Note, that this issue is not closed, just locked. We can re-evaluate this issue once we check the following boxes:
My personal opinion is, that we will be able to happily close this issue once we have all of that, because there will not be a need anymore - but we will see. |
I have a schema capturing events and participants:
I can expose these tables directly with 3 postgrest views. However, I required that inserting/updating an event and the participants in that event should be atomic.
I think I need a single view that somehow data from event and event_participant, and is insertable/updateable.
Is there any way to do this with postgrest?
The text was updated successfully, but these errors were encountered: