Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple requests in a transaction #286

Open
timbod7 opened this issue Sep 13, 2015 · 59 comments
Open

Multiple requests in a transaction #286

timbod7 opened this issue Sep 13, 2015 · 59 comments
Labels
enhancement a feature, ready for implementation

Comments

@timbod7
Copy link

timbod7 commented Sep 13, 2015

I have a schema capturing events and participants:

CREATE TABLE demo.participant(
  id BIGINT DEFAULT,
  name VARCHAR(255) NOT NULL
  );

CREATE TABLE demo.event(
  id BIGINT DEFAULT,
  title VARCHAR(255) NOT NULL
  );

CREATE TABLE demo.event_participant(
  event_id BIGINT NOT NULL REFERENCES demo.event(id),
  participant_id BIGINT NOT NULL REFERENCES demo.participant(id),
  PRIMARY KEY(event_id,participant_id)
  );

I can expose these tables directly with 3 postgrest views. However, I required that inserting/updating an event and the participants in that event should be atomic.

I think I need a single view that somehow data from event and event_participant, and is insertable/updateable.

Is there any way to do this with postgrest?

@timbod7
Copy link
Author

timbod7 commented Sep 21, 2015

I'm curious that there hasn't been a response to this. How are people handling many-many relationships with postgrest?

@begriffs
Copy link
Member

Hey sorry for the slow response. I've been away on vacation this week and not on the computer much.

There is currently no general way to batch operations in a single transaction. As a workaround you can create a stored procedure to do complicated updates and call it with the rpc interface.

It would be good to invent a nice interface for batch operations. We could take advantage of HTTP/2 multiplexing and share a single transaction per request. Then the client could send several requests in the same connection to execute atomically.

@timbod7
Copy link
Author

timbod7 commented Sep 21, 2015

Thanks for the reply. An interface for batch operations would be awesome.

But apart from the "how can I do this atomically" question, I guess I was wondering if there was a better way to model the data relationship given the capabilities of postgrest now? Given my naive model above, If I want to update an event along with the list of participants associated, I need to:

  - PUT the new event value
  - DELETE the existing values from event_participant
  - POST a csv to bulk insert the new values in event_participant

Perhaps, as you say the RPC approach is the best one.

@begriffs
Copy link
Member

The RPC is your best bet at the moment, but we really need a batch interface. I can see that being high priority after the upcoming v0.3 release.

@begriffs begriffs changed the title Atomic updates to join tables Multiple requests in a transaction Apr 3, 2016
@begriffs begriffs added the enhancement a feature, ready for implementation label Apr 3, 2016
@dcominottim
Copy link

Is this still a priority? When this is implemented, I will be able to use PostgREST in many of my projects.

@ruslantalpa
Copy link
Contributor

Not in the immediate future but there is a possibility of manually implementing this by having a trigger on insert/update in the parent table take the values that are meant for the child table and send them there.

@begriffs
Copy link
Member

begriffs commented Oct 7, 2016

Our web server supports HTTP/2, so we can do this bulk operation stuff at some point but it'll take a significant amount of work and there are a number of more pressing issues to fix before attempting it.

@ephetic
Copy link

ephetic commented May 9, 2017

I'm not sure about the relative stupidity of this suggestion, but couldn't you use custom headers (or reuse semantically appropriate standard ones) to track requests that should be wrapped in a single transaction?

X-TRANSACTION-START: f058ebd6-02f7-4d3f-942e-904344e8cde5
...
X-TRANSACTION-ID: f058ebd6-02f7-4d3f-942e-904344e8cde5
...
X-TRANSACTION-COMMIT: f058ebd6-02f7-4d3f-942e-904344e8cde5

@begriffs
Copy link
Member

Your suggestion makes sense. I've seen a related scheme recommended on various places on the web: expose an HTTP resource that represents transactions. Let me outline how it could work, then talk about some problems.

How it could work

The server can expose /transactions which accepts only POST. When you post to it you can include a json object with options like the desired isolation level. It then responds with a transaction id (txid). Using this id you can access another endpoint, /transactions/:txid. DELETE would roll it back, POST (or maybe PUT) would commit it, GET would give statistics about it.

With the txid you could then send regular postgrest requests and either include it in a query param like &txid=foo or as an HTTP header if we find one that is appropriate. The web server thread would then find the existing db connection and resume sending commands on it. After doing regular postgrest requests the client would commit or rollback the transaction with HTTP calls on its resource.

Challenges

OK that's the general idea, and here are challenges we would face:

  • Making sure that users couldn't spoof the txid to get their requests executed under another user's transaction with escalated permissions. PostgreSQL already has the notion of txid, accessible with txid_current(), but it is a simple number counting upward, easy to guess. We would have to obfuscate it with a big guid alias at the very least.
  • We would want an aggressive inactivity timeout on the open transactions so that a user couldn't easily lock up all our db pool connections.
  • The URLs I proposed don't match the postgrest url conventions or typical verbs. Could adjust it a little, but it makes the interface inconsistent.
  • Have to find a way to maintain affinity between client and postgrest server under load balancing. Sending the updates in an HTTP/2 request would get around this problem.
  • Holding transactions open is always expensive because postgresql is an MVCC system. Memory gets used, rows can't be vacuumed, etc. Waiting for all those round-trip HTTP requests to finish holds the transaction open longer than sending the requests in a single HTTP/2 request.

So I still think HTTP/2 is the most elegant way, but it may have problems of its own like client compatibility. What do you think? Do you have solutions for these problems, or perhaps problems I didn't imagine?

@ephetic
Copy link

ephetic commented May 10, 2017

  • guid alias sounds sufficient
  • reasonable, if configurable
  • should be out-of-band. perhaps rpc endpoint
  • umm...pass
  • yeah, pass again

How would HTTP/2 handle arbitrary SELECT/UPDATE type transactions?

@jacobsvante
Copy link

jacobsvante commented Jun 28, 2017

What about - until a good way of utilizing HTTP2 connections can be established - just supporting a simple /transaction endpoint which handles RFC6902 JSON PATCH-esque payloads with a very specific content-type? This way we'd have a good stable baseline for what I feel is an absolutely essential part of a REST API today, and perhaps especially in this project, considering the transactional nature of PostgreSQL.

I'm thinking the format could look like this:

POST /transaction
Content-Type: application/vnd.pgrst.patch+json
[
  {
    "verb": "DELETE",
    "endpoint": "/dogs?alive=is.false"
  },
  {
    "verb": "POST",
    "endpoint": "/dogs",
    "data": "{\"name\": \"Foo\"}"
  }
]

@steve-chavez
Copy link
Member

Aside from the http2 interface to implement this feature, we also need a way to restrict the cost of what could be many expensive queries/mutations in a single transaction and that way avoid the risk of clients ddosing the database.

I think this would be better enforced at the database level with a statement_timeout or perhaps an explain cost maximum, so maybe it would be better to address #249 before this is implemented.

@jackfirth
Copy link

jackfirth commented Oct 6, 2017

Comment from the peanut gallery about request batching: there's a standard application/http content type defined in the HTTP/1.1 spec that represents a sequence of requests or a sequence of responses. Posting that to some sort of "batch operations" resource could serve as an alternative to clients creating, manipulating, and committing transaction resources with a series of requests.

@wildsurfer
Copy link

there is a possibility of manually implementing this by having a trigger on insert/update in the parent table take the values that are meant for the child table and send them there.

This approach seems to be the best for current versions. However it would be great to see something more transactions-friendly in feature.

@ilkosta
Copy link

ilkosta commented Jan 13, 2018

Not in the immediate future but there is a possibility of manually implementing this by having a trigger on insert/update in the parent table take the values that are meant for the child table and send them there.

The approach seems something like https://github.com/tobyhede/postgresql-event-sourcing, but it would need a complete rewrite of the application logic, something that is not always possible.

@steve-chavez
Copy link
Member

Even if the http2 interface is implemented, that still wouldn't solve the problem of INSERTing a row dependent on the id of another row(child/parent table).

I made a proposal for solving that part in #818 (comment).

@dmulfinger
Copy link

Just throwing out an idea here:

Implement a /batch or /transaction endpoint. You can POST your array of commands:

[
{ "endpoint": "/authors", "payload": { "author_id" : 1, "first": "Robert", "last": Frost" } },
{ "endpoint": "/works", "payload": { "work_id": 1, "author_id": 1, "title": "Mending Wall" } },
{ "endpoint": "/works", "payload": { "work_id": 2, "author_id": 1, "title": "The Road Not Taken"} }
]

These would all be performed inside of a transaction. Should work for other HTTP requests as well.

Thoughts?

@christiaanwesterbeek
Copy link

christiaanwesterbeek commented Aug 29, 2018

What's the benefit of a transaction interface over exposing an rpc function that takes json objects, arrays and what not? I know it's been said before, but I'm saying it again; you can do it all in your function.

Sure, creating such a function means putting some effort handling json and arrays (of json objects) as arguments. But, implementing an http request to construct a transaction (as proposed above) requires effort too.

I would put my effort in building the Postgres function instead of such an http request.

There's an argument to be made about keeping PostgREST lean because that's more likely to not introduce extra maintenance and bugs. Offload your requirements to Postgres, not PostgREST when possible.

@dmulfinger
Copy link

The benefit is that while a json parsing /rpc function would need to be individually coded for every parent/child relationship of every project that uses PostgREST, a transaction interface as proposed above would be coded once into PostgREST. I agree that a json parsing /rpc function can be written, but it's not necessary easy to write one well that handles error cases correctly and reports meaningful error messages. And these /rpc function have a big maintenance problem in that any change to the schema would require changes to the /rpc function. There's probably more to consider regarding how to declare the arguments meaningfully in the OpenAPI spec, but I haven't thought that through.

From the client's perspective, I think it's a minimal effort to batch the calls.

If handling multiple-table relationships transactionally can be somehow be accommodated in PostgREST, I believe it would appeal to a significantly larger user base. As you can see above, others like me would find PostgREST the perfect tool if it could handle this very common use case.

@steve-chavez
Copy link
Member

steve-chavez commented Aug 24, 2022

  • We would want an aggressive inactivity timeout on the open transactions so that a user couldn't easily lock up all our db pool connections.
  • The URLs I proposed don't match the postgrest url conventions or typical verbs. Could adjust it a little, but it makes the interface inconsistent.
  • Holding transactions open is always expensive because postgresql is an MVCC system. Memory gets used, rows can't be vacuumed, etc.

To avoid the above drawbacks and have client-safe transactions, one option could be to maintain state for our regular requests. Maintaining a hashmap of the transaction guid + a list of the generated queries would allow us to send all at the same time using a single pool connection + transaction.

We would take the hit in memory consumption instead of the db since we'd store the state until the transaction is sent. An inactivity timeout for saving up memory would be needed, this would account for HTTP 1 latency and it'd be configurable. HTTP2 would improve this as it reduces latency.

Interface

Reusing our Prefer:tx=commit/rollback header, the interface could be like

PATCH /tbl?id=eq.1
Prefer: tx=begin; guid=550e8400-e29b-41d4-a716-446655440000

{"body": "value"}

200 OK
Preference-Applied: Prefer: tx=begin; guid=550e8400-e29b-41d4-a716-446655440000
POST /other
Prefer: tx=continue; guid=550e8400-e29b-41d4-a716-446655440000

{"body": "value"}

200 OK
Preference-Applied: Prefer: tx=continue; guid=550e8400-e29b-41d4-a716-446655440000
DELETE /another?id=eq.3
Prefer: tx=commit; guid=550e8400-e29b-41d4-a716-446655440000

200 OK
Preference-Applied: Prefer: tx=commit; guid=550e8400-e29b-41d4-a716-446655440000
{"..."}

Limitations

  • Since there's no response from the database yet, the tx=begin|continue requests won't return a body( Prefer: return=representation won't be applied). The last query on the transaction would return its result or an error message.
  • Because of the above, GET/HEAD requests would have to fail on a tx context(or better, they will succeed and ignore the Prefer: tx=...)
  • No ability to obtain a previous query result to make a dependent mutation(comment above)
  • Could be a slow operation if latency is high. HTTP2 would help with this.
  • The memory consumption per tx would be the sum of the request bodies(needs solving perf: Avoid copying the request body more than once #2261)
  • Not sure about this part. If HTTP2 can parallelize requests, we might have to add a position to the Prefer: tx?.

Further possibilities

  • No ability to obtain a previous query result to make a dependent mutation

It could be possible to lift the above restriction if we refine the interface to construct a single query instead of multiple by using Data-Modifying Statements in WITH. The basic idea is to add the previous request result as a body of the next and refer to it with underscore operators.

@bombillazo
Copy link

@juanvilladev for transactional data, it will only work in the DB layer. There is no workaround for this using only the Supabase client. You can venture into using a full-fledged ORM like Prisma or MikroORM if they support the feature you want in your service layer.

@gitbugr

This comment was marked as off-topic.

@steve-chavez
Copy link
Member

steve-chavez commented Jul 19, 2023

I'll hide some of the Supabase-specific comments as they're off topic. I suggest moving those to a discussion on Supabase repos (maybe here or here).


Agreed.. Kinda. In isolation this would be more akin to Firebase's "batched writes",

@gitbugr True, so there's another way of combining both reads + writes in one single request. Unlike the above proposal it doesn't require to maintain state nor multiple requests. Basically a custom multipart media type:

POST / HTTP/1.1
Content-Type: multipart/vnd.pgrst.transaction;boundary="boundary"

--boundary
Content-Type: application/vnd.pgrst.search+json; col.eq=$field;
Content-Disposition: target="a_table";

{"field": "val"}

--boundary
Content-Type: application/vnd.pgrst.patch+json; another_col.eq=$another_field;
Content-Disposition: target="a_view";

{"another_field": "another_val"}

It might be possible to have a placeholder like $previous to refer to a previous result (like SEARCH). This will also play along other features:

It's somewhat similar to the JSON batch idea above, but has some advantages:

  • We don't need to put everything into a single JSON, which would require more parsing.
  • We get to reuse our other types of requests.
  • Additionally we don't have to build a big CTE like on a previous idea, each statement can be separate.

What are the main challenges in delivering this feature? Would it be helpful to see how ORMs typically handle transactions or are these libraries too different from the REST API-based approach of PostgREST?

@bombillazo ORMs are inherently different as they're on a trusted environment (backend) and can leave a transaction open. In our case we need to provide client-side transactions for web users (untrusted), leaving them open can potentially DoS the database (intentionally or not). Additionally we need to ensure that this is cohesive with the rest of the feature set. Otherwise it will be hard to maintain and prone to bugs.

@abdirahmn1
Copy link

any updates on this feature? i can't imagine writing a bunch of RPC functions for an app that deals with finances. is there an ETA?, Thanks.

@christiaanwesterbeek
Copy link

I can't imagine making multiple http request or crafting a complex single request when my goal is to write multiple things to the database in a single transaction, especially when my data is about finance. Why go through that if you can simply write a postgres function and do a single POST to its endpoint?

@gitbugr
Copy link

gitbugr commented Nov 16, 2023

I can't imagine making multiple http request or crafting a complex single request when my goal is to write multiple things to the database in a single transaction, especially when my data is about finance. Why go through that if you can simply write a postgres function and do a single POST to its endpoint?

So you've never used an ORM? You write all your calls as postgres stored procedures? You rewrite code you've written in your app's codebase? Manage debugging and versioning how? Doing everything to the db layer makes work in the real world pretty difficult to manage and increases the load on the db where it may be better suited for the client/server.

As was mentioned here even supabase edge functions don't support transactions, so it's not even a question of "don't do it client side", it's use RPC or bust.

@wolfgangwalther
Copy link
Member

So you've never used an ORM?

Way back. But not anymore. All business logic inside PostgreSQL. Dumb client only, no other server "app".

You write all your calls as postgres stored procedures?

Certainly not all - because a lot of things work very nicely with regular requests on VIEWs. But yeah - a lot of stored procedures, too. Current project has 29 exposed views, 44 exposed RPCs, 43 hidden tables.

You rewrite code you've written in your app's codebase?

I'm writing my app from scratch with PostgREST.

Manage debugging and versioning how?

All SQL code is written and managed in git, versioned and migrated through postgresql extensions. CD workflow with GitOps principles - fully automated.

Doing everything to the db layer makes work in the real world pretty difficult to manage

Yes, the tooling is not perfect, yet. But it's certainly possible to do it.

and increases the load on the db where it may be better suited for the client/server.

I have not made that experience at all. And I even store all files, up to 1GB in size, in PostgreSQL. I stream videos from it (with nginx as a cache layer).

It is possible - just a different mind-set.

@abdirahmn1
Copy link

I can't imagine making multiple http request or crafting a complex single request when my goal is to write multiple things to the database in a single transaction, especially when my data is about finance. Why go through that if you can simply write a postgres function and do a single POST to its endpoint?

true but i meant "the app supports multiple features, so... many rpc functions to maintain. its a headache, also I'll have to do a ton of input sanitization before doing an INSERT, with TS, i can use zod but with postgres, ill have to write a bunch of ugly and verbose IF checks! so yeah...

currently am looking into a solution where i call my node api that uses a pg lib like node-postgres, i then set jwt claims like supabase does in order to enforce RLS then i do the operations.

i like my business logic living at the service layer instead of the database.

@christiaanwesterbeek
Copy link

christiaanwesterbeek commented Nov 16, 2023

So you've never used an ORM? You write all your calls as postgres stored procedures?

Apart from requesting to views, yeah. My work focusses on the database where almost all business rules are, and the client. I rarely code in the server. I do not use ORM's anymore, but I was never a heavy user of that to be honest.

When none-database stuff needs to occur, I write a trigger on a table or let a function create a job that a graphile-worker handles to run some Node.js task.

Lots of exposed postgres functions to maintain. I just counted them, a 150 of them in my current project (and 77 exposed views). I could probably do with less. But nevertheless that logic must sit somewhere. And I've gotten so used and comfortable with having it all in the database and have it exposed generically by Postgrest (or Postgraphile).

By only needing to code in the database and the client, I found that my velocity has increased compared to my old days where I also had to code in the server.

@abdirahmn1
Copy link

So you've never used an ORM? You write all your calls as postgres stored procedures?

Apart from requesting to views, yeah. My work focusses on the database where almost all business rules are, and the client. I rarely code in the server. I do not use ORM's anymore, but I was never a heavy user of that to be honest.

When none-database stuff needs to occur, I write a trigger on a table or let a function create a job that a graphile-worker handles to run some Node.js task.

Lots of exposed postgres functions to maintain. I just counted them, a 150 of them in my current project (and 77 exposed views). I could probably do with less. But nevertheless that logic must sit somewhere. And I've gotten so used and comfortable with having it all in the database and have it exposed generically by Postgrest (or Postgraphile).

By only needing to code in the database and the client, I found that my velocity has increased compared to my old days where I also had to code in the server.

@christiaanwesterbeek if i may ask, how do you do data transformation and or validation before INSERTing or UPDATEing a record?, IF checks work just fine id the input data is small, but what about big input data?, Postgres is a mature DB but the tooling around it is not so great.

@bombillazo
Copy link

bombillazo commented Nov 16, 2023

You write all your calls as postgres stored procedures? You rewrite code you've written in your app's codebase?

This is not some far-fetched concept, in fact, it is how its been historically done and I'm willing to bet it is the industry standard when it comes to many battle-tested, legacy yet super robust scaled solutions.

Manage debugging and versioning how?

Debugging is a pain in SQL for sure, but it's doable and eased with a good SQL IDE. You can use migrations and a git repo to store the latest state of your schema and your functions.

Doing everything to the db layer makes work in the real world pretty difficult to manage and increases the load on the db where it may be better suited for the client/server.

People underestimate the processing capabilities of PostgreSQL and its server. RPCs have been used for decades for high-processing framework applications before the advent of ORMs and Query Builders. They're proven to handle huge loads of data and processing.

As was mentioned here even supabase edge functions don't support transactions, so it's not even a question of "don't do it client side", it's use RPC or bust.

I responded to that thread, the issue isn't edge functions, it is that Supabase chose PostREST as the basis of its client, which is not suited for transactions. So either PostREST resolves the transactions dilemma somehow, or Supabase will NEED to abandon PostREST and build its client around an ORM (or build their own) to support transactions.

@christiaanwesterbeek if i may ask, how do you do data transformation and or validation before INSERTing or UPDATEing a record?, IF checks work just fine id the input data is small, but what about big input data?, Postgres is a mature DB but the tooling around it is not so great.

You can create reusable internal RPCs, that can validate the different parts of your payloads, and call them inside your other RPCs.

@wolfgangwalther
Copy link
Member

By only needing to code in the database and the client, I found that my velocity has increased compared to my old days where I also had to code in the server.

Oh yeah, 100%.

if i may ask, how do you do data transformation and or validation before INSERTing or UPDATEing a record?, IF checks work just fine id the input data is small, but what about big input data?

In my case, I have a lot of constraints on the tables/columns and make sure to have them properly typed. Even if bad data makes it through the api layer, the storage layer will not be happy and error out. DOMAINs with constraints are also nice to validate input parameters of stored procedures, without handling that in the code itself.

@wolfgangwalther
Copy link
Member

wolfgangwalther commented Nov 16, 2023

Browsing through this thread I am asking myself: What are some actual real-world use-cases here?

I only see the case in issue body itself (event + event_participants + participants). This can be nicely solved with a VIEW joining those three tables and then creating some INSTEAD OF triggers for INSERT and UPDATE. Nowadays, MERGE can be used in the update trigger to update the related tables nicely. This works well and gives you a clean HTTP interface for one "resource" (event), which is internally represented by three normalized data tables. Example here: #2933 (reply in thread)

If you don't want to create that view, but use embedding here, then you're looking for #818 instead.

Would love to see some more examples that don't fit the view or relational insert pattern. Some questions to answer for each use-case:

  1. Do you need to send multiple "types" of requests (POST, PATCH, DELETE, ...) together - or would sending the same type of requests to different endpoints at the same time be enough? For example, POSTing to 3 endpoints at the same instead of only one.

  2. Do you need to use return values of a statement in subsequent statements? Or could you send all statements at once?

My gut feeling tells me, that for the latter case, where return values are required for the next statement, but everything should still happen atomically - RPCs are indeed the best (only) answer. But I think there is a gray area between "clearly a relational insert/update" and "clearly an RPC".

I have one example myself:

In my project, I can remove a user from a group. In theory this is a simple DELETE /members?group=eq.X&user=eq.Y. But this user can have group-related data, which can either be kept or be removed as well. This data is referenced via foreign keys, so when the admin wants to keep the data, this FK relationship must first be cleared, so that the data now only belongs "to the user" and not "to the membership" anymore. Basically there are two kinds of "delete": A soft delete and a hard delete. We have implemented this via RPC now, but that feels wrong: Why am I POSTing to some RPC, when I in fact want to DELETE from my regular endpoint? Although to be fair, I really don't have a problem creating an RPC here. I just don't like missing HTTP/REST semantics here...

If I could send multiple requests in one transaction, I would be sending a PATCH + DELETE request in this case. I could send those two requests together, too - they don't depend on each other's results. But in my perfect world, I would be able to pass an "argument" to a regular DELETE on the view, which I could then somehow read inside the INSTEAD OF trigger on that view. This way I could influence the "kind of delete" I want to have. I wouldn't know how to do that nicely, though.

@bombillazo
Copy link

bombillazo commented Nov 16, 2023

@wolfgangwalther I think the most common use case is the comfort and friendly DX of atomically making any data changes in your DB using JS/TS. For example, creating an account may require data insertion in related tables in the same transaction so that any errors or failures will revert the intermediary data. Another example is making an API call to an external service during multiple inserts that must be atomic and reverting if the response is inadequate, etc.

I can't deny it is much easier to develop if you're using TypeScript; you have typing cues while building the logic versus pure SQL queries where column types, looping, iterations, JSON transforms, etc, are more complex to program. Debugging in SQL is also less friendly, having to RAISE logs and frequently run queries to test the logic versus having step debuggers, types, and other advanced tools.

I have hit a sweet spot with Kysely; it is a query builder that supports auto-generated typings for your DB schema, supports transactions (and other DB native features), and has the flexibility for different styles of programming. I view it almost as "writing SQL on TypeScript".

@wolfgangwalther
Copy link
Member

creating an account may require data insertion in related tables in the same transaction so that any errors or failures will revert the intermediary data.

That's very much the "relational insert" mentioned above.

Another example is making an API call to an external service during multiple inserts that must be atomic and reverting if the response is inadequate

If I understand correctly, you want to make one request to PostgREST, then another one to an external service, then a second one to PostgREST - and all of them should be atomic on the PostgREST side. This is not going to happen - because that would mean, we'd need to keep the transaction open on the server-side and that's going to be blocking other requests. This does not scale well at all, and also is just not... REST. None of it. So I don't see us doing that.

I can't deny it is much easier to develop if you're using TypeScript; you have typing cues while building the logic versus pure SQL queries where column types, looping, iterations, JSON transforms, etc, are more complex to program.

I find complex operations involving data much easier to do in SQL than in JavaScript, TypeScript or anything.

Debugging in SQL is also less friendly, having to RAISE logs and frequently run queries to test the logic versus having step debuggers, types, and other advanced tools.

I use pgtap, run my tests with a temporary throw-away database and a watcher on my SQL and spec files - pretty much the same as my jest setup for the client. Works great, actually.

I agree some better tooling here would be awesome, though.

I have hit a sweet spot with Kysely; it is a query builder that supports auto-generated typings for your DB schema, supports transactions (and other DB native features), and has the flexibility for different styles of programming. I view it almost as "writing SQL on TypeScript".

That reads pretty much like "all business logic in the client". I don't think that is the PostgREST approach. For something like that, you are indeed much more helped with just a database connection directly from the client.

@christiaanwesterbeek
Copy link

I want to mention Kanel, which I started using this year.

Generate Typescript types from a live Postgres database.

This is for people who don't like ORM's but who do like intellisense and type checking for their database access code.

@christiaanwesterbeek
Copy link

@christiaanwesterbeek if i may ask, how do you do data transformation and or validation before INSERTing or UPDATEing a record?, IF checks work just fine id the input data is small, but what about big input data?, Postgres is a mature DB but the tooling around it is not so great.

When I need to do data-transformation while POSTing PATCHing, I would create an api view that has exactly the fields that the UI needs to send. Next I write an instead of insert/update trigger on the view that calls a trigger function that validates the data, modifies the data, and writes it to the necessary tables. I might also simply create an rpc function where the UI can POST to. I couldn't really tell you when to use the view or rpc since I am not sure myself.

@abdirahmn1

This comment was marked as off-topic.

@divmgl
Copy link

divmgl commented Feb 24, 2024

Want to mirror what this comment is saying now that I'm using Supabase.

Functions are enough to do all of my complicated inserts that need to be atomic. It's as simple as writing a function with parameters and then calling the function using PostgREST RPC. It's been working well so far.

I'm struggling to find a use-case where this feature makes sense.

@gitbugr
Copy link

gitbugr commented Feb 26, 2024

I'm struggling to find a use-case where this feature makes sense.

It does mostly come down to developer experience as there are "ways around" each of these issues, but with scale, they become larger and larger issues. It's a little like saying "why use a power tool when my screwdriver does the job?".

  • Avoid query duplication for small changes.
    (What do I mean by this?: When keeping code in an RPC you're removing them from your codebase - therefore when pushing an update to your app that requires a change in that transaction, now you need to modify the RPC - requiring downtime. The way to avoid that issue is to avoid such destructive changes by making a new RPC with a differerent name and using that in your updated code, then deleting the old RPC some time after the update. But you can see how that can get out of hand the larger your app and the more RPCs you have.)

  • Avoiding the uncessesary rewriting of queries in SQL that you have already coded (in full or in part) using JS.
    (Reusing code through functions, classes, etc. is a very useful tool. If you want to use a query you've previously wrote in JS in your RPC... You can't. Requiring you to rewrite them in SQL.)

  • Avoiding the uncessesary mental context switching between JS and SQL.

@gitbugr
Copy link

gitbugr commented Feb 26, 2024

As an aside note, this is now the ticket with the most comments, which isn't the most accurate metric for "how important is this feature", but it's certainly something with a lot of eyes on it and worth considering.

image

@wolfgangwalther
Copy link
Member

wolfgangwalther commented Feb 26, 2024

Avoid query duplication for small changes.
(What do I mean by this?: When keeping code in an RPC you're removing them from your codebase - therefore when pushing an update to your app that requires a change in that transaction, now you need to modify the RPC - requiring downtime. The way to avoid that issue is to avoid such destructive changes by making a new RPC with a differerent name and using that in your updated code, then deleting the old RPC some time after the update. But you can see how that can get out of hand the larger your app and the more RPCs you have.)

This is a much more generic problem: Writing backwards-compatible APIs.

Supporting multiple requests in a transaction does not solve this challenge.

Avoiding the uncessesary rewriting of queries in SQL that you have already coded (in full or in part) using JS.
(Reusing code through functions, classes, etc. is a very useful tool. If you want to use a query you've previously wrote in JS in your RPC... You can't. Requiring you to rewrite them in SQL.)

Wait, I can put this code on the server-side and make sure that anyone calling this, will do the right steps in that transaction? I don't need to replicate the same steps in different places, when multiple clients / multiple parts of my client need to do "the same thing"? That's re-using code.

Avoiding the uncessesary mental context switching between JS and SQL.

PostgREST is not a tool to avoid SQL. The opposite, actually. PostgREST allows to build an API with almost only SQL. The thing you're avoiding by using PostgREST is the server-side code in other languages.

So context switching between <your-favorite-client-language> and SQL is more than expected.

@bombillazo
Copy link

This is a much more generic problem: Writing backwards-compatible APIs.

Supporting multiple requests in a transaction does not solve this challenge.

PostgREST is not a tool to avoid SQL. The opposite, actually. PostgREST allows to build an API with almost only SQL. The thing you're avoiding by using PostgREST is the server-side code in other languages.

So context switching between and SQL is more than expected.

I get the DevEx points, though.

The issue that affects the DevEx here is the leakage of using PostgREST via other client libraries. The natural and immediate inclination of developers using PostgREST via a JS client like the Supabase client, where querying/inserts/updates are highly abstracted from their SQL implementations, is to start using PostgREST for everything directly in the front end (FE): simple insert logic, updating tables, complex join queries, etc. This happens because it is effortless and quick to "build queries" using these libraries that abstract the SQL. But the moment you need transactional logic, it all breaks down because it forces one to move code from your FE logic into the backend (BE), meaning RPCs) and juggle what should go in FE or BE.

We went through that process. At that point, moving more and more logic to RPCs reached its natural conclusion. We opted to simply abandon the PostgREST approach and do my queries all in the backend because having to track what was being implemented using PostgREST vs directly in RPCs and having to call RPCs for some things or using PostgREST for others was becoming increasingly complex as more logic and resources are added to the application.

If one keeps using PostgREST though, migrations can take care of the downtime issue for updating function logic. Still, I get that starting to use the PostgREST API logic from the FE and suddenly needing to migrate or duplicate code in the RPCs is a bad DevEx, but like I said I think its was a misunderstanding of the limits of the technology.

@gitbugr
Copy link

gitbugr commented Feb 26, 2024

This is a much more generic problem: Writing backwards-compatible APIs.

To an extent. But the lack of transactions exacerbates the issue / gives you more to worry about. No RPC = No need to keep your RPC backward compatible. The new behaviour will simply be reflected when the user's cache is updated.

Wait, I can put this code on the server-side and make sure that anyone calling this, will do the right steps in that transaction? I don't need to replicate the same steps in different places, when multiple clients / multiple parts of my client need to do "the same thing"? That's re-using code.

If you put all your queries on the server-side and call them from the client, yes, you can re-use them. But the server cannot call getSomeQuery() on the client side and recieve some SQL. In which case, why use the client side library for writing queries at all?

I think @bombillazo 's reply above about moving to abandoning PostgREST in favour of doing all queries in the backend illustrates my point.

PostgREST is not a tool to avoid SQL. The opposite, actually. PostgREST allows to build an API with almost only SQL. The thing you're avoiding by using PostgREST is the server-side code in other languages.

So context switching between and SQL is more than expected.

You know what I mean..

@abdirahmn1

This comment was marked as off-topic.

@zerogtiger

This comment was marked as off-topic.

@ProductOfAmerica

This comment was marked as off-topic.

@wolfgangwalther
Copy link
Member

Imho, everything has been said about this feature request. I don't expect any new input to come up anymore. Since this issue receives off-topic / spam repeatedly, I will lock the conversation for now. Note, that this issue is not closed, just locked.

We can re-evaluate this issue once we check the following boxes:

My personal opinion is, that we will be able to happily close this issue once we have all of that, because there will not be a need anymore - but we will see.

@PostgREST PostgREST locked as too heated and limited conversation to collaborators May 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement a feature, ready for implementation
Development

No branches or pull requests