-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resource embedding disambiguation #1419
Conversation
I don't agree with this change for several reasons:
|
That's because of a lack of documentation. Given the mediums, users will annotate and care for their schema. Recently I saw this project through postgresweekly. Note the usage of postgraphile specific annotations.
Views obey constraints defined on their source tables, it's not true that they have nothing to do with constraints.
No need to look at the database or the openapi schema in case there's an ambiguous request. That's the whole point of the error message added in #1412. To "menially" disambiguate.
"prior knowledge" is also required for the previous "column" solution so I don't see the point there. In fact, with this fk solution the api creator could better guide the api consumer with a proper constraint name.
With the "column" method it's basically impossible right now to disambiguate if the FKs are composite. Also as I mentioned, for api users it'll also make the disambiguation error message clearer. There are many disadvantages with the #918 approach. First of all, the duck typing stuff is just "magical" and most users just randomly try the url(we have had many issues that proves this) that works for their desired result. The worst of the previous approach is that is not deterministic or "future proof" since the requests can give different results if the schema becomes more complex. |
I take the point about api consumers. The name "foreignKey" in the error could be a bit harsh. How about this, I think this example would better illustrate the advantage of this approach. Using these orders-addresses tables, rename their fk constraints to GET /orders?select=*,addresses(*)" You'd get a 300 Multiple Choices error with this body: {
"details": [
{
"cardinality": "m2o(many-to-one)",
"relationship": "billing",
"source": "test.orders",
"target": "test.addresses"
},
{
"cardinality": "m2o(many-to-one)",
"relationship": "shipping",
"source": "test.orders",
"target": "test.addresses"
}
],
"hint": "Disambiguate by choosing a relationship from the `details` key",
"message": "More than one relationship was found for orders and addresses"
} I've renamed "foreignKey" to "relationship"(name comes from foreign key relationship) and omitted the fk cols. Also note that the error message now no longer seems like it's "leaking internal details". It denotes a relationship between two resources. So the user would have to disambiguate with: GET /orders?select=*,addresses!billing(*)"
-- or
GET /orders?select=*,addresses!shipping(*)" |
That's just one more thing to do (that people do not usually do) to be able have a good api when it's not needed
I am talking about how things are on the db side (not postgrest), there is no concept of FK for views, only tables have this. This is why I say they have to think twice
I don't think you understand what I am saying here.
No it's not, all the issues were for two reasons
That probably is true but this PR also does not support that and also in 5 years I never saw ppl mentioning composite keys at all (I am sure it comes up even less in the embeding context)
This is just not true. If you specify the FK column based on which to embed the request is always deterministic (just as with constraint name). If you do not specify the the FK column, and there is a conflict, you get a error (just as with the constraint name) this is in the master after your PR. If at the beginning, there is no conflict, and you do not specify the FK col, and later another column is added and there is conflict, old request will start to fail (error message for conflict), and this will be the exact same behavior as with the "constraint name" To summarize:
With just documenting the current format in the "Resource embedding" section (which anyway has to be done for this pr or else the users will "randomly try urls" as before ) and in the "conflict error message" substitute Another problem I see with constraint_name approach (I did not see any examples) |
The problem about the previous "column" approach is that even with the improved ambiguity detection the resulting error message is very confusing: postgrest/test/Feature/EmbedDisambiguationSpec.hs Lines 41 to 131 in 2e6c78d
This is all because of trying to "get clever" with the regex. You can match undesired results in unexpected ways. The way I see it, the duck typing was just a temporary hack. It was also frowned upon by @begriffs when it was proposed. Design-wise, using the fk names is much better and also simpler for the codebase. Breaking changes are not bad if done gradually, in this release this would be the only major breaking change. Btw, the previously merged #1412 would also break old clients, likely all of the clients that used the #918 additions(because it just threw the first found result). If we're already going to break clients, let's do this right and not expect someone reporting a composite key disambiguation case to break clients again.
It's a matter of not being ambiguous and overload the dot
That works the same as before, and there are better tests about that. |
so change that and make it clear, to me this argument is like saying "this technical implementation is bad because it's undocumented"?
I just rechecked the code to remember the regex part. Breaking changes need to have a reason beyond "i think this implementation is more solid", the reason should be something like "i can not implement this new major/important feature without breaking some old functionality", anything less that that is disregard for the users.
If it were a comment form someone else it would be an argument but this is your comment so the "being confusing" is your opinion/bias, not a established fact (by lots of people pointing that out). Any issues related to resource embedding being sometimes confusing is strictly related to it not being properly documented (0 mentions on how to specify the used FK or even a hint of some kind) and not returning an error when there is a conflict and silently using the first match, it has no relation to the select parameter format. The
In the comment that you linked you criticized the format because it meant different things in different situations (column vs table), and in this PR i see the part after If this was a green field thing, i'd be less against this (for example i don't have a problem with Your feeling of this PR feeling more "solid" compared to current master comes from a mix of things which are not really true if you carefully look at them:
Talking about this has distilled this for me a bit, so the debate is:
|
I already mentioned a migration path for requests like
Once again, composite key disambiguation cannot be done without this feature and the ambiguous error messages are confusing for the users. I'd say disregard for the users is to put known half-baked designs that are known to fail in obvious ways. Documenting that previous design will also make PostgREST design seem "sloppy". It is obvious the "column" approach will fail. Also, once again, #1412 is already a breaking change.
Only on the M2M cases it's interpreted as a junction(or link) table. For the other cases it'll be the fk constraint. This can be documented as a |
I've added a composite fk test case(based off a production deployment I have). In general these are rare but I wouldn't want to influence the schema design because of a shortcoming in pgrst(this would be similar to ORMs that don't support composite fks). I've also improved the error message in case of ambiguity: {
"details": [
{
"cardinality": "m2o",
"relationship": "billing",
"source": "test.orders",
"target": "test.addresses"
},
{
"cardinality": "m2o",
"relationship": "shipping",
"source": "test.orders",
"target": "test.addresses"
}
],
"hint": "By following the 'details' key, disambiguate the request by changing the url to /source?select=relationship(*) or /source?select=target!<relationship|cardinality|junction>(*)",
"message": "More than one relationship was found for orders and addresses"
} The
I don't think this error message can be shortened more and remain legible. In principle, we could keep backwards compatibility and the previous "column" approach. But that would make the error much more verbose(like the one I show here plus we'd need to add things like Ideally we wouldn't have breaking changes, but this is sometimes necessary to keep us improving. This issue in particular would make #818 doable(ambiguity stopped me from pursuing that one). Also I don't think this as a "major" breaking change(such as the one regarding |
Adding my two cents into this, the renaming of FKs is no simple task for us, especially because our ORM (SQLAlchemy) generates these automatically (and thus can get gnarly and long), and Alembic which applies them via generated migration plans. This would require quite a lot of tedious updates to our python code, to manually specify and rename foreignkeys, and make sure our developers require that this is now necessary rather than letting the ORM/Migration planner just handle this automatically. This also means our schema output is now dependant on also displaying foreign key constraint names. (We do not use the openAPI in postgrest as our primary API does a lot more and thus our schema inspection for users to view handles other use cases). Generally for our use case, its easy for our schema to output model definitions and users understand column names that point to other tables to use in postgrest calls. My two cents on the FK rename thing, this requirement will ripple through our code base. |
@bolerodan Can you estimate how many cases of ambiguous resource embedding do you have? I remember you mentioned one case in #1406. |
Ok, points taken. I've decided to minimize the impact of this breaking change.
Since self reference will be a special case and handled as before, perhaps the ability to disambiguate with the cardinality( I'll give this more thought and see if cardinality disambiguation can be removed. Or perhaps just enable it for self reference cases. |
This PR had too many breaking changes, besides removing the column name disambiguation(which could be made to work with the fk rename, without patching the client) the self reference cases were broken and in this case patching the client was needed. So in #1430, I've left these options like before and instead I've focused on adding the FK capability(not dropping support for column) and removing the duck typing. Let's continue there. |
Finishes the work started on #918. This should make resource embedding handle complex OLTP schemas in a non-ambiguous way.
The idea is to use the foreign key constraint name(as opposed to a column that it's part of the FK, previous discussion in #1078 (comment)) and the cardinality to do the disambiguation.
(These ideas share some similarities with Sybase's KEY JOIN, which also does an
AUTO JOIN
based on a FK)Foreign key constraint name
With the fk name, the previous duck typing features are not necessary and "shorthands" can be defined in the schema. For example:
And the request with the embedding would be like:
Another advantage with this is that we handle composite keys disambiguations with a short url.
Cardinality
In self reference cases the FK name is not enough to do the embedding since there's an m2o(many-to-one) and o2m(one-to-many) relationship with the same FK. So the cardinality can be specified to disambiguate:
many-to-many would be
m2m
. In this case the ability to specify the junction table remains the same as in #918.Also added some tests for circular references.
Combined
These capabilities can be combined:
Though in most cases just one hint will be enough to disambiguate. The error is also more clear(see the tests) and points what hint to use to disambiguate.
Limitation
There's a limitation for junction tables that have more than one FK to the source and target. This is a rare case that cannot be disambiguated without complicating the syntax. In this case the table should be split(which I'd argue it's better design from a normalization standpoint). A clear error will be shown in this case(see the tests).
Edit: this limitation can be lifted later if the need arises. The solution would be to allow to specify two more fks. Like
/users?select=tasks!m2m!fk1!fk2(*)
(long url, though there's no other choice). This could be added without causing a breaking change.