-
-
Notifications
You must be signed in to change notification settings - Fork 347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: user-set rewrites #877
Conversation
Already? Wow! |
c05e479
to
0567cfe
Compare
4049c54
to
9569613
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I find the code very enjoyable to read. I have only found some minor things / questions. I haven't looked into the algorithm itself yet.
@hperl if you decide to venture into caching, let's please synch first. At Ory, we've been badly burned by caching in security modules and there are several considerations we need to make to have a consistent and in particular secure cache! I'd also be interested to see a couple of benchmark tests to understand what we can improve where :)
bacca56
to
70817ff
Compare
af098e5
to
3ca8e3b
Compare
b5d6241
to
2183cd8
Compare
ecd4805
to
a8c7f32
Compare
@zepatrik let's merge that soon before I have to rebase again :) |
9373128
to
e6b2d44
Compare
childCtx, cancelFn := context.WithCancel(ctx) | ||
defer cancelFn() | ||
|
||
for _, check := range checks { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spawning a potential unlimited amount of go routines that execute very little logic or which cause a lot of load should be avoided at all cost.
Calling things asynchronous makes sense if the workload is worth the overhead. If we are spawning a go routine to return a (more or less static) value, the overhead of spawning a goroutine is not worth it.
When using concurrency to execute costly functions - usually functions that use RPC calls such as SQL, redis, protobuf, ... - we need to limit how much load a single node can generate.
Please re-evaluate the concurrency pattern here and decide whether it is needed or not. If it is needed, please replace this with a configurable worker pool. If you need inspiration: https://github.com/aeneasr/pool (more of a 30 minutes hack challenge, potentially not perfectly cancelable yet).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. We have implemented it this way because at a later point we want to off-load some of the concurrent processing to other Keto nodes, essentially a cluster mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parallelism of the binary operations is directly bound by the branching of the AST. So if we put an upper bound on the number of expressions e.g., 'or' together, we automatically limit this here.
I am talking about expressions like this:
this.related.P.traverse(p => p.permits.view(ctx)) ||
this.related.A.includes(ctx.subject) ||
this.related.B.includes(ctx.subject) ||
this.related.C.includes(ctx.subject) ||
this.related.D.includes(ctx.subject)
which will result in all 5 checks executing in parallel.
I still think that executing these in parallel is warranted, since each check will ultimately execute one or more database queries.
I think the better approach is to limit the branching in the AST (and leave this code as is). Right now, only the depth of the AST is limited.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While that might limit the concurrency of a singular check, it won't limit the concurrency of millions of checks running in a multi-tenant system. If enough requests hit, and parallelism is not bound by a worker pool, we will run into DoS issues.
I thus recommend implementing a worker pool regardless of the decisions in the AST. Limiting depth and branching is a good idea in general as an unbound model could also lead to DoS attacks against Ory Cloud (by using a very wide policy or potentially recursive policy)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK makes sense. This will then be a global worker pool for all checks, right? Not scoped to one top-level check request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aeneasr I implemented a worker pool (basically https://gobyexample.com/worker-pools), PTAL 😃
1674f14
to
1d2a49a
Compare
This will allow reusing the tree to provide debug info on how a check decision was reached. Co-authored-by: Henning Perl <hperl@users.noreply.github.com>
Co-authored-by: Henning Perl <hperl@users.noreply.github.com>
Co-authored-by: Henning Perl <hperl@users.noreply.github.com>
Co-authored-by: Henning Perl <hperl@users.noreply.github.com>
Co-authored-by: Henning Perl <hperl@users.noreply.github.com>
2bdbd6d
to
46a659f
Compare
Rewrote the history a bit, see https://github.com/ory/keto/tree/feat/userset-rewrites-backup for the old history. |
This PR implements userset rewrites as described in the Zanzibar paper (https://storage.googleapis.com/pub-tools-public-publication-data/pdf/10683a8987dbf0c6d4edcafb9b4f05cc9de5974a.pdf). This is work-in-progress.
The first part of the implementation focusses on checks against an internal representation of the namespace configuration, a configuration similar to the protobufs in the Zanzibar paper.
A later part would then translate from a high-level configuration language down to the internal representation.
Related issue(s)
#263
Checklist
permits