-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RLP atomic override #523
RLP atomic override #523
Conversation
// It iterates through the RateLimitPolicyList to find overrides for the provided target HTTPRoute. | ||
// If an override is found, it updates the limits in the RateLimitPolicySpec in accordingly. | ||
func (r *RateLimitPolicyReconciler) applyOverrides(ctx context.Context, rlp *kuadrantv1beta2.RateLimitPolicy, targetNetworkObject client.Object) error { | ||
if route, ok := targetNetworkObject.(*gatewayapiv1.HTTPRoute); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could follow a different approach here that is indifferent to the fact that nowadays we only allow one policy of a kind targeting a given network resource at a time.
Instead, what about the following?
- List all the policies that target the same resource, plus, if it's a route policy, all the policies that target any of the route's parent Gateways
- Sort the list of policies by precedence – i.e. the higher the target is in the hierarchy, the higher the precedence; same level in the hierarchy, older policy beats newer ones
- Drop all policies with lower precedence than
rlp
- Iterate from highest to lowest until finding a block of overrides, in which case replace the
rlp
's limits with it andbreak
In the future, for adding support for the merge
strategy, we invert the order of the list (iterating from lowest to highest) and go all the way through the list (i.e. without breaking at the first match.)
A remaining problem to solve – which is an issue in your current proposed implementation also – is what to do in case of multiple parent gateways with override policies targeting them. I suppose we'll have to add the host names in.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we agreed on not necessarily using the DAG in the first iteration. I have to mention though that the fetching and sorting of the policies is precisely what it can help us with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am torn with my view on this. Yes the use of the DAG make it an easier problem. If code freeze is in two weeks I think it would be better to have a working solution sooner.
The conflict I have is with the hierarchy. Currently we only support polices attached to the gateway or the route. If the gateway has an atomic override the number of policies attached to route does not matter, the gateway always wins.
The same goes for the multiply routes, we currently don't support that as a feature. While we possible should to me it is feeling like scope creep and solving a problem of tomorrow. A problem we are not sure even exists in the wild.
This is were my conflict is. I think we should be solving that problem, and that we should support polices attached to the gateway class. That is not the product of today, but of tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point, @Boomatang. The part that I don't like is the spreading of one decision across the entire code, i.e. the coupling of implementations at multiple levels to a rule otherwise enforced at a single point, possibly at a level far beyond. We are just increasing the number of points in the code that will need to change when the rule also shifts.
In this case, the "decision" (or "rule") we'd be coupling the implementation to or not is the current behaviour of 1 policy of a kind per target. I already lost count of how many places in the code we rely on that for something. And this one is particularly nasty because the rule itself is not even obvious at a glance. Instead, one who reads the code needs to have that click that "oh, right! this is because of only 1 policy per..."
Other examples include for sure the kinds of resources that can be targeted by a policy. Dozens of ifs and switch cases all over for this one too, when IMO in many cases the types could/should abstract the complexity. (BTW, you can git blame me on several of those.)
As for using the DAG, I may be overlooking the difficulties to add it to implementation here, so I'll let you and @KevFan decide. My initial intuition is that it could result in some DRYer code and arguably improve performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed an implememtation using the DAG that is focused solely on the overriden portion of the reconcile logic based on the algorithm suggested (#527 would be the general refactor to use the DAG throughout)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently doesn't account for:
A remaining problem to solve – which is an issue in your current proposed implementation also – is what to do in case of multiple parent gateways with override policies targeting them. I suppose we'll have to add the host names in.
In this scenario, the oldest Gateway RLP's override block is used
100514e
to
3ad3f9c
Compare
550539b
to
365f209
Compare
3231881
to
c7b7552
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're still missing overriding the rules in the wasm plugin config.
The reason why the verification steps seem to work is because, coincidentally, the limit is named "create-toy" in both policies. But, of course, we shouldn't expect that to be always the case. In fact, other differences between the override and the lower-tier policies, such as on when
conditions and routeSelectors
, would also cause divergences here.
In the verification, I personally found useful checking the states of the WasmPlugin and Limitador CRs after each step of applying the policies: kubectl get wasmplugin/kuadrant-istio-ingressgateway -n istio-system -o yaml
kubectl get limitador/limitador -n kuadrant-system -o yaml |
c8aa3f0
to
721e1cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not much is standing out to me. I do have two question in my comments.
Due to a local issue I have with ratelimiting I can not test this locally
if slices.Contains(utils.Map(policyList, func(p kuadrantgatewayapi.Policy) client.ObjectKey { | ||
return client.ObjectKeyFromObject(p) | ||
}), client.ObjectKeyFromObject(rlp)) { | ||
affectedPolicies = append(affectedPolicies, policyList...) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this what is expected. I am reading this as if there is one policy in the policy list that contains the object key for the RLP, all the policies are added to the affected list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, currently that is exactly what is expected. This functions returns all policies that is associated with a gateway, if the current policy is contained within (i.e. Route RLP may be associated with multiple gateways but gateway RLP should just return its's own associated policies ) which can be further filtered out later on a per use basis
rlp.Spec.CommonSpec().Limits = p.Spec.Overrides.Limits | ||
logger.V(1).Info("applying overrides from parent policy", "parentPolicy", client.ObjectKeyFromObject(p)) | ||
r.AffectedPolicyMap.SetAffectedPolicy(rlp, []client.ObjectKey{client.ObjectKeyFromObject(p)}) | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this not be continue
instead of break
? break will exit the loop but there still might be policies left in the filteredPolicies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I went with break
because the applied overrides currently can only come from one gateway RLP, with the oldest having the precdence. In this case, we should break out the loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course it is sorted by the time stamp. Makes perfect sense now.
…ly exclusive with defaults
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verification steps working. Good job, @KevFan!
🚀
* feat: apply RLP gateway overrides * feat: RLP CEL implicit default and override mutual exclusivity * refactor: use DAG for applying RLP overrides * refactor: RLP CEL for override - only allowed for gateways and mutually exclusive with defaults * docs: RLP overrides * feat: rlp enforced condition * refactor: event handler for limitador status changes only * refactor: overridden policy map * refactor: override logic and integration tests * refactor: overridden to affected policy map * tests: add testing enforced condition to other integration tests * fix: wasm plugin config not accounting for limit overrides * tests: AuthPolicy enforced condition message * fix: invalid reason not deleting second ns after test & fix missing comments
Description
Closes: #463
RLP atomic override. RLP targetting gateways that define an override block, will override the limits of HTTPRoute RLPs.
CEL validation was added to:
defaults
andoverride
block mutually exclusiveoverride
block mutually exclusiveoverride
block for RLPs targetting Gateway resourceVerification
The scenario's is already generally tested with the integration tests added
If you want to manually verify:
kubectl apply -f examples/toystore/toystore.yaml kubectl wait --timeout=300s --for=condition=Available deployment toystore