RLP atomic override #523

KevFan · 2024-04-08T14:45:08Z

Description

Closes: #463

RLP atomic override. RLP targetting gateways that define an override block, will override the limits of HTTPRoute RLPs.

CEL validation was added to:

make defaults and override block mutually exclusive
make implicit limits block and override block mutually exclusive
only allow override block for RLPs targetting Gateway resource

Verification

The scenario's is already generally tested with the integration tests added

If you want to manually verify:

Checkout this branch and deploy cluster

make local-setup

Deploy Kuadrant CR

kubectl -n kuadrant-system apply -f - <<EOF                 
apiVersion: kuadrant.io/v1beta1                       
kind: Kuadrant
metadata:
  name: kuadrant
spec: {}
EOF

Deploy toystore

kubectl apply -f examples/toystore/toystore.yaml
kubectl wait --timeout=300s --for=condition=Available deployment toystore

Create HTTP Route

kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: toystore
spec:
  parentRefs:
  - name: istio-ingressgateway
    namespace: istio-system
  hostnames:
  - api.toystore.com
  rules:
  - matches:
    - method: GET
      path:
        type: PathPrefix
        value: "/toys"
    backendRefs:
    - name: toystore
      port: 80
  - matches: # it has to be a separate HTTPRouteRule so we do not rate limit other endpoints
    - method: POST
      path:
        type: Exact
        value: "/toys"
    backendRefs:
    - name: toystore
      port: 80
EOF

Create GW RLP with defaults

kubectl apply -f - <<EOF
apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: gw-rlp
  namespace: istio-system
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: istio-ingressgateway
  defaults:
    limits:
      "gateway":
        rates:
        - limit: 5
          duration: 10
          unit: second
EOF

Port forward gateway service

kubectl port-forward -n istio-system service/istio-ingressgateway-istio 9080:80 2>&1 >/dev/null &
export GATEWAY_URL=localhost:9080

Ensure HTTP Route is rate limit at 5 requests per 10 seconds

while :; do curl --write-out '%{http_code}\n' --silent --output /dev/null -H 'Host: api.toystore.com' http://$GATEWAY_URL/toys -X POST | grep -E --color "\b(429)\b|$"; sleep 1; done

Create HTTPRoute RLP with implicit default limits

kubectl apply -f - <<EOF
apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: toystore
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: toystore
  limits:
    "route":
      rates:
      - limit: 8
        duration: 10
        unit: second
EOF

Ensure HTTP Route is rate limted at 8 requests per second (Route RLP defaults takes precedence over gateway defaults) (Note: you might need to port forward the gateway service again)

while :; do curl --write-out '%{http_code}\n' --silent --output /dev/null -H 'Host: api.toystore.com' http://$GATEWAY_URL/toys -X POST | grep -E --color "\b(429)\b|$"; sleep 1; done

Update GW RLP to set overrides instead

kubectl apply -f - <<EOF
apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: gw-rlp
  namespace: istio-system
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: istio-ingressgateway
  overrides:
    limits:
      "gateway":
        rates:
        - limit: 5
          duration: 10
          unit: second
EOF

Ensure HTTP Route is rate limits at 5 requests per second again (GW RLP Override takes precedence)

while :; do curl --write-out '%{http_code}\n' --silent --output /dev/null -H 'Host: api.toystore.com' http://$GATEWAY_URL/toys -X POST | grep -E --color "\b(429)\b|$"; sleep 1; done

Verify overrides to not allowed on RLPs targetting HTTPRoutes

kubectl apply -f - <<EOF
apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: toystore
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: toystore
  overrides:
    limits:
      "create-toy":
        rates:
        - limit: 8
          duration: 10
          unit: second
EOF

Verify explicit defaults and overrides blocks are mutually exclusive

kubectl apply -f - <<EOF
apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: gw-rlp
  namespace: istio-system
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: istio-ingressgateway
  defaults:
    limits:
      "create-toy":
        rates:
        - limit: 1
          duration: 10
          unit: second
  overrides:
    limits:
      "create-toy":
        rates:
        - limit: 5
          duration: 10
          unit: second
EOF

Verify implicit defaults and overrides blocks are mutually exclusive

kubectl apply -f - <<EOF
apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: gw-rlp
  namespace: istio-system
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: istio-ingressgateway
  limits:
    "create-toy":
      rates:
      - limit: 1
        duration: 10
        unit: second
  overrides:
    limits:
      "create-toy":
        rates:
        - limit: 5
          duration: 10
          unit: second
EOF

controllers/ratelimitpolicy_controller_test.go

guicassolato · 2024-04-10T09:18:32Z

controllers/ratelimitpolicy_limits.go

+// It iterates through the RateLimitPolicyList to find overrides for the provided target HTTPRoute.
+// If an override is found, it updates the limits in the RateLimitPolicySpec in accordingly.
+func (r *RateLimitPolicyReconciler) applyOverrides(ctx context.Context, rlp *kuadrantv1beta2.RateLimitPolicy, targetNetworkObject client.Object) error {
+	if route, ok := targetNetworkObject.(*gatewayapiv1.HTTPRoute); ok {


I think we could follow a different approach here that is indifferent to the fact that nowadays we only allow one policy of a kind targeting a given network resource at a time.

Instead, what about the following?

List all the policies that target the same resource, plus, if it's a route policy, all the policies that target any of the route's parent Gateways

Sort the list of policies by precedence – i.e. the higher the target is in the hierarchy, the higher the precedence; same level in the hierarchy, older policy beats newer ones

Drop all policies with lower precedence than rlp

Iterate from highest to lowest until finding a block of overrides, in which case replace the rlp's limits with it and break

In the future, for adding support for the merge strategy, we invert the order of the list (iterating from lowest to highest) and go all the way through the list (i.e. without breaking at the first match.)

A remaining problem to solve – which is an issue in your current proposed implementation also – is what to do in case of multiple parent gateways with override policies targeting them. I suppose we'll have to add the host names in.

WDYT?

I know we agreed on not necessarily using the DAG in the first iteration. I have to mention though that the fetching and sorting of the policies is precisely what it can help us with.

I am torn with my view on this. Yes the use of the DAG make it an easier problem. If code freeze is in two weeks I think it would be better to have a working solution sooner.

The conflict I have is with the hierarchy. Currently we only support polices attached to the gateway or the route. If the gateway has an atomic override the number of policies attached to route does not matter, the gateway always wins.

The same goes for the multiply routes, we currently don't support that as a feature. While we possible should to me it is feeling like scope creep and solving a problem of tomorrow. A problem we are not sure even exists in the wild.

This is were my conflict is. I think we should be solving that problem, and that we should support polices attached to the gateway class. That is not the product of today, but of tomorrow.

I see your point, @Boomatang. The part that I don't like is the spreading of one decision across the entire code, i.e. the coupling of implementations at multiple levels to a rule otherwise enforced at a single point, possibly at a level far beyond. We are just increasing the number of points in the code that will need to change when the rule also shifts.

In this case, the "decision" (or "rule") we'd be coupling the implementation to or not is the current behaviour of 1 policy of a kind per target. I already lost count of how many places in the code we rely on that for something. And this one is particularly nasty because the rule itself is not even obvious at a glance. Instead, one who reads the code needs to have that click that "oh, right! this is because of only 1 policy per..."

Other examples include for sure the kinds of resources that can be targeted by a policy. Dozens of ifs and switch cases all over for this one too, when IMO in many cases the types could/should abstract the complexity. (BTW, you can git blame me on several of those.)

As for using the DAG, I may be overlooking the difficulties to add it to implementation here, so I'll let you and @KevFan decide. My initial intuition is that it could result in some DRYer code and arguably improve performance.

Hmm, yeah might play around with this, although part of this, at least for RLP, might be addressed/implemented by @eguzki already since he has draft PR introducing a new limitador limits controller which uses the DAG #527

Pushed an implememtation using the DAG that is focused solely on the overriden portion of the reconcile logic based on the algorithm suggested (#527 would be the general refactor to use the DAG throughout)

Currently doesn't account for:

A remaining problem to solve – which is an issue in your current proposed implementation also – is what to do in case of multiple parent gateways with override policies targeting them. I suppose we'll have to add the host names in.

In this scenario, the oldest Gateway RLP's override block is used

pkg/library/gatewayapi/types.go

controllers/ratelimitpolicy_limits.go

guicassolato

We're still missing overriding the rules in the wasm plugin config.

The reason why the verification steps seem to work is because, coincidentally, the limit is named "create-toy" in both policies. But, of course, we shouldn't expect that to be always the case. In fact, other differences between the override and the lower-tier policies, such as on when conditions and routeSelectors, would also cause divergences here.

guicassolato · 2024-04-18T06:27:15Z

In the verification, I personally found useful checking the states of the WasmPlugin and Limitador CRs after each step of applying the policies:

kubectl get wasmplugin/kuadrant-istio-ingressgateway -n istio-system -o yaml
kubectl get limitador/limitador -n kuadrant-system -o yaml

Boomatang

Not much is standing out to me. I do have two question in my comments.
Due to a local issue I have with ratelimiting I can not test this locally

Boomatang · 2024-04-22T10:50:41Z

controllers/ratelimitpolicy_limits.go

+		if slices.Contains(utils.Map(policyList, func(p kuadrantgatewayapi.Policy) client.ObjectKey {
+			return client.ObjectKeyFromObject(p)
+		}), client.ObjectKeyFromObject(rlp)) {
+			affectedPolicies = append(affectedPolicies, policyList...)
+		}


Is this what is expected. I am reading this as if there is one policy in the policy list that contains the object key for the RLP, all the policies are added to the affected list.

Yes, currently that is exactly what is expected. This functions returns all policies that is associated with a gateway, if the current policy is contained within (i.e. Route RLP may be associated with multiple gateways but gateway RLP should just return its's own associated policies ) which can be further filtered out later on a per use basis

Boomatang · 2024-04-22T10:56:58Z

controllers/ratelimitpolicy_limits.go

+			rlp.Spec.CommonSpec().Limits = p.Spec.Overrides.Limits
+			logger.V(1).Info("applying overrides from parent policy", "parentPolicy", client.ObjectKeyFromObject(p))
+			r.AffectedPolicyMap.SetAffectedPolicy(rlp, []client.ObjectKey{client.ObjectKeyFromObject(p)})
+			break


Should this not be continue instead of break? break will exit the loop but there still might be policies left in the filteredPolicies.

No, I went with break because the applied overrides currently can only come from one gateway RLP, with the oldest having the precdence. In this case, we should break out the loop

Of course it is sorted by the time stamp. Makes perfect sense now.

…ly exclusive with defaults

guicassolato

Verification steps working. Good job, @KevFan!

🚀

…omments

* feat: apply RLP gateway overrides * feat: RLP CEL implicit default and override mutual exclusivity * refactor: use DAG for applying RLP overrides * refactor: RLP CEL for override - only allowed for gateways and mutually exclusive with defaults * docs: RLP overrides * feat: rlp enforced condition * refactor: event handler for limitador status changes only * refactor: overridden policy map * refactor: override logic and integration tests * refactor: overridden to affected policy map * tests: add testing enforced condition to other integration tests * fix: wasm plugin config not accounting for limit overrides * tests: AuthPolicy enforced condition message * fix: invalid reason not deleting second ns after test & fix missing comments

guicassolato reviewed Apr 10, 2024

View reviewed changes

controllers/ratelimitpolicy_controller_test.go Outdated Show resolved Hide resolved

guicassolato reviewed Apr 10, 2024

View reviewed changes

KevFan force-pushed the rlp-override branch 2 times, most recently from 100514e to 3ad3f9c Compare April 10, 2024 14:52

KevFan changed the title ~~[WIP]RLP atomic override~~ [WIP] RLP atomic override Apr 10, 2024

KevFan added kind/enhancement New feature or request area/api Changes user facing APIs labels Apr 10, 2024

KevFan force-pushed the rlp-override branch 5 times, most recently from 550539b to 365f209 Compare April 12, 2024 13:11

KevFan changed the title ~~[WIP] RLP atomic override~~ RLP atomic override Apr 12, 2024

KevFan changed the title ~~RLP atomic override~~ [WIP] RLP atomic override Apr 12, 2024

KevFan force-pushed the rlp-override branch 3 times, most recently from 3231881 to c7b7552 Compare April 12, 2024 15:40

KevFan changed the title ~~[WIP] RLP atomic override~~ RLP atomic override Apr 16, 2024

KevFan marked this pull request as ready for review April 16, 2024 09:58

KevFan requested a review from a team as a code owner April 16, 2024 09:58

KevFan self-assigned this Apr 16, 2024

KevFan mentioned this pull request Apr 17, 2024

[wip] feat: rlp enforced condition #533

Closed

guicassolato reviewed Apr 18, 2024

View reviewed changes

pkg/library/gatewayapi/types.go Show resolved Hide resolved

guicassolato reviewed Apr 18, 2024

View reviewed changes

controllers/ratelimitpolicy_limits.go Outdated Show resolved Hide resolved

guicassolato requested changes Apr 18, 2024

View reviewed changes

KevFan force-pushed the rlp-override branch 2 times, most recently from c8aa3f0 to 721e1cd Compare April 19, 2024 11:25

Boomatang reviewed Apr 22, 2024

View reviewed changes

feat: apply RLP gateway overrides

2f3d1bd

KevFan added 11 commits April 22, 2024 15:19

feat: RLP CEL implicit default and override mutual exclusivity

d6ac7a0

refactor: use DAG for applying RLP overrides

d5f37c7

refactor: RLP CEL for override - only allowed for gateways and mutual…

3fc0f8e

…ly exclusive with defaults

docs: RLP overrides

8bafd7f

feat: rlp enforced condition

26ec9df

refactor: event handler for limitador status changes only

d7c61f7

refactor: overridden policy map

db61290

refactor: override logic and integration tests

cf11463

refactor: overridden to affected policy map

bf18ae1

tests: add testing enforced condition to other integration tests

707e779

fix: wasm plugin config not accounting for limit overrides

8f371e7

KevFan force-pushed the rlp-override branch from 5ca771b to 8f371e7 Compare April 22, 2024 14:23

tests: AuthPolicy enforced condition message

cb4c778

guicassolato approved these changes Apr 22, 2024

View reviewed changes

KevFan force-pushed the rlp-override branch from 839a18d to 85b6a96 Compare April 22, 2024 17:43

fix: invalid reason not deleting second ns after test & fix missing c…

76cc36f

…omments

KevFan force-pushed the rlp-override branch from 85b6a96 to 76cc36f Compare April 22, 2024 18:41

KevFan merged commit 0eb260b into Kuadrant:main Apr 22, 2024
13 checks passed

guicassolato mentioned this pull request Apr 23, 2024

RateLimitPolicy status for D/O #465

Closed

This was referenced Apr 23, 2024

RLP Enforced condition #414

Closed

RateLimitPolicy conditions don't indicate that the limit is applied #140

Closed

martinhesko mentioned this pull request Apr 23, 2024

Atomic overrides testing Kuadrant/testsuite#380

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RLP atomic override #523

RLP atomic override #523

KevFan commented Apr 8, 2024 •

edited

Loading

guicassolato Apr 10, 2024

guicassolato Apr 10, 2024 •

edited

Loading

Boomatang Apr 10, 2024

guicassolato Apr 10, 2024

KevFan Apr 10, 2024 •

edited

Loading

KevFan Apr 12, 2024

KevFan Apr 12, 2024

guicassolato left a comment

guicassolato commented Apr 18, 2024

Boomatang left a comment

Boomatang Apr 22, 2024

KevFan Apr 22, 2024 •

edited

Loading

Boomatang Apr 22, 2024

KevFan Apr 22, 2024

Boomatang Apr 22, 2024

guicassolato left a comment

RLP atomic override #523

RLP atomic override #523

Conversation

KevFan commented Apr 8, 2024 • edited Loading

Description

Verification

Choose a reason for hiding this comment

guicassolato Apr 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KevFan Apr 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guicassolato left a comment

Choose a reason for hiding this comment

guicassolato commented Apr 18, 2024

Boomatang left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KevFan Apr 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guicassolato left a comment

Choose a reason for hiding this comment

KevFan commented Apr 8, 2024 •

edited

Loading

guicassolato Apr 10, 2024 •

edited

Loading

KevFan Apr 10, 2024 •

edited

Loading

KevFan Apr 22, 2024 •

edited

Loading