Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RLP atomic override #523

Merged
merged 14 commits into from
Apr 22, 2024
Merged

RLP atomic override #523

merged 14 commits into from
Apr 22, 2024

Conversation

KevFan
Copy link
Contributor

@KevFan KevFan commented Apr 8, 2024

Description

Closes: #463

RLP atomic override. RLP targetting gateways that define an override block, will override the limits of HTTPRoute RLPs.

CEL validation was added to:

  • make defaults and override block mutually exclusive
  • make implicit limits block and override block mutually exclusive
  • only allow override block for RLPs targetting Gateway resource

Verification

The scenario's is already generally tested with the integration tests added

If you want to manually verify:

  • Checkout this branch and deploy cluster
make local-setup
  • Deploy Kuadrant CR
kubectl -n kuadrant-system apply -f - <<EOF                 
apiVersion: kuadrant.io/v1beta1                       
kind: Kuadrant
metadata:
  name: kuadrant
spec: {}
EOF
  • Deploy toystore
kubectl apply -f examples/toystore/toystore.yaml
kubectl wait --timeout=300s --for=condition=Available deployment toystore
  • Create HTTP Route
kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: toystore
spec:
  parentRefs:
  - name: istio-ingressgateway
    namespace: istio-system
  hostnames:
  - api.toystore.com
  rules:
  - matches:
    - method: GET
      path:
        type: PathPrefix
        value: "/toys"
    backendRefs:
    - name: toystore
      port: 80
  - matches: # it has to be a separate HTTPRouteRule so we do not rate limit other endpoints
    - method: POST
      path:
        type: Exact
        value: "/toys"
    backendRefs:
    - name: toystore
      port: 80
EOF
  • Create GW RLP with defaults
kubectl apply -f - <<EOF
apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: gw-rlp
  namespace: istio-system
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: istio-ingressgateway
  defaults:
    limits:
      "gateway":
        rates:
        - limit: 5
          duration: 10
          unit: second
EOF
  • Port forward gateway service
kubectl port-forward -n istio-system service/istio-ingressgateway-istio 9080:80 2>&1 >/dev/null &
export GATEWAY_URL=localhost:9080
  • Ensure HTTP Route is rate limit at 5 requests per 10 seconds
while :; do curl --write-out '%{http_code}\n' --silent --output /dev/null -H 'Host: api.toystore.com' http://$GATEWAY_URL/toys -X POST | grep -E --color "\b(429)\b|$"; sleep 1; done
  • Create HTTPRoute RLP with implicit default limits
kubectl apply -f - <<EOF
apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: toystore
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: toystore
  limits:
    "route":
      rates:
      - limit: 8
        duration: 10
        unit: second
EOF
  • Ensure HTTP Route is rate limted at 8 requests per second (Route RLP defaults takes precedence over gateway defaults) (Note: you might need to port forward the gateway service again)
while :; do curl --write-out '%{http_code}\n' --silent --output /dev/null -H 'Host: api.toystore.com' http://$GATEWAY_URL/toys -X POST | grep -E --color "\b(429)\b|$"; sleep 1; done
  • Update GW RLP to set overrides instead
kubectl apply -f - <<EOF
apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: gw-rlp
  namespace: istio-system
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: istio-ingressgateway
  overrides:
    limits:
      "gateway":
        rates:
        - limit: 5
          duration: 10
          unit: second
EOF
  • Ensure HTTP Route is rate limits at 5 requests per second again (GW RLP Override takes precedence)
while :; do curl --write-out '%{http_code}\n' --silent --output /dev/null -H 'Host: api.toystore.com' http://$GATEWAY_URL/toys -X POST | grep -E --color "\b(429)\b|$"; sleep 1; done
  • Verify overrides to not allowed on RLPs targetting HTTPRoutes
kubectl apply -f - <<EOF
apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: toystore
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: toystore
  overrides:
    limits:
      "create-toy":
        rates:
        - limit: 8
          duration: 10
          unit: second
EOF
  • Verify explicit defaults and overrides blocks are mutually exclusive
kubectl apply -f - <<EOF
apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: gw-rlp
  namespace: istio-system
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: istio-ingressgateway
  defaults:
    limits:
      "create-toy":
        rates:
        - limit: 1
          duration: 10
          unit: second
  overrides:
    limits:
      "create-toy":
        rates:
        - limit: 5
          duration: 10
          unit: second
EOF
  • Verify implicit defaults and overrides blocks are mutually exclusive
kubectl apply -f - <<EOF
apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: gw-rlp
  namespace: istio-system
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: istio-ingressgateway
  limits:
    "create-toy":
      rates:
      - limit: 1
        duration: 10
        unit: second
  overrides:
    limits:
      "create-toy":
        rates:
        - limit: 5
          duration: 10
          unit: second
EOF

// It iterates through the RateLimitPolicyList to find overrides for the provided target HTTPRoute.
// If an override is found, it updates the limits in the RateLimitPolicySpec in accordingly.
func (r *RateLimitPolicyReconciler) applyOverrides(ctx context.Context, rlp *kuadrantv1beta2.RateLimitPolicy, targetNetworkObject client.Object) error {
if route, ok := targetNetworkObject.(*gatewayapiv1.HTTPRoute); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could follow a different approach here that is indifferent to the fact that nowadays we only allow one policy of a kind targeting a given network resource at a time.

Instead, what about the following?

  1. List all the policies that target the same resource, plus, if it's a route policy, all the policies that target any of the route's parent Gateways
  2. Sort the list of policies by precedence – i.e. the higher the target is in the hierarchy, the higher the precedence; same level in the hierarchy, older policy beats newer ones
  3. Drop all policies with lower precedence than rlp
  4. Iterate from highest to lowest until finding a block of overrides, in which case replace the rlp's limits with it and break

In the future, for adding support for the merge strategy, we invert the order of the list (iterating from lowest to highest) and go all the way through the list (i.e. without breaking at the first match.)

A remaining problem to solve – which is an issue in your current proposed implementation also – is what to do in case of multiple parent gateways with override policies targeting them. I suppose we'll have to add the host names in.

WDYT?

Copy link
Contributor

@guicassolato guicassolato Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we agreed on not necessarily using the DAG in the first iteration. I have to mention though that the fetching and sorting of the policies is precisely what it can help us with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am torn with my view on this. Yes the use of the DAG make it an easier problem. If code freeze is in two weeks I think it would be better to have a working solution sooner.

The conflict I have is with the hierarchy. Currently we only support polices attached to the gateway or the route. If the gateway has an atomic override the number of policies attached to route does not matter, the gateway always wins.

The same goes for the multiply routes, we currently don't support that as a feature. While we possible should to me it is feeling like scope creep and solving a problem of tomorrow. A problem we are not sure even exists in the wild.

This is were my conflict is. I think we should be solving that problem, and that we should support polices attached to the gateway class. That is not the product of today, but of tomorrow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point, @Boomatang. The part that I don't like is the spreading of one decision across the entire code, i.e. the coupling of implementations at multiple levels to a rule otherwise enforced at a single point, possibly at a level far beyond. We are just increasing the number of points in the code that will need to change when the rule also shifts.

In this case, the "decision" (or "rule") we'd be coupling the implementation to or not is the current behaviour of 1 policy of a kind per target. I already lost count of how many places in the code we rely on that for something. And this one is particularly nasty because the rule itself is not even obvious at a glance. Instead, one who reads the code needs to have that click that "oh, right! this is because of only 1 policy per..."

Other examples include for sure the kinds of resources that can be targeted by a policy. Dozens of ifs and switch cases all over for this one too, when IMO in many cases the types could/should abstract the complexity. (BTW, you can git blame me on several of those.)

As for using the DAG, I may be overlooking the difficulties to add it to implementation here, so I'll let you and @KevFan decide. My initial intuition is that it could result in some DRYer code and arguably improve performance.

Copy link
Contributor Author

@KevFan KevFan Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah might play around with this, although part of this, at least for RLP, might be addressed/implemented by @eguzki already since he has draft PR introducing a new limitador limits controller which uses the DAG #527

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed an implememtation using the DAG that is focused solely on the overriden portion of the reconcile logic based on the algorithm suggested (#527 would be the general refactor to use the DAG throughout)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently doesn't account for:

A remaining problem to solve – which is an issue in your current proposed implementation also – is what to do in case of multiple parent gateways with override policies targeting them. I suppose we'll have to add the host names in.

In this scenario, the oldest Gateway RLP's override block is used

@KevFan KevFan force-pushed the rlp-override branch 2 times, most recently from 100514e to 3ad3f9c Compare April 10, 2024 14:52
@KevFan KevFan changed the title [WIP]RLP atomic override [WIP] RLP atomic override Apr 10, 2024
@KevFan KevFan added kind/enhancement New feature or request area/api Changes user facing APIs labels Apr 10, 2024
@KevFan KevFan force-pushed the rlp-override branch 5 times, most recently from 550539b to 365f209 Compare April 12, 2024 13:11
@KevFan KevFan changed the title [WIP] RLP atomic override RLP atomic override Apr 12, 2024
@KevFan KevFan changed the title RLP atomic override [WIP] RLP atomic override Apr 12, 2024
@KevFan KevFan force-pushed the rlp-override branch 3 times, most recently from 3231881 to c7b7552 Compare April 12, 2024 15:40
@KevFan KevFan changed the title [WIP] RLP atomic override RLP atomic override Apr 16, 2024
@KevFan KevFan marked this pull request as ready for review April 16, 2024 09:58
@KevFan KevFan requested a review from a team as a code owner April 16, 2024 09:58
@KevFan KevFan self-assigned this Apr 16, 2024
Copy link
Contributor

@guicassolato guicassolato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're still missing overriding the rules in the wasm plugin config.

The reason why the verification steps seem to work is because, coincidentally, the limit is named "create-toy" in both policies. But, of course, we shouldn't expect that to be always the case. In fact, other differences between the override and the lower-tier policies, such as on when conditions and routeSelectors, would also cause divergences here.

@guicassolato
Copy link
Contributor

In the verification, I personally found useful checking the states of the WasmPlugin and Limitador CRs after each step of applying the policies:

kubectl get wasmplugin/kuadrant-istio-ingressgateway -n istio-system -o yaml
kubectl get limitador/limitador -n kuadrant-system -o yaml

@KevFan KevFan force-pushed the rlp-override branch 2 times, most recently from c8aa3f0 to 721e1cd Compare April 19, 2024 11:25
Copy link
Contributor

@Boomatang Boomatang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not much is standing out to me. I do have two question in my comments.
Due to a local issue I have with ratelimiting I can not test this locally

Comment on lines +181 to +185
if slices.Contains(utils.Map(policyList, func(p kuadrantgatewayapi.Policy) client.ObjectKey {
return client.ObjectKeyFromObject(p)
}), client.ObjectKeyFromObject(rlp)) {
affectedPolicies = append(affectedPolicies, policyList...)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what is expected. I am reading this as if there is one policy in the policy list that contains the object key for the RLP, all the policies are added to the affected list.

Copy link
Contributor Author

@KevFan KevFan Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, currently that is exactly what is expected. This functions returns all policies that is associated with a gateway, if the current policy is contained within (i.e. Route RLP may be associated with multiple gateways but gateway RLP should just return its's own associated policies ) which can be further filtered out later on a per use basis

rlp.Spec.CommonSpec().Limits = p.Spec.Overrides.Limits
logger.V(1).Info("applying overrides from parent policy", "parentPolicy", client.ObjectKeyFromObject(p))
r.AffectedPolicyMap.SetAffectedPolicy(rlp, []client.ObjectKey{client.ObjectKeyFromObject(p)})
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this not be continue instead of break? break will exit the loop but there still might be policies left in the filteredPolicies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I went with break because the applied overrides currently can only come from one gateway RLP, with the oldest having the precdence. In this case, we should break out the loop

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course it is sorted by the time stamp. Makes perfect sense now.

Copy link
Contributor

@guicassolato guicassolato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verification steps working. Good job, @KevFan!

🚀

@KevFan KevFan merged commit 0eb260b into Kuadrant:main Apr 22, 2024
13 checks passed
maleck13 pushed a commit that referenced this pull request Apr 25, 2024
* feat: apply RLP gateway overrides

* feat: RLP CEL implicit default and override mutual exclusivity

* refactor: use DAG for applying RLP overrides

* refactor: RLP CEL for override - only allowed for gateways and mutually exclusive with defaults

* docs: RLP overrides

* feat: rlp enforced condition

* refactor: event handler for limitador status changes only

* refactor: overridden policy map

* refactor: override logic and integration tests

* refactor: overridden to affected policy map

* tests: add testing enforced condition to other integration tests

* fix: wasm plugin config not accounting for limit overrides

* tests: AuthPolicy enforced condition message

* fix: invalid reason not deleting second ns after test & fix missing comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Changes user facing APIs kind/enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add full overrides to the RateLimitPolicy API
3 participants