Skip to content

Make auth-proxy query EventPolicies dynamically#8870

Merged
knative-prow[bot] merged 5 commits intoknative:mainfrom
creydr:make-auth-proxy-load-ep-dynamically
Feb 4, 2026
Merged

Make auth-proxy query EventPolicies dynamically#8870
knative-prow[bot] merged 5 commits intoknative:mainfrom
creydr:make-auth-proxy-load-ep-dynamically

Conversation

@creydr
Copy link
Copy Markdown
Member

@creydr creydr commented Feb 3, 2026

IntegrationSink was baking EventPolicies into the AUTH_POLICIES env var, requiring deployment rollouts whenever policies changed. This required a redeploy when Policies changes (and lead to many flaky test failures because old pods with stale policies continued serving traffic during RollingUpdate).

This PR addresses it and changes auth-proxy to query EventPolicies dynamically using a namespace-scoped informer, similar to how Broker and Channel work. This eliminates deployment rollouts when EventPolicies change.

  • Add knative-eventing-eventpolicy-reader ClusterRole
  • Create namespace-scoped EventPolicy informer in auth-proxy
  • Add parent resource env vars to identify which resource to query policies for
  • Create RoleBinding in sink's namespace for EventPolicy access
  • Remove AUTH_POLICIES env var from deployment spec
  • Add test coverage for OIDC-enabled deployments with RoleBindings

Makes tests more stable and required for #8867

@creydr creydr requested review from matzew and simkam February 3, 2026 11:28
@knative-prow knative-prow bot requested review from Cali0707 and aliok February 3, 2026 11:28
@knative-prow knative-prow bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 3, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 3, 2026

Codecov Report

❌ Patch coverage is 12.43243% with 162 lines in your changes missing coverage. Please review.
✅ Project coverage is 50.91%. Comparing base (0ef7bc1) to head (5c9fe25).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
cmd/auth_proxy/main.go 0.00% 84 Missing ⚠️
...iler/integration/sink/resources/container_image.go 2.27% 43 Missing ⚠️
pkg/reconciler/integration/sink/integrationsink.go 43.13% 26 Missing and 3 partials ⚠️
pkg/reconciler/testing/v1alpha1/integrationsink.go 0.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8870      +/-   ##
==========================================
- Coverage   51.11%   50.91%   -0.21%     
==========================================
  Files         409      409              
  Lines       21381    21524     +143     
==========================================
+ Hits        10928    10958      +30     
- Misses       9602     9710     +108     
- Partials      851      856       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

IntegrationSink was baking EventPolicies into the AUTH_POLICIES env var,
requiring deployment rollouts whenever policies changed. This caused test
failures because old pods with stale policies continued serving traffic
during RollingUpdate.

Change auth-proxy to query EventPolicies dynamically using a namespace-scoped
informer, similar to how Broker and Channel work. This eliminates deployment
rollouts when EventPolicies change.

- Add knative-eventing-eventpolicy-reader ClusterRole
- Create namespace-scoped EventPolicy informer in auth-proxy
- Add parent resource env vars to identify which resource to query policies for
- Create RoleBinding in sink's namespace for EventPolicy access
- Remove AUTH_POLICIES env var from deployment spec
- Add test coverage for OIDC-enabled deployments with RoleBindings
@creydr creydr force-pushed the make-auth-proxy-load-ep-dynamically branch from a847b16 to ff8caa6 Compare February 3, 2026 11:39
@knative-prow knative-prow bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 3, 2026
@creydr
Copy link
Copy Markdown
Member Author

creydr commented Feb 3, 2026

/cc @simkam

@creydr
Copy link
Copy Markdown
Member Author

creydr commented Feb 3, 2026

/cc @Cali0707

Copy link
Copy Markdown
Member

@Cali0707 Cali0707 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this big change @creydr !

Left a few comments, but overall looks good

Comment thread cmd/auth_proxy/main.go
}

// Create namespace-scoped informer factory
eventingInformerFactory := eventinginformers.NewSharedInformerFactoryWithOptions(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on making this namespaced 😄

Comment thread cmd/auth_proxy/main.go
logger.Debugf("Found %d EventPolicies for %s/%s", len(policies), h.config.ParentNamespace, h.config.ParentName)

// Convert EventPolicies to SubjectsWithFilters
subjectsWithFilters := make([]auth.SubjectsWithFilters, 0, len(policies))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can cache this & recompute whenever the informer has a new event?

In a NS with a lot of policies this will probably really slow the proxy down (since it happens on every request as far as I can tell)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the informer cache should be enought. But you're right with the computation of the filters on each request. I updated it

func (r *Reconciler) reconcileEventPolicyRoleBinding(ctx context.Context, sink *sinks.IntegrationSink) error {
expected := resources.MakeEventPolicyRoleBinding(sink)

rb, err := r.kubeClientSet.RbacV1().RoleBindings(expected.Namespace).Get(ctx, expected.Name, metav1.GetOptions{})
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we are using the kubeclient directly instead of an informer here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll update it

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated it

Comment thread pkg/reconciler/integration/sink/integrationsink.go
Comment thread config/core/roles/clusterrole-namespaced.yaml
@creydr
Copy link
Copy Markdown
Member Author

creydr commented Feb 4, 2026

/cc @Cali0707

@knative-prow knative-prow bot requested a review from Cali0707 February 4, 2026 10:36
Copy link
Copy Markdown
Member

@Cali0707 Cali0707 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Thanks @creydr !

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Feb 4, 2026
@knative-prow
Copy link
Copy Markdown

knative-prow bot commented Feb 4, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Cali0707, creydr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot merged commit 4ee1519 into knative:main Feb 4, 2026
39 of 41 checks passed
creydr added a commit to creydr/knative-eventing that referenced this pull request Feb 5, 2026
* Make auth-proxy query EventPolicies dynamically

IntegrationSink was baking EventPolicies into the AUTH_POLICIES env var,
requiring deployment rollouts whenever policies changed. This caused test
failures because old pods with stale policies continued serving traffic
during RollingUpdate.

Change auth-proxy to query EventPolicies dynamically using a namespace-scoped
informer, similar to how Broker and Channel work. This eliminates deployment
rollouts when EventPolicies change.

- Add knative-eventing-eventpolicy-reader ClusterRole
- Create namespace-scoped EventPolicy informer in auth-proxy
- Add parent resource env vars to identify which resource to query policies for
- Create RoleBinding in sink's namespace for EventPolicy access
- Remove AUTH_POLICIES env var from deployment spec
- Add test coverage for OIDC-enabled deployments with RoleBindings

* Get resync period from context

* Use rolebindingLister instead of kubeclient directly

* Delete EventPolicy RBAC when OIDC gets disabled

* Recreate subjectsWithFilters only on eventPolicy changes
openshift-merge-bot bot pushed a commit to openshift-knative/eventing that referenced this pull request Feb 5, 2026
* Make auth-proxy query EventPolicies dynamically (knative#8870)

* Make auth-proxy query EventPolicies dynamically

IntegrationSink was baking EventPolicies into the AUTH_POLICIES env var,
requiring deployment rollouts whenever policies changed. This caused test
failures because old pods with stale policies continued serving traffic
during RollingUpdate.

Change auth-proxy to query EventPolicies dynamically using a namespace-scoped
informer, similar to how Broker and Channel work. This eliminates deployment
rollouts when EventPolicies change.

- Add knative-eventing-eventpolicy-reader ClusterRole
- Create namespace-scoped EventPolicy informer in auth-proxy
- Add parent resource env vars to identify which resource to query policies for
- Create RoleBinding in sink's namespace for EventPolicy access
- Remove AUTH_POLICIES env var from deployment spec
- Add test coverage for OIDC-enabled deployments with RoleBindings

* Get resync period from context

* Use rolebindingLister instead of kubeclient directly

* Delete EventPolicy RBAC when OIDC gets disabled

* Recreate subjectsWithFilters only on eventPolicy changes

* Add probes to IntegrationSource & -Sink deployments (knative#8867)

* Add probes to IntegrationSource deployments

* Fix auth-proxy SINK_URI missing value error

The auth-proxy container was crashing with "required key SINK_URI
missing value" due to a circular dependency in the reconciliation
order. The deployment (with auth-proxy) was being created before
the IntegrationSink status.Address was set, causing the auth-proxy
to fail during startup.

This commit fixes the issue by deriving the SINK_URI directly from
the sink name and namespace (using network.GetServiceHostname())
instead of reading it from status.Address. This matches how the
reconcileAddress() function constructs the URL, but without
requiring the status to be set first.

The same approach is now used for the SINK_AUDIENCE when OIDC
authentication is enabled.

This eliminates the circular dependency and ensures the auth-proxy
always has a valid SINK_URI, regardless of reconciliation timing.

* Add readiness check to auth-proxy

* Fix unit tests

* Use same timings for auth-proxy probes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants