CNF-13731: Cert Manager HTTP01 Proxy #1773

sebrandon1 · 2025-03-28T19:40:28Z

Draft for adding an enhancement for the Cert Manager HTTP01 proxy.

Based on #1682 for inspiration. Thanks @swghosh for pointing me to that.

openshift-ci-robot · 2025-03-28T19:40:32Z

@sebrandon1: This pull request references CNF-13731 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.19.0" version, but no target version was set.

In response to this:

Draft for adding an enhancement for the Cert Manager HTTP01 proxy.

Based on #1682 for inspiration. Thanks @swghosh for pointing me to that.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2025-03-28T19:40:50Z

Hi @sebrandon1. Thanks for your PR.

I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci-robot · 2025-03-28T19:50:08Z

@sebrandon1: This pull request references CNF-13731 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.19.0" version, but no target version was set.

In response to this:

Draft for adding an enhancement for the Cert Manager HTTP01 proxy.

Based on #1682 for inspiration. Thanks @swghosh for pointing me to that.

cc @mvazquezc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-bot · 2025-04-26T01:15:30Z

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

enhancements/cert-manager-proxy/http01_challenge_proxy.md

everettraven · 2025-05-01T11:49:35Z

It looks like the markdownlint job is, correctly, failing due to missing sections:

enhancements/cert-manager-proxy/http01_challenge_proxy.md missing "### Goals"
enhancements/cert-manager-proxy/http01_challenge_proxy.md missing "## Proposal"
enhancements/cert-manager-proxy/http01_challenge_proxy.md missing "#### Hypershift / Hosted Control Planes"
enhancements/cert-manager-proxy/http01_challenge_proxy.md missing "#### Standalone Clusters"
enhancements/cert-manager-proxy/http01_challenge_proxy.md missing "#### Single-node Deployments or MicroShift"
enhancements/cert-manager-proxy/http01_challenge_proxy.md missing "## Alternatives (Not Implemented)"
enhancements/cert-manager-proxy/http01_challenge_proxy.md missing "## Test Plan"
enhancements/cert-manager-proxy/http01_challenge_proxy.md missing "## Graduation Criteria"
enhancements/cert-manager-proxy/http01_challenge_proxy.md missing "### Dev Preview -> Tech Preview"
enhancements/cert-manager-proxy/http01_challenge_proxy.md missing "### Tech Preview -> GA"
enhancements/cert-manager-proxy/http01_challenge_proxy.md missing "### Removing a deprecated feature"
enhancements/cert-manager-proxy/http01_challenge_proxy.md missing "## Upgrade / Downgrade Strategy"
enhancements/cert-manager-proxy/http01_challenge_proxy.md missing "## Version Skew Strategy"
enhancements/cert-manager-proxy/http01_challenge_proxy.md missing "## Operational Aspects of API Extensions"
enhancements/cert-manager-proxy/http01_challenge_proxy.md missing "## Support Procedures"

I'm not sure what the policy is on removing certain sections all together, but at the very least I would expect to still see the following sections:

Hypershift / Hosted Control Planes
Standalone Clusters
Single-node Deployments or MicroShift
Test Plan
Graduation Criteria
Dev Preview -> Tech Preview
Tech Preview -> GA
Upgrade / Downgrade Strategy
Version Skew Strategy
Operational Aspects of API Extensions
Support Procedures

If you need a more explicit template to follow for this EP, there is one with some good general advice for what should be in each section here: https://github.com/openshift/enhancements/blob/master/guidelines/enhancement_template.md

openshift-bot · 2025-05-09T00:45:30Z

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-ci · 2025-05-12T17:41:24Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign yuqi-zhang for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

enhancements/cert-manager-proxy/http01_challenge_proxy.md

everettraven · 2025-05-13T12:27:45Z

enhancements/cert-manager-proxy/http01_challenge_proxy.md

+
+### Implementation Details/Notes/Constraints
+
+- The proxy will be deployed as a DaemonSet to ensure it runs on all nodes which may host the API VIP in the cluster.


What will deploy the DaemonSet? Will this be a default component on OpenShift clusters? Does it need to be a part of the core payload or can it be a "layered product"?

@mvazquezc can correct me if I'm wrong but this will be an optional feature that Telco customers can enable.

What approaches have you considered for allowing this to be an opt-in feature?

I think the idea was to make it part of the core payload. The DaemonSet could be deployed by the OpenShift API Server Operator maybe?

By default we don't expect the proxy to be deployed. This will be opt-in feature that the customer needs to activate through the use of the proposed api HTTP01Proxy.

I'm assuming we need to agree on how/who/what deploys the daemonset here before the enhancement is finalized right?

enhancements/cert-manager-proxy/http01_challenge_proxy.md

everettraven · 2025-05-13T12:31:27Z

enhancements/cert-manager-proxy/http01_challenge_proxy.md

+
+- The proxy will be deployed as a DaemonSet to ensure it runs on all nodes which may host the API VIP in the cluster.
+- The nftables rules will be added to the nodes. The proxy will listen on port 8888 and redirect traffic to the OpenShift Ingress Routers.
+- The implementation relies on `nftables` for traffic redirection, which must be supported and enabled on the cluster nodes.


What should happen if nftables is not enabled on the cluster nodes?

AFAIK nftables is required for OCP nodes to run. OVN-K requires it (maybe other CNI don't), but still I don't think it can be disabled.

Any references that state it cannot be disabled? Isn't nftables a systemd service that runs on the nodes? Can someone modify the configurations that MCO applies to mark it as disabled?

Maybe we should consult someone from the MCO team to verify whether or not nftables can be disabled?

Maybe @yuqi-zhang ?

The MCO doesn't directly manage nftables in any way on the nodes. I believe it's being shipped (and mostly configured) in RHCOS directly.

As a user, you can technically do whatever you want, including disabling nftables via a MachineConfig, but that would be an explicit user-driven decision and not something we manage.

Thanks for the additional information @yuqi-zhang !

With that in mind - and no other information provided on it being required - I'd like to have included in the considerations of this EP what should happen if nftables isn't present or is disabled and how we plan to detect that.

Nftables is always present, it is part of the RHCOS payload.

Baremetal and AWS (at least) OCP clusters install nftables tables, chains and rules. Execute $ nft list ruleset on any node to list rules. OVN-Kubernetes relies on netfilter (ipables/nftables) for features like UDN, Egress and Services.

In this thread, it is mentioned a few times nftables being enabled/disabled. Note that the nftables systemd unit is disabled out of the box but netfilter is on and can be configured via the nftables CLI/API without the nftables systemd unit being enabled.

The nftables systemd service is just a helper to faciliate the load/unload of custom firewall rules:

[Unit] Description=Netfilter Tables Documentation=man:nft(8) Wants=network-pre.target Before=network-pre.target [Service] Type=oneshot ProtectSystem=full ProtectHome=true ExecStart=/sbin/nft -f /etc/sysconfig/nftables.conf ExecReload=/sbin/nft 'flush ruleset; include "/etc/sysconfig/nftables.conf";' ExecStop=/sbin/nft flush ruleset RemainAfterExit=yes [Install] WantedBy=multi-user.target

Let's include some of this context in the enhancement.

I think it is important for us to document and understand if it is possible for the feature we are relying on being present to not be present based on a user configuration. If it is not possible, lets explicitly detail that. If it is possible, lets document that and how we take that into consideration in this design.

From this additional information, it sounds like it is not possible to remove the thing we would be relying on here.

everettraven · 2025-05-13T12:37:38Z

enhancements/cert-manager-proxy/http01_challenge_proxy.md

+      message: "HTTP01ChallengeProxy is ready"
+```
+
+### Implementation Details/Notes/Constraints


I left some comments with some further questions on how this is being done, but I'd like to see a more descriptive section here that explains how certain aspects are going to be implemented.

For example, does this need to be a core component of the OpenShift payload that is shipped with every cluster? If so, you may want to consider the cluster operator approach: https://github.com/openshift/enhancements/blob/master/dev-guide/operators.md

If not, could this be implemented as an operator installable through OLM?

Should multiple instances of a proxy be allowed to run? If so, what are some issues that may be encountered when running multiple instances? How do you plan to mitigate these issues?

Answering only the last part, will let other chime in for the first part.

No, multiple instances of a proxy are not allowed to run. There will be port collisions since the port used (8888) will be listening on the host network namespace. This can be mitigated by checking if the port is in use when initializing the proxy. During upgrades, etc, the idea is use the recreate strategy so the existing proxy gets deleted before the new one gets created.

How do you plan to limit that only a singular instance of a proxy should run? Is this one proxy per node? one proxy per cluster?

If there is risk of port collision, would it make sense to allow configuration of the port from the user perspective?

Echo'ing from above, if the api for this feature is cluster scoped and allows configuration of the internal port used I think we avoid both risks.

To me, delivering this as part of the openshift payload makes sense, but a full ClusterOperator or standalone olm operator feels heavyweight for the function. Adding this functionality to an existing component would likely fall to either the api server or cert manager.

It certainly doesn't have to be a full ClusterOperator or a standalone OLM operator. You just need to further specify in this enhancement what component(s) will be responsible for ensuring the proxy gets stamped out onto the cluster and under which conditions.

enhancements/cert-manager-proxy/http01_challenge_proxy.md

everettraven · 2025-05-13T13:02:30Z

enhancements/cert-manager-proxy/http01_challenge_proxy.md

+
+## Upgrade / Downgrade Strategy
+
+Updated versions of the proxy can be applied to the cluster similar to initial deployment


Beyond the proxy itself, how do you plan to handle typical upgrade situations where there may older and newer instances of the proxy running at the same time due to a rolling upgrade? Are there any special considerations needed for this scenario?

What happens if an upgrade fails midway through? What steps would a user need to take to get their cluster back to the previous state?

We can't have multiple instances of the proxy running. They listen on port :8888 in the host network. We can control that with a recreate policy for the deployment/daemonset. In terms of new/old versions the only thing that may be changing (and I doubt it) is the nftable rule that redirects traffic from port 80 to proxy port 8888. But still we should ensure backwards compatibility in the proxy code.

Adding additional context here.

everettraven · 2025-05-13T13:04:32Z

enhancements/cert-manager-proxy/http01_challenge_proxy.md

+
+## Version Skew Strategy
+
+Any changes to the proxy's behavior will be documented to ensure compatibility with older cluster versions.


Does there need to be any special considerations for the interactions between the component that deploys the proxy and updates node configurations and the component that rolls out node configuration changes?

During upgrades there will always be version skew between components, so capturing how your component handles this is helpful.

@mvazquezc Mentioned in another comment that there will be some checking of the OCP version as part of the proxy deploy.

everettraven · 2025-05-13T13:07:56Z

enhancements/cert-manager-proxy/http01_challenge_proxy.md

+
+## Support Procedures
+
+Support for the proxy will be provided through standard OpenShift support channels. Administrators can refer to the deployment documentation and logs for troubleshooting.


I'd like to see a bit more information here "standard support channels" isn't all that helpful in representing how someone providing support can evaluate if this component is behaving correctly or not and what steps they can take to mitigate issues that it may be causing on the system.

The template section has some good examples for things to take into consideration: https://github.com/openshift/enhancements/blob/master/guidelines/enhancement_template.md#support-procedures

Updated based on template.

enhancements/cert-manager-proxy/http01_challenge_proxy.md

openshift-bot · 2025-05-30T00:15:38Z

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2025-05-30T00:16:06Z

@openshift-bot: Closed this PR.

In response to this:

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

sebrandon1 · 2025-05-30T11:59:29Z

/reopen

openshift-ci · 2025-05-30T11:59:40Z

@sebrandon1: Reopened this PR.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci-robot · 2025-05-30T11:59:42Z

@sebrandon1: This pull request references CNF-13731 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.20.0" version, but no target version was set.

In response to this:

Draft for adding an enhancement for the Cert Manager HTTP01 proxy.

Based on #1682 for inspiration. Thanks @swghosh for pointing me to that.

cc @mvazquezc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2025-06-11T21:37:47Z

@sebrandon1: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/markdownlint	`67d3e63`	link	true	`/test markdownlint`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 28, 2025

openshift-ci bot requested review from ashcrow and celebdor March 28, 2025 19:40

openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 28, 2025

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 26, 2025

imiller0 reviewed Apr 28, 2025

View reviewed changes

sebrandon1 force-pushed the enhancement_cert_manager_http01_proxy branch from faf4430 to b77ce7e Compare April 28, 2025 20:38

everettraven reviewed Apr 30, 2025

View reviewed changes

enhancements/cert-manager-proxy/http01_challenge_proxy.md Outdated Show resolved Hide resolved

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 9, 2025

sebrandon1 force-pushed the enhancement_cert_manager_http01_proxy branch from ae29d68 to 21b454b Compare May 12, 2025 17:41

sebrandon1 mentioned this pull request May 12, 2025

CNF-13731: Add HTTP01ChallengeProxy types (PoC) openshift/api#2318

Open

everettraven reviewed May 13, 2025

View reviewed changes

openshift-ci bot closed this May 30, 2025

openshift-ci bot reopened this May 30, 2025

sebrandon1 force-pushed the enhancement_cert_manager_http01_proxy branch from bc39b89 to ab590b5 Compare June 3, 2025 19:38

sebrandon1 added 2 commits June 11, 2025 16:14

CNF-13731: Cert Manager HTTP01 Proxy

1751d83

Address comments

92393b1

sebrandon1 added 5 commits June 11, 2025 16:14

Add missing markdown sections for linter

e2b9af3

Address comments; update/add missing sections

52207c7

Add example CR

6a32547

Address comments 1

0ceb16d

Address comments 2

67d3e63

sebrandon1 force-pushed the enhancement_cert_manager_http01_proxy branch from 0975367 to 67d3e63 Compare June 11, 2025 21:14


		### Implementation Details/Notes/Constraints

		- The proxy will be deployed as a DaemonSet to ensure it runs on all nodes which may host the API VIP in the cluster.


		## Upgrade / Downgrade Strategy

		Updated versions of the proxy can be applied to the cluster similar to initial deployment


		## Version Skew Strategy

		Any changes to the proxy's behavior will be documented to ensure compatibility with older cluster versions.


		## Support Procedures

		Support for the proxy will be provided through standard OpenShift support channels. Administrators can refer to the deployment documentation and logs for troubleshooting.

CNF-13731: Cert Manager HTTP01 Proxy #1773

Are you sure you want to change the base?

CNF-13731: Cert Manager HTTP01 Proxy #1773

Uh oh!

Conversation

sebrandon1 commented Mar 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci-robot commented Mar 28, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci bot commented Mar 28, 2025

Uh oh!

openshift-ci-robot commented Mar 28, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-bot commented Apr 26, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

everettraven commented May 1, 2025

Uh oh!

openshift-bot commented May 9, 2025

Uh oh!

openshift-ci bot commented May 12, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sebrandon1 commented Mar 28, 2025 •

edited

Loading

openshift-ci-robot commented Mar 28, 2025 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Mar 28, 2025 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented May 30, 2025 •

edited by openshift-ci bot

Loading