-
Notifications
You must be signed in to change notification settings - Fork 74
WIP: OCPBUGS-64574: FOR TESTING ONLY (different method) #1138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@tmshort: This pull request references Jira Issue OCPBUGS-64574, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tmshort The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Fix CatalogSource reporting TRANSIENT_FAILURE in Hypershift guest clusters by automatically using the "passthrough" resolver scheme when a proxy is detected. Root Cause: The migration from grpc.Dial() to grpc.NewClient() introduced a resolver scheme issue. When grpc.NewClient() is used with WithContextDialer (for proxy support), gRPC defaults to the "dns" resolver which tries to resolve addresses client-side. In Hypershift, the catalog operator runs in the management cluster and connects via SOCKS5 proxy to catalog pods in the guest cluster. Service addresses like "service.namespace.svc:50051" only exist in the guest cluster's DNS and cannot be resolved from the management cluster, causing connections to fail with TRANSIENT_FAILURE. Solution: Automatically detect when a proxy is being used (proxyURL != nil) and prepend "passthrough:///" to the target address. The passthrough resolver bypasses client-side DNS resolution and delegates it to the custom dialer (proxy), which resolves addresses in the guest cluster where they exist. This solution: - Requires no environment variables or configuration - Automatically activates only when proxy is used - Follows gRPC best practices per documentation - Simpler than alternative env var approaches (e.g., PR #3699) Fixes: OCPBUGS-64574 Related: OCPBUGS-64631, operator-framework/operator-lifecycle-manager#3698, operator-framework/operator-lifecycle-manager#3699 🤖 Generated with [Claude Code](https://claude.com/claude-code) via /jira:solve OCPBUGS-64574 Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Todd Short <todd.short@me.com>
423b43a to
0b36b69
Compare
|
@tmshort: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
the new code works, thanks. I already update the ticket. |
|
Closing this to do the work upstream. |
|
@tmshort: This pull request references Jira Issue OCPBUGS-64574. The bug has been updated to no longer refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
fix(grpc): Use passthrough resolver when proxy is detected
Fix CatalogSource reporting TRANSIENT_FAILURE in Hypershift guest clusters by automatically using the "passthrough" resolver scheme when a proxy is detected.
Root Cause:
The migration from grpc.Dial() to grpc.NewClient() introduced a resolver scheme issue. When grpc.NewClient() is used with WithContextDialer (for proxy support), gRPC defaults to the "dns" resolver which tries to resolve addresses client-side. In Hypershift, the catalog operator runs in the management cluster and connects via SOCKS5 proxy to catalog pods in the guest cluster. Service addresses like "service.namespace.svc:50051" only exist in the guest cluster's DNS and cannot be resolved from the management cluster, causing connections to fail with TRANSIENT_FAILURE.
Solution:
Automatically detect when a proxy is being used (proxyURL != nil) and prepend "passthrough:///" to the target address. The passthrough resolver bypasses client-side DNS resolution and delegates it to the custom dialer (proxy), which resolves addresses in the guest cluster where they exist.
This solution:
Fixes: OCPBUGS-64574
Related: OCPBUGS-64631, operator-framework/operator-lifecycle-manager#3698, operator-framework/operator-lifecycle-manager#3699
🤖 Generated with Claude Code via /jira:solve OCPBUGS-64574