Skip to content

Adding context timeout for Ansible proxy #2264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Apr 22, 2020
Merged

Adding context timeout for Ansible proxy #2264

merged 13 commits into from
Apr 22, 2020

Conversation

camilamacedo86
Copy link
Contributor

@camilamacedo86 camilamacedo86 commented Nov 25, 2019

Description of the change:
Add timeout in the watch feature for Ansible based-operators proxy to avoid appears that the reconcile is stuck and hang when the operator has not the correct permissions to List and Watch the resources.

Motivation for the change:

Note
Also, solved by kubernetes-sigs/controller-runtime#663.

Steps to test it locally

  • Build an ansible image with this PR
  • Use the image in an ansible POC ( can be the Memchaced sample )
  • Remove the required permission to perform some operations defined in the playbook/roles. E.g to create a resource.
  • After installing the CR which will call the playbook/role check in the logs the permissions issues and then verify that the CR will be updated properly with the error faced.
    PS.: The tests made to verify this fix was by removing the deployment permission.

Screenshot 2020-04-21 at 12 21 45

@openshift-ci-robot openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Nov 25, 2019
@camilamacedo86 camilamacedo86 added language/ansible Issue is related to an Ansible operator project bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. kind/bug Categorizes issue or PR as related to a bug. labels Nov 25, 2019
@camilamacedo86 camilamacedo86 changed the title Fix issue bug 1701041 WIP: Fix issue bug 1701041 Nov 25, 2019
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 25, 2019
@camilamacedo86 camilamacedo86 changed the title WIP: Fix issue bug 1701041 Fix issue bug 1701041 Nov 26, 2019
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 26, 2019
@joelanford joelanford changed the title Fix issue bug 1701041 Bug 1701041: Adding timeout on dependent watch establishment Nov 26, 2019
@openshift-ci-robot openshift-ci-robot added bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. and removed bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Nov 26, 2019
@openshift-ci-robot
Copy link

@camilamacedo86: This pull request references Bugzilla bug 1701041, which is invalid:

  • expected the bug to be open, but it isn't
  • expected the bug to target the "4.4.0" release, but it targets "4.2.0" instead
  • expected the bug to be in one of the following states: NEW, ASSIGNED, ON_DEV, POST, POST, but it is CLOSED (ERRATA) instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 1701041: Adding timeout on dependent watch establishment

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@joelanford
Copy link
Member

@camilamacedo86 @shawn-hurley @fabianvf The BZ for this issue was verified and closed over a month ago. Is this PR an improvement on that fix, or should the BZ be re-opened?

@camilamacedo86
Copy link
Contributor Author

@joelanford I understand that the Bugzilla was closed because automatically when we close the PR #1638 and carry on @shawn-hurley code here.

@camilamacedo86 camilamacedo86 changed the title Bug 1701041: Adding timeout on dependent watch establishment WIP: Bug 1701041: Adding timeout on dependent watch establishment Dec 2, 2019
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 2, 2019
@joelanford
Copy link
Member

joelanford commented Dec 3, 2019

Will this PR be made obsolete (or at least much simpler) with kubernetes-sigs/controller-runtime#663?

EDIT: Looking at that other PR closer, I think it is required to be merged before my suggestion will work.

@camilamacedo86
Copy link
Contributor Author

camilamacedo86 commented Dec 3, 2019

Hi @joelanford,

Sorry, your comment #2264 (comment) is not clear.

EDIT: Looking at that other PR closer, I think it is required to be merged before my suggestion will work.

What do you mean with I think it is required to be merged before my suggestion will work.? What suggestion? Here or there? What do you think that is required to be merged first? Is the PR kubernetes-sigs/controller-runtime#663 or this one without address your comments?

@openshift-ci-robot
Copy link

@camilamacedo86: This pull request references Bugzilla bug 1701041, which is invalid:

  • expected the bug to be open, but it isn't
  • expected the bug to target the "4.4.0" release, but it targets "4.2.0" instead
  • expected the bug to be in one of the following states: NEW, ASSIGNED, ON_DEV, POST, POST, but it is CLOSED (ERRATA) instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

WIP: Bug 1701041: Adding timeout on dependent watch establishment

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 9, 2019
@camilamacedo86 camilamacedo86 changed the title WIP: Bug 1701041: Adding timeout on dependent watch establishment Adding context timeout for Ansible proxy Apr 21, 2020
@openshift-ci-robot openshift-ci-robot removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Apr 21, 2020
@openshift-ci-robot
Copy link

@camilamacedo86: No Bugzilla bug is referenced in the title of this pull request.
To reference a bug, add 'Bug XXX:' to the title of this pull request and request another bug refresh with /bugzilla refresh.

In response to this:

Adding context timeout for Ansible proxy

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@camilamacedo86
Copy link
Contributor Author

camilamacedo86 commented Apr 21, 2020

Hi @joelanford and @fabianvf,

I think that all is properly addressed now. Could you please check this one?

Copy link
Member

@joelanford joelanford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry one more suggestion, I skipped right past this the first time around! LGTM after addressing it.

# release notes and/or the migration guide
entries:
- description: >
Fix issue by adding proxy timeout for the Ansible based-operators to add error conditional status properly when the operator has not the required permissions (RBAC) to perform the operations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a little cleanup to provide more clarity about the fix.

Suggested change
Fix issue by adding proxy timeout for the Ansible based-operators to add error conditional status properly when the operator has not the required permissions (RBAC) to perform the operations.
Added timeout to child resource watches for Ansible based-operators. This is necessary to avoid blocking the reconciler when the operator does not have the required RBAC permissions to list and watch the child resource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @joelanford
The reconcile was NOT blocked before this fix. (Just missing the status.conditions)
Applied your suggestion with a small change to clarifies it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like Added timeout to the Ansible Operator proxy, which enables error reporting for requests that fail due to RBAC permissions issues

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very good :-)

@camilamacedo86 camilamacedo86 removed the request for review from shawn-hurley April 21, 2020 21:57
# release notes and/or the migration guide
entries:
- description: >
Added timeout to the Ansible based-operator proxy, which enables error reporting for requests that fail due to RBAC permissions issues to List and Watch the resources.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fabianvf done

@camilamacedo86 camilamacedo86 merged commit bebd746 into operator-framework:master Apr 22, 2020
@camilamacedo86 camilamacedo86 deleted the fix-issue-bug-1701041 branch April 22, 2020 18:38
@estroz
Copy link
Member

estroz commented May 13, 2020

/cherry-pick v0.17.x

@openshift-cherrypick-robot

@estroz: new pull request created: #3025

In response to this:

/cherry-pick v0.17.x

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. language/ansible Issue is related to an Ansible operator project size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants