Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect await_container_completion in KubernetesPodOperator #26796

Closed
1 of 2 tasks
windmark opened this issue Sep 30, 2022 · 2 comments · Fixed by #28771
Closed
1 of 2 tasks

Incorrect await_container_completion in KubernetesPodOperator #26796

windmark opened this issue Sep 30, 2022 · 2 comments · Fixed by #28771
Labels
area:providers kind:bug This is a clearly a bug provider:cncf-kubernetes Kubernetes provider related issues

Comments

@windmark
Copy link

windmark commented Sep 30, 2022

Apache Airflow version

2.3.4

What happened

The await_container_completion function has a negated condition

while not self.container_is_running(pod=pod, container_name=container_name):

that causes our Airflow tasks running <1 s to never be completed, causing an infinite loop.

I see this was addressed and released in #23883, but later reverted in #24474. How come it was reverted? The thread on that revert PR with comments from @jedcunningham and @potiuk didn't really address why the fix was reverted.

What you think should happen instead

Pods finishing within 1s should be properly handled.

How to reproduce

No response

Operating System

Linux

Versions of Apache Airflow Providers

apache-airflow-providers-cncf-kubernetes==4.3.0

Deployment

Other 3rd-party Helm chart

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@windmark windmark added area:core kind:bug This is a clearly a bug labels Sep 30, 2022
@eladkal
Copy link
Contributor

eladkal commented Sep 30, 2022

It caused failure of tests and we were unable to point the source so we had to revert
#24479

https://apache-airflow.slack.com/archives/CCPRP7943/p1655306336391429
https://apache-airflow.slack.com/archives/CCPRP7943/p1655300699968739

Quoting Ash from slack:

That'll learn us for merging a bug fix without any new tests

Feel free to raise PR with tests :)

@eladkal eladkal added the provider:cncf-kubernetes Kubernetes provider related issues label Sep 30, 2022
@windmark
Copy link
Author

It caused failure of tests and we were unable to point the source so we had to revert #24479

https://apache-airflow.slack.com/archives/CCPRP7943/p1655306336391429 https://apache-airflow.slack.com/archives/CCPRP7943/p1655300699968739

Quoting Ash from slack:

That'll learn us for merging a bug fix without any new tests

Feel free to raise PR with tests :)

I see, thanks for the links. Though it's quite a severe bug if the condition incorrectly is negated when waiting for completed pods. Let me read up on the threads.

eladkal pushed a commit that referenced this issue Jan 7, 2023
* Fix #26796 .. await_container_completion has a negated condition

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers kind:bug This is a clearly a bug provider:cncf-kubernetes Kubernetes provider related issues
Projects
None yet
3 participants