Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exponential Backoff Not Functioning in BaseSensorOperator Reschedule Mode #39823

Merged
merged 9 commits into from
Jun 16, 2024

Conversation

sanket2000
Copy link
Contributor

Fix: Increment try_number/poke_count in BaseSensorOperator for correct exponential backoff in reschedule mode.

What happened?

  • The BaseSensorOperator does not correctly implement exponential backoff when in reschedule mode.
  • The sensor reschedules at a constant interval instead of increasing the interval exponentially.

What you think should happen instead?

  • The try_number (or poke_count) should increment on each reschedule too, ensuring the backoff interval increases exponentially.

How to reproduce?

  • Set up a BaseSensorOperator with mode="reschedule" and exponential_backoff=True.
  • Observe the interval between reschedules; it remains constant instead of increasing exponentially.

#6654


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added the area:core-operators Operators, Sensors and hooks within Core Airflow label May 24, 2024
Copy link

boring-cyborg bot commented May 24, 2024

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@sanket2000
Copy link
Contributor Author

To reproduce:

from airflow.models.dag import DAG
from airflow.sensors.base import BaseSensorOperator


class DummySensor(BaseSensorOperator):
    def __init__(self, return_value=False, **kwargs):
        super().__init__(**kwargs)
        self.return_value = return_value

    def poke(self, context):
        return self.return_value


with DAG(dag_id="base_sensor_test") as d:

    s = DummySensor(
        task_id="sensor",
        mode="reschedule",
        exponential_backoff=True,
    )

Task Log: attempt=1.log

image

@sanket2000
Copy link
Contributor Author

How can I request Reviewers?

Copy link
Collaborator

@dirrao dirrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the test case for the bug?

@sanket2000
Copy link
Contributor Author

Thank you for the review, this is my first PR, I wanted some guidance on the testing.
This should be the file I need to add my test tests/sensors/test_base.py right?

@sanket2000
Copy link
Contributor Author

sanket2000 commented May 25, 2024

I looked into tests/sensors/test_base.py and couldn't find any reference tests that specifically test the execute function of the sensor. To test the increment of poke_count (or try_number for older versions), such a test might be necessary. Any guidance or assistance in implementing this test would be greatly appreciated.

airflow/sensors/base.py Outdated Show resolved Hide resolved
@sanket2000
Copy link
Contributor Author

sanket2000 commented May 26, 2024

I have added a change, if it looks good poke_count can potentially be remove altogether. We can estimate the pokes based on start time and run duration.

@sanket2000
Copy link
Contributor Author

I would also encourage authors and participants of PR #30669 to take a look as this approach doesn't involve DB column changes proposed in the above mentioned PR

@eladkal
Copy link
Contributor

eladkal commented May 28, 2024

I would also encourage authors and participants of PR #30669 to take a look as this approach doesn't involve DB column changes proposed in the above mentioned PR

cc @hussein-awala

@RNHTTR
Copy link
Contributor

RNHTTR commented May 30, 2024

Thank you for the review, this is my first PR, I wanted some guidance on the testing. This should be the file I need to add my test tests/sensors/test_base.py right?

yes

@sanket2000
Copy link
Contributor Author

I feel this PR is ready for review along with tests.
It really is a roller-coaster to work on the test files :)

@potiuk
Copy link
Member

potiuk commented Jun 16, 2024

This one looks good. One thing to add is that you should add newsfragment describing the change (see newsfragment directory). This is a change in behaviour (a good one - fixing unintended behaviour) - but it also should be noted in release notes when this one gets merged.

@potiuk potiuk added this to the Airflow 2.9.3 milestone Jun 16, 2024
@sanket2000
Copy link
Contributor Author

This one looks good. One thing to add is that you should add newsfragment describing the change (see newsfragment directory). This is a change in behaviour (a good one - fixing unintended behaviour) - but it also should be noted in release notes when this one gets merged.

Thanks for the review, I have added the necessary file.

@sanket2000 sanket2000 requested a review from potiuk June 16, 2024 17:14
@potiuk potiuk merged commit 841b28c into apache:main Jun 16, 2024
52 checks passed
Copy link

boring-cyborg bot commented Jun 16, 2024

Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions.

@utkarsharma2 utkarsharma2 added the type:bug-fix Changelog: Bug Fixes label Jul 1, 2024
utkarsharma2 pushed a commit that referenced this pull request Jul 2, 2024
…Mode (#39823)

* Fix: Increment try_number/poke_count in BaseSensorOperator for correct exponential backoff in reschedule mode.

* exponential backoff handling for reschedule mode

---------

Co-authored-by: sanket2000 <email@example.com>
(cherry picked from commit 841b28c)
romsharon98 pushed a commit to romsharon98/airflow that referenced this pull request Jul 26, 2024
…Mode (apache#39823)

* Fix: Increment try_number/poke_count in BaseSensorOperator for correct exponential backoff in reschedule mode.

* exponential backoff handling for reschedule mode

---------

Co-authored-by: sanket2000 <email@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core-operators Operators, Sensors and hooks within Core Airflow type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants