Fixes to how DebugExecutor handles sensors #28528

vandonr-amz · 2022-12-21T23:11:57Z

Current behavior of the Debug Executor with sensors:

When the sensor code gets executed the (copy of the) task is in "reschedule" mode, so the sensor releases it and reschedules it now + poke_interval in the future

airflow/airflow/sensors/base.py

Lines 263 to 265 in 13b43a7

    
           if conf.get("core", "executor") == "DebugExecutor": 
        
               self.log.warning("DebugExecutor changes sensor mode to 'reschedule'.") 
        
               task.mode = "reschedule"

The original task object stays in "poke" mode, so when the rescheduling event is received, it isn't handled properly, and the task is rescheduled for immediate execution (instead of rescheduling it int the future). This is taking place here:

airflow/airflow/ti_deps/deps/ready_to_reschedule.py

Lines 47 to 52 in 681835a

    
           if not is_mapped and not getattr(ti.task, "reschedule", False): 
        
               # Mapped sensors don't have the reschedule property (it can only 
        
               # be calculated after unmapping), so we don't check them here. 
        
               # They are handled below by checking TaskReschedule instead. 
        
               yield self._passing_status(reason="Task is not in reschedule mode.") 
        
               return

Because of this, the poke method is effectively called in a tight loop, hammering whichever API it's querying, eventually leading to trouble such as rate-limiting if the sensor waits long enough, which hampers debugging, the initial purpose of this executor.

In this change, I propose that we add a special case for the debug executor in the ready_to_reschedule file to always reschedule when running with the debug executor.

While testing this change, I noticed that the Executor itself also spins in a tight loop when it has no task to execute, leading to unnecessary resource usage and huge log files. With limited knowledge of how executors work, I'm proposing a poor man's fix for this here as well, where the executor would sleep for 500ms if there are no task ready to be executed.
I think this time is short enough that humans won't have to wait too long for their tasks to be picked up when ready, and also long enough that the amount of logs is manageable and can be reasonably scrolled.

airflow/sensors/base.py

airflow/executors/debug_executor.py

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>

airflow/executors/debug_executor.py

airflow/sensors/base.py

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>

This reverts commit cc508a6.

uranusjr · 2023-01-04T07:24:01Z

airflow/ti_deps/deps/ready_to_reschedule.py


        is_mapped = isinstance(ti.task, MappedOperator)
-        if not is_mapped and not getattr(ti.task, "reschedule", False):
+        is_debug_executor = conf.get("core", "executor") == "DebugExecutor"


I wonder if there’s a better way to obtain this information. But since this pattern is used elsewhere inthe code base, it’s probably good enough.

@o-nikolas - maybe we could also add "is_debug" as part of AIP-51 ?

Yeah not great to pull this from a global static place, but looking through what we have accessible here, I don't see anything that could help us. And injecting this all the way through here looks like a lot of pain.

I'm pushing a commit to use the constant that's available, making this piece of code a tiny bit better I think ?

Oh no! This is definitely new executor coupling being added, we should not add this as it is.

I apologize but I'm missing some context for this specific line change. Why do we need to know if this is the debug executor or not? Shouldn't we just need to know that the task is being rescheduled (by any executor)?

maybe we could also add "is_debug" as part of AIP-51?

@potiuk this is covered by the single threaded case actually, which is more precisely what the issue is. Any single threaded executor needs to reschedule sensors otherwise the thread will be blocked when the sensor sleeps.

See 2c here: #27929

@potiuk this is covered by the single threaded case actually, which is more precisely what the issue is. Any single threaded executor needs to reschedule sensors otherwise the thread will be blocked when the sensor sleeps.

Yep. You are right. Single-threaded is the right check here and we should use it here. I think being single-threaded is the only reason we are doing it.

ok, I just checked, and the SequentialExecutor has the same Behavior, in that it does not wait to reschedule, it's just slower at it so it spams less requests.
As such, it appears that checking on single threaded is indeed the way to go.

I'm a bit confused by DebugExecutor. Didn't we deprecate it with the attention of removing it completely in Airflow 3?
#28861 (comment)

Yeah it was removed in the context of the dag.test() command in this commit. But AIP-47 compliant system tests still depend on that executor, so we can't kill it just yet without finding a replacement for that.

What`s mor i think we do not possibly even want to replace it for AIP-47. Why would we? Maybe we can rename to SystemTestExcecutor but I think it does the job nicely ?

A flag for executors that require reschedule mode has been added in #28934, I've just merged it, you can now make use of it here @vandonr-amz

potiuk · 2023-01-19T23:01:59Z

Not sure - after all the discussions and other things happening what would be the best course of action here. @o-nikolas - maybe you can make a call (especially in relation to AIP-51. Do you think we need some more discussion / feedback/ brainstorming ?

vandonr-amz · 2023-01-19T23:06:01Z

Not sure - after all the discussions and other things happening what would be the best course of action here.

I think the course of action is to wait for an is_single_threaded_executor to be available through the works of @utkarsharma2 , and use that instead of the current check on debug executor ?

potiuk · 2023-01-19T23:07:28Z

Not sure - after all the discussions and other things happening what would be the best course of action here.

I think the course of action is to wait for an is_single_threaded_executor to be available through the works of @utkarsharma2 , and use that instead of the current check on debug executor ?

Yeah. that's what I thought too. wasn't sure though :)

o-nikolas · 2023-01-19T23:08:37Z

Not sure - after all the discussions and other things happening what would be the best course of action here.

I think the course of action is to wait for an is_single_threaded_executor to be available through the works of @utkarsharma2 , and use that instead of the current check on debug executor ?

Yeah. that's what I thought too. wasn't sure though :)

Yupp, this is the current plan!

reverts commit cbc824d.

vandonr-amz · 2023-01-23T22:59:34Z

AIP-51 has been merged, so I'm using it !
Should be good to go now !

o-nikolas

Changes look good with AIP-51 updates included 👍

pierrejeambrun · 2023-03-06T20:06:48Z

Marking for 2.6.0, this relies on AIP-51

fixes to the DebugExecutor

cc508a6

vandonr-amz requested review from XD-DENG, ashb and kaxil as code owners December 21, 2022 23:11

boring-cyborg bot added area:core-operators area:Scheduler including HA (high availability) scheduler labels Dec 21, 2022

vandonr-amz commented Dec 21, 2022

View reviewed changes

airflow/sensors/base.py Outdated Show resolved Hide resolved

uranusjr reviewed Dec 22, 2022

View reviewed changes

airflow/executors/debug_executor.py Outdated Show resolved Hide resolved

use 'not' instead of len == 0

fb61e48

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>

uranusjr reviewed Dec 27, 2022

View reviewed changes

airflow/executors/debug_executor.py Outdated Show resolved Hide resolved

airflow/sensors/base.py Outdated Show resolved Hide resolved

vandonr-amz and others added 3 commits January 3, 2023 10:21

typo

b91c652

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>

Revert "fixes to the DebugExecutor"

13b43a7

This reverts commit cc508a6.

move fix to ready_to_reschedule

5d43aea

uranusjr approved these changes Jan 4, 2023

View reviewed changes

replace debug executor string with constant

cbc824d

o-nikolas mentioned this pull request Jan 6, 2023

AIP-51 - Single Threaded Executors #27929

Closed

vandonr-amz added 3 commits January 23, 2023 14:15

Merge remote-tracking branch 'origin/main' into vandonr/fix

2827d56

replace check on debug exec with new property

ea2bd4a

revert changes to configuration.py

fd82ab6

reverts commit cbc824d.

o-nikolas reviewed Jan 25, 2023

View reviewed changes

o-nikolas approved these changes Jan 25, 2023

View reviewed changes

o-nikolas merged commit a35ec95 into apache:main Jan 25, 2023

vandonr-amz deleted the vandonr/fix branch January 25, 2023 17:44

vandonr-amz mentioned this pull request Jan 26, 2023

shorten poke intervals on systems tests #29183

Merged

eladkal mentioned this pull request Jan 27, 2023

Refactor S3KeysUnchangedSensor system test #26725

Closed

pierrejeambrun added the type:bug-fix Changelog: Bug Fixes label Feb 27, 2023

pierrejeambrun modified the milestones: Airflow 2.5.2, Airflow 2.6.0 Feb 27, 2023

vincbeck mentioned this pull request Apr 20, 2023

Remove @poke_mode_only from EmrStepSensor #30774

Merged

	if conf.get("core", "executor") == "DebugExecutor":
	self.log.warning("DebugExecutor changes sensor mode to 'reschedule'.")
	task.mode = "reschedule"

	if not is_mapped and not getattr(ti.task, "reschedule", False):
	# Mapped sensors don't have the reschedule property (it can only
	# be calculated after unmapping), so we don't check them here.
	# They are handled below by checking TaskReschedule instead.
	yield self._passing_status(reason="Task is not in reschedule mode.")
	return

Fixes to how DebugExecutor handles sensors #28528

Fixes to how DebugExecutor handles sensors #28528

Uh oh!

Conversation

vandonr-amz commented Dec 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

potiuk commented Jan 19, 2023

Uh oh!

vandonr-amz commented Jan 19, 2023

Uh oh!

potiuk commented Jan 19, 2023

Uh oh!

o-nikolas commented Jan 19, 2023

Uh oh!

vandonr-amz commented Jan 23, 2023

Uh oh!

o-nikolas left a comment

Choose a reason for hiding this comment

Uh oh!

pierrejeambrun commented Mar 6, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

vandonr-amz commented Dec 21, 2022 •

edited

Loading