-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Demonstrate usage of the PythonSensor #29143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
josh-fell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're really shining up these docs lately. Just a comment on the example snippet.
|
Very nice. Just a question - should not we make it back into the example_sensors.py and include from there? I think the big benefit of include_examples is that the documentaion can actually be tested very easily and in case we make any refactoring in the future it is far less likely to be "invalid" - and for example the discussion #29143 (comment) would not be likely needed? I think the code could be simply copied into the example_sensors.py and end up with nearly the same result (barring the example_include header) Why not doing it this way ? |
|
I suggest showing reproducible examples. The current example for the PythonSensor is not easily reproducible because it doesn't show imports. |
Yes I understand that. What I am suggesting to also surround import with "exaampleinclude" start/end comments. There is no reason exampleicnlude start/end comments shoud be limited to in-dag. the new # Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# [START example_python_sensors]
import datetime
from airflow.decorators import dag, task
from airflow.sensors.python import PythonSensor
@dag(start_date=datetime.datetime(2023, 1, 1), schedule=None)
def example():
# TaskFlow sensor
@task.sensor
def wait_for_success_taskflow():
return datetime.datetime.now().minute % 2 == 0
wait_for_success_taskflow()
# Equivalent functionality using PythonSensor class
def wait_for_success_pythonsensor():
return datetime.datetime.now().minute % 2 == 0
PythonSensor(task_id="wait_for_success_pythonsensor", python_callable=wait_for_success_pythonsensor)
example()
# [END example_python_sensors](whole file). The weekday |
|
Exactly same effect for docs, example is runnable with |
|
Also, I think example DAGs is the wrong place to store code samples shown in the docs. With the default config, users see example DAGs. There's currently already a large number of example DAGs and your average Airflow user does not search example DAG code when in need of certain functionality. Instead, they google "how do I do XYZ in Airflow?" and end up in the docs. Seeing the first view of Airflow cluttered with example DAGs and then having to google how to get rid of those is a bad user experience. If we must keep code in separate files for testing/migration purposes, I suggest storing it together with the docs. So for this example, we'd have e.g. # Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# [START example]
import datetime
from airflow.decorators import dag, task
from airflow.sensors.python import PythonSensor
@dag(start_date=datetime.datetime(2023, 1, 1), schedule=None)
def example():
# TaskFlow sensor
@task.sensor
def wait_for_success_taskflow():
return datetime.datetime.now().minute % 2 == 0
wait_for_success_taskflow()
# Equivalent functionality with PythonSensor class
def wait_for_success_pythonsensor():
return datetime.datetime.now().minute % 2 == 0
PythonSensor(task_id="wait_for_success_pythonsensor", python_callable=wait_for_success_pythonsensor)
example()
# [END example]And the code in the .rst file will be: .. exampleinclude:: pythonsensor.py
:start-after: [START example]
:end-before: [END example]WDYT? |
|
Oh yes - I absolutely agree example dags should not be loaded by default.This is even one of the reasons why in Breeze they are disabled by default. And if anything I would argue for changing the flag to False. Since this is not a production setting, I think we could flip it to False without breaking any "backwards compatibility promise". Maybe that is a better approach - keep the examples where they are but disable them by default. Or maybe even we could fine-grain it and enable/disable different types of the examples. That sounds like a better idea that introducing a new place where example dags are stored without the "easiness" of running them. I think you might not understood why I think it's good to have the code in example_dags and copied from there - it's not because it is a python file, but it is because I can easily run airflow wiith a flag and run them without even copy-pasting the code. This is for example how I as a maintainer test a new release of airflow. I run But yes - keeping the flag disabled by default is a very good idea (like it is in breeze). |
|
We should think better about the user experience. Currently, example DAGs are displayed by default. I've never met a user that actually wants that. The first question is usually "how do I get rid of these example DAGs?", and you have to start looking up Airflow configuration straight away. Not a smooth experience... The current example DAGs are being used as integration tests and end up in the UI which impacts user experience. Having all code that's displayed in the docs in the example DAGs would result in hundreds of example DAGs which sounds like a really bad idea. I think there should be only 1 example DAG, which is enough for users to click around and get to know the user interface. For anything beyond that, users can go to the docs. We should separate purposes and store code for docs separate from tests, i.e. in the |
I think we agree on that. I am absolutely for separating those two use cases for users. Users when installing Airflow by default (for production) should not see those. But I think those This is great value for people who are exploring airlfow and want to get the grasp of it. Equivakent to "playground" in a number of services that show how an application works. I think this is really valuable that the user workflow for such a "playful" user is:
And be able to run them. Same for So I have no problems with moving the dags to "docs" section, as long as the workflows above continue to work (but being opt-in rather than opt-out). It basically means that the example dags (or doc dags) will have to be embedded in airflow code when package is prepared (or taken from docs or wherever they are when run from developer's venv/breeze). |
|
@BasPH Also one more case why having example_dags as "executable" and separate python file here: #28325 Our example dags are actually imported at tested during the CI. Originally the taskflow example dag has failed when submitted and required some fixes in the test code. So it was a test issue not "airflow core" issue, however it could also be "core" issue, regression that could have been missed in unit tests and only surface at the moment real DAG is parsed/serialized. And IMHO it underlines how important it is to have the example dags as separate python files that can be both - run manually very easily and testsd automatically with every PR. In Polish, we have the idiom "wylać dziecko z kąpielą" - which freely translated means "to throw away the baby together with the bathing water". Trying to move example_dags to be just part of the docs IMHO is a very good example of that - by trying to improve the docs - the attempt here is to remove example python files that are giving us extra test harness that adds yet another level of making sure our code does what it should do. I'd say we should find a way to improve the docs while keeping the baby. |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
Update of the https://airflow.apache.org/docs/apache-airflow/stable/howto/operator/python.html#pythonsensor paragraph. All other paragraphs in the doc also demonstrate the taskflow-style operator, but the PythonSensor paragraph did not. This PR shows both style + adds a reproducible example.
Before:

After:

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.