Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ Airflow defines the specification as `hookspec <https://github.com/apache/airflo

To include the listener in your Airflow installation, include it as a part of an :doc:`Airflow Plugin </administration-and-deployment/plugins>`.

Listener API is meant to be called across all dags and all operators. You can't listen to events generated by specific dags. For that behavior, try methods like ``on_success_callback`` and ``pre_execute``. These provide callbacks for particular DAG authors or operator creators. The logs and ``print()`` calls will be handled as part of the listeners.
Listener API is meant to be called across all dags and all operators. You can't listen to events generated by specific dags. For that behavior, try methods like ``on_success_callback`` and ``pre_execute``. These provide callbacks for particular Dag authors or operator creators. The logs and ``print()`` calls will be handled as part of the listeners.


Compatibility note
Expand Down
2 changes: 1 addition & 1 deletion airflow-core/docs/authoring-and-scheduling/deferring.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ An overview of how this process works:
* The trigger runs until it fires, at which point its source task is re-scheduled by the scheduler.
* The scheduler queues the task to resume on a worker node.

You can either use pre-written deferrable operators as a DAG author or write your own. Writing them, however, requires that they meet certain design criteria.
You can either use pre-written deferrable operators as a Dag author or write your own. Writing them, however, requires that they meet certain design criteria.

Using Deferrable Operators
--------------------------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
Dynamic Task Mapping
====================

Dynamic Task Mapping allows a way for a workflow to create a number of tasks at runtime based upon current data, rather than the DAG author having to know in advance how many tasks would be needed.
Dynamic Task Mapping allows a way for a workflow to create a number of tasks at runtime based upon current data, rather than the Dag author having to know in advance how many tasks would be needed.

This is similar to defining your tasks in a for loop, but instead of having the DAG file fetch the data and do that itself, the scheduler can do this based on the output of a previous task. Right before a mapped task is executed the scheduler will create *n* copies of the task, one for each input.

Expand Down
12 changes: 6 additions & 6 deletions airflow-core/docs/best-practices.rst
Original file line number Diff line number Diff line change
Expand Up @@ -954,12 +954,12 @@ The benefits of the operator are:
Airflow dependencies) to make use of multiple virtual environments
* You can run tasks with different sets of dependencies on the same workers - thus Memory resources are
reused (though see below about the CPU overhead involved in creating the venvs).
* In bigger installations, DAG Authors do not need to ask anyone to create the venvs for you.
As a DAG Author, you only have to have virtualenv dependency installed and you can specify and modify the
* In bigger installations, Dag authors do not need to ask anyone to create the venvs for you.
As a Dag author, you only have to have virtualenv dependency installed and you can specify and modify the
environments as you see fit.
* No changes in deployment requirements - whether you use Local virtualenv, or Docker, or Kubernetes,
the tasks will work without adding anything to your deployment.
* No need to learn more about containers, Kubernetes as a DAG Author. Only knowledge of Python requirements
* No need to learn more about containers, Kubernetes as a Dag author. Only knowledge of Python requirements
is required to author dags this way.

There are certain limitations and overhead introduced by this operator:
Expand Down Expand Up @@ -1005,7 +1005,7 @@ and available in all the workers in case your Airflow runs in a distributed envi

This way you avoid the overhead and problems of re-creating the virtual environment but they have to be
prepared and deployed together with Airflow installation. Usually people who manage Airflow installation
need to be involved, and in bigger installations those are usually different people than DAG Authors
need to be involved, and in bigger installations those are usually different people than Dag authors
(DevOps/System Admins).

Those virtual environments can be prepared in various ways - if you use LocalExecutor they just need to be installed
Expand All @@ -1024,7 +1024,7 @@ The benefits of the operator are:
be added dynamically. This is good for both, security and stability.
* Limited impact on your deployment - you do not need to switch to Docker containers or Kubernetes to
make a good use of the operator.
* No need to learn more about containers, Kubernetes as a DAG Author. Only knowledge of Python, requirements
* No need to learn more about containers, Kubernetes as a Dag author. Only knowledge of Python, requirements
is required to author dags this way.

The drawbacks:
Expand All @@ -1045,7 +1045,7 @@ The drawbacks:
same worker might be affected by previous tasks creating/modifying files etc.

You can think about the ``PythonVirtualenvOperator`` and ``ExternalPythonOperator`` as counterparts -
that make it smoother to move from development phase to production phase. As a DAG author you'd normally
that make it smoother to move from development phase to production phase. As a Dag author you'd normally
iterate with dependencies and develop your DAG using ``PythonVirtualenvOperator`` (thus decorating
your tasks with ``@task.virtualenv`` decorators) while after the iteration and changes you would likely
want to change it for production to switch to the ``ExternalPythonOperator`` (and ``@task.external_python``)
Expand Down
10 changes: 5 additions & 5 deletions airflow-core/docs/core-concepts/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -98,15 +98,15 @@ and can be scaled by running multiple instances of the components above.
The separation of components also allow for increased security, by isolating the components from each other
and by allowing to perform different tasks. For example separating *dag processor* from *scheduler*
allows to make sure that the *scheduler* does not have access to the *DAG files* and cannot execute
code provided by *DAG author*.
code provided by *Dag author*.

Also while single person can run and manage Airflow installation, Airflow Deployment in more complex
setup can involve various roles of users that can interact with different parts of the system, which is
an important aspect of secure Airflow deployment. The roles are described in detail in the
:doc:`/security/security_model` and generally speaking include:

* Deployment Manager - a person that installs and configures Airflow and manages the deployment
* DAG author - a person that writes dags and submits them to Airflow
* Dag author - a person that writes dags and submits them to Airflow
* Operations User - a person that triggers dags and tasks and monitors their execution

Architecture Diagrams
Expand Down Expand Up @@ -153,13 +153,13 @@ Distributed Airflow architecture
................................

This is the architecture of Airflow where components of Airflow are distributed among multiple machines
and where various roles of users are introduced - *Deployment Manager*, **DAG author**,
and where various roles of users are introduced - *Deployment Manager*, **Dag author**,
**Operations User**. You can read more about those various roles in the :doc:`/security/security_model`.

In the case of a distributed deployment, it is important to consider the security aspects of the components.
The *webserver* does not have access to the *DAG files* directly. The code in the ``Code`` tab of the
UI is read from the *metadata database*. The *webserver* cannot execute any code submitted by the
**DAG author**. It can only execute code that is installed as an *installed package* or *plugin* by
**Dag author**. It can only execute code that is installed as an *installed package* or *plugin* by
the **Deployment Manager**. The **Operations User** only has access to the UI and can only trigger
dags and tasks, but cannot author dags.

Expand All @@ -178,7 +178,7 @@ Separate DAG processing architecture
In a more complex installation where security and isolation are important, you'll also see the
standalone *dag processor* component that allows to separate *scheduler* from accessing *DAG files*.
This is suitable if the deployment focus is on isolation between parsed tasks. While Airflow does not yet
support full multi-tenant features, it can be used to make sure that **DAG author** provided code is never
support full multi-tenant features, it can be used to make sure that **Dag author** provided code is never
executed in the context of the scheduler.

.. image:: ../img/diagram_dag_processor_airflow_architecture.png
Expand Down
2 changes: 1 addition & 1 deletion airflow-core/docs/core-concepts/params.rst
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ JSON Schema Validation
.. note::
If ``schedule`` is defined for a DAG, params with defaults must be valid. This is validated during DAG parsing.
If ``schedule=None`` then params are not validated during DAG parsing but before triggering a DAG.
This is useful in cases where the DAG author does not want to provide defaults but wants to force users provide valid parameters
This is useful in cases where the Dag author does not want to provide defaults but wants to force users provide valid parameters
at time of trigger.

.. note::
Expand Down
2 changes: 1 addition & 1 deletion airflow-core/docs/installation/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -347,7 +347,7 @@ The requirements that Airflow might need depend on many factors, including (but
the technology/cloud/integration of monitoring etc.
* Technical details of database, hardware, network, etc. that your deployment is running on
* The complexity of the code you add to your DAGS, configuration, plugins, settings etc. (note, that
Airflow runs the code that DAG author and Deployment Manager provide)
Airflow runs the code that Dag author and Deployment Manager provide)
* The number and choice of providers you install and use (Airflow has more than 80 providers) that can
be installed by choice of the Deployment Manager and using them might require more resources.
* The choice of parameters that you use when tuning Airflow. Airflow has many configuration parameters
Expand Down
4 changes: 2 additions & 2 deletions airflow-core/docs/installation/upgrading_to_airflow3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ Airflow 3.x Architecture

Database Access Restrictions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In Airflow 3, direct metadata database access from task code is now restricted. This is a key security and architectural improvement that affects how DAG authors interact with Airflow resources:
In Airflow 3, direct metadata database access from task code is now restricted. This is a key security and architectural improvement that affects how Dag authors interact with Airflow resources:

- **No Direct Database Access**: Task code can no longer directly import and use Airflow database sessions or models.
- **API-Based Resource Access**: All runtime interactions (state transitions, heartbeats, XComs, and resource fetching) are handled through a dedicated Task Execution API.
Expand Down Expand Up @@ -83,7 +83,7 @@ Step 2: Clean and back up your existing Airflow Instance
ensure you deploy your changes to your old instance prior to upgrade, and wait until your dags have all been reprocessed
(and all errors gone) before you proceed with upgrade.

Step 3: Dag Authors - Check your Airflow dags for compatibility
Step 3: Dag authors - Check your Airflow dags for compatibility
----------------------------------------------------------------

To minimize friction for users upgrading from prior versions of Airflow, we have created a dag upgrade check utility using `Ruff <https://docs.astral.sh/ruff/>`_ combined with `AIR <https://docs.astral.sh/ruff/rules/#airflow-air>`_ rules.
Expand Down
24 changes: 12 additions & 12 deletions airflow-core/docs/public-airflow-interface.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,8 @@ and extending Airflow capabilities by writing new executors, plugins, operators
Public Interface can be useful for building custom tools and integrations with other systems,
and for automating certain aspects of the Airflow workflow.

The primary public interface for DAG Authors and task execution is using task SDK
Airflow task SDK is the primary public interface for DAG Authors and for task execution
The primary public interface for Dag authors and task execution is using task SDK
Airflow task SDK is the primary public interface for Dag authors and for task execution
:doc:`airflow.sdk namespace <core-concepts/taskflow>`. Direct access to the metadata database
from task code is no longer allowed. Instead, use the :doc:`Stable REST API <stable-rest-api-ref>`,
`Python Client <https://github.com/apache/airflow-client-python>`_, or Task Context methods.
Expand Down Expand Up @@ -87,12 +87,12 @@ in details (such as output format and available flags) so if you want to rely on
way, the Stable REST API is recommended.


Using the Public Interface for DAG Authors
Using the Public Interface for Dag authors
==========================================

The primary interface for DAG Authors is the :doc:`airflow.sdk namespace <core-concepts/taskflow>`.
The primary interface for Dag authors is the :doc:`airflow.sdk namespace <core-concepts/taskflow>`.
This provides a stable, well-defined interface for creating DAGs and tasks that is not subject to internal
implementation changes. The goal of this change is to decouple DAG authoring from Airflow internals (Scheduler,
implementation changes. The goal of this change is to decouple Dag authoring from Airflow internals (Scheduler,
API Server, etc.), providing a version-agnostic, stable interface for writing and maintaining DAGs across Airflow versions.

**Key Imports from airflow.sdk:**
Expand Down Expand Up @@ -164,17 +164,17 @@ You can read more about dags in :doc:`Dags <core-concepts/dags>`.
References for the modules used in dags are here:

.. note::
The airflow.sdk namespace provides the primary interface for DAG Authors.
The airflow.sdk namespace provides the primary interface for Dag authors.
For detailed API documentation, see the `Task SDK Reference <https://airflow.apache.org/docs/task-sdk/stable/>`_.

.. note::
The :class:`~airflow.models.dagbag.DagBag` class is used internally by Airflow for loading DAGs
from files and folders. DAG Authors should use the :class:`~airflow.sdk.DAG` class from the
from files and folders. Dag authors should use the :class:`~airflow.sdk.DAG` class from the
airflow.sdk namespace instead.

.. note::
The :class:`~airflow.models.dagrun.DagRun` class is used internally by Airflow for DAG run
management. DAG Authors should access DAG run information through the Task Context via
management. Dag authors should access DAG run information through the Task Context via
:func:`~airflow.sdk.get_current_context` or use the :class:`~airflow.sdk.types.DagRunProtocol`
interface.

Expand Down Expand Up @@ -231,7 +231,7 @@ Example of accessing task instance information through Task Context:

.. note::
The :class:`~airflow.models.taskinstancekey.TaskInstanceKey` class is used internally by Airflow
for identifying task instances. DAG Authors should access task instance information through the
for identifying task instances. Dag authors should access task instance information through the
Task Context via :func:`~airflow.sdk.get_current_context` instead.


Expand All @@ -257,7 +257,7 @@ by extending them:
Public Airflow utilities
========================

When writing or extending Hooks and Operators, DAG Authors and developers can
When writing or extending Hooks and Operators, Dag authors and developers can
use the following classes:

* The :class:`~airflow.sdk.Connection`, which provides access to external service credentials and configuration.
Expand Down Expand Up @@ -485,10 +485,10 @@ implemented in the community providers.

Decorators
==========
DAG Authors can use decorators to author dags using the :doc:`TaskFlow <core-concepts/taskflow>` concept.
Dag authors can use decorators to author dags using the :doc:`TaskFlow <core-concepts/taskflow>` concept.
All Decorators derive from :class:`~airflow.sdk.bases.decorator.TaskDecorator`.

The primary decorators for DAG Authors are now in the airflow.sdk namespace:
The primary decorators for Dag authors are now in the airflow.sdk namespace:
:func:`~airflow.sdk.dag`, :func:`~airflow.sdk.task`, :func:`~airflow.sdk.asset`,
:func:`~airflow.sdk.setup`, :func:`~airflow.sdk.task_group`, :func:`~airflow.sdk.teardown`,
:func:`~airflow.sdk.chain`, :func:`~airflow.sdk.chain_linear`, :func:`~airflow.sdk.cross_downstream`,
Expand Down
Loading