Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@

- Gracefully shutdown all concurrent tasks by forwarding the SIGTERM signal ([#741]).
- Bump testing-tools to `0.3.0-stackable0.0.0-dev` ([#733]).
- BREAKING: Replace Airflow credentials-secret with database/broker connections ([#743]).
- Existing secret is retained for the admin user alone.
- Database/broker connections can be defined either using structs or as a generic connection string (see ADR 29).
- Removed standalone examples folder (not affecting the documentation).

### Fixed

Expand All @@ -28,6 +32,7 @@
[#734]: https://github.com/stackabletech/airflow-operator/pull/734
[#741]: https://github.com/stackabletech/airflow-operator/pull/741
[#742]: https://github.com/stackabletech/airflow-operator/pull/742
[#743]: https://github.com/stackabletech/airflow-operator/pull/743

## [25.11.0] - 2025-11-07

Expand Down
23 changes: 8 additions & 15 deletions docs/modules/airflow/examples/example-airflow-dags-configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,12 @@ spec:
image:
productVersion: 3.1.6
clusterConfig:
loadExamples: false
exposeConfig: false
credentialsSecret: simple-airflow-credentials
credentialsSecret: admin-user-credentials
metadataDatabase:
postgresql:
host: airflow-postgresql
databaseName: airflow
credentialsSecret: postgresql-credentials
volumes:
- name: cm-dag # <3>
configMap:
Expand All @@ -23,18 +26,8 @@ spec:
listenerClass: external-unstable
roleGroups:
default:
envOverrides:
envOverrides: &envOverrides
AIRFLOW__CORE__DAGS_FOLDER: "/dags" # <8>
replicas: 1
celeryExecutors:
roleGroups:
default:
envOverrides:
AIRFLOW__CORE__DAGS_FOLDER: "/dags" # <8>
replicas: 2
schedulers:
roleGroups:
default:
envOverrides:
AIRFLOW__CORE__DAGS_FOLDER: "/dags" # <8>
replicas: 1
...
27 changes: 19 additions & 8 deletions docs/modules/airflow/examples/example-airflow-incluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,28 +9,39 @@ spec:
clusterConfig:
loadExamples: false
exposeConfig: false
credentialsSecret: simple-airflow-credentials
credentialsSecret: admin-user-credentials
metadataDatabase:
postgresql:
host: airflow-postgresql
databaseName: airflow
credentialsSecret: postgresql-credentials
webservers:
roleConfig:
listenerClass: external-unstable
roleGroups:
default:
envOverrides:
envOverrides: &envOverrides
AIRFLOW_CONN_KUBERNETES_IN_CLUSTER: "kubernetes://?__extra__=%7B%22extra__kubernetes__in_cluster%22%3A+true%2C+%22extra__kubernetes__kube_config%22%3A+%22%22%2C+%22extra__kubernetes__kube_config_path%22%3A+%22%22%2C+%22extra__kubernetes__namespace%22%3A+%22%22%7D"
replicas: 1
schedulers:
roleGroups:
default:
envOverrides:
AIRFLOW_CONN_KUBERNETES_IN_CLUSTER: "kubernetes://?__extra__=%7B%22extra__kubernetes__in_cluster%22%3A+true%2C+%22extra__kubernetes__kube_config%22%3A+%22%22%2C+%22extra__kubernetes__kube_config_path%22%3A+%22%22%2C+%22extra__kubernetes__namespace%22%3A+%22%22%7D"
envOverrides: *envOverrides
replicas: 1
celeryExecutors:
celeryResultBackend:
postgresql:
host: airflow-postgresql
databaseName: airflow
credentialsSecret: postgresql-credentials
celeryBrokerUrl:
redis:
host: airflow-redis-master
credentialsSecret: redis-credentials
roleGroups:
default:
envOverrides:
AIRFLOW_CONN_KUBERNETES_IN_CLUSTER: "kubernetes://?__extra__=%7B%22extra__kubernetes__in_cluster%22%3A+true%2C+%22extra__kubernetes__kube_config%22%3A+%22%22%2C+%22extra__kubernetes__kube_config_path%22%3A+%22%22%2C+%22extra__kubernetes__namespace%22%3A+%22%22%7D"
envOverrides: *envOverrides
replicas: 1
# in case of using kubernetesExecutors
# kubernetesExecutors:
# envOverrides:
# AIRFLOW_CONN_KUBERNETES_IN_CLUSTER: "kubernetes://?__extra__=%7B%22extra__kubernetes__in_cluster%22%3A+true%2C+%22extra__kubernetes__kube_config%22%3A+%22%22%2C+%22extra__kubernetes__kube_config_path%22%3A+%22%22%2C+%22extra__kubernetes__namespace%22%3A+%22%22%7D"
# envOverrides: *envOverrides
16 changes: 0 additions & 16 deletions docs/modules/airflow/examples/example-airflow-secret.yaml

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,27 @@
apiVersion: v1
kind: Secret
metadata:
name: simple-airflow-credentials
name: admin-user-credentials
type: Opaque
stringData:
adminUser.username: airflow
adminUser.firstname: Airflow
adminUser.lastname: Admin
adminUser.email: airflow@airflow.com
adminUser.password: airflow
connections.sqlalchemyDatabaseUri: postgresql+psycopg2://airflow:airflow@airflow-postgresql.default.svc.cluster.local/airflow
# Only needed when using celery workers (instead of Kubernetes executors)
connections.celeryResultBackend: db+postgresql://airflow:airflow@airflow-postgresql.default.svc.cluster.local/airflow
connections.celeryBrokerUrl: redis://:redis@airflow-redis-master:6379/0
---
apiVersion: v1
kind: Secret
metadata:
name: postgresql-credentials
stringData:
username: airflow
password: airflow
---
apiVersion: v1
kind: Secret
metadata:
name: redis-credentials
stringData:
username: ""
password: redis
16 changes: 15 additions & 1 deletion docs/modules/airflow/examples/getting_started/code/airflow.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,28 @@ spec:
clusterConfig:
loadExamples: true
exposeConfig: false
credentialsSecret: simple-airflow-credentials
credentialsSecret: admin-user-credentials
metadataDatabase:
postgresql:
host: airflow-postgresql
databaseName: airflow
credentialsSecret: postgresql-credentials
webservers:
roleConfig:
listenerClass: external-unstable
roleGroups:
default:
replicas: 1
celeryExecutors:
celeryResultBackend:
postgresql:
host: airflow-postgresql
databaseName: airflow
credentialsSecret: postgresql-credentials
celeryBrokerUrl:
redis:
host: airflow-redis-master
credentialsSecret: redis-credentials
roleGroups:
default:
replicas: 1
Expand Down
21 changes: 13 additions & 8 deletions docs/modules/airflow/pages/getting_started/first_steps.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,10 @@ With the external dependencies required by Airflow (Postgresql and Redis) instal

Supported versions for PostgreSQL and Redis can be found in the https://airflow.apache.org/docs/apache-airflow/stable/installation/prerequisites.html#prerequisites[Airflow documentation].

== Secret with Airflow credentials
== Airflow secrets

Create a Secret with the necessary credentials, this entails database connection credentials as well as an admin account for Airflow itself.
Secrets are required for the mandatory metadata database connection and the airflow admin user.
When using the celery executor it is also required to provide information for the celery database and broker.
Create a file called `airflow-credentials.yaml`:

[source,yaml]
Expand All @@ -23,13 +24,12 @@ And apply it:
[source,bash]
include::example$getting_started/code/getting_started.sh[tag=apply-airflow-credentials]

`connections.sqlalchemyDatabaseUri` must contain the connection string to the SQL database storing the Airflow metadata.
`postgresql-credentials` contains credentials for the SQL database storing the Airflow metadata.
In this example we will use the same database for both the Airflow job metadata as well as the Celery broker metadata.

`connections.celeryResultBackend` must contain the connection string to the SQL database storing the job metadata (the example above uses the same PostgreSQL database for both).
`redis-credentials` contains credentials for the the Redis instance used for queuing the jobs submitted to the airflow executor(s).

`connections.celeryBrokerUrl` must contain the connection string to the Redis instance used for queuing the jobs submitted to the airflow executor(s).

The `adminUser` fields are used to create an admin user.
`admin-user-credentials`: the `adminUser` fields are used to create an admin user.

NOTE: The admin user is disabled if you use a non-default authentication mechanism like LDAP.

Expand Down Expand Up @@ -61,20 +61,25 @@ include::example$getting_started/code/getting_started.sh[tag=install-airflow]
Where:

* `metadata.name` contains the name of the Airflow cluster.
* `spec.clusterConfig.metadataDatabase` specifies one of the supported database types (in this case, `postgresql`) along with references to the host, database and the secret containing the connection credentials.
* the product version of the Docker image provided by Stackable must be set in `spec.image.productVersion`.
* `spec.celeryExecutors`: deploy executors managed by Airflow's Celery engine.
Alternatively you can use `kuberenetesExectors` that use Airflow's Kubernetes engine for executor management.
For more information see https://airflow.apache.org/docs/apache-airflow/stable/executor/index.html#executor-types).
* `spec.celeryExecutors.celeryResultBackend`: specifies one of the supported database types (in this case, `postgresql`) along with references to the host, database and the secret containing the connection credentials.
* `spec.celeryExecutors.celeryBrokerUrl`: specifies one of the supported queue/broker types (in this case, `redis`) along with references to the host and the secret containing the connection credentials.
* the `spec.clusterConfig.loadExamples` key is optional and defaults to `false`.
It is set to `true` here as the example DAGs are used when verifying the installation.
* the `spec.clusterConfig.exposeConfig` key is optional and defaults to `false`. It is set to `true` only as an aid to verify the configuration and should never be used as such in anything other than test or demo clusters.
* the previously created secret must be referenced in `spec.clusterConfig.credentialsSecret`.
* the secret containing the admin user infomration must be referenced in `spec.clusterConfig.credentialsSecret`.

NOTE: The version you need to specify for `spec.image.productVersion` is the desired version of Apache Airflow.
You can optionally specify the `spec.image.stackableVersion` to a certain release like `23.11.0` but it is recommended to leave it out and use the default provided by the operator.
Check our https://oci.stackable.tech/[image registry,window=_blank] for a list of available versions. Information on how to browse the registry can be found xref:contributor:project-overview.adoc#docker-images[here,window=_blank].
It should generally be safe to simply use the latest version that is available.

NOTE: Refer to xref:usage-guide/db-connect.adoc[] for more information about database/broker connections.

This creates the actual Airflow cluster.

After a while, all the Pods in the StatefulSets should be ready:
Expand Down
78 changes: 78 additions & 0 deletions docs/modules/airflow/pages/usage-guide/db-connect.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
= Database connections
:description: Configure Airflow Database connectivity.

Airflow requires a metadata database for storing e.g. DAG, task and Job data.
The actual connection string is provided by the operator so that the user does not need to remember the exact structure.
The same database can be accessed using different drivers: this is also handled by the operator, since the context is known (e.g. job metadata vs. queued job metadata) when parsing the resource file.

== Typed connections

[source,yaml]
----
---
spec:
clusterConfig:
metadataDatabase:
postgresql: # <1>
host: airflow-postgresql
databaseName: airflow
credentialsSecret: postgresql-credentials # <2>
----
<1> A reference to one of the supported database backends (e.g. `postgresql`).
<2> A reference to a secret which must contain the two fields `username` and `password`.

The queue/broker metadata and URL is only needed when running the celery executor.
The `celeryResultBackend` definition uses the same structure as `metadataDatabase` shown above.
The `celeryBrokerUrl` definition is similar but does not require a `databaseName`.

[source,yaml]
----
---
spec:
celeryExecutors:
celeryResultBackend:
postgresql: # <1>
host: airflow-postgresql
databaseName: airflow
credentialsSecret: postgresql-credentials # <2>
celeryBrokerUrl:
redis: # <3>
host: airflow-redis-master
credentialsSecret: redis-credentials # <2>
----
<1> A reference to one of the supported database backends (e.g. `postgresql`).
<2> A reference to a secret which must contain the two fields `username` and `password`.
<3> A reference to one of the supported queue brokers (e.g. `redis`).

== Generic connections

Alternatively, these connections can also be defined in full in a referenced secret:

[source,yaml]
----
---
spec:
clusterConfig:
metadataDatabase:
generic:
uriSecret: postgresql-metadata # <1>
----

[source,yaml]
----
---
spec:
celeryResultBackend:
generic:
uriSecret: postgresql-celery # <2>
celeryBrokerUrl:
generic:
uriSecret: redis-celery # <3>
----

<1> A reference to a secret which must contain the single fields `uri` e.g.
`uri: postgresql+psycopg2://airflow:airflow@airflow-postgresql/airflow`
<2> A reference to a secret which must contain the single fields `uri` e.g.
`uri: db+postgresql://airflow:airflow@airflow-postgresql/airflow`
<3> A reference to a secret which must contain the single fields `uri` e.g.
`uri: redis://:redis@airflow-redis-master:6379/0`
1 change: 1 addition & 0 deletions docs/modules/airflow/pages/usage-guide/logging.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ spec:
"flask_appbuilder":
level: WARN
celeryExecutors:
...
config:
logging:
enableVectorAgent: true
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ spec:
default:
replicas: 2
celeryExecutors:
...
config:
resources:
cpu:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ E.g. you would change the following example
----
spec:
celeryExecutors:
...
roleGroups:
default:
replicas: 2
Expand Down
1 change: 1 addition & 0 deletions docs/modules/airflow/partials/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
* xref:airflow:required-external-components.adoc[]
* xref:airflow:usage-guide/index.adoc[]
** xref:airflow:usage-guide/db-init.adoc[]
** xref:airflow:usage-guide/db-connect.adoc[]
** xref:airflow:usage-guide/mounting-dags.adoc[]
** xref:airflow:usage-guide/applying-custom-resources.adoc[]
** xref:airflow:usage-guide/listenerclass.adoc[]
Expand Down
Loading