Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create guide for Machine Learning Engine operators #8207 #8968

Closed
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
155 commits
Select commit Hold shift + click to select a range
a28c66f
[AIRFLOW-4734] Upsert functionality for PostgresHook.insert_rows() (#…
oxymor0n Apr 30, 2020
4a1d71d
Fix the process of requirements generations (#8648)
potiuk Apr 30, 2020
b185b36
Reduce response payload size of /dag_stats and /task_stats (#8633)
XD-DENG Apr 30, 2020
4421f01
Improve template capabilities of EMR job and step operators (#8572)
oripwk May 1, 2020
6560f29
Enhanced documentation around Cluster Policy (#8661)
vardancse May 1, 2020
511d98e
[AIRFLOW-4363] Fix JSON encoding error (#8287)
retornam May 1, 2020
ce50538
Add check for pre-2.0 style hostname_callable config value (#8637)
dimberman May 1, 2020
0a7b500
Fix displaying Executor Class Name in "Base Job" table (#8679)
kaxil May 2, 2020
d92e848
Persist start/end date and duration for DummyOperator Task Instance (…
XD-DENG May 2, 2020
0954140
Ensure "started"/"ended" in tooltips are not shown if job not started…
XD-DENG May 2, 2020
19ac45a
Add support for fetching logs from running pods (#8626)
msumit May 3, 2020
1100cea
Remove _get_pretty_exception_message in PrestoHook
fusuiyi123 May 3, 2020
62796b9
Improve tutorial - Include all imports statements (#8670)
Lyalpha May 3, 2020
dd6a7bc
Group Google services in one section (#8623)
mik-laj May 3, 2020
ac59735
Refactor test_variable_command.py (#8535)
May 3, 2020
bc45fa6
Add system test and docs for Facebook Ads operators (#8503)
randr97 May 3, 2020
0b598a2
Fix connection add/edit for spark (#8685)
XD-DENG May 3, 2020
ffbbbfc
Sort connection type list in add/edit page alphabetically (#8692)
XD-DENG May 3, 2020
d8cb0b5
Support k8s auth method in Vault Secrets provider (#8640)
May 3, 2020
67caae0
Add system test for gcs_to_bigquery (#8556)
joppevos May 4, 2020
aec768b
[AIRFLOW-7008] Add perf kit with common used decorators/contexts (#7650)
mik-laj May 4, 2020
c3a46b9
Invalid output in test_variable assertion (#8698)
mik-laj May 4, 2020
5ddc458
Change provider:GCP to provider:Google for Labeler Bot (#8697)
mik-laj May 4, 2020
caa60b1
Remove config side effects from tests (#8607)
turbaszek May 4, 2020
923f423
Check consistency between the reference list and howto directory (#8690)
mik-laj May 4, 2020
b31ad51
Prevent clickable sorting on non sortable columns in TI view (#8681)
Acehaidrey May 4, 2020
6600e47
Import Connection directly from multiprocessing.connection. (#8711)
jhtimmins May 4, 2020
2c92a29
Fix typo in Google Display & Video 360 guide
michalslowikowski00 May 5, 2020
41b4c27
Carefully parse warning messages when building documentation (#8693)
mik-laj May 5, 2020
8d6f1aa
Support num_retries field in env var for GCP connection (#8700)
mik-laj May 5, 2020
c717d12
Add __repr__ for DagTag so tags display properly in /dagmodel/show (#…
XD-DENG May 5, 2020
487b5cc
Add guide for Apache Spark operators (#8305)
May 5, 2020
520aeed
Fix pickling failure when spawning processes (#8671)
jhtimmins May 6, 2020
25ee421
Support all RuntimeEnvironment parameters in DataflowTemplatedJobStar…
mik-laj May 6, 2020
d923b5b
Add jinja template test for AirflowVersion (#8505)
mik-laj May 6, 2020
e673413
Avoid loading executors in jobs (#7888)
mik-laj May 6, 2020
3437bea
Optimize count query on /home (#8729)
mik-laj May 6, 2020
336aa27
Correctly deserialize dagrun_timeout field on DAGs (#8735)
ashb May 6, 2020
2e9ef45
Stop Stalebot on Github issues (#8738)
kaxil May 6, 2020
fd6e057
Make loading plugins from entrypoint fault-tolerant (#8732)
kaxil May 6, 2020
bd29ee3
Ensure test_logging_config.test_reload_module works in spawn mode. (#…
jhtimmins May 6, 2020
d15839d
Latest debian-buster release broke image build (#8758)
potiuk May 7, 2020
ff5b701
Add google_api_to_s3_transfer example dags and system tests (#8581)
feluelle May 7, 2020
7c04604
Add google_api_to_s3_transfer docs howto link (#8761)
feluelle May 7, 2020
723c52c
Add documentation for SpannerDeployInstanceOperator (#8750)
ephraimbuddy May 7, 2020
6e4f5fa
[AIRFLOW-4568]The ExternalTaskSensor should be configurable to raise …
lokeshlal May 7, 2020
b7566e1
Add SQL query tracking for pytest (#8754)
mik-laj May 8, 2020
58aefb2
Added SDFtoGCSOperator (#8740)
michalslowikowski00 May 8, 2020
b37ce29
Patch Pool.DEFAULT_POOL_NAME in BaseOperator (#8587)
vshshjn7 May 8, 2020
2bd3e76
Support same short flags for `create user` as 1.10 did for `user_crea…
ashb May 8, 2020
09770e4
Add WorldRemit as Airflow user (#8786)
May 8, 2020
a091c1f
fix typing errors reported by dmypy (#8773)
May 8, 2020
42c5975
Update example SingularityOperator DAG (#8790)
ashb May 8, 2020
791d1a7
Backport packages are renamed to include backport in their name (#8767)
potiuk May 9, 2020
100f530
Fixed test-target command (#8795)
potiuk May 9, 2020
db1b51d
Make celery worker_prefetch_multiplier configurable (#8695)
nadflinn May 9, 2020
bc19778
[AIP-31] Implement XComArg to pass output from one operator to the ne…
jonathanshir May 9, 2020
7506c73
Add default `conf` parameter to Spark JDBC Hook (#8787)
May 9, 2020
5e1c33a
Fix docs on creating CustomOperator (#8678)
JonnyWaffles May 10, 2020
21cc7d7
Document default timeout value for SSHOperator (#8744)
abhilash1in May 10, 2020
cd635dd
[AIRFLOW-5906] Add authenticator parameter to snowflake_hook (#8642)
koszti May 10, 2020
c7788a6
Add imap_attachment_to_s3 example dag and system test (#8669)
feluelle May 10, 2020
a715aa6
Correctly store non-default Nones in serialized tasks/dags (#8772)
ashb May 10, 2020
280f1f0
Correctly restore upstream_task_ids when deserializing Operators (#8775)
ashb May 10, 2020
cbebed2
Allow passing backend_kwargs to AWS SSM client (#8802)
kaxil May 10, 2020
79ef8be
Added Upload Multiple Entity Read Files to specified big query datase…
michalslowikowski00 May 10, 2020
e1cc17e
Remove old airflow logger causing side effects in tests (#8746)
kaxil May 10, 2020
9bb91ef
Add comments to breeze scripts (#8797)
potiuk May 10, 2020
493b685
Add separate example DAGs and system tests for google cloud speech (#…
ephraimbuddy May 10, 2020
bed1995
Avoid color info in response of /dag_stats & /task_stats (#8742)
XD-DENG May 11, 2020
b59adab
Support cron presets in date_range function (#7777)
Rcharriol May 11, 2020
5f3774a
[AIRFLOW-6921] Fetch celery states in bulk (#7542)
mik-laj May 11, 2020
d5c4001
Useful help information in test-target and docker-compose commands (#…
potiuk May 11, 2020
a6434a5
Fix bash command in performance test dag (#8812)
ashb May 11, 2020
0c3db84
[AIRFLOW-7068] Create EC2 Hook, Operator and Sensor (#7731)
mustafagok May 11, 2020
5ae76d8
Option to set end_date for performance testing dag. (#8817)
ashb May 11, 2020
2ec0130
[AIRFLOW-4549] Allow skipped tasks to satisfy wait_for_downstream (#7…
teddyhartanto May 11, 2020
1fb9f07
Synchronize extras between airflow and providers (#8819)
potiuk May 11, 2020
d590e5e
Add option to propagate tags in ECSOperator (#8811)
JPonte May 11, 2020
f410d64
Use fork when test relies on mock.patch in parent process. (#8794)
jhtimmins May 11, 2020
3ad4f96
[AIRFLOW-1156] BugFix: Unpausing a DAG with catchup=False creates an …
kaxil May 11, 2020
4375607
Fix typo. 'zobmies' => 'zombies'. (#8832)
jhtimmins May 12, 2020
78a48db
Add support for non-default orientation in `dag show` command (#8834)
klsnreddy May 12, 2020
7533378
Access function to be pickled as attribute, not method, to avoid erro…
jhtimmins May 12, 2020
1d12c34
Refactor BigQuery check operators (#8813)
turbaszek May 12, 2020
4b06fde
Fix Flake8 errors (#8841)
kaxil May 12, 2020
6911dfe
Fix template fields in Google operators (#8840)
turbaszek May 12, 2020
01db738
Azure storage 0.37.0 is not installable any more (#8833)
potiuk May 12, 2020
578fc51
[AIRFLOW-4543] Update slack operator to support slackclient v2 (#5519)
serkef May 12, 2020
7236862
[AIRFLOW-2310] Enable AWS Glue Job Integration (#6007)
abdulbasitds May 12, 2020
8b54919
Refactor BigQuery hook methods to use python library (#8631)
turbaszek May 12, 2020
7d69987
Remove duplicate code from perf_kit (#8843)
kaxil May 12, 2020
e1e833b
Update GoogleBaseHook to not follow 308 and use 60s timeout (#8816)
waiyan1612 May 13, 2020
8a94d18
Fix Environment Variable in perf/scheduler_dag_execution_timing.py (#…
kaxil May 13, 2020
ed3f513
Correctly pass sleep time from AWSAthenaOperator down to the hook. (#…
ashb May 13, 2020
f1dc2e0
The librabbitmq library stopped installing for python3.7 (#8853)
potiuk May 13, 2020
c3af681
Convert tests/jobs/test_base_job.py to pytest (#8856)
ashb May 13, 2020
81fb9d6
Add metric for monitoring email notification failures (#8771)
May 13, 2020
2878f17
Relax Flask-Appbuilder version to ~=2.3.4 (#8857)
feluelle May 13, 2020
e61b9bb
Add AWS EMR System tests (#8618)
xinbinhuang May 13, 2020
fc862a3
Do not create a separate process for one task in CeleryExecutor (#8855)
mik-laj May 14, 2020
961c710
Make Custom XCom backend a subsection of XCom docs (#8869)
turbaszek May 14, 2020
fe42191
Don't use ProcessorAgent to test ProcessorManager (#8871)
ashb May 14, 2020
4813b94
Create log file w/abs path so tests pass on MacOS (#8820)
jhtimmins May 14, 2020
35c523f
Fix list formatting of plugins doc. (#8873)
ashb May 15, 2020
85bbab2
Add EMR operators howto docs (#8863)
xinbinhuang May 15, 2020
f82ad45
Fix KubernetesPodOperator pod name length validation (#8829)
dsaiztc May 15, 2020
92585ca
Added automated release notes generation for backport operators (#8807)
potiuk May 15, 2020
82de6f7
Spend less time waiting for DagFileProcessor processes to complete (#…
ashb May 15, 2020
a3a4bac
JIRA and Github issues explanation (#8539)
mschickensoup May 16, 2020
f4edd90
Speed up TestAwsLambdaHook by not actually running a function (#8882)
ashb May 16, 2020
15273f0
Check for same task instead of Equality to detect Duplicate Tasks (#8…
kaxil May 16, 2020
a3a3411
Fix master failing on generating requirements (#8885)
potiuk May 16, 2020
f3521fb
Regenerate readme files for backport package release (#8886)
potiuk May 16, 2020
f6d5917
Updated docs for experimental API /dags/<DAG_ID>/dag_runs (#8800)
randr97 May 16, 2020
707bb0c
[AIRFLOW-6535] Add AirflowFailException to fail without any retry (#7…
jstern May 16, 2020
a546a10
Add Snowflake system test (#8422)
dhuang May 16, 2020
8985df0
Monitor pods by labels instead of names (#6377)
dimberman May 16, 2020
ff342fc
Added SalesforceHook missing method to return only dataframe (#8565) …
pranjalmittal May 17, 2020
12c5e5d
Prepare release candidate for backport packages (#8891)
potiuk May 17, 2020
2121f49
Avoid failure on transient requirements in CI image (#8892)
potiuk May 17, 2020
841d816
Allow setting the pooling time in DLPHook (#8824)
xuan616 May 19, 2020
dd57ec9
Fix task and dag stats on home page (#8865)
May 19, 2020
375d1ca
Release candidate 2 for backport packages 2020.05.20 (#8898)
potiuk May 19, 2020
bae5cc2
Fix race in Celery tests by pre-creating result tables (#8909)
ashb May 19, 2020
499493c
[AIRFLOW-6586] Improvements to gcs sensor (#7197)
May 19, 2020
ce7fdea
UX Fix: Prevent undesired text selection with DAG title selection in …
ryanahamilton May 19, 2020
fef00e5
Use Debian's provided JRE from Buster (#8919)
ashb May 20, 2020
5360045
Fix incorrect Env Var to stop Scheduler from creating DagRuns (#8920)
kaxil May 20, 2020
51d9557
Re-run all tests when Dockerfile or Github worflow change (#8924)
ashb May 20, 2020
c6224e2
Remove unused self.max_threads argument in SchedulerJob (#8935)
kaxil May 21, 2020
12c22e0
Added Greytip to Airflow Users list (#8887)
Sarankrishna May 21, 2020
8476c1e
Hive/Hadoop minicluster needs JDK8 and JAVA_HOME to work (#8938)
ashb May 21, 2020
f17b4bb
Fix DagRun Prefix for Performance script (#8934)
kaxil May 21, 2020
a9dfd7d
Remove side-effect of session in FAB (#8940)
mik-laj May 21, 2020
f3f74c7
Add TaskInstance state to TI Tooltip to be colour-blind friendlier (#…
harrisjoseph May 21, 2020
8d3acd7
Fix docstring in DagFileProcessor._schedule_task_instances (#8948)
kaxil May 21, 2020
47413d9
Remove singularity from CI images (#8945)
ashb May 21, 2020
16206cd
Update example webserver_config.py to show correct CSRF config (#8944)
ashb May 21, 2020
97b6cc7
Add note in Updating.md about the removel of DagRun.ID_PREFIX (#8949)
kaxil May 21, 2020
41481bb
Python base images are stored in cache (#8943)
potiuk May 21, 2020
b26b3ca
Don't hard-code constants in scheduler_dag_execution_timing (#8950)
ashb May 21, 2020
113982b
Make scheduler_dag_execution_timing grok dynamic start date of elasti…
ashb May 21, 2020
90a07d8
Cache 1 10 ci images (#8955)
dimberman May 21, 2020
dd72040
Pin Version of Azure Cosmos to <4 (#8956)
kaxil May 21, 2020
94a7673
Pin google-cloud-datacatalog to <0.8 (#8957)
kaxil May 21, 2020
9a4a2d1
[AIRFLOW-5262] Update timeout exception to include dag (#8466)
curiousjazz77 May 22, 2020
b055151
Add context to execution_date_fn in ExternalTaskSensor (#8702)
Acehaidrey May 22, 2020
f107338
Add support for spark python and submit tasks in Databricks operator(…
siddartha-ravichandran May 22, 2020
e742ef7
Fix typo in test_project_structure (#8978)
mik-laj May 23, 2020
4d67704
Remove duplicate line from CONTRIBUTING.rst (#8981)
kaxil May 23, 2020
db70da2
Flush pending Sentry exceptions before exiting (#7232)
mikeclarke May 23, 2020
cf5cf45
Support YAML input for CloudBuildCreateOperator (#8808)
joppevos May 23, 2020
bdb8369
Add secrets to test_deprecated_packages (#8979)
mik-laj May 23, 2020
f3456b1
Fix formatting code block in TESTING.rst (#8985)
ad-m May 23, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Added automated release notes generation for backport operators (#8807)
We have now mechanism to keep release notes updated for the
backport operators in an automated way.

It really nicely generates all the necessary information:

* summary of requirements for each backport package
* list of dependencies (including extras to install them) when package
  depends on other providers packages
* table of new hooks/operators/sensors/protocols/secrets
* table of moved hooks/operators/sensors/protocols/secrets with
  information where they were moved from
* changelog of all the changes to the provider package (this will be
  automatically updated with incremental changelog whenever we decide to
  release separate packages.

The system is fully automated - we will be able to produce release notes
automatically (per-package) whenever we decide to release new version of
the package in the future.
  • Loading branch information
potiuk authored May 15, 2020
commit 92585ca4cb375ac879f4ab331b3a063106eb7b92
16 changes: 6 additions & 10 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,8 @@ repos:
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all JINJA template files
files: ^airflow/www/templates/.*\.html$|^docs/templates/.*\.html$|^airflow/contrib/plugins/metastore_browser/templates/.*\.html$ # yamllint disable-line rule:line-length
files: "^airflow/www/templates/.*\\.html$|^docs/templates/.*\\.html$|^airflow/contrib/plugins/\
metastore_browser/templates/.*\\.html$|.*\\.jinja2"
exclude: ^\.github/.*$|^airflow/_vendor/.*$
args:
- --comment-style
Expand Down Expand Up @@ -120,7 +121,7 @@ repos:
- id: insert-license
name: Add license for all md files
files: \.md$
exclude: ^\.github/.*$|^airflow/_vendor/.*$
exclude: ^\.github/.*$|^airflow/_vendor/.*|PROVIDERS_CHANGES.*\.md
args:
- --comment-style
- "<!--|| -->"
Expand All @@ -132,16 +133,10 @@ repos:
hooks:
- id: doctoc
name: Add TOC for md files
files: ^README\.md$|^CONTRIBUTING\.md$|^UPDATING.md$|^dev/README\.md$
files: ^README\.md$|^CONTRIBUTING\.md$|^UPDATING.md$|^dev/README\.md$|^dev/BACKPORT_PACKAGES.md$
args:
- "--maxlevel"
- "2"
- repo: https://github.com/thlorenz/doctoc.git
rev: v1.4.0
hooks:
- id: doctoc
name: Add TOC for backport readme files
files: BACKPORT_README\.md$
- repo: meta
hooks:
- id: check-hooks-apply
Expand Down Expand Up @@ -277,7 +272,8 @@ repos:
^airflow/contrib/.*\.py$
- id: provide-create-sessions
language: pygrep
name: To avoid import cycles make sure provide_session and create_session are imported from airflow.utils.session # yamllint disable-line rule:line-length
name: To avoid import cycles make sure provide_session and create_session are imported from
airflow.utils.session
entry: "from airflow\\.utils\\.db import.* (provide_session|create_session)"
files: \.py$
pass_filenames: true
Expand Down
1 change: 1 addition & 0 deletions .rat-excludes
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ rat-results.txt
apache-airflow-.*\+source.tar.gz.*
apache-airflow-.*\+bin.tar.gz.*
PULL_REQUEST_TEMPLATE.md
PROVIDERS_CHANGES*.md

# vendored modules
_vendor/*
Expand Down
87 changes: 87 additions & 0 deletions BREEZE.rst
Original file line number Diff line number Diff line change
Expand Up @@ -655,6 +655,8 @@ This is the current syntax for `./breeze <./breeze>`_:
cleanup-image Cleans up the container image created
exec Execs into running breeze container in new terminal
generate-requirements Generates pinned requirements for pip dependencies
generate-backport-readme Generates backport packages readme files
prepare-backport-packages Prepares backport packages
initialize-local-virtualenv Initializes local virtualenv
setup-autocomplete Sets up autocomplete for breeze
stop Stops the docker-compose environment
Expand Down Expand Up @@ -870,6 +872,84 @@ This is the current syntax for `./breeze <./breeze>`_:
####################################################################################################


Detailed usage for command: generate-backport-readme

breeze [FLAGS] generate-backport-readme -- <EXTRA_ARGS>

Prepares README.md files for backport packages. You can provide (after --) optional version
in the form of YYYY.MM.DD, optionally followed by the list of packages to generate readme for.
If the first parameter is not formatted as a date, then today is used as version.
If no packages are specified, readme for all packages are generated.
If no date is specified, current date + 3 days is used (allowing for PMC votes to pass).

Examples:

'breeze generate-backport-readme' or
'breeze generate-backport-readme -- 2020.05.10' or
'breeze generate-backport-readme -- 2020.05.10 https google amazon'

General form:

'breeze generate-backport-readme -- YYYY.MM.DD <PACKAGE_ID> ...'

* YYYY.MM.DD - is the CALVER version of the package to prepare. Note that this date
cannot be earlier than the already released version (the script will fail if it
will be). It can be set in the future anticipating the future release date.

* <PACKAGE_ID> is usually directory in the airflow/providers folder (for example
'google' but in several cases, it might be one level deeper separated with
'.' for example 'apache.hive'

Flags:

-v, --verbose
Show verbose information about executed commands (enabled by default for running test).
Note that you can further increase verbosity and see all the commands executed by breeze
by running 'export VERBOSE_COMMANDS="true"' before running breeze.


####################################################################################################


Detailed usage for command: prepare-backport-packages

breeze [FLAGS] prepare-backport-packages -- <EXTRA_ARGS>

Builds backport packages. You can provide (after --) optional list of packages to prepare.
If no packages are specified, readme for all packages are generated. You can specify optional
--version-suffix flag to generate rc candidates for the packages.

Make sure to set the right version in './backport_packages/setup_backport_packages.py'

Examples:

'breeze prepare-backport-packages' or
'breeze prepare-backport-packages -- google' or
'breeze prepare-backport-packages --version-suffix rc1 -- http google amazon'

General form:

'breeze prepare-backport-packages -- <PACKAGE_ID> ...'

* <PACKAGE_ID> is usually directory in the airflow/providers folder (for example
'google'), but in several cases, it might be one level deeper separated with '.'
for example 'apache.hive'

Flags:

-S, --version-suffix
Adds optional suffix to the generated backport package version. It can be used to generate
rc1/rc2 ... versions of the packages.

-v, --verbose
Show verbose information about executed commands (enabled by default for running test).
Note that you can further increase verbosity and see all the commands executed by breeze
by running 'export VERBOSE_COMMANDS="true"' before running breeze.


####################################################################################################


Detailed usage for command: initialize-local-virtualenv

breeze [FLAGS] initialize-local-virtualenv -- <EXTRA_ARGS>
Expand Down Expand Up @@ -1340,6 +1420,13 @@ This is the current syntax for `./breeze <./breeze>`_:
-H, --dockerhub-repo
DockerHub repository used to pull, push, build images. Default: airflow.

****************************************************************************************************
Flags for generation of the backport packages

-S, --version-suffix
Adds optional suffix to the generated backport package version. It can be used to generate
rc1/rc2 ... versions of the packages.

****************************************************************************************************
Increase verbosity of the scripts

Expand Down
58 changes: 5 additions & 53 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -313,14 +313,15 @@ This is the full list of those extras:

.. START EXTRAS HERE

all, all_dbs, amazon, apache.atlas, apache.cassandra, apache.druid, apache.hdfs, apache.hive,
apache.pinot, apache.webhdfs, async, atlas, aws, azure, cassandra, celery, cgroups, cloudant,
cncf.kubernetes, dask, databricks, datadog, devel, devel_ci, devel_hadoop, doc, docker, druid,
all_dbs, amazon, apache.atlas, apache_beam, apache.cassandra, apache.druid, apache.hdfs,
apache.hive, apache.pinot, apache.webhdfs, async, atlas, aws, azure, cassandra, celery, cgroups,
cloudant, cncf.kubernetes, dask, databricks, datadog, devel, devel_hadoop, doc, docker, druid,
elasticsearch, exasol, facebook, gcp, gcp_api, github_enterprise, google, google_auth, grpc,
hashicorp, hdfs, hive, jdbc, jira, kerberos, kubernetes, ldap, microsoft.azure, microsoft.mssql,
microsoft.winrm, mongo, mssql, mysql, odbc, oracle, pagerduty, papermill, password, pinot, postgres,
presto, qds, rabbitmq, redis, salesforce, samba, segment, sendgrid, sentry, singularity, slack,
snowflake, ssh, statsd, tableau, vertica, virtualenv, webhdfs, winrm, yandexcloud
snowflake, spark, ssh, statsd, tableau, vertica, virtualenv, webhdfs, winrm, yandexcloud, all,
devel_ci

.. END EXTRAS HERE

Expand Down Expand Up @@ -981,52 +982,3 @@ Resources & Links
- `Airflow’s official documentation <http://airflow.apache.org/>`__

- `More resources and links to Airflow related content on the Wiki <https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Links>`__

Preparing backport packages
===========================

As part of preparation to Airflow 2.0 we decided to prepare backport of providers package that will be
possible to install in the Airflow 1.10.*, Python 3.6+ environment.
Some of those packages will be soon (after testing) officially released via PyPi, but you can build and
prepare such packages on your own easily.

* The setuptools.py script only works in python3.6+. This is also our minimally supported python
version to use the packages in.

* Make sure you have ``setuptools`` and ``wheel`` installed in your python environment. The easiest way
to do it is to run ``pip install setuptools wheel``

* Run the following command:

.. code-block:: bash

./scripts/ci/ci_prepare_packages.sh

* Usually you only build some of the providers package. The ``providers`` directory is separated into
separate providers. You can see the list of all available providers by running
``./scripts/ci/ci_prepare_packages.sh --help``. You can build the backport package
by running ``./scripts/ci/ci_prepare_packages.sh <PROVIDER_NAME>``. Note that there
might be (and are) dependencies between some packages that might prevent subset of the packages
to be used without installing the packages they depend on. This will be solved soon by
adding cross-dependencies between packages.

* This creates a wheel package in your ``dist`` folder with a name similar to:
``apache_airflow_backport_providers-0.0.1-py2.py3-none-any.whl``

* You can install this package with ``pip install <PACKAGE_FILE>``


* You can also build sdist (source distribution packages) by running
``python setup.py <PROVIDER_NAME> sdist`` but this is only needed in case of distribution of the packages.

Each package has description generated from the the general ``backport_packages/README.md`` file with the
following replacements:

* ``{{ PACKAGE_NAME }}`` is replaced with the name of the package
(``apache-airflow-backport-providers-<NAME>``)
* ``{{ PACKAGE_DEPENDENCIES }}`` is replaced with list of optional dependencies for the package
* ``{{ PACKAGE_BACKPORT_README }}`` is replaced with the content of ``BACKPORT_README.md`` file in the
package folder if it exists.

Note that those are unofficial packages yet - they are not yet released in PyPi, but you might use them to
test the master versions of operators/hooks/sensors in Airflow 1.10.* environment with Python3.6+
9 changes: 5 additions & 4 deletions INSTALL
Original file line number Diff line number Diff line change
Expand Up @@ -44,14 +44,15 @@ pip install . --constraint requirements/requirements-python3.7.txt
# You can also install Airflow with extras specified. The list of available extras:
# START EXTRAS HERE

all, all_dbs, amazon, apache.atlas, apache.cassandra, apache.druid, apache.hdfs, apache.hive,
apache.pinot, apache.webhdfs, async, atlas, aws, azure, cassandra, celery, cgroups, cloudant,
cncf.kubernetes, dask, databricks, datadog, devel, devel_ci, devel_hadoop, doc, docker, druid,
all_dbs, amazon, apache.atlas, apache_beam, apache.cassandra, apache.druid, apache.hdfs,
apache.hive, apache.pinot, apache.webhdfs, async, atlas, aws, azure, cassandra, celery, cgroups,
cloudant, cncf.kubernetes, dask, databricks, datadog, devel, devel_hadoop, doc, docker, druid,
elasticsearch, exasol, facebook, gcp, gcp_api, github_enterprise, google, google_auth, grpc,
hashicorp, hdfs, hive, jdbc, jira, kerberos, kubernetes, ldap, microsoft.azure, microsoft.mssql,
microsoft.winrm, mongo, mssql, mysql, odbc, oracle, pagerduty, papermill, password, pinot, postgres,
presto, qds, rabbitmq, redis, salesforce, samba, segment, sendgrid, sentry, singularity, slack,
snowflake, ssh, statsd, tableau, vertica, virtualenv, webhdfs, winrm, yandexcloud
snowflake, spark, ssh, statsd, tableau, vertica, virtualenv, webhdfs, winrm, yandexcloud, all,
devel_ci

# END EXTRAS HERE

Expand Down
10 changes: 5 additions & 5 deletions TESTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -547,12 +547,12 @@ Preparing backport packages for System Tests for Airflow 1.10.* series
----------------------------------------------------------------------

To run system tests with old Airflow version you need to prepare backport packages. This
can be done by running ``./scripts/ci/ci_prepare_packages.sh <PACKAGES TO BUILD>``. For
can be done by running ``./breeze prepare-backport-packages -- <PACKAGES TO BUILD>``. For
example the below command will build google postgres and mysql packages:

.. code-block:: bash

./scripts/ci/ci_prepare_packages.sh google postgres mysql
./breeze prepare-backport-packages -- google postgres mysql

Those packages will be prepared in ./dist folder. This folder is mapped to /dist folder
when you enter Breeze, so it is easy to automate installing those packages for testing.
Expand Down Expand Up @@ -614,7 +614,7 @@ Here is the typical session that you need to do to run system tests:

.. code-block:: bash

./scripts/ci/ci_prepare_packages.sh google postgres mysql
./breeze prepare-backport-packages -- google postgres mysql

2. Enter breeze with installing Airflow 1.10.*, forwarding credentials and installing
backported packages (you need an appropriate line in ``./files/airflow-breeze-config/variables.env``)
Expand Down Expand Up @@ -686,7 +686,7 @@ The typical session then looks as follows:

.. code-block:: bash

./scripts/ci/ci_prepare_packages.sh google postgres mysql
./breeze prepare-backport-packages -- google postgres mysql

2. Enter breeze with installing Airflow 1.10.*, forwarding credentials and installing
backported packages (you need an appropriate line in ``./files/airflow-breeze-config/variables.env``)
Expand Down Expand Up @@ -716,7 +716,7 @@ In the host:

.. code-block:: bash

./scripts/ci/ci_prepare_packages.sh google
./breeze prepare-backport-packages -- google

In the container:

Expand Down
8 changes: 7 additions & 1 deletion airflow/contrib/operators/bigquery_operator.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,13 @@

import warnings

from airflow.providers.google.cloud.operators.bigquery import BigQueryExecuteQueryOperator
# pylint: disable=unused-import
from airflow.providers.google.cloud.operators.bigquery import ( # noqa; noqa; noqa; noqa; noqa
BigQueryCreateEmptyDatasetOperator, BigQueryCreateEmptyTableOperator, BigQueryCreateExternalTableOperator,
BigQueryDeleteDatasetOperator, BigQueryExecuteQueryOperator, BigQueryGetDatasetOperator,
BigQueryGetDatasetTablesOperator, BigQueryPatchDatasetOperator, BigQueryUpdateDatasetOperator,
BigQueryUpsertTableOperator,
)

warnings.warn(
"This module is deprecated. Please use `airflow.providers.google.cloud.operators.bigquery`.",
Expand Down
1 change: 1 addition & 0 deletions airflow/jobs/base_job.py
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,7 @@ def reset_state_for_orphaned_tasks(self, filter_by_dag_run=None, session=None):
TI.dag_id == DR.dag_id,
TI.execution_date == DR.execution_date))
.filter(
# pylint: disable=comparison-with-callable
DR.state == State.RUNNING,
DR.run_id.notlike(f"{DagRunType.BACKFILL_JOB.value}__%"),
TI.state.in_(resettable_states))).all()
Expand Down
1 change: 1 addition & 0 deletions airflow/jobs/scheduler_job.py
Original file line number Diff line number Diff line change
Expand Up @@ -1091,6 +1091,7 @@ def _change_state_for_tis_without_dagrun(self,
.filter(models.TaskInstance.dag_id.in_(simple_dag_bag.dag_ids)) \
.filter(models.TaskInstance.state.in_(old_states)) \
.filter(or_(
# pylint: disable=comparison-with-callable
models.DagRun.state != State.RUNNING,
models.DagRun.state.is_(None))) # pylint: disable=no-member
# We need to do this for mysql as well because it can cause deadlocks
Expand Down
2 changes: 2 additions & 0 deletions airflow/models/serialized_dag.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,7 @@ def remove_dag(cls, dag_id: str, session=None):
:param dag_id: dag_id to be deleted
:param session: ORM Session
"""
# pylint: disable=no-member
session.execute(cls.__table__.delete().where(cls.dag_id == dag_id))

@classmethod
Expand All @@ -158,6 +159,7 @@ def remove_stale_dags(cls, expiration_date, session=None):
"scheduler since %s from %s table ", expiration_date, cls.__tablename__)

session.execute(
# pylint: disable=no-member
cls.__table__.delete().where(cls.last_updated < expiration_date)
)

Expand Down
Loading