Skip to content

Upgrade to Airflow 2.11.0 #158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 31 commits into
base: master
Choose a base branch
from
Open

Upgrade to Airflow 2.11.0 #158

wants to merge 31 commits into from

Conversation

Lee-W
Copy link
Member

@Lee-W Lee-W commented May 25, 2025

Types of changes

  • Refactoring
  • Breaking change (any change that would cause existing functionality to not work as expected)
  • Documentation Update
  • Other (please describe)

Description

This PR is quite huge, but I have organized the commits into several categories. If necessary, this PR can be divided into four parts: the upgrade to Airflow 2.11, MkDocs, TaskFlow, and the refactoring of posts and insights. Since we are not squashing commits and I have detailed the changes in each git commit, it should be straightforward to review the PR on a commit-by-commit basis.

  • Upgrade airflow to 2.11.0
    • fix(dags): remove outdated airflow 1 syntax from "airflow-log-cleanup.py" and "airlfow-db-cleanup.py"
    • build(docker-compose): add airflow-init for enabling airflow migration
    • build(docker-compose): add healthcheck to webserver
    • build(Dockerfile): add patches to installed airflow
    • fix(patch): add migration patch
    • feat: upgrade to airflow 2.11.0
    • feat: upgrade to airflow 2.10.3
    • feat(airflow.cfg): migrate old airflow configuration to 2.6.3 version
    • feat(airflow-config): add default webserver_config.py
    • feat: update airflow to 2.6.3 and Python to 3.10
  • dockerfile enhancement
    • build(Dockerfile): remove gcc installation
    • build(Dockerfile): remove git installation as no longer needed
  • dags refactoring
    • refactor(dag:*post_insights): extract common dump_to_bq_logic
    • refactor(dag:*post_insights): extract bq_client as cached_property
    • refactor(dag:*post_insights): deduplicate common code with *PostsInsightsParser
    • refactor(dag:FB_POST_INSIGHTS_V1): improve typing and simplify logic
    • docs(dag:TWITTER_POST_NOTIFICATION_BOT_V2): add airflow 3 upgrade note
    • refactor(dag:KKTIX_DISCORD_BOT_FOR_TEAM_REGISTRATION): improve typing and simplify logic
    • refactor(dag:DISCORD_PROPOSAL_REMINDER_v3): remove unnessary kwargs
    • refactor(dag:DISCORD_FINANCE_REMINDER): improve typing and simplify logic
    • refactor(dag:DISCORD_CHORES_REMINDER): remove unnessary kwargs
    • refactor(dags): update dags to use taskflow syntax whenever possible
    • fix: add init.py for each module
  • setup mkdocs for future docs improvement
    • docs: initialize mkdocs structure
    • build: add mkdocs and mkdocs-material to doc dependency group
  • misc
    • style: improve or temporarily ignore mypy warnings
    • ci(github): update google-github-actions/auth to v2
    • style: format through ruff
    • build(dev-dep): add types-requests>=2.31.0.1 and types-python-dateutil>=2.8.19.13
    • fix: move airflow.db to sqlite/airflow.db

Checklist

  • Add test cases to all the changes you introduce
  • Run make lint and make test locally to ensure all linter checks and testing pass
  • Update the documentation if necessary

Steps to Test This Pull Request

Expected behavior

Related Issue

Additional context

After this is merged and deployed, we'll need to do the following things.

  1. Create a superuser account.
  2. Disable the currently active dags and keep a record of the list.
  3. Back up the database.
  4. Move the database from airflow.db to sqlite/airflow.db.
  5. Upgrade the database.
  6. Purge the connections table and recreate it manually (variable somehow works fine)
  7. Re-enable the dags that were previously turned on.

@Lee-W Lee-W force-pushed the upgrade-to-airflow-2 branch 2 times, most recently from cf6bcd2 to bdeac52 Compare May 25, 2025 12:41
@Lee-W
Copy link
Member Author

Lee-W commented May 25, 2025

We'll need to manually backport apache/airflow#50745 if we don't want to wait for 2.11.1

@Lee-W Lee-W force-pushed the upgrade-to-airflow-2 branch 5 times, most recently from 1247cab to 918475b Compare June 1, 2025 10:00
Lee-W added 3 commits June 1, 2025 18:11
This is the highest airflow version with no package conflict and extra build needed on a
M-series mac.
The way we define in docker-compose creates a "airflow.db" directory if the `airflow.db`
file is absent, causing Airflow to load the empty directory as a SQLite database.

By changing it to "sqlite" directory, it loads the content in it if presents and creates
a new one if not.
@Lee-W Lee-W force-pushed the upgrade-to-airflow-2 branch from 6c857a0 to 525bd05 Compare June 1, 2025 10:11
@Lee-W Lee-W force-pushed the upgrade-to-airflow-2 branch from 525bd05 to 14fbd3f Compare June 1, 2025 11:31
@Lee-W Lee-W changed the title Upgrade to airflow 2 Upgrade to Airflow 2.11.0 Jun 1, 2025
@Lee-W Lee-W marked this pull request as ready for review June 1, 2025 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants