Skip to content

Cache user objects for given id in auth manager for better performance #60265

@tirkarthi

Description

@tirkarthi

Apache Airflow version

main (development)

If "Other Airflow 3 version" selected, which one?

No response

What happened?

We noticed in our instance the API calls had varied amount of time for same payload. For example the API call to fetch latest_run will be slow on the first load of the dag details page but subsequent auto refresh calls were faster. To debug this we enabled sql logging and the query for the endpoint was simple without any indexing or n+1 issues.

Later we commented out the dependencies for the endpoint Depends(requires_access_dag(method="GET", access_entity=DagAccessEntity.RUN)) and found the timings significantly improved. Then later we debugged and found that calls to convert the user_id from decoded jwt to User object from database was affecting the timings which includes a call to fetch user and also a complex join call to fetch permissions which could be causing the delays. We wrote a simple dict to store the id with the User object since we store the users in db and id is not reused. Thus it immutably holds a User to an id forever without any future conflicts. We applied the changes to our custom auth manager from fab auth manager and then noticed significant performance improvements. The distribution of the histogram was also much narrower without much deviation.

The frontend also became snappy since the auto refresh calls and other calls are much faster.

Database: MySQL
Workers: 4 (Detault as per airflow api-server command)

hey -H "Cookie:  _token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxIiwianRpIjoiMjI1MGFiMjE0NDIzNDQ2Mzk5YTQ5MDM4ZmY1Y2RiNjIiLCJpc3MiOltdLCJhdWQiOiJhcGFjaGUtYWlyZmxvdyIsIm5iZiI6MTc2Nzg3OTIwMSwiZXhwIjoxNzY3ODgyODAxLCJpYXQiOjE3Njc4NzkyMDF9.TljAVQubDebbjXl3gSROZG4H5_X7yU_wpU-IR4PMDfnvzcg4aJE7tHbHCLEGzDYsoa2uLQJLqyxA3V-8sqCJEg" http://localhost:8000/ui/dags/example_complex/latest_run

Summary:
  Total:	2.2426 secs
  Slowest:	1.1800 secs
  Fastest:	0.0623 secs
  Average:	0.4897 secs
  Requests/sec:	89.1818
  

Response time histogram:
  0.062 [1]	|■
  0.174 [11]	|■■■■■■
  0.286 [36]	|■■■■■■■■■■■■■■■■■■■■
  0.398 [72]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.509 [26]	|■■■■■■■■■■■■■■
  0.621 [4]	|■■
  0.733 [0]	|
  0.845 [7]	|■■■■
  0.956 [13]	|■■■■■■■
  1.068 [16]	|■■■■■■■■■
  1.180 [14]	|■■■■■■■■


Latency distribution:
  10% in 0.2208 secs
  25% in 0.2948 secs
  50% in 0.3543 secs
  75% in 0.8007 secs
  90% in 1.0434 secs
  95% in 1.1479 secs
  99% in 1.1799 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0003 secs, 0.0623 secs, 1.1800 secs
  DNS-lookup:	0.0001 secs, 0.0000 secs, 0.0008 secs
  req write:	0.0001 secs, 0.0000 secs, 0.0016 secs
  resp wait:	0.4890 secs, 0.0621 secs, 1.1787 secs
  resp read:	0.0002 secs, 0.0000 secs, 0.0019 secs

Status code distribution:
  [200]	200 responses

Queries

[2026-01-08T13:35:14.606436Z] {base.py:2710} INFO - BEGIN (implicit)
[2026-01-08T13:35:14.608468Z] {base.py:1846} INFO - SELECT ab_user.id, ab_user.first_name, ab_user.last_name, ab_user.username, ab_user.password, ab_user.active, ab_user.email, ab_user.last_login, ab_user.login_count, ab_user.fail_login_count, ab_user.created_on, ab_user.changed_on, ab_user.created_by_fk, ab_user.changed_by_fk 
FROM ab_user 
WHERE ab_user.id = %(id_1)s
[2026-01-08T13:35:14.608612Z] {base.py:1846} INFO - [generated in 0.00025s] {'id_1': 1}
[2026-01-08T13:35:14.616390Z] {base.py:1846} INFO - SELECT ab_user_1.id AS ab_user_1_id, ab_role.id AS ab_role_id, ab_role.name AS ab_role_name, ab_permission_1.id AS ab_permission_1_id, ab_permission_1.name AS ab_permission_1_name, ab_view_menu_1.id AS ab_view_menu_1_id, ab_view_menu_1.name AS ab_view_menu_1_name, ab_permission_view_1.id AS ab_permission_view_1_id, ab_permission_view_1.permission_id AS ab_permission_view_1_permission_id, ab_permission_view_1.view_menu_id AS ab_permission_view_1_view_menu_id 
FROM ab_user AS ab_user_1 INNER JOIN ab_user_role AS ab_user_role_1 ON ab_user_1.id = ab_user_role_1.user_id INNER JOIN ab_role ON ab_role.id = ab_user_role_1.role_id LEFT OUTER JOIN (ab_permission_view_role AS ab_permission_view_role_1 INNER JOIN ab_permission_view AS ab_permission_view_1 ON ab_permission_view_1.id = ab_permission_view_role_1.permission_view_id) ON ab_role.id = ab_permission_view_role_1.role_id LEFT OUTER JOIN ab_permission AS ab_permission_1 ON ab_permission_1.id = ab_permission_view_1.permission_id LEFT OUTER JOIN ab_view_menu AS ab_view_menu_1 ON ab_view_menu_1.id = ab_permission_view_1.view_menu_id 
WHERE ab_user_1.id IN (%(primary_keys_1)s)
[2026-01-08T13:35:14.616567Z] {base.py:1846} INFO - [generated in 0.00048s] {'primary_keys_1': 1}
[2026-01-08T13:35:14.622088Z] {base.py:2716} INFO - COMMIT
[2026-01-08T13:35:14.627531Z] {base.py:2710} INFO - BEGIN (implicit)
[2026-01-08T13:35:14.731046Z] {base.py:1846} INFO - SELECT team.name 
FROM dag_bundle INNER JOIN dag_bundle_team AS dag_bundle_team_1 ON dag_bundle.name = dag_bundle_team_1.dag_bundle_name INNER JOIN team ON team.name = dag_bundle_team_1.team_name INNER JOIN dag ON dag.bundle_name = dag_bundle.name 
WHERE dag.dag_id = %(dag_id_1)s
[2026-01-08T13:35:14.731210Z] {base.py:1846} INFO - [generated in 0.00030s] {'dag_id_1': 'example_complex'}
[2026-01-08T13:35:14.732375Z] {base.py:2716} INFO - COMMIT
[2026-01-08T13:35:14.734832Z] {base.py:2710} INFO - BEGIN (implicit)
[2026-01-08T13:35:14.736355Z] {base.py:1846} INFO - SELECT dag_run.id, dag_run.dag_id, dag_run.run_id, dag_run.end_date, dag_run.logical_date, dag_run.run_after, dag_run.start_date, dag_run.state 
FROM dag_run 
WHERE dag_run.dag_id = %(dag_id_1)s ORDER BY dag_run.run_after DESC 
 LIMIT %(param_1)s
[2026-01-08T13:35:14.736486Z] {base.py:1846} INFO - [generated in 0.00022s] {'dag_id_1': 'example_complex', 'param_1': 1}
[2026-01-08T13:35:14.738814Z] {base.py:2716} INFO - COMMIT
INFO:     127.0.0.1:58590 - "GET /ui/dags/example_complex/latest_run HTTP/1.1" 200 OK

After 1 iteration to ensure all workers have the cached value

hey -H "Cookie:  _token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxIiwianRpIjoiMjI1MGFiMjE0NDIzNDQ2Mzk5YTQ5MDM4ZmY1Y2RiNjIiLCJpc3MiOltdLCJhdWQiOiJhcGFjaGUtYWlyZmxvdyIsIm5iZiI6MTc2Nzg3OTIwMSwiZXhwIjoxNzY3ODgyODAxLCJpYXQiOjE3Njc4NzkyMDF9.TljAVQubDebbjXl3gSROZG4H5_X7yU_wpU-IR4PMDfnvzcg4aJE7tHbHCLEGzDYsoa2uLQJLqyxA3V-8sqCJEg" http://localhost:8000/ui/dags/example_complex/latest_run

Summary:
  Total:	0.4605 secs
  Slowest:	0.1938 secs
  Fastest:	0.0415 secs
  Average:	0.1033 secs
  Requests/sec:	434.3205
  

Response time histogram:
  0.042 [1]	|■
  0.057 [7]	|■■■■■
  0.072 [9]	|■■■■■■
  0.087 [28]	|■■■■■■■■■■■■■■■■■■■■
  0.102 [56]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.118 [44]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.133 [33]	|■■■■■■■■■■■■■■■■■■■■■■■■
  0.148 [11]	|■■■■■■■■
  0.163 [9]	|■■■■■■
  0.179 [1]	|■
  0.194 [1]	|■


Latency distribution:
  10% in 0.0748 secs
  25% in 0.0893 secs
  50% in 0.1020 secs
  75% in 0.1205 secs
  90% in 0.1344 secs
  95% in 0.1488 secs
  99% in 0.1640 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0002 secs, 0.0415 secs, 0.1938 secs
  DNS-lookup:	0.0001 secs, 0.0000 secs, 0.0024 secs
  req write:	0.0001 secs, 0.0000 secs, 0.0008 secs
  resp wait:	0.1027 secs, 0.0413 secs, 0.1924 secs
  resp read:	0.0002 secs, 0.0001 secs, 0.0018 secs

Status code distribution:
  [200]	200 responses
[2026-01-08T13:36:15.881633Z] {base.py:2710} INFO - BEGIN (implicit)
[2026-01-08T13:36:15.881946Z] {base.py:1846} INFO - SELECT team.name 
FROM dag_bundle INNER JOIN dag_bundle_team AS dag_bundle_team_1 ON dag_bundle.name = dag_bundle_team_1.dag_bundle_name INNER JOIN team ON team.name = dag_bundle_team_1.team_name INNER JOIN dag ON dag.bundle_name = dag_bundle.name 
WHERE dag.dag_id = %(dag_id_1)s
[2026-01-08T13:36:15.882026Z] {base.py:1846} INFO - [cached since 4.759s ago] {'dag_id_1': 'example_complex'}
[2026-01-08T13:36:15.883135Z] {base.py:2716} INFO - COMMIT
[2026-01-08T13:36:15.885019Z] {base.py:2710} INFO - BEGIN (implicit)
[2026-01-08T13:36:15.885327Z] {base.py:1846} INFO - SELECT dag_run.id, dag_run.dag_id, dag_run.run_id, dag_run.end_date, dag_run.logical_date, dag_run.run_after, dag_run.start_date, dag_run.state 
FROM dag_run 
WHERE dag_run.dag_id = %(dag_id_1)s ORDER BY dag_run.run_after DESC 
 LIMIT %(param_1)s
[2026-01-08T13:36:15.885404Z] {base.py:1846} INFO - [cached since 4.757s ago] {'dag_id_1': 'example_complex', 'param_1': 1}
[2026-01-08T13:36:15.887501Z] {base.py:2716} INFO - COMMIT
INFO:     127.0.0.1:58600 - "GET /ui/dags/example_complex/latest_run HTTP/1.1" 200 OK

Query analysis


mysql> explain SELECT ab_user_1.id AS ab_user_1_id, ab_role.id AS ab_role_id, ab_role.name AS ab_role_name, ab_permission_1.id AS ab_permission_1_id, ab_permission_1.name AS ab_permission_1_name, ab_view_menu_1.id AS ab_view_menu_1_id, ab_view_menu_1.name AS ab_view_menu_1_name, ab_permission_view_1.id AS ab_permission_view_1_id, ab_permission_view_1.permission_id AS ab_permission_view_1_permission_id, ab_permission_view_1.view_menu_id AS ab_permission_view_1_view_menu_id FROM ab_user AS ab_user_1 INNER JOIN ab_user_role AS ab_user_role_1 ON ab_user_1.id = ab_user_role_1.user_id INNER JOIN ab_role ON ab_role.id = ab_user_role_1.role_id LEFT OUTER JOIN (ab_permission_view_role AS ab_permission_view_role_1 INNER JOIN ab_permission_view AS ab_permission_view_1 ON ab_permission_view_1.id = ab_permission_view_role_1.permission_view_id) ON ab_role.id = ab_permission_view_role_1.role_id LEFT OUTER JOIN ab_permission AS ab_permission_1 ON ab_permission_1.id = ab_permission_view_1.permission_id LEFT OUTER JOIN ab_view_menu AS ab_view_menu_1 ON ab_view_menu_1.id = ab_permission_view_1.view_menu_id WHERE ab_user_1.id IN (1)
    -> ;
+----+-------------+---------------------------+------------+--------+-------------------------------------------------------+-------------+---------+---------------------------------------------------------+------+----------+--------------------------+
| id | select_type | table                     | partitions | type   | possible_keys                                         | key         | key_len | ref                                                     | rows | filtered | Extra                    |
+----+-------------+---------------------------+------------+--------+-------------------------------------------------------+-------------+---------+---------------------------------------------------------+------+----------+--------------------------+
|  1 | SIMPLE      | ab_user_1                 | NULL       | const  | PRIMARY                                               | PRIMARY     | 4       | const                                                   |    1 |   100.00 | Using index              |
|  1 | SIMPLE      | ab_user_role_1            | NULL       | ref    | user_id,role_id                                       | user_id     | 5       | const                                                   |    1 |   100.00 | Using where; Using index |
|  1 | SIMPLE      | ab_role                   | NULL       | eq_ref | PRIMARY                                               | PRIMARY     | 4       | airflow_db.ab_user_role_1.role_id                       |    1 |   100.00 | NULL                     |
|  1 | SIMPLE      | ab_permission_view_role_1 | NULL       | ref    | permission_view_id,idx_role_id,idx_permission_view_id | idx_role_id | 5       | airflow_db.ab_user_role_1.role_id                       |   62 |   100.00 | NULL                     |
|  1 | SIMPLE      | ab_permission_view_1      | NULL       | eq_ref | PRIMARY                                               | PRIMARY     | 4       | airflow_db.ab_permission_view_role_1.permission_view_id |    1 |   100.00 | NULL                     |
|  1 | SIMPLE      | ab_permission_1           | NULL       | eq_ref | PRIMARY                                               | PRIMARY     | 4       | airflow_db.ab_permission_view_1.permission_id           |    1 |   100.00 | NULL                     |
|  1 | SIMPLE      | ab_view_menu_1            | NULL       | eq_ref | PRIMARY                                               | PRIMARY     | 4       | airflow_db.ab_permission_view_1.view_menu_id            |    1 |   100.00 | NULL                     |
+----+-------------+---------------------------+------------+--------+-------------------------------------------------------+-------------+---------+---------------------------------------------------------+------+----------+--------------------------+
7 rows in set, 1 warning (0.01 sec)

mysql> explain analyze SELECT ab_user_1.id AS ab_user_1_id, ab_role.id AS ab_role_id, ab_role.name AS ab_role_name, ab_permission_1.id AS ab_permission_1_id, ab_permission_1.name AS ab_permission_1_name, ab_view_menu_1.id AS ab_view_menu_1_id, ab_view_menu_1.name AS ab_view_menu_1_name, ab_permission_view_1.id AS ab_permission_view_1_id, ab_permission_view_1.permission_id AS ab_permission_view_1_permission_id, ab_permission_view_1.view_menu_id AS ab_permission_view_1_view_menu_id FROM ab_user AS ab_user_1 INNER JOIN ab_user_role AS ab_user_role_1 ON ab_user_1.id = ab_user_role_1.user_id INNER JOIN ab_role ON ab_role.id = ab_user_role_1.role_id LEFT OUTER JOIN (ab_permission_view_role AS ab_permission_view_role_1 INNER JOIN ab_permission_view AS ab_permission_view_1 ON ab_permission_view_1.id = ab_permission_view_role_1.permission_view_id) ON ab_role.id = ab_permission_view_role_1.role_id LEFT OUTER JOIN ab_permission AS ab_permission_1
ON ab_permission_1.id = ab_permission_view_1.permission_id LEFT OUTER JOIN ab_view_menu AS ab_view_menu_1 ON ab_view_menu_1.id = ab_permission_view_1.view_menu_id WHERE ab_user_1.id IN (1) \G;
*************************** 1. row ***************************
EXPLAIN: -> Nested loop left join  (cost=51.3 rows=62.2) (actual time=0.127..0.669 rows=99 loops=1)
    -> Nested loop left join  (cost=29.5 rows=62.2) (actual time=0.123..0.524 rows=99 loops=1)
        -> Nested loop left join  (cost=7.68 rows=62.2) (actual time=0.119..0.395 rows=99 loops=1)
            -> Nested loop inner join  (cost=0.7 rows=1) (actual time=0.0137..0.0152 rows=1 loops=1)
                -> Filter: (ab_user_role_1.role_id is not null)  (cost=0.35 rows=1) (actual time=0.00628..0.00747 rows=1 loops=1)
                    -> Covering index lookup on ab_user_role_1 using user_id (user_id=1)  (cost=0.35 rows=1) (actual time=0.00567..0.00673 rows=1 loops=1)
                -> Single-row index lookup on ab_role using PRIMARY (id=ab_user_role_1.role_id)  (cost=0.35 rows=1) (actual time=0.0063..0.00635 rows=1 loops=1)
            -> Nested loop inner join  (cost=28.8 rows=62.2) (actual time=0.105..0.368 rows=99 loops=1)
                -> Index lookup on ab_permission_view_role_1 using idx_role_id (role_id=ab_user_role_1.role_id)  (cost=6.98 rows=62.2) (actual time=0.101..0.176 rows=99 loops=1)
                -> Single-row index lookup on ab_permission_view_1 using PRIMARY (id=ab_permission_view_role_1.permission_view_id)  (cost=0.252 rows=1) (actual time=0.00161..0.00166 rows=1 loops=99)
        -> Single-row index lookup on ab_permission_1 using PRIMARY (id=ab_permission_view_1.permission_id)  (cost=0.252 rows=1) (actual time=978e-6..0.00103 rows=1 loops=99)
    -> Single-row index lookup on ab_view_menu_1 using PRIMARY (id=ab_permission_view_1.view_menu_id)  (cost=0.252 rows=1) (actual time=0.00114..0.00119 rows=1 loops=99)

1 row in set (0.01 sec)

ERROR: 
No query specified
diff --git a/providers/fab/src/airflow/providers/fab/auth_manager/fab_auth_manager.py b/providers/fab/src/airflow/providers/fab/auth_manager/fab_auth_manager.py
index c5486bcdc1..8df85f081f 100644
--- a/providers/fab/src/airflow/providers/fab/auth_manager/fab_auth_manager.py
+++ b/providers/fab/src/airflow/providers/fab/auth_manager/fab_auth_manager.py
@@ -176,6 +176,7 @@ class FabAuthManager(BaseAuthManager[User]):
     This auth manager is responsible for providing a backward compatible user management experience to users.
     """
 
+    user_cache: dict[int, User] = {}
     appbuilder: AirflowAppBuilder | None = None
 
     def init_flask_resources(self) -> None:
@@ -254,8 +255,15 @@ class FabAuthManager(BaseAuthManager[User]):
         return current_user
 
     def deserialize_user(self, token: dict[str, Any]) -> User:
-        with create_session() as session:
-            return session.scalars(select(User).where(User.id == int(token["sub"]))).one()
+        id = int(token["sub"])
+        user = self.user_cache.get(id)
+
+        if not user:
+            with create_session() as session:
+                user = session.scalars(select(User).where(User.id == id)).one()
+                self.user_cache[id] = user
+
+        return user
 
     def serialize_user(self, user: User) -> dict[str, Any]:
         return {"sub": str(user.id)}

What you think should happen instead?

No response

How to reproduce

  1. Trigger a dagrun and then copy the token in cookie to get the latest dagrun as above.
  2. Apply the patch and make one iteration to ensure the workers have their cache filled. Repeat the benchmark.

Operating System

Ubuntu 20.04

Versions of Apache Airflow Providers

No response

Deployment

Virtualenv installation

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions