Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After upgrade from Airflow 2.2.4, grid disappears for some DAGs #23588

Closed
1 of 2 tasks
rotemseekingalpha opened this issue May 9, 2022 · 33 comments · Fixed by #32992
Closed
1 of 2 tasks

After upgrade from Airflow 2.2.4, grid disappears for some DAGs #23588

rotemseekingalpha opened this issue May 9, 2022 · 33 comments · Fixed by #32992
Labels
affected_version:2.3 Issues Reported for 2.3 area:API Airflow's REST/HTTP API area:UI Related to UI/UX. For Frontend Developers. kind:bug This is a clearly a bug

Comments

@rotemseekingalpha
Copy link

Apache Airflow version

2.3.0 (latest released)

What happened

After the upgrade from 2.2.4 to 2.3.0, some DAGs grid data seems missing and it renders the UI blank

What you think should happen instead

When I click the grid for a specific execution date, I expect to be able to click the tasks and view the log, render jinja templating, and clear status

How to reproduce

Run an upgrade from 2.2.4 to 2.3.0 with a huge database (we have ~750 DAGs with a minimum of 10 tasks each).
In addition, we heavily rely on XCom.

Operating System

Ubuntu 20.04.3 LTS

Versions of Apache Airflow Providers

apache-airflow apache_airflow-2.3.0-py3-none-any.whl
apache-airflow-providers-amazon apache_airflow_providers_amazon-3.3.0-py3-none-any.whl
apache-airflow-providers-ftp apache_airflow_providers_ftp-2.1.2-py3-none-any.whl
apache-airflow-providers-http apache_airflow_providers_http-2.1.2-py3-none-any.whl
apache-airflow-providers-imap apache_airflow_providers_imap-2.2.3-py3-none-any.whl
apache-airflow-providers-mongo apache_airflow_providers_mongo-2.3.3-py3-none-any.whl
apache-airflow-providers-mysql apache_airflow_providers_mysql-2.2.3-py3-none-any.whl
apache-airflow-providers-pagerduty apache_airflow_providers_pagerduty-2.1.3-py3-none-any.whl
apache-airflow-providers-postgres apache_airflow_providers_postgres-4.1.0-py3-none-any.whl
apache-airflow-providers-sendgrid apache_airflow_providers_sendgrid-2.0.4-py3-none-any.whl
apache-airflow-providers-slack apache_airflow_providers_slack-4.2.3-py3-none-any.whl
apache-airflow-providers-sqlite apache_airflow_providers_sqlite-2.1.3-py3-none-any.whl
apache-airflow-providers-ssh apache_airflow_providers_ssh-2.4.3-py3-none-any.whl
apache-airflow-providers-vertica apache_airflow_providers_vertica-2.1.3-py3-none-any.whl

Deployment

Virtualenv installation

Deployment details

Python 3.8.10

Anything else

For the affected DAGs, all the time

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@rotemseekingalpha rotemseekingalpha added area:core kind:bug This is a clearly a bug labels May 9, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented May 9, 2022

Thanks for opening your first issue here! Be sure to follow the issue template!

@rotemseekingalpha
Copy link
Author

Video that shows the problem

@jpipas
Copy link

jpipas commented May 9, 2022

Getting the same thing - It appears to be an issue with the getTask API (there's several javascript errors in console). Most notably the /tasks endpoint for a given dag.

Python version: 3.9.12
Airflow version: 2.3.0+astro.3

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/usr/local/lib/python3.9/site-packages/connexion/decorators/decorator.py", line 68, in wrapper
    response = function(request)
  File "/usr/local/lib/python3.9/site-packages/connexion/decorators/uri_parsing.py", line 149, in wrapper
    response = function(request)
  File "/usr/local/lib/python3.9/site-packages/connexion/decorators/validation.py", line 399, in wrapper
    return function(request)
  File "/usr/local/lib/python3.9/site-packages/connexion/decorators/response.py", line 112, in wrapper
    response = function(request)
  File "/usr/local/lib/python3.9/site-packages/connexion/decorators/parameter.py", line 116, in wrapper
    return function(**kwargs)
  File "/usr/local/lib/python3.9/site-packages/airflow/api_connexion/security.py", line 49, in decorated
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/airflow/api_connexion/endpoints/task_endpoint.py", line 67, in get_tasks
    return task_collection_schema.dump(task_collection)
  File "/usr/local/lib/python3.9/site-packages/marshmallow/schema.py", line 552, in dump
    result = self._serialize(processed_obj, many=many)
  File "/usr/local/lib/python3.9/site-packages/marshmallow/schema.py", line 520, in _serialize
    value = field_obj.serialize(attr_name, obj, accessor=self.get_attribute)
  File "/usr/local/lib/python3.9/site-packages/marshmallow/fields.py", line 338, in serialize
    return self._serialize(value, attr, obj, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/marshmallow/fields.py", line 765, in _serialize
    return [self.inner._serialize(each, attr, obj, **kwargs) for each in value]
  File "/usr/local/lib/python3.9/site-packages/marshmallow/fields.py", line 765, in <listcomp>
    return [self.inner._serialize(each, attr, obj, **kwargs) for each in value]
  File "/usr/local/lib/python3.9/site-packages/marshmallow/fields.py", line 634, in _serialize
    return schema.dump(nested_obj, many=many)
  File "/usr/local/lib/python3.9/site-packages/marshmallow/schema.py", line 552, in dump
    result = self._serialize(processed_obj, many=many)
  File "/usr/local/lib/python3.9/site-packages/marshmallow/schema.py", line 520, in _serialize
    value = field_obj.serialize(attr_name, obj, accessor=self.get_attribute)
  File "/usr/local/lib/python3.9/site-packages/marshmallow/fields.py", line 338, in serialize
    return self._serialize(value, attr, obj, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/marshmallow/fields.py", line 1870, in _serialize
    return self._serialize_method(obj)
  File "/usr/local/lib/python3.9/site-packages/airflow/api_connexion/schemas/task_schema.py", line 70, in get_params
    return {k: v.dump() for k, v in params.items()}
  File "/usr/local/lib/python3.9/site-packages/airflow/api_connexion/schemas/task_schema.py", line 70, in <dictcomp>
    return {k: v.dump() for k, v in params.items()}
AttributeError: 'dict' object has no attribute 'dump'

@bbovenzi bbovenzi added area:API Airflow's REST/HTTP API area:UI Related to UI/UX. For Frontend Developers. and removed area:core labels May 9, 2022
@bbovenzi
Copy link
Contributor

bbovenzi commented May 9, 2022

Hmm thats a new one. @norm would you mind taking a look at get tasks?

@jpipas
Copy link

jpipas commented May 14, 2022

Just wanted to update, that this appears to only be happening for DAGs that have TaskGroups. When viewing my "non-taskgroup" DAGs, the grid-view works as expected. When trying to select individual tasks on a DAG with task groups (and therefore have expand/collapsible grid), clicking on an individual task the entire grid view disappears.

@ashb
Copy link
Member

ashb commented May 17, 2022

Can someone give us a DAG and re-production steps please?

@bbovenzi
Copy link
Contributor

Also, what are the other js console errors aside from the get task endpoint? Even if that endpoint throws an error, the UI shouldn't crash

@rotemseekingalpha
Copy link
Author

rotemseekingalpha commented May 19, 2022

I get first error message twice when then the second when the Grid view loads:

# First
VM35:1          
GET https://****************/api/v1/dags/some_dag/tasks 500
    (anonymous)                 @   VM35:1
    (anonymous)                 @   tree.ac7202f….js:2
    e.exports                   @   tree.ac7202f….js:2
    e.exports                   @   tree.ac7202f….js:2
    c.request                   @   tree.ac7202f….js:2
    r.forEach.c.<computed>      @   tree.ac7202f….js:2
    (anonymous)                 @   tree.ac7202f….js:2
    (anonymous)                 @   tree.ac7202f….js:2
    fetchFn                     @   tree.ac7202f….js:2
    s                           @   tree.ac7202f….js:2
    c                           @   tree.ac7202f….js:2
    t.fetch                     @   tree.ac7202f….js:2
    n.executeFetch              @   tree.ac7202f….js:2
    n.onSubscribe               @   tree.ac7202f….js:2
    t.subscribe                 @   tree.ac7202f….js:2
    (anonymous)                 @   tree.ac7202f….js:2
    Dl                          @   tree.ac7202f….js:2
    t.unstable_runWithPriority  @   tree.ac7202f….js:2
    Go                          @   tree.ac7202f….js:2
    Bl                          @   tree.ac7202f….js:2
    bl                          @   tree.ac7202f….js:2
    (anonymous)                 @   tree.ac7202f….js:2
    t.unstable_runWithPriority  @   tree.ac7202f….js:2
    Go                          @   tree.ac7202f….js:2
    Yo                          @   tree.ac7202f….js:2
    Xo                          @   tree.ac7202f….js:2
    xl                          @   tree.ac7202f….js:2
    is                          @   tree.ac7202f….js:2
    t.render                    @   tree.ac7202f….js:2
    (anonymous)                 @   tree.ac7202f….js:2
    n                           @   tree.ac7202f….js:2
    (anonymous)                 @   tree.ac7202f….js:2
    n                           @   tree.ac7202f….js:2
    (anonymous)                 @   tree.ac7202f….js:2
    (anonymous)                 @   tree.ac7202f….js:2
    (anonymous)                 @   tree.ac7202f….js:2
    (anonymous)                 @   tree.ac7202f….js:2

# Second
Error: Request failed with status code 500
    at e.exports (tree.ac7202f96db7173e9baf.js:2:141127)
    at e.exports (tree.ac7202f96db7173e9baf.js:2:297309)
    at XMLHttpRequest.E (tree.ac7202f96db7173e9baf.js:2:139518)

And finally, the following error twice when I click one of the tasks:

tree.ac7202f96db7173e9baf.js:2 
TypeError: Cannot read properties of undefined (reading 'tasks')
    at B_ (tree.ac7202f96db7173e9baf.js:2:739128)
    at ui (tree.ac7202f96db7173e9baf.js:2:228656)
    at Qu (tree.ac7202f96db7173e9baf.js:2:280737)
    at Pl (tree.ac7202f96db7173e9baf.js:2:268010)
    at jl (tree.ac7202f96db7173e9baf.js:2:267938)
    at Cl (tree.ac7202f96db7173e9baf.js:2:267801)
    at bl (tree.ac7202f96db7173e9baf.js:2:264788)
    at tree.ac7202f96db7173e9baf.js:2:214575
    at t.unstable_runWithPriority (tree.ac7202f96db7173e9baf.js:2:291432)
    at Go (tree.ac7202f96db7173e9baf.js:2:214352)
cu                          @ tree.ac7202f96db7173e9baf.js:2
n.callback                  @ tree.ac7202f96db7173e9baf.js:2
ma                          @ tree.ac7202f96db7173e9baf.js:2
gu                          @ tree.ac7202f96db7173e9baf.js:2
Il                          @ tree.ac7202f96db7173e9baf.js:2
t.unstable_runWithPriority  @ tree.ac7202f96db7173e9baf.js:2
Go                          @ tree.ac7202f96db7173e9baf.js:2
Tl                          @ tree.ac7202f96db7173e9baf.js:2
bl                          @ tree.ac7202f96db7173e9baf.js:2
(anonymous)                 @ tree.ac7202f96db7173e9baf.js:2
t.unstable_runWithPriority  @ tree.ac7202f96db7173e9baf.js:2
Go                          @ tree.ac7202f96db7173e9baf.js:2
Yo                          @ tree.ac7202f96db7173e9baf.js:2
Xo                          @ tree.ac7202f96db7173e9baf.js:2
Me                          @ tree.ac7202f96db7173e9baf.js:2
Yt                          @ tree.ac7202f96db7173e9baf.js:2

@ashb
Copy link
Member

ashb commented May 19, 2022

Great thanks -- the second error is likely a side effect of the first.

What error appears in the webserver logs when you do the first action?

@jpipas
Copy link

jpipas commented May 19, 2022

The web server error is what I posted above... ends with:

AttributeError: 'dict' object has no attribute 'dump'

I too am getting the same JS errors that @rotemseekingalpha is showing as well. My theory about taskgroups however, appears wrong because it's also happening on "non-taskgroup" DAGs. In my case they're "previously parsed" dags, and were there before the upgrade to 2.3.0 - so I assume there's something in the migration for these dags or that happened to them in the past that 2.3.0 just doesn't like.

@ashb
Copy link
Member

ashb commented May 19, 2022

Do you have task.params or dag.params set to by any chance?

If someone can share with me a dag that exhibits this behaviour that would be very useful.

@rotemseekingalpha
Copy link
Author

@ashb we rely heavily on task.params

@ashb
Copy link
Member

ashb commented May 20, 2022

I'll say it for the third time: we need a DAG that has this behavoiur to reproduce and fix it please.

@zachliu
Copy link
Contributor

zachliu commented May 25, 2022

@ashb steps to reproduce: #23908 which seems to be a more severe version of this issue: the grid view can't be rendered at all

@ashb
Copy link
Member

ashb commented May 27, 2022

@zachliu That grid view issue was caused by something else (removing a task from a DAG after it has run) to the 500 errors reported in this thread.

@maf-rnmourao
Copy link

The issue seems to happen only when you are not logged in. It can be probably solved by customizing some permission.

Not logged in:
Screen Shot 2022-06-02 at 11 04 00 AM

Logged in:
Screen Shot 2022-06-02 at 11 07 05 AM

@bbovenzi
Copy link
Contributor

bbovenzi commented Jun 2, 2022

Got it. At least one step: We need to make sure an error on /tasks doesn't crash the UI

@NilsJPWerner
Copy link

NilsJPWerner commented Jun 8, 2022

I'm having the same issue:

image

Thank you for your fix @bbovenzi! Looking forward to it in the next version!

@junaidnasir-ps
Copy link

it's also happening for us when the user role is viewer (not logged in). (for admin role it works fine)
also using helm and k8 with airflow:2.3.2-debian-11-r2 (docker.io/bitnami/airflow:2.3.2-debian-11-r2)

@jpipas
Copy link

jpipas commented Jun 13, 2022

We were able to determine that this issue was caused by a custom operator that erroneously had params as part of its template_fields. Therefore any DAG that contained this operator would "flash" the grid - and then the subsequent API calls would fail - which then caused the grid to disappear. After removing this "keyword" used by the baseoperator from template_fields the grids load properly now.

Before: template_fields = ["sql_query", "s3_key","params"]
After: template_fields = ["sql_query", "s3_key"]

@uranusjr
Copy link
Member

@jpipas Do you know what error the API is failing with? (May be available in web server logs.) Ideally the web server should be a bit more resilient to this kind of user errors.

@IonutArmeanu-work
Copy link

IonutArmeanu-work commented Jun 16, 2022

I get the same error when I'm using AUTH_ROLE_PUBLIC = 'Admin' in webserver_config.py.
The /tasks endpoint fails with HTTP 401 ( Unauthorized )

@j-adamczyk
Copy link

I get the same error, but Grid View disappears almost immediately after entering that page, without clicking anything. Also I have small number of DAGs.

@ephraimbuddy ephraimbuddy added this to the Airflow 2.4.0 milestone Aug 16, 2022
@eladkal eladkal added the affected_version:2.3 Issues Reported for 2.3 label Aug 16, 2022
@ashb ashb modified the milestones: Airflow 2.4.0, Airflow 2.4.1 Sep 8, 2022
@kobethuwis
Copy link
Contributor

kobethuwis commented Sep 21, 2022

@cloventt @rotemseekingalpha I'm using the official Airflow helm chart. Encountered the same error. In the chart, we use AUTH_ROLE_PUBLIC = 'Admin' as well as webserver config.

Have been able to fix this error by adding the following to the chart:

extraEnv: |
  - name: AIRFLOW__API__AUTH_BACKENDS
    value: "airflow.api.auth.backend.default"

@cgadam
Copy link

cgadam commented Sep 21, 2022

@cloventt @rotemseekingalpha I'm using the official Airflow helm chart. Encountered the same error. In the chart, we use AUTH_ROLE_PUBLIC = 'Admin' as well as webserver config.

Have been able to fix this error by adding the following to the chart:

extraEnv: |
  - name: AIRFLOW__API__AUTH_BACKENDS
    value: "airflow.api.auth.backend.default"

@kobethuwis thank you for this tip! Adding airflow.api.auth.backend.default to AIRFLOW__API__AUTH_BACKENDS also fixed this issue in my case! Nice workaround until we have this issue solved! Thanks!

@ephraimbuddy ephraimbuddy modified the milestones: Airflow 2.4.3, Airflow 2.4.4 Nov 9, 2022
@ephraimbuddy ephraimbuddy modified the milestones: Airflow 2.4.4, Airflow 2.5.0, Airflow 2.5.1 Nov 23, 2022
@eladkal eladkal removed this from the Airflow 2.6.1 milestone Apr 28, 2023
@radiophysicist
Copy link

We have the same problem in some DAGs (others working as expected) after upgrade from 1.0.15 to 2.5.3).
No errors both in browser console and webserver logs

@mjkonarski-b
Copy link
Contributor

mjkonarski-b commented Jun 26, 2023

I believe I found the root cause of the problem.

A new feature was introduced in 2.3.0 - "details drawer" in Grid View (#22123). Under the hood in calls /confirm endpoint here. The problem is that this endpoint requires
CAN_EDIT RESOURCE_DAG and CAN_EDIT RESOURCE_TASK_INSTANCE permissions, which are obviously not granted to the Viewer role by default.

As a result the /confirm XHR call is redirected to the login page and returns its HTML content, which breaks the logic behind the details drawer and we get a blank page.

On top of that #30373 introduced more improvements to the details drawer and now there's also a POST being sent to /clear which requires even more permissions.

EDIT: It's not actually the details drawer itself that is problematic, but its "Mark task as" modals.

@potiuk
Copy link
Member

potiuk commented Jun 26, 2023

cc: @bbovenzi @pierrejeambrun ^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affected_version:2.3 Issues Reported for 2.3 area:API Airflow's REST/HTTP API area:UI Related to UI/UX. For Frontend Developers. kind:bug This is a clearly a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.