Skip to content

DagProcessor restart constantly when it running as standalone process #30251

@antoniocorralsierra

Description

@antoniocorralsierra

Apache Airflow version

2.5.2

What happened

I'm running Airflow locally in my minikube cluster. For the deployment I use Official Apache Airflow Helm Chart (1.8.0) with the follow values.yaml (helm install airflow-release -f values.yaml apache-airflow/airflow) :

defaultAirflowTag: "2.5.2"
airflowVersion: "2.5.2"

dagProcessor:
enabled: true
replicas: 1

env:
name: "AIRFLOW__CORE__LOAD_EXAMPLES"
value: "True"

All component is deployed correctly but dag processor pod is restarting each 5 minutes. When I inspect this pod I found that the liveness probe failed due to timeout. The command executed by the pod is "sh -c CONNECTION_CHECK_MAX_COUNT=0 AIRFLOW__LOGGING__LOGGING_LEVEL=ERROR exec /entrypoint \\nairflow jobs check --hostname $(hostname)\n".

The following message error is reported:

Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/loading.py", line 1241, in configure_subclass_mapper
sub_mapper = mapper.polymorphic_map[discriminator]
KeyError: 'DagProcessorJob'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/airflow/.local/bin/airflow", line 8, in
sys.exit(main())
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/main.py", line 48, in main
args.func(args)
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/cli/cli_parser.py", line 52, in command
return func(*args, **kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/utils/session.py", line 75, in wrapper
return func(*args, session=session, **kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/cli/commands/jobs_command.py", line 47, in check
jobs: list[BaseJob] = query.all()
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 2773, in all
return self._iter().all()
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1476, in all
return self._allrows()
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 401, in _allrows
rows = self._fetchall_impl()
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1389, in _fetchall_impl
return self._real_result._fetchall_impl()
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1813, in _fetchall_impl
return list(self.iterator)
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/loading.py", line 151, in chunks
rows = [proc(row) for row in fetch]
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/loading.py", line 151, in
rows = [proc(row) for row in fetch]
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/loading.py", line 1269, in polymorphic_instance
_instance = polymorphic_instances[discriminator]
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/util/_collections.py", line 746, in missing
self[key] = val = self.creator(key)
File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/loading.py", line 1244, in configure_subclass_mapper
"No such polymorphic_identity %r is defined" % discriminator
AssertionError: No such polymorphic_identity 'DagProcessorJob' is defined

What you think should happen instead

I think there is an error in the file /airflow/airflow/cli/cli_parser.py (tag 2.5.2 commit). In line 919 I found this:

ARG_JOB_TYPE_FILTER = Arg(
("--job-type",),
choices=("BackfillJob", "LocalTaskJob", "SchedulerJob", "TriggererJob"),
action="store",
help="The type of job(s) that will be checked.",
)

How we can see, DagProcessorJob does not appear in choices. I think that this could belong to the problem.

PD: In recent version of code, cli_parser.py is split in cli_config.py for that we found this code in it.

How to reproduce

Deploy Airflow with Official Helm Chart (1.8.0) on minikube cluster with the configuration indicate on "What happened".

Operating System

Ubuntu 20.04.6 LTS

Versions of Apache Airflow Providers

No response

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:corekind:bugThis is a clearly a bugneeds-triagelabel for new issues that we didn't triage yet

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions