Skip to content

"AirflowException: Invalid arguments were passed to GlueJobOperator" when setting update_config=True #35637

@andrewparsons-janus

Description

@andrewparsons-janus

Apache Airflow version

Other Airflow 2 version (please specify below)

Airflow version: 2.4.3
apache-airflow-providers-amazon==8.7.1

What happened

Exception when loading DAG:

Broken DAG: [/opt/airflow/dags/<project_directory>/<dag>.py] Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/baseoperator.py", line 408, in apply_defaults
    result = func(self, **kwargs, default_args=default_args)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/baseoperator.py", line 756, in __init__
    raise AirflowException(
airflow.exceptions.AirflowException: Invalid arguments were passed to GlueJobOperator (task_id: submit_glue_job). Invalid arguments were:
**kwargs: {'update_config': True}

What you think should happen instead

I don't believe passing an argument to GlueJobOperator's update_config init parameter should throw an exception. I also don't believe that I should have to set ALLOW_ILLEGAL_ARGUMENTS. Why would a documented parameter be illegal?

If this is expected behavior, then it should be better documented.

How to reproduce

If needed, I can provide a comprehensive example, but anyone should be able to adapt this task with a basic DAG.

@task_group(group_id="...", default_args=None)
def run_glue_job(job_name: str, script_args: dict):
    submit_glue_job = GlueJobOperator(

        # BaseOperator kwargs
        task_id="submit_glue_job",
        retries=0,
        wait_for_completion=False,
        on_success_callback=foo,
        on_failure_callback=[foo, bar],

        # GlueJobOperator kwargs
        job_name=job_name,
        script_location=f"{S3_PATH/glue_script.py",
        script_args=script_args,

        update_config=True,  # <-- this causes an exception!

        **EXTRA_KWARGS_GLUE,
    )

    wait_on_glue_job = GlueJobSensor(
        task_id="wait_on_glue_job",
        job_name=job_name,
        run_id=submit_glue_job.output,  # type: ignore
        on_failure_callback=bar,
    )

    submit_glue_job >> wait_on_glue_job

Operating System

macOS 14.1.1

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==8.7.1

Deployment

Docker-Compose

Deployment details

Docker version 20.10.24, build 297e128

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct


Edit 1: adjust listed version for apache-airflow-providers-amazon

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions