Skip to content

Incorrect type annotation for num_of_dpus in GlueJobOperator/GlueJobHook  #29091

@amarjanovic-as24

Description

@amarjanovic-as24

Apache Airflow Provider(s)

amazon

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==7.1.0

Apache Airflow version

2.2.2

Operating System

macOS Ventura 13.1

Deployment

Docker-Compose

Deployment details

No response

What happened

When calling GlueJobOperator and passing
create_job_kwargs={"Command": {"Name": "pythonshell"}} I need to specify MaxCapacity and based on the code here that's equal to num_of_dpus and that parameter is integer as stated here
Because I want to use pythonshell, AWS Glue offers to setup ranges between 0.00625 and 1 and that can't be achieved with integer.
When you specify a Python shell job (JobCommand.Name="pythonshell"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.

I was trying to pass MaxCapacity in create_job_kwargs={"Command": {"Name": "pythonshell"}, "MaxCapacity": 0.0625} but it throws the error.

What you think should happen instead

I think that parameter num_of_dpus should be type double or MaxCapacity should be allowed to setup as double if pythonshell was selected in Command -> Name.

How to reproduce

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions