Skip to content

SparkKubernetesOperator doesn't respect name from application_file parameter (yaml) #41188

@andallo

Description

@andallo

Apache Airflow Provider(s)

cncf-kubernetes

Versions of Apache Airflow Providers

8.3.1

Apache Airflow version

2.9.2

Operating System

Debian GNU/Linux 12 (bookworm)

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

What happened

SparkKubernetesOperator creates SparkApplication with templated name. The name is always consist from task name and unique 8 symbols string connected by '-'. Operator ignore the name of SparkApplication from yaml, that it consumes by application_file parameter:

apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
    name: user-specified-name
    namespace: forge
spec:
...

This is inconvenient in case, when there are a lot of DAGs with same name of SparkKubernetesOperator task. As a result all SparkApplications looks similar. It also causes the similar names for drivers, because driver names are consist from SparkApplication name and suffix '-driver'.

What you think should happen instead

SparkKubernetesOperator should inspect there is a name for SparkApplication in yaml from application_file parameter (path in yaml = metadata.name). If there is a name in the yaml operator should use it for SparkApplication CRD. If no name specified in the yaml, operator should use templated name for it.

How to reproduce

Start SparkApplication using SparkKubernetesOperator. Pass yaml with specified metadata.name option to application_file parameter. The name of created SparkApplication will not be the same as in yaml.

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions