Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic queueing via REST API trigger #11080

Open
FelixReuthlingerBMW opened this issue Sep 22, 2020 · 6 comments
Open

Dynamic queueing via REST API trigger #11080

FelixReuthlingerBMW opened this issue Sep 22, 2020 · 6 comments
Labels
area:API Airflow's REST/HTTP API area:core area:Scheduler including HA (high availability) scheduler kind:feature Feature Requests

Comments

@FelixReuthlingerBMW
Copy link

FelixReuthlingerBMW commented Sep 22, 2020

Description

I want to run Airflow DAGs triggered using the REST API in different queue and want to set the queue name on calling the REST API.

Use case / motivation

Currently Airflow does support to set queues only on a per DAG deployment level, but in a case where I have a DAG that I want once run with high-prio and 1000s of times with a lower prio, I don't have a chance to make this happen using the REST API (which we need to use for doing programmatic integration with other software / services.

A work around is to deploy the DAG twice with some different hard-coded value for setting the queue, but this will cause huge amount of code and/or code deployment duplications. Also probably causing naming conflicts, etc. Applying work arounds for the part that a feature is missing that I can set the queue for a DAG run using the REST API's data model.

I would want to just send a Json like this to Airflow and it will figure out how to queue the DAG run:

{
"conf": {
... my application config for the DAG run...
}
"queue": "..."
}

Related Issues

Not aware of, yet.

@FelixReuthlingerBMW FelixReuthlingerBMW added the kind:feature Feature Requests label Sep 22, 2020
@mik-laj
Copy link
Member

mik-laj commented Sep 22, 2020

This is not officially supported, but you can try to use a cluster policy to configure the queue based on DAGRun.conf. See:
https://airflow.readthedocs.io/en/latest/concepts.html#mutate-task-instances-before-task-execution

@FelixReuthlingerBMW
Copy link
Author

Thanks for the hint. If I understand it correctly, this would just apply different settings to single tasks, but not the whole DAG run, right?

Problem then would be, if I first start 1000 DAG runs, the high-prio DAG run queuing as number 1001 would still need to wait until all the other DAG runs were finished, independently from in which queue the tasks are executed, right?

@mik-laj
Copy link
Member

mik-laj commented Sep 22, 2020

@FelixReuthlingerBMW Yes. But you can assign all tasks from a given DAG to one queue. In Airflow, tasks are assigned to a queue, not a DAG Run.

No. The tasks will run according to the queues to which they have been assigned. You can create two queues to run some tasks faster.

@FelixReuthlingerBMW
Copy link
Author

Maybe I should state that we are using limitations for parallel runs:

dag:
concurrency: 60
max_active_runs: 40

So, this limits to having maximal 40 DAG runs in parallel. I guess this then would not start more DAG runs, even if some of their tasks would be set to different queues, right?

@mik-laj
Copy link
Member

mik-laj commented Sep 22, 2020

I am not sure if active_dag_runs applies to externally trigger DAG Runs. It would have to be checked.

@FelixReuthlingerBMW
Copy link
Author

FelixReuthlingerBMW commented Sep 22, 2020

It does, works pretty fine ;)

If triggered via REST API, a DAG run will be created, but since the max DAG runs is limited, it will not start queuing tasks, since the amount of parallel runs is already limited.

@mik-laj mik-laj added the area:API Airflow's REST/HTTP API label Sep 30, 2020
@jscheffl jscheffl added area:Scheduler including HA (high availability) scheduler area:core labels Aug 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API area:core area:Scheduler including HA (high availability) scheduler kind:feature Feature Requests
Projects
None yet
Development

No branches or pull requests

3 participants