Skip to content

Conversation

@alexott
Copy link
Contributor

@alexott alexott commented Nov 1, 2022

Existing implementation uses list API to find all possible jobs and filtering out them by name. This leads to huge load on the API backend for workspaces with big number of jobs (> 1k). Databricks Workflows team extended list API with search functionality.

@alexott
Copy link
Contributor Author

alexott commented Nov 1, 2022

Instead of explicit parameter we can check for presence of an environment variable. Not sure what is the best way, especially when we depreciate old implementation completely

@o-nikolas
Copy link
Contributor

Instead of explicit parameter we can check for presence of an environment variable. Not sure what is the best way, especially when we depreciate old implementation completely

Could you not attempt to make the new API request and then handle the error/exception you get back if it isn't supported by the backend being used, then automatically fall back to the old api in that case?

@alexott
Copy link
Contributor Author

alexott commented Nov 2, 2022

@o-nikolas it's not really a completely new API, it's a new parameter to that API. If API doesn't support that parameter it just returns all data as before...

@alexott alexott marked this pull request as draft November 2, 2022 07:10
@o-nikolas
Copy link
Contributor

@o-nikolas it's not really a completely new API, it's a new parameter to that API. If API doesn't support that parameter it just returns all data as before...

If that's the case, then it should be easy to build this in a backwards compatible way without the use_old_list_impl flag right?

Existing implementation uses list API to find all possible jobs and filtering out them by
name. This leads to huge load on the API backend for workspaces with big number of jobs
(> 1k). Databricks Workflows team extended list API with search functionality.
@alexott alexott force-pushed the databricks-use-name-search-listing branch from c464aad to 3dc2d7c Compare November 7, 2022 18:32
@alexott alexott marked this pull request as ready for review November 7, 2022 18:32
@alexott
Copy link
Contributor Author

alexott commented Nov 7, 2022

I've reimplemented it differently, so old review comments aren't much relevant

@alexott alexott requested a review from ephraimbuddy November 7, 2022 18:33
@alexott
Copy link
Contributor Author

alexott commented Nov 7, 2022

Hmmm, not sure if it fails in static check... Looks like mypy inferred type incorrectly

@alexott
Copy link
Contributor Author

alexott commented Nov 8, 2022

@ephraimbuddy comments are addressed & mypy error has been fixed. Can you please review again?
thank you

@potiuk potiuk merged commit eb06c65 into apache:main Nov 11, 2022
Adityamalik123 pushed a commit to Adityamalik123/airflow that referenced this pull request Nov 12, 2022
…27446)

* Use new job search API for triggering Databricks job by name

Existing implementation uses list API to find all possible jobs and filtering out them by
name. This leads to huge load on the API backend for workspaces with big number of jobs
(> 1k). Databricks Workflows team extended list API with search functionality.

* address review comments

* fix mypy error
@alexott alexott deleted the databricks-use-name-search-listing branch November 19, 2022 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants