Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add deployment guide for Prefect 2.0 #2431

Closed
deepyaman opened this issue Mar 16, 2023 · 18 comments
Closed

Add deployment guide for Prefect 2.0 #2431

deepyaman opened this issue Mar 16, 2023 · 18 comments
Labels
Component: Documentation 📄 Issue/PR for markdown and API documentation Issue: Feature Request New feature or improvement to existing feature

Comments

@deepyaman
Copy link
Member

deepyaman commented Mar 16, 2023

Description

The existing deployment guides cover Prefect deployment, but not Prefect 2.0, and there have been a number of questions recently about this (e.g. https://www.linen.dev/s/kedro/t/9703126/hi-everyone-does-anyone-know-whether-kedro-supports-prefect-#d8f1e73e-ee9f-4399-ae4b-31f75d657879).

Context

Prefect 2.0 deployment should look different from deploying to the Prefect 1.x series, so it's not as simple as slightly modifying the deployment instructions. Furthermore, the Prefect deployment guide creates underlying Prefect objects (i.e. not how it's demonstrated in the tutorial, so it requires looking at Prefect source code to update properly).

Possible Implementation

N/A

Possible Alternatives

Support Prefect 2.0 as a built-in runner.

@deepyaman deepyaman added the Issue: Feature Request New feature or improvement to existing feature label Mar 16, 2023
@astrojuanlu
Copy link
Member

Related: do we want to keep the existing Prefect 1.0 guide? I'm not familiar with the project enough to judge.

@deepyaman
Copy link
Member Author

Related: do we want to keep the existing Prefect 1.0 guide? I'm not familiar with the project enough to judge.

I'm not familiar enough, either, but I assume most new deployments should be to Prefect 2.0 (so I'd say it's not as important to keep Prefect 1.0).

@ofir-insait
Copy link

What about simply calling os.system("kedro run") from a Prefect workflow? What are the disadvantages of doing such thing?

@astrojuanlu
Copy link
Member

@ofir-insait from my limited understanding of Prefect, I don't think there would be anything wrong with doing it, just a loss of granularity compared to with, say, a Kedro pipeline transformed into a Prefect flow.

@ofir-insait
Copy link

Gotcha, thanks!

@jmalovera10
Copy link
Contributor

jmalovera10 commented May 14, 2023

Hi! I wanted to ask if the Prefect deployment guide is going to be updated soon? Since Prefect Cloud 1.0 is going to be frozen on May 15th 2023. To keep using the service we must migrate to Prefect 2.0.

@hugocool
Copy link

One point to consider is that Prefect 2.0 server is not open source, so one has to use their cloud.
Therefore, for sensitive applications (like healthcare) prefect 2.0 is not an option, and prefect 1.x is actually preferred.
And while their prefect 1.0 server may be sunset, the version you deploy yourself is still a great option all things considered.

@astrojuanlu
Copy link
Member

astrojuanlu commented May 31, 2023

@hugocool thanks for chiming in (see the Slack conversation for more context). I went and checked and apparently there is a Prefect 2.0 server:

https://docs.prefect.io/2.10.11/host/

About Prefect 1.0 transition: https://www.prefect.io/guide/blog/freezing-legacy-prefect-cloud-1-accounts-on-starter-and-standard-plans/

Today we are announcing that legacy Prefect Cloud 1 accounts will be frozen for some users starting May 15th 2023.

@jmalovera10 we have limited capacity and unfortunately updating the Prefect docs is not our priority in the short term, but we would be more than happy to review community contributions.

@hugocool

This comment was marked as off-topic.

@astrojuanlu

This comment was marked as off-topic.

@noklam
Copy link
Contributor

noklam commented Jun 19, 2023

Adding this for reference
image

From this Slack thread

@jmalovera10
Copy link
Contributor

@hugocool thanks for chiming in (see the Slack conversation for more context). I went and checked and apparently there is a Prefect 2.0 server:

https://docs.prefect.io/2.10.11/host/

About Prefect 1.0 transition: https://www.prefect.io/guide/blog/freezing-legacy-prefect-cloud-1-accounts-on-starter-and-standard-plans/

Today we are announcing that legacy Prefect Cloud 1 accounts will be frozen for some users starting May 15th 2023.

@jmalovera10 we have limited capacity and unfortunately updating the Prefect docs is not our priority in the short term, but we would be more than happy to review community contributions.

@astrojuanlu thank you for replying! I understand if it is not in the priorities right now. Maybe I could help, I managed to make working example for the project I am working on. Should I make a PR for documentation?

@astrojuanlu
Copy link
Member

Should I make a PR for documentation?

@jmalovera10 Yes please! 💖

@jmalovera10 jmalovera10 mentioned this issue Jun 26, 2023
5 tasks
@astrojuanlu
Copy link
Member

Reviews welcome on gh-2725 🚀

@hugocool
Copy link

@jmalovera10 , thanks for including me/us.
We just finished two deployments of kedro; on AWS batch, and Prefect v1.
From my perspective, the deployment on prefect is not feature complete, specifically with respect to filtering. Because for as far as I know one of the main differences between the batch and prefect deployments is that batch runs every node as a single job, which is not ideal but does keep the filtering feature working, but for prefect it runs a pipeline as a flow so it wasn't clear to me whether it could run single nodes and/or entire pipelines allowing one to keep using the traditional kedro run commands. We also found some issues with node hooks, specifically when using kedro-mlflow.
basically we would like to contribute a plugin for this deployment.

also, did you figure out the mapping of the different agents used in v1 vs v2? so the dockeragent, localagent, kubernetesagent?

@astrojuanlu
Copy link
Member

@hugocool your feedback is really appreciated and I'm hoping @jmalovera10 can comment on that. In the meantime, we have merged gh-2748 which explains a simple approach that works on Prefect 2.0, hence closing the original scope of this issue. Feel free to keep discussing, or otherwise open a new issue for further improvements.

@jmalovera10
Copy link
Contributor

@jmalovera10 , thanks for including me/us. We just finished two deployments of kedro; on AWS batch, and Prefect v1. From my perspective, the deployment on prefect is not feature complete, specifically with respect to filtering. Because for as far as I know one of the main differences between the batch and prefect deployments is that batch runs every node as a single job, which is not ideal but does keep the filtering feature working, but for prefect it runs a pipeline as a flow so it wasn't clear to me whether it could run single nodes and/or entire pipelines allowing one to keep using the traditional kedro run commands. We also found some issues with node hooks, specifically when using kedro-mlflow. basically we would like to contribute a plugin for this deployment.

also, did you figure out the mapping of the different agents used in v1 vs v2? so the dockeragent, localagent, kubernetesagent?

@hugocool thanks for the question. The deployment script I adapted from Prefect 1.0 takes a pipeline and wraps Kedro nodes as Prefect tasks. This helps to keep consistent mappings between pipelines -> flows and nodes -> tasks. In the filtering example you propose, it could be done by initializing the filtered pipeline and extracting the topological order of the pipeline. Then you could wrap the nodes as tasks and execute them by topological "layers", in other words, execute the nodes that are the dependencies of other nodes. If you were to modify the script to achieve this, you could replace these pipeline declarations:

pipeline = pipelines.get(pipeline_name)

For this one:

pipeline = filter(...)

The limitation of this approach is that you must have a pipeline for the script to work. However, the filtering functions allow you to filter by node names, so providing a single node name would suffice for single node executions.

On the other hand, I have only worked with the Process Block that executes a local entrypoint given an execution event received by an Agent. What changed in Prefect 2.0 is that the Agent is independent from the infrastructure, so you rely on a single Agent type and define the entrypoint of the infrastructure you want to use. I haven't tried, but my guess is that by configuring, for example, a Docker Block the deployment should be similar as for a Process Block. Let me know if this is helpful 😄

@astrojuanlu
Copy link
Member

Heads up for people subscribed to this issue, there seem to be some issues with Kedro on Prefect under Windows #2346

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Documentation 📄 Issue/PR for markdown and API documentation Issue: Feature Request New feature or improvement to existing feature
Projects
Archived in project
Development

No branches or pull requests

7 participants