Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Argo events for triggering pipelines #651

Closed
swiftdiaries opened this issue Jan 8, 2019 · 35 comments
Closed

Argo events for triggering pipelines #651

swiftdiaries opened this issue Jan 8, 2019 · 35 comments
Labels
area/backend help wanted The community is welcome to contribute. kind/feature lifecycle/stale The issue / pull request is stale, any activities remove this label. priority/p1

Comments

@swiftdiaries
Copy link
Member

It'd be great if we could trigger pipelines automatically wrt events.
Use Case 1:
When a model is uploaded to an object store -> trigger a step (pipeline) to deploy.
Use Case 2:
When data arrives at a local volume / external storage -> trigger a pipeline to train.

This is related to #604.

I'd love to see this feature and help out in the implementation with some PRs as well (if it's on the roadmap)

@paveldournov
Copy link
Contributor

@swiftdiaries - yes, this feature is on the roadmap. Let's collaborate on the design.

/assign @vicaire

@swiftdiaries
Copy link
Member Author

Awesome ! Looking forward to this :)

@vicaire
Copy link
Contributor

vicaire commented Jan 25, 2019

I will follow up on this thread as soon as we start tackling this. Thanks.

@vicaire
Copy link
Contributor

vicaire commented Feb 13, 2019

@swiftdiaries

It's a bit short but I provided an outline of how we plan to support event-driven pipelines here: https://docs.google.com/document/d/1O5n02SzMYmLH0cMkykxHWWWe7eMzaP1vk7Y3fBbLoD8/edit#heading=h.mhe3tnle0c9o

(See event-driven pipelines and data-driven pipelines)

In a nutshell:

  • We will have a metadata store storing info about the data generated by a workflow (metadata).
  • Events can also be stored in that metadata stored from various sources (webhook, pub/sub, etc.) using piece of infrastructure decoupled from the rest of the system.
  • An event-driven CRD will let users specify a workflow to execute each time new data of a particular type is added to the metadata store.

WDYT?

@swiftdiaries
Copy link
Member Author

Sorry for the late reply.

The overall idea is sound. I found this thread on kubeflow-discuss quite interesting on how Argo Events is integrated with Argo Workflow at GitHub.

Also, what is the status for this? If there are tasks to be done, happy to work together on this one

@vicaire
Copy link
Contributor

vicaire commented Mar 27, 2019

@swiftdiaries,

The metadatastore is currently being designed with collaboration from the KF community.

We could start by looking at the best way to integrate Argo events with KFP for common use cases. Adding the "help wanted" flag. Contributions/Proposals are welcome.

@vicaire vicaire added the help wanted The community is welcome to contribute. label Mar 27, 2019
@vicaire
Copy link
Contributor

vicaire commented Apr 11, 2019

Note, resolving this issue should enable support for continuous online learning, as requested in #1053

@animeshsingh
Copy link
Contributor

Do we need to make it specifc to Argo events? Can it be designed in generic way to support something like KNative eventing? @vicaire please include us if there are any backdoor design discussions going at this end

@VaibhavPage
Copy link

@jingzhang36 Is this feature being actively worked on?

@stale
Copy link

stale bot commented Jun 24, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 24, 2020
@stale
Copy link

stale bot commented Jul 2, 2020

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

@stale stale bot closed this as completed Jul 2, 2020
@jondoering
Copy link

Any updates on this feature?

@Bobgy
Copy link
Contributor

Bobgy commented Aug 5, 2020

/reopen
looks like someone cares

no one is working on this.

I am curious what makes it different from using KFP SDK triggered by the event

@k8s-ci-robot k8s-ci-robot reopened this Aug 5, 2020
@k8s-ci-robot
Copy link
Contributor

@Bobgy: Reopened this issue.

In response to this:

/reopen
looks like someone cares

no one is working on this.

I am curious what makes it different from using KFP SDK triggered by the event

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@imagr-pat
Copy link

+1 for this issue.

We would like to be able trigger pipeline runs from GCP pubsub events

@Bobgy
Copy link
Contributor

Bobgy commented Sep 2, 2020

@imagr-pat for GCP pubsub events, it's possible to add a cloud function that listens to it and runs a kfp client, does it work for you?

@codebeard1
Copy link

codebeard1 commented Sep 22, 2020

plus 1 for me on this issue as well.

Ideally I would like to see native Kafka support for event based triggering of Kubeflow pipelines. This way we don't have to use something outside like Nifi or Airflow to have to trigger pipelines based upon an event. This is all to ensure there is better native support for online learning which is event driven based upon the mini-batches of training data that constantly flow into the pipelines to re-train and re-deploy a model.

@stale
Copy link

stale bot commented Dec 24, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Dec 24, 2020
@maganaluis
Copy link
Contributor

hold

@stale stale bot removed the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jan 28, 2021
@albertshx
Copy link

Look forward to seeing this feature so we don't need AWS lambda or Cloud Function to chain relevant pipelines ~~ A big thank you ~~

@stale
Copy link

stale bot commented Jun 9, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 9, 2021
@jeanphilippelingrand
Copy link

Our team would like to integrate with SQS queue.
The use case is the following. We would have data pipeline on airflow and ml pipeline on kubeflow.
The integration would allow to run the ml pipeline once the data pipeline is completed.

@stale stale bot removed the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 29, 2021
@midhun1998
Copy link
Member

+1 on this issue.
This issue could solve the CD approach partially too. We could have an argo workflow which could do CD for us. This workflow could be triggered using GitHub webhook.

@stale
Copy link

stale bot commented Mar 3, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Mar 3, 2022
@xin-hao-awx
Copy link

@chensun is this has been designed? or you guys are open to a community design proposal?

@stale stale bot removed the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Apr 15, 2022
@WaterKnight1998
Copy link

Any news on this?

I am looking for event/data driven pipelines that get triggered when new data arrives

@droctothorpe
Copy link
Contributor

+1

2 similar comments
@satriawadhipurusa
Copy link

+1

@rsr23
Copy link

rsr23 commented Sep 5, 2022

+1

@magdalenakuhn
Copy link

magdalenakuhn commented Mar 5, 2023

Any update on this? we’re also quite interested in this! Current workaround would be to use an AWS lambda function or Google Cloud Function like described here https://amygdala.github.io/gcp_blog/ml/kfp/mlops/tfdv/gcf/2021/02/26/kfp_tfdv_event_triggered.html#event-triggered-pipeline-runs that simply executes kfp.Client().run_pipeline()

Linchin pushed a commit to Linchin/pipelines that referenced this issue Apr 11, 2023
* There is a bug in the SSL certificate cleanup code which is preventing
  certificates from being GC'd.

* Fix kubeflow#636
@charlesmelby
Copy link

another request for updates on this issue!

@titoeb
Copy link

titoeb commented Sep 21, 2023

+1

Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 26, 2024
Copy link

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/backend help wanted The community is welcome to contribute. kind/feature lifecycle/stale The issue / pull request is stale, any activities remove this label. priority/p1
Projects
None yet
Development

No branches or pull requests