Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameterize the artifact path for mlpipeline ui-metadata and metrics #998

Merged
merged 5 commits into from
Apr 25, 2019

Conversation

Tomcli
Copy link
Member

@Tomcli Tomcli commented Mar 19, 2019

We have some use case where our containerOps doesn't have permission to access the root directory. Therefore, we want to parameterize the artifact path for mlpipeline ui-metadata and metrics because those artifact paths currently are hard coded in root.

Example usage:

containerOps().set_mlpipeline_ui_metadata_path(path)

cc: @animeshsingh


This change is Reviewable

@k8s-ci-robot
Copy link
Contributor

Hi @Tomcli. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

Hi @Tomcli. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 20, 2019

Hi @Tomcli
Thank you for tackling this problem. I'd love to fix it in a slightly different way. I'd like to get your feedback on this solution.

Instead of extending the user-facing API surface (adding methods to ContainerOp), let's add proper support for output artifacts the same way output parameters are supported (e.g. similar to file_outputs):

def my_op():
  return ContainerOp(
    ...,
    output_artifact_paths = {
      'mlpipeline-ui-metadata': '/outputs/mlpipeline-ui-metadata/data',
      'mlpipeline-metrics': '/outputs/mlpipeline-metrics/data',
    }
  )

This is a pretty trivial change to the ContainerOp constructor and the DSL compiler.

To preserve backwards compatibility the compiler can still add the mlpipeline-ui-metadata and mlpipeline-metrics artifacts with default paths if they're not configured in output_artifact_paths.

Could you try to do it like that?

@Ark-kun Ark-kun self-assigned this Mar 20, 2019
@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 20, 2019

/ok-to-test

@Tomcli
Copy link
Member Author

Tomcli commented Mar 20, 2019

Hi @Ark-kun, I updated the implementation to take the artifact paths as containerOps arguments. Does it look good to you? Thank you for the reviews and suggestions.

@Tomcli
Copy link
Member Author

Tomcli commented Mar 20, 2019

/retest

@Tomcli
Copy link
Member Author

Tomcli commented Mar 20, 2019

Looks like Travis CI failed on getting some go packages.

@hongye-sun
Copy link
Contributor

/hold

@Ark-kun
Hold the PR as I have question on why exposing output_artifact_paths on ContainerOp constructor. It is very hard to change and deprecate in the future. Why not we just adding argo artifacts low level APIs on the ContainerOp?

@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 20, 2019

Hold the PR as I have question on why exposing output_artifact_paths on ContainerOp constructor. It is very hard to change and deprecate in the future. Why not we just adding argo artifacts low level APIs on the ContainerOp?

We're exposing Argo artifacts API.

Argo artifacts API for container templates looks like this:

outputs:
  artifacts:
  - name: art1
    path: /tmp/art1.txt
  - name: art2
    path: /tmp/art2.txt

This is a mapping between the artifact names and artifact files. We're exposing this Argo artifact mapping as output_artifact_paths (again, using Argo terminology).

It is in the official Argo sample on artifact passing. https://github.com/argoproj/argo/blob/bb4520a6f65d4e8e765ce4d426befa583721c194/examples/artifact-passing.yaml#L26

We should expose this Argo functionality so that our customers can create components that produce artifacts.

@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 20, 2019

Looks like Travis CI failed on getting some go packages.

I've restarted the tests.

@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 29, 2019

hey I am sorry, just saw this PR.
Cuz I needed this feature too, so I made a similar PR at #1064
Might want to take a look?

I'd say that your PR is orthogonal. This PR is about specifying the list of output artifacts and their paths, while your PR is about configuring the storage (which I think we should configure more globally - per pipeline or per cluster), not per output.

@animeshsingh
Copy link
Contributor

@Ark-kun @neuromage coming back to it. This one is needed for us to be use k8sapi executor from Argo - since IBM cloud no longer supports docker, and KFP and Argo use docker executor by default
cc @vicaire

@Ark-kun
Copy link
Contributor

Ark-kun commented Apr 25, 2019

Let's merge this PR.
It's a good small PR that's been waiting for too long.

@animeshsingh
Copy link
Contributor

+1

@hongye-sun
Copy link
Contributor

Discussed with @neuromage offline. With our current thinking of metadata design, it seems that exposing argo's artifact in ContainerOp won't cause conflicts. I will unhold the PR to unblock the hardcoded uimetadata and metrics path issue.

/hold cancel

@Tomcli
Copy link
Member Author

Tomcli commented Apr 25, 2019

Thanks @hongye-sun. @Ark-kun I updated the code with the latest master branch and all the tests have passed. Is this looking good to you?

@vicaire
Copy link
Contributor

vicaire commented Apr 25, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot merged commit bb0a5e3 into kubeflow:master Apr 25, 2019
@animeshsingh
Copy link
Contributor

@neuromage @hongye-sun - how far is the metadata effort (pipeline metadata being sent to metadata tracking server) from realization? any more concrete design docs there?

@neuromage
Copy link
Contributor

@animeshsingh We're still working on the design. We will share more once things get more concrete. Thanks.

@Ark-kun Ark-kun mentioned this pull request Apr 29, 2019
Linchin pushed a commit to Linchin/pipelines that referenced this pull request Apr 11, 2023
magdalenakuhn17 pushed a commit to magdalenakuhn17/pipelines that referenced this pull request Oct 22, 2023
…ubeflow#998)

This allows a more performant implementation due to the way protobuf
clearing and reuse operates in conjunction with GRPC APIs.
@Tomcli Tomcli deleted the dsl branch February 28, 2024 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants