Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline run created from python shows experiment but not pipeline (in UI). #617

Closed
TimZaman opened this issue Jan 3, 2019 · 13 comments
Closed

Comments

@TimZaman
Copy link
Contributor

TimZaman commented Jan 3, 2019

See title.

GKE, Kubeflow 0.3.5. KFP 0.1.4 and 0.1.6

Reproduce:

  1. Run the sample notebook (used ipython notebook) https://github.com/kubeflow/pipelines/blob/master/samples/notebooks/Lightweight%20Python%20components%20-%20basics.ipynb
  2. Click on its pipeline:
    image
  3. Looks good:
    image
  4. Pipelines view:
    image

Expected result:
I expected the pipeline to be visible in the pipelines view UI, especially because in the UI it is impossible to even create an experiment without a pipeline.

@yebrahim
Copy link
Contributor

yebrahim commented Jan 4, 2019

This was done on purpose to allow for fast iteration on pipelines using the Python SDK (modify pipeline -> create run -> modify -> create run.. etc), without polluting the pipeline list UI. The latter mainly targets supporting sharing pipelines IMO. We don't have a good sharing story yet, but you can imagine in a multi-tenant cluster, you can upload pipelines there that you think may be useful for others to try out.

You can create an experiment without a pipeline, there's no dependency there. Just go to the experiments view and hit "New experiment."

/cc @IronPan and @ajayalfred for more context.
/area front-end

@TimZaman
Copy link
Contributor Author

TimZaman commented Jan 4, 2019 via email

@yebrahim
Copy link
Contributor

yebrahim commented Jan 4, 2019

Well the API supports that, I'm not sure why I don't see it in the Python SDK either. Ideally the SDK should be auto-generated from the swagger files so that it aligns with the protos and other clients. @gaoning777 is this not the case?

@IronPan
Copy link
Member

IronPan commented Jan 4, 2019

The python SDK currently doesn't support but it won't be hard to add. It just need to auto generate the pipeline API client as part of its build script in here
https://github.com/kubeflow/pipelines/blob/master/sdk/python/build.sh#L39

then you should be able to do import kfp_pipeline and call the API to upload pipeline from python.
@qimingj @gaoning777 Is that a reasonable change?

@gaoning777
Copy link
Contributor

Good call. It is part of the Python SDK plan to expose the upload pipelines API.
It should be pretty straightforward and we only need to wrap the auto-generated swagger library and expose the API in a more friendly way.

@IronPan
Copy link
Member

IronPan commented Jan 4, 2019

As a side note, I don't think we should build thick syntactic sugar wrapper on top of API for the purpose of making the API call easier

  • It makes experience across clients inconsistent. The funciton supported by python client is not supported by CLI or UI. It build up the UX gaps across clients.
  • It makes the enhancement effort duplicated. If the enhancement is done at API level, it is immediately available for other clients.
    E.g. there is a get experiment wrapper here that can get by id or name. this can well be a utility API that supported in the backend
    def get_experiment(self, experiment_id=None, experiment_name=None):

Any thoughts? @qimingj @gaoning777 @Ark-kun @yebrahim

@yebrahim
Copy link
Contributor

yebrahim commented Jan 4, 2019

+1 to these points. There's also the added maintenance effort here, whenever the underlying APIs change.

IMO APIs should aim to be user-friendly as much as possible. If clients are spending significant effort making the experience better around calling the APIs, that's a call for improving the APIs themselves IMO. We're seeing the same problem with the UI having to make many calls to get the information needed by the user, and I think this needs improvement on the API level.

If you still think the Python SDK needs to have a friendlier API, let's track that separately. The first step is to give the Python client parity with other clients.

@qimingj
Copy link
Contributor

qimingj commented Jan 4, 2019

I believe our REST API are friendly enough. The problem is the python generated by swagger tool. For example, to create an experiment using the swagger generated lib, users need to do:

    config = kfp_experiment.configuration.Configuration()
    config.host = host if host else Client.IN_CLUSTER_DNS_NAME
    api_client = kfp_experiment.api_client.ApiClient(config)
    experiment_api = kfp_experiment.api.experiment_service_api.ExperimentServiceApi(api_client)
    exp = kfp_experiment.models.ApiExperiment(name=EXPERIMENT_NAME)
    response = experiment_api.create_experiment(body=exp)
    return response.id

With the wrapper client, users do:

client = kfp.Client()
exp = client.create_experiment(name=EXPERIMENT_NAME)

In the wrapper API we also make it more friendly for notebook users (we return clickable URL).

In general, the python API generated by swagger has the following problems:

  1. It creates separate packages and namespaces for run, experiment, job, upload. They cannot share root namespace.
  2. The "Api" and "Model" design works well for arbitrary APIs, but it is less friendly (example above).
  3. It lacks notebook support (returning URLs, and even HTMLs in the future).

I understand the extra effort of maintaining the client API. But I think it is well justified.

@IronPan
Copy link
Member

IronPan commented Jan 5, 2019

I agree on your point and I think a thin wrapper is OK (and necessary in this case). I am concerned about a thick wrapper that provides facilitation that should be optimized by API itself.

As the example I gave above, get experiment by name should be a API level support.

  • In the code we loop though all pages and get the one that matches. This is can be a efficient SQL select instead
  • If we decide experiment name is no longer unique in the future, this code is broken, silently.

Hope this make the point clearer.

@qimingj
Copy link
Contributor

qimingj commented Jan 5, 2019

Ah. That I totally agree. I think this is the only case where we build thick logic inside client library. Once we have that supported at backend I'll be happy to move over.

@IronPan
Copy link
Member

IronPan commented Jan 9, 2019

@TimZaman with the PR #634 you should be able to access all APIs from python. It should be available in latest release v0.1.7

@yebrahim
Copy link
Contributor

Closing this one then.

@TimZaman
Copy link
Contributor Author

Sweet!

Linchin pushed a commit to Linchin/pipelines that referenced this issue Apr 11, 2023
* There was a syntax error so we weren't GC'ing deployments

Fix: kubeflow#616
magdalenakuhn17 pushed a commit to magdalenakuhn17/pipelines that referenced this issue Oct 22, 2023
HumairAK pushed a commit to red-hat-data-services/data-science-pipelines that referenced this issue Mar 11, 2024
Resolves kubeflow#595

Signed-off-by: Christian Kadner <ckadner@us.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants