Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile samples instead of hard code them in API server #76

Merged
merged 15 commits into from
Nov 6, 2018

Conversation

IronPan
Copy link
Member

@IronPan IronPan commented Nov 6, 2018

Please ignore the sample names in the sample configs since they are still TBD.

This change is Reviewable

@IronPan
Copy link
Member Author

IronPan commented Nov 6, 2018

/assign @qimingj @vicaire

@@ -14,6 +14,15 @@ COPY . .
RUN apk add --update gcc musl-dev
RUN go build -o /bin/apiserver backend/src/apiserver/*.go

FROM python:3.5.0-slim as compiler

# This is hard coded to 0.0.26. Once kfp DSK release process is automated,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already in (#70). Give it a try?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the test need to update to adopt that. it's kind of late for the last minute change. I'll do it post beta.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

"file":"/samples/tfx/taxi-cab-classification-pipeline.py.tar.gz"
},
{
"name":"[Sample] ML - CMLE - Resnet Training",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to remove this one --- CMLE is not ready for TPU yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

RUN pip install https://storage.googleapis.com/ml-pipeline/release/0.0.26/kfp-0.0.26.tar.gz --upgrade
WORKDIR /samples
COPY ./samples .
RUN find . -maxdepth 2 -name "*.py" -exec python3 {} \;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is more robust to call dsl-compile on these py files --- Not all py files come with self-compilation code. It is not required.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -132,16 +132,26 @@ func loadSamples(resourceManager *resource.ResourceManager) {
}
var configs []config
if json.Unmarshal(configBytes, &configs) != nil {
glog.Warningf("Failed to read sample configurations. Err: %v", err.Error())
return
glog.Fatalf("Failed to read sample configurations. Err: %v", err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we please return errors in this method and have the top level decide whether to abort? (i.e. let's avoid functions with side effects as it makes code hard to reuse).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
resourceManager.CreatePipeline(config.Name, config.Description, sampleBytes)
// Decompress if file is tarball
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to test whether it is a tarball or not?

Why not reusing the logic in the API that already turns a file (yaml or tarball) into a pipeline object?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -14,6 +14,15 @@ COPY . .
RUN apk add --update gcc musl-dev
RUN go build -o /bin/apiserver backend/src/apiserver/*.go

FROM python:3.5.0-slim as compiler

# This is hard coded to 0.0.26. Once kfp DSK release process is automated,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

@qimingj
Copy link
Contributor

qimingj commented Nov 6, 2018

/lgtm

@@ -132,16 +138,26 @@ func loadSamples(resourceManager *resource.ResourceManager) {
}
var configs []config
if json.Unmarshal(configBytes, &configs) != nil {
glog.Warningf("Failed to read sample configurations. Err: %v", err.Error())
return
return errors.New(fmt.Sprintf("Failed to parse sample configurations."))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not returning the original error message as part of the result?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -72,7 +72,7 @@ if [ "$CLUSTER_TYPE" == "create-gke" ]; then
echo "Delete cluster..."
gcloud container clusters delete ${TEST_CLUSTER} --async
}
trap delete_cluster EXIT
# trap delete_cluster EXIT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we removing this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is for debugging. revert

loadSamples(resourceManager)
err:= loadSamples(resourceManager)
if err!=nil{
glog.Fatalf("Failed to load samples. Err: %v", err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the server restarts?

Does this prevent the server from ever restarting unless the storage is deleted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the samples are baked into the image so it won't be a problem

@vicaire
Copy link
Contributor

vicaire commented Nov 6, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot removed the lgtm label Nov 6, 2018
@IronPan
Copy link
Member Author

IronPan commented Nov 6, 2018

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: IronPan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@IronPan
Copy link
Member Author

IronPan commented Nov 6, 2018

/lgtm

@k8s-ci-robot
Copy link
Contributor

@IronPan: you cannot LGTM your own PR.

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

1 similar comment
@google-prow-robot
Copy link
Collaborator

@IronPan: you cannot LGTM your own PR.

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@IronPan
Copy link
Member Author

IronPan commented Nov 6, 2018

/lgtm

@k8s-ci-robot
Copy link
Contributor

@IronPan: you cannot LGTM your own PR.

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

1 similar comment
@google-prow-robot
Copy link
Collaborator

@IronPan: you cannot LGTM your own PR.

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vicaire
Copy link
Contributor

vicaire commented Nov 6, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot merged commit 7847b74 into master Nov 6, 2018
k8s-ci-robot pushed a commit that referenced this pull request Nov 7, 2018
* Moves docs from the /samples README to the wiki

I've copied all the info from the googleprivate/samples README.md to the wiki. This PR removes most of the content from the README, adds a simple introduction, and points people to the wiki docs for further information.

Relevant wiki files:
https://github.com/kubeflow/pipelines/wiki/Deploy-the-Kubeflow-Pipelines-Service
https://github.com/kubeflow/pipelines/wiki/Build-a-Pipeline
https://github.com/kubeflow/pipelines/wiki/Build-Your-Own-Component

* SDK/Tests/Components - Improve temporary file handling (#37)

* Improve temporary file handling in python op tests

* More small temp path fixes

* Fix tfx name bug in the tfma sample test (#67)

* fix tfx name bug

* update release build for the data publish

* SDK/DSL/Compiler - Fixed compilation of dsl.Condition (#28)

* Fixed compilation of dsl.Conditional
The compiler no longer produced intermediate steps.

* Got rid of _create_new_groups

* Changed the sub_group.type check

* Fix tfx name bug in the tfma sample test (#67)

* fix tfx name bug

* update release build for the data publish

* Build Python SDK in the releasing (#70)

* publish python sdk in the cloud build

* update cloudbuild

* adjust output path

* add dependency for buildapiserver

* mlp -> kfp.dsl (#88)

* Removed instruction to clone the repo.

* Changed link to samples doc on wiki.

* Add %%docker magic to jupyter kernel. It helps submitting a docker build job more easily with one cell. (#72)

* fix miscellaneous List API issue (#90)

* fix

* Update job_store.go

* Update run_store.go

* Update presubmit-tests.sh

* Update job_store.go

* Moves docs from pipelines main README to wiki (#83)

I've copied all the info from the kubeflow/pipelines README.md to the wiki. This PR removes most of the content from the README, adds a simple introduction, and points people to the wiki docs for further information.

Relevant wiki files:
https://github.com/kubeflow/pipelines/wiki
https://github.com/kubeflow/pipelines/wiki/Deploy-the-Kubeflow-Pipelines-Service
https://github.com/kubeflow/pipelines/wiki/Build-a-Pipeline

* SDK/DSL/Compiler - Reverted fix of dsl.Condition until the UI is ready. (#94)

* sort by run display name by default (#96)

* CSS changes for nav menu and tables (#99)

* debug tfma failure (#91)

* debug tfma failure

* tft version bug

* minor fix

* comment the test validation

* Fix validation check for maximum size limit (#104)

* Fixed the Minikube tests after moving to the new repo (#98)

* Don't barf when experiment name is already used (#101)

* don't barf when experiment name is already used

* make mock backend use case insensitive names

* ExperimentList tests, use immer.js (#86)

* experiment list tests

* use produce in more places

* remove extra test

* remove extra import

* Use the experiment's resource reference in the listJobs request (#105)

* use the experiment's resource reference in the listJobs request

* fix test import

* Add Ning and Alexey to OWNERS for components, samples and sample-test (#102)

* Compile samples instead of hard code them in API server (#76)

* compile samples

* update logging

* update description

* update sample

* add immediate value sample

* revert

* fail fast if the samples are failed to load

* comment

* address comments

* comment out

* update command

* comments

* Account for padding in metric progress fill (#107)

* Account for padding in metric progress fill

* small mock backend fix

* move to css classes, add color

* changes to breadcrumb style

* increase width of summary card

* tests

* merge tests

* First integration test for the ML Pipeline CLI (Pipeline List). (#81)

* First integration test for the ML Pipeline CLI (Pipeline List).

* Fixing an issue with an undefined variable

* Adding the --debug flag to help with debugging.

* Changing the namespace to Kubeflow.

* Add tests for the NewExperiment page (#109)

* Add tests for the NewExperiment page

* Fix test name

* Remove obsolete tests

* Clean up

* add xgboost: migrate from the old repo (#46)

* migrate from the old repo

* fix bug: accidentally override tfma test

* add tfma test back

* add tfma back

* typo fix

* fix small typo

* if job fails, exit after logs are output

* Remove CMLE sample for now since we are waiting for a service fix to support TPU. (#113)

* image tag update for release (#114)

* update image tag for new releases

* add more

* delete the accidentaly added sample

* fix typo (#116)

* Fix an issue that %%docker doesn't work. (#119)

* updated favicon to monochrome color (#118)

* Expanded row changes (#120)

* updated favicon to monochrome color

* simple CSS changes to expanded row

* Removed mentions of ark7 in tests (#111)

* Add basic sample tests (#79)

* add sequential sample test

* add condition basic sample

* reuse script

* add all the other basic tests

* update sample test dockerfile to add run_basic_test file

* write test output

* typo bug
@IronPan IronPan deleted the yangpa/samples branch November 8, 2018 00:32
Linchin pushed a commit to Linchin/pipelines that referenced this pull request Apr 11, 2023
* Add ssh for minikube testing.

* We use ssh to execute code on the VM used for minikube.

* Log the file used to configure kubectl.
magdalenakuhn17 pushed a commit to magdalenakuhn17/pipelines that referenced this pull request Oct 22, 2023
HumairAK referenced this pull request in red-hat-data-services/data-science-pipelines Mar 11, 2024
HumairAK referenced this pull request in red-hat-data-services/data-science-pipelines Mar 11, 2024
Include permission for get services
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants