Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create TFX Example.ipynb #913

Merged
merged 1 commit into from
Mar 6, 2019
Merged

Create TFX Example.ipynb #913

merged 1 commit into from
Mar 6, 2019

Conversation

paveldournov
Copy link
Contributor

@paveldournov paveldournov commented Mar 5, 2019

Adding a notebook for running TFX Example from https://github.com/tensorflow/tfx/tree/master/examples/chicago_taxi_pipeline


This change is Reviewable

@paveldournov paveldournov requested review from neuromage and removed request for gaoning777 March 5, 2019 22:26
@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 5, 2019

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ark-kun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ark-kun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@neuromage
Copy link
Contributor

/lgtm

@paveldournov paveldournov merged commit 1a4922c into master Mar 6, 2019
@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 6, 2019

Issues when running:

# copy the trainer code to a storage bucket 
!gsutil cp tfx/examples/chicago_taxi_pipeline/taxi_utils.py gs://avolkov/temp/tfx/examples/chicago_taxi_pipeline/taxi_utils.py

Copying file://tfx/examples/chicago_taxi_pipeline/taxi_utils.py [Content-Type=text/x-python]...
AccessDeniedException: 403 Insufficient Permission

This code works:

from tensorflow import gfile
gfile.Copy('tfx/examples/chicago_taxi_pipeline/taxi_utils.py', 'gs://avolkov/temp/tfx/examples/chicago_taxi_pipeline/taxi_utils.py')

@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 6, 2019

GCS storage bucket name (replace my-bucket)

_input_bucket = 'gs://my-bucket'

It's not obvious what kind of input bucket should be put here. There is no connection with the previous gsutil step (which is also not explained).

@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 6, 2019

To prevent exceptions when running the pipeline submission the second time, replace the code with the following:

# Get or create a new experiment
import kfp
client = kfp.Client()
experiment_name="TFX Examples"
try:
    experiment_id = client.get_experiment(experiment_name=experiment_name).id
except:
    experiment_id = client.create_experiment(experiment_name).id

pipeline_filename = "chicago_taxi_pipeline_kubeflow.tar.gz"

#Submit a pipeline run
run_name = 'Run 1'
run_result = client.run_pipeline(experiment_id, run_name, pipeline_filename, {})

@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 6, 2019

The README.md or notebook do not state that the dataflow needs to be enabled in the cluster.
It's better to provide the link to enable DataFlow: https://console.developers.google.com/apis/api/dataflow.googleapis.com/overview

apitools.base.py.exceptions.HttpForbiddenError: HttpError accessing <
https://dataflow.googleapis.com/v1b3/projects/avolkov/locations/us-central1/jobs?alt=json
>: response: <{'status': '403', 'content-length': '755', 'x-xss-protection': '1; mode=block', 'x-content-type-options': 'nosniff', 'transfer-encoding': 'chunked', 'vary': 'Origin, X-Origin, Referer', 'server': 'ESF', '-content-encoding': 'gzip', 'cache-control': 'private', 'date': 'Wed, 06 Mar 2019 02:11:09 GMT', 'x-frame-options': 'SAMEORIGIN', 'content-type': 'application/json; charset=UTF-8'}>, content <{
  "error": {
    "code": 403,
    "message": "Dataflow API has not been used in project 140626129697 before or it is disabled. Enable it by visiting 
https://console.developers.google.com/apis/api/dataflow.googleapis.com/overview?project=140626129697
 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.",
    "status": "PERMISSION_DENIED",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.Help",
        "links": [
          {
            "description": "Google developers console API activation",
            "url": "
https://console.developers.google.com/apis/api/dataflow.googleapis.com/overview?project=140626129697
"
          }
        ]
      }
    ]
  }
}

@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 6, 2019

User also needs to configure access:

apitools.base.py.exceptions.HttpForbiddenError: HttpError accessing <
https://dataflow.googleapis.com/v1b3/projects/avolkov/locations/us-central1/jobs?alt=json
>: response: <{'status': '403', 'content-length': '280', 'x-xss-protection': '1; mode=block', 'x-content-type-options': 'nosniff', 'transfer-encoding': 'chunked', 'vary': 'Origin, X-Origin, Referer', 'server': 'ESF', '-content-encoding': 'gzip', 'cache-control': 'private', 'date': 'Wed, 06 Mar 2019 02:14:07 GMT', 'x-frame-options': 'SAMEORIGIN', 'content-type': 'application/json; charset=UTF-8'}>, content <{
  "error": {
    "code": 403,
    "message": "(25bb56c762755b0d): Could not create workflow; user does not have write access to project: avolkov Causes: (25bb56c7627555d2): Permission 'dataflow.jobs.create' denied on project: 'avolkov'",
    "status": "PERMISSION_DENIED"
  }

@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 6, 2019

We need to highlight that the user must specify the project ID (which they don't usually see) and not the project name.

Ark-kun added a commit to Ark-kun/pipelines that referenced this pull request Mar 6, 2019
Addressed the ussues discovered here: kubeflow#913
@paveldournov
Copy link
Contributor Author

@Ark-kun can you please go ahead and submit a PR with these fixes?

Ark-kun added a commit to Ark-kun/pipelines that referenced this pull request Mar 6, 2019
Addressed the ussues discovered here: kubeflow#913
Ark-kun added a commit to Ark-kun/pipelines that referenced this pull request Mar 6, 2019
Addressed the ussues discovered here: kubeflow#913
Ark-kun added a commit to Ark-kun/pipelines that referenced this pull request Mar 6, 2019
Addressed the ussues discovered here: kubeflow#913
Ark-kun added a commit to Ark-kun/pipelines that referenced this pull request Mar 6, 2019
Addressed the ussues discovered here: kubeflow#913
Ark-kun added a commit to Ark-kun/pipelines that referenced this pull request Mar 6, 2019
Addressed the ussues discovered here: kubeflow#913
Ark-kun added a commit to Ark-kun/pipelines that referenced this pull request Mar 6, 2019
Addressed the ussues discovered here: kubeflow#913
Ark-kun added a commit to Ark-kun/pipelines that referenced this pull request Mar 6, 2019
Addressed the ussues discovered here: kubeflow#913
Ark-kun added a commit to Ark-kun/pipelines that referenced this pull request Mar 6, 2019
Addressed the issues discribed here: kubeflow#913
Ark-kun added a commit to Ark-kun/pipelines that referenced this pull request Mar 6, 2019
Addressed the issues described here: kubeflow#913
Ark-kun added a commit to Ark-kun/pipelines that referenced this pull request Mar 6, 2019
Addressed the issues described here: kubeflow#913
k8s-ci-robot pushed a commit that referenced this pull request Mar 7, 2019
* Improved the TFX OSS notebook and README
Addressed the issues described here: #913

* Addressed the PR feedback.
@gaoning777
Copy link
Contributor

Please add a sample test entry to verify the sample.

cheyang pushed a commit to alibaba/pipelines that referenced this pull request Mar 28, 2019
cheyang pushed a commit to alibaba/pipelines that referenced this pull request Mar 28, 2019
* Improved the TFX OSS notebook and README
Addressed the issues described here: kubeflow#913

* Addressed the PR feedback.
@IronPan IronPan deleted the paveldournov-notebook-2 branch June 28, 2019 18:48
Linchin pushed a commit to Linchin/pipelines that referenced this pull request Apr 11, 2023
Signed-off-by: Theofilos Papapanagiotou <theofilos@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants