Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create README.md #909

Merged
merged 1 commit into from
Mar 5, 2019
Merged

Create README.md #909

merged 1 commit into from
Mar 5, 2019

Conversation

paveldournov
Copy link
Contributor

@paveldournov paveldournov commented Mar 5, 2019

TFX OSS instructions for running Taxi example.


This change is Reviewable

TFX OSS instructions for running Taxi example.
@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 5, 2019

/lgtm

Install TFX and Kubeflow Pipelines SDK
```
!pip3 install https://storage.googleapis.com/ml-pipeline/tfx/tfx-0.12.0rc0-py2.py3-none-any.whl
!pip3 install https://storage.googleapis.com/ml-pipeline/release/0.1.10/kfp.tar.gz --upgrade
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JFYI: The latest version is 0.1.11 and in 0.1.12 will have improved experiment creation.

@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 5, 2019

/approve

Copy link
Contributor

@neuromage neuromage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

```
conda create -n tfx-kfp pip python=3.5.3
```
then activate the environment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can add how to activate:

conda activate tfx-kfp

- GCS storage bucket name (replace "my-bucket")
- GCP project ID (replace "my-gcp-project")
- Make sure the path to the taxi_utils.py is correct
- Set the limit on the BigQuery query. The original dataset has 100M rows, which can take time to process. Set it to 20000 to run an sample test.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, I changed it to use RAND() < 0.01. So we can say:

"Change the sampling rate, or alternately, replace it with a LIMIT clause to process a smaller dataset. We recommend using at least 20000 rows in your sample."

or something like this.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ark-kun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ark-kun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@paveldournov paveldournov merged commit f492baa into master Mar 5, 2019
cheyang pushed a commit to alibaba/pipelines that referenced this pull request Mar 28, 2019
TFX OSS instructions for running Taxi example.
@IronPan IronPan deleted the paveldournov-tfx-readme-0 branch June 28, 2019 18:48
Linchin pushed a commit to Linchin/pipelines that referenced this pull request Apr 11, 2023
HumairAK pushed a commit to red-hat-data-services/data-science-pipelines that referenced this pull request Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants