Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create TFX Example.ipynb #913

Merged
merged 1 commit into from
Mar 6, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
122 changes: 122 additions & 0 deletions samples/tfx-oss/TFX Example.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# TFX on KubeFlow Pipelines Example\n",
"\n",
"### Install TFX and KFP packages"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip3 install https://storage.googleapis.com/ml-pipeline/tfx/tfx-0.12.0rc0-py2.py3-none-any.whl \n",
"!pip3 install https://storage.googleapis.com/ml-pipeline/release/0.1.10/kfp.tar.gz --upgrade\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get the TFX repo with sample pipeline\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!git clone https://github.com/tensorflow/tfx"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# copy the trainer code to a storage bucket \n",
"!gsutil cp tfx/examples/chicago_taxi_pipeline/taxi_utils.py gs://<my bucket>/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Configure the TFX pipeline example\n",
"\n",
"Reload this cell by running the load command to get the pipeline configuration file\n",
"```\n",
"%load tfx/examples/chicago_taxi_pipeline/taxi_pipeline_kubeflow.py\n",
"```\n",
"\n",
"Configure:\n",
"- GCS storage bucket name (replace my-bucket)\n",
"- GCP project ID (replace my-gcp-project)\n",
"- Make sure the path to the taxi_utils.py is correct\n",
"\n",
"The dataset in BigQuery has 100M rows, you can change the query parameters in WHERE clause to limit the number of rows used.\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"%load tfx/examples/chicago_taxi_pipeline/taxi_pipeline_kubeflow.py"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Compile the pipeline and submit a run to the Kubeflow cluster"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Get or create a new experiment\n",
"import kfp\n",
"client = kfp.Client()\n",
"experiment = client.create_experiment(\"TFX Examples\")\n",
"pipeline_filename = \"chicago_taxi_pipeline_kubeflow.tar.gz\"\n",
"\n",
"#Submit a pipeline run\n",
"run_name = 'Run 1'\n",
"run_result = client.run_pipeline(experiment.id, run_name, pipeline_filename, {})\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}