Skip to content

Commit

Permalink
[Doc] Change sample/component/sdk documentation to not use `use_gcp_s…
Browse files Browse the repository at this point in the history
…ecret` (kubeflow#2782)

* Update use_gcp_secret documentation to point to authenticating pipelines to GCP doc

* Update Local Development Quickstart.ipynb
  • Loading branch information
Bobgy authored and rui5i committed Jan 16, 2020
1 parent 4c2682e commit cccbb9b
Show file tree
Hide file tree
Showing 32 changed files with 117 additions and 264 deletions.
9 changes: 2 additions & 7 deletions components/gcp/bigquery/query/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,7 @@ output_gcs_path | The path to the Cloud Storage bucket containing the query outp
To use the component, the following requirements must be met:

* The BigQuery API is enabled.
* The component is running under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow Pipeline cluster. For example:

```
bigquery_query_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))
```
* The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.
* The Kubeflow user service account is a member of the `roles/bigquery.admin` role of the project.
* The Kubeflow user service account is a member of the `roles/storage.objectCreator `role of the Cloud Storage output bucket.

Expand Down Expand Up @@ -125,7 +121,6 @@ OUTPUT_PATH = '{}/bigquery/query/questions.csv'.format(GCS_WORKING_DIR)

```python
import kfp.dsl as dsl
import kfp.gcp as gcp
import json
@dsl.pipeline(
name='Bigquery query pipeline',
Expand All @@ -147,7 +142,7 @@ def pipeline(
table_id=table_id,
output_gcs_path=output_gcs_path,
dataset_location=dataset_location,
job_config=job_config).apply(gcp.use_gcp_secret('user-gcp-sa'))
job_config=job_config)
```

#### Compile the pipeline
Expand Down
11 changes: 3 additions & 8 deletions components/gcp/bigquery/query/sample.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -57,11 +57,7 @@
"To use the component, the following requirements must be met:\n",
"\n",
"* The BigQuery API is enabled.\n",
"* The component is running under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow Pipeline cluster. For example:\n",
"\n",
" ```\n",
" bigquery_query_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))\n",
" ```\n",
"* The component can authenticate to use GCP APIs. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.\n",
"* The Kubeflow user service account is a member of the `roles/bigquery.admin` role of the project.\n",
"* The Kubeflow user service account is a member of the `roles/storage.objectCreator `role of the Cloud Storage output bucket.\n",
"\n",
Expand Down Expand Up @@ -179,7 +175,6 @@
"outputs": [],
"source": [
"import kfp.dsl as dsl\n",
"import kfp.gcp as gcp\n",
"import json\n",
"@dsl.pipeline(\n",
" name='Bigquery query pipeline',\n",
Expand All @@ -201,7 +196,7 @@
" table_id=table_id, \n",
" output_gcs_path=output_gcs_path, \n",
" dataset_location=dataset_location, \n",
" job_config=job_config).apply(gcp.use_gcp_secret('user-gcp-sa'))"
" job_config=job_config)"
]
},
{
Expand Down Expand Up @@ -301,4 +296,4 @@
},
"nbformat": 4,
"nbformat_minor": 2
}
}
16 changes: 6 additions & 10 deletions components/gcp/dataflow/launch_python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,14 +63,11 @@ job_id | The ID of the Cloud Dataflow job that is created.
## Cautions & requirements
To use the components, the following requirements must be met:
- Cloud Dataflow API is enabled.
- The component is running under a secret Kubeflow user service account in a Kubeflow Pipelines cluster. For example:
```
component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))
```
The Kubeflow user service account is a member of:
- `roles/dataflow.developer` role of the project.
- `roles/storage.objectViewer` role of the Cloud Storage Objects `python_file_path` and `requirements_file_path`.
- `roles/storage.objectCreator` role of the Cloud Storage Object `staging_dir`.
- The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.
- The Kubeflow user service account is a member of:
- `roles/dataflow.developer` role of the project.
- `roles/storage.objectViewer` role of the Cloud Storage Objects `python_file_path` and `requirements_file_path`.
- `roles/storage.objectCreator` role of the Cloud Storage Object `staging_dir`.

## Detailed description
The component does several things during the execution:
Expand Down Expand Up @@ -221,7 +218,6 @@ OUTPUT_FILE = '{}/wc/wordcount.out'.format(GCS_STAGING_DIR)

```python
import kfp.dsl as dsl
import kfp.gcp as gcp
import json
@dsl.pipeline(
name='Dataflow launch python pipeline',
Expand All @@ -243,7 +239,7 @@ def pipeline(
staging_dir = staging_dir,
requirements_file_path = requirements_file_path,
args = args,
wait_interval = wait_interval).apply(gcp.use_gcp_secret('user-gcp-sa'))
wait_interval = wait_interval)
```

#### Compile the pipeline
Expand Down
18 changes: 7 additions & 11 deletions components/gcp/dataflow/launch_python/sample.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -47,14 +47,11 @@
"## Cautions & requirements\n",
"To use the components, the following requirements must be met:\n",
"- Cloud Dataflow API is enabled.\n",
"- The component is running under a secret Kubeflow user service account in a Kubeflow Pipeline cluster. For example:\n",
"```\n",
"component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))\n",
"```\n",
"The Kubeflow user service account is a member of:\n",
"- `roles/dataflow.developer` role of the project.\n",
"- `roles/storage.objectViewer` role of the Cloud Storage Objects `python_file_path` and `requirements_file_path`.\n",
"- `roles/storage.objectCreator` role of the Cloud Storage Object `staging_dir`. \n",
"- The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.\n",
"- The Kubeflow user service account is a member of:\n",
" - `roles/dataflow.developer` role of the project.\n",
" - `roles/storage.objectViewer` role of the Cloud Storage Objects `python_file_path` and `requirements_file_path`.\n",
" - `roles/storage.objectCreator` role of the Cloud Storage Object `staging_dir`. \n",
"\n",
"## Detailed description\n",
"The component does several things during the execution:\n",
Expand Down Expand Up @@ -295,7 +292,6 @@
"outputs": [],
"source": [
"import kfp.dsl as dsl\n",
"import kfp.gcp as gcp\n",
"import json\n",
"@dsl.pipeline(\n",
" name='Dataflow launch python pipeline',\n",
Expand All @@ -317,7 +313,7 @@
" staging_dir = staging_dir, \n",
" requirements_file_path = requirements_file_path, \n",
" args = args,\n",
" wait_interval = wait_interval).apply(gcp.use_gcp_secret('user-gcp-sa'))"
" wait_interval = wait_interval)"
]
},
{
Expand Down Expand Up @@ -417,4 +413,4 @@
},
"nbformat": 4,
"nbformat_minor": 2
}
}
10 changes: 3 additions & 7 deletions components/gcp/dataflow/launch_template/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,8 @@ job_id | The id of the Cloud Dataflow job that is created.

To use the component, the following requirements must be met:
- Cloud Dataflow API is enabled.
- The component is running under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow Pipeline cluster. For example:
```
component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))
```
* The Kubeflow user service account is a member of:
- The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.
- The Kubeflow user service account is a member of:
- `roles/dataflow.developer` role of the project.
- `roles/storage.objectViewer` role of the Cloud Storage Object `gcs_path.`
- `roles/storage.objectCreator` role of the Cloud Storage Object `staging_dir.`
Expand Down Expand Up @@ -102,7 +99,6 @@ OUTPUT_PATH = '{}/out/wc'.format(GCS_WORKING_DIR)

```python
import kfp.dsl as dsl
import kfp.gcp as gcp
import json
@dsl.pipeline(
name='Dataflow launch template pipeline',
Expand All @@ -128,7 +124,7 @@ def pipeline(
location = location,
validate_only = validate_only,
staging_dir = staging_dir,
wait_interval = wait_interval).apply(gcp.use_gcp_secret('user-gcp-sa'))
wait_interval = wait_interval))
```

#### Compile the pipeline
Expand Down
12 changes: 4 additions & 8 deletions components/gcp/dataflow/launch_template/sample.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -42,11 +42,8 @@
"\n",
"To use the component, the following requirements must be met:\n",
"- Cloud Dataflow API is enabled.\n",
"- The component is running under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow Pipeline cluster. For example:\n",
" ```\n",
" component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))\n",
" ```\n",
"* The Kubeflow user service account is a member of:\n",
"- The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.\n",
"- The Kubeflow user service account is a member of:\n",
" - `roles/dataflow.developer` role of the project.\n",
" - `roles/storage.objectViewer` role of the Cloud Storage Object `gcs_path.`\n",
" - `roles/storage.objectCreator` role of the Cloud Storage Object `staging_dir.` \n",
Expand Down Expand Up @@ -155,7 +152,6 @@
"outputs": [],
"source": [
"import kfp.dsl as dsl\n",
"import kfp.gcp as gcp\n",
"import json\n",
"@dsl.pipeline(\n",
" name='Dataflow launch template pipeline',\n",
Expand All @@ -181,7 +177,7 @@
" location = location, \n",
" validate_only = validate_only,\n",
" staging_dir = staging_dir,\n",
" wait_interval = wait_interval).apply(gcp.use_gcp_secret('user-gcp-sa'))"
" wait_interval = wait_interval)"
]
},
{
Expand Down Expand Up @@ -282,4 +278,4 @@
},
"nbformat": 4,
"nbformat_minor": 2
}
}
9 changes: 2 additions & 7 deletions components/gcp/dataproc/create_cluster/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,11 +62,7 @@ Note: You can recycle the cluster by using the [Dataproc delete cluster componen

To use the component, you must:
* Set up the GCP project by following these [steps](https://cloud.google.com/dataproc/docs/guides/setup-project).
* Run the component under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow cluster. For example:

```
component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))
```
* The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.
* Grant the following types of access to the Kubeflow user service account:
* Read access to the Cloud Storage buckets which contain the initialization action files.
* The role, `roles/dataproc.editor`, on the project.
Expand Down Expand Up @@ -114,7 +110,6 @@ EXPERIMENT_NAME = 'Dataproc - Create Cluster'

```python
import kfp.dsl as dsl
import kfp.gcp as gcp
import json
@dsl.pipeline(
name='Dataproc create cluster pipeline',
Expand All @@ -140,7 +135,7 @@ def dataproc_create_cluster_pipeline(
config_bucket=config_bucket,
image_version=image_version,
cluster=cluster,
wait_interval=wait_interval).apply(gcp.use_gcp_secret('user-gcp-sa'))
wait_interval=wait_interval)
```

#### Compile the pipeline
Expand Down
11 changes: 3 additions & 8 deletions components/gcp/dataproc/create_cluster/sample.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -46,11 +46,7 @@
"\n",
"To use the component, you must:\n",
"* Set up the GCP project by following these [steps](https://cloud.google.com/dataproc/docs/guides/setup-project).\n",
"* Run the component under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow cluster. For example:\n",
"\n",
" ```\n",
" component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))\n",
" ```\n",
"* The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.\n",
"* Grant the following types of access to the Kubeflow user service account:\n",
" * Read access to the Cloud Storage buckets which contains initialization action files.\n",
" * The role, `roles/dataproc.editor` on the project.\n",
Expand Down Expand Up @@ -137,7 +133,6 @@
"outputs": [],
"source": [
"import kfp.dsl as dsl\n",
"import kfp.gcp as gcp\n",
"import json\n",
"@dsl.pipeline(\n",
" name='Dataproc create cluster pipeline',\n",
Expand All @@ -163,7 +158,7 @@
" config_bucket=config_bucket, \n",
" image_version=image_version, \n",
" cluster=cluster, \n",
" wait_interval=wait_interval).apply(gcp.use_gcp_secret('user-gcp-sa'))"
" wait_interval=wait_interval)"
]
},
{
Expand Down Expand Up @@ -248,4 +243,4 @@
},
"nbformat": 4,
"nbformat_minor": 2
}
}
9 changes: 2 additions & 7 deletions components/gcp/dataproc/delete_cluster/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,7 @@ ML workflow:
## Cautions & requirements
To use the component, you must:
* Set up a GCP project by following this [guide](https://cloud.google.com/dataproc/docs/guides/setup-project).
* Run the component under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow cluster. For example:

```
component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))
```
* The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.
* Grant the Kubeflow user service account the role, `roles/dataproc.editor`, on the project.

## Detailed description
Expand Down Expand Up @@ -98,7 +94,6 @@ EXPERIMENT_NAME = 'Dataproc - Delete Cluster'

```python
import kfp.dsl as dsl
import kfp.gcp as gcp
import json
@dsl.pipeline(
name='Dataproc delete cluster pipeline',
Expand All @@ -112,7 +107,7 @@ def dataproc_delete_cluster_pipeline(
dataproc_delete_cluster_op(
project_id=project_id,
region=region,
name=name).apply(gcp.use_gcp_secret('user-gcp-sa'))
name=name)
```

#### Compile the pipeline
Expand Down
9 changes: 2 additions & 7 deletions components/gcp/dataproc/delete_cluster/sample.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,7 @@
"## Cautions & requirements\n",
"To use the component, you must:\n",
"* Set up a GCP project by following this [guide](https://cloud.google.com/dataproc/docs/guides/setup-project).\n",
"* Run the component under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow cluster. For example:\n",
"\n",
" ```\n",
" component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))\n",
" ```\n",
"* The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.\n",
"* Grant the Kubeflow user service account the role `roles/dataproc.editor` on the project.\n",
"\n",
"## Detailed description\n",
Expand Down Expand Up @@ -125,7 +121,6 @@
"outputs": [],
"source": [
"import kfp.dsl as dsl\n",
"import kfp.gcp as gcp\n",
"import json\n",
"@dsl.pipeline(\n",
" name='Dataproc delete cluster pipeline',\n",
Expand All @@ -139,7 +134,7 @@
" dataproc_delete_cluster_op(\n",
" project_id=project_id, \n",
" region=region, \n",
" name=name).apply(gcp.use_gcp_secret('user-gcp-sa'))"
" name=name)"
]
},
{
Expand Down
9 changes: 2 additions & 7 deletions components/gcp/dataproc/submit_hadoop_job/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,11 +60,7 @@ job_id | The ID of the created job. | String
To use the component, you must:
* Set up a GCP project by following this [guide](https://cloud.google.com/dataproc/docs/guides/setup-project).
* [Create a new cluster](https://cloud.google.com/dataproc/docs/guides/create-cluster).
* Run the component under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow cluster. For example:

```python
component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))
```
* The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.
* Grant the Kubeflow user service account the role, `roles/dataproc.editor`, on the project.

## Detailed description
Expand Down Expand Up @@ -135,7 +131,6 @@ Caution: This will remove all blob files under `OUTPUT_GCS_PATH`.

```python
import kfp.dsl as dsl
import kfp.gcp as gcp
import json
@dsl.pipeline(
name='Dataproc submit Hadoop job pipeline',
Expand Down Expand Up @@ -164,7 +159,7 @@ def dataproc_submit_hadoop_job_pipeline(
args=args,
hadoop_job=hadoop_job,
job=job,
wait_interval=wait_interval).apply(gcp.use_gcp_secret('user-gcp-sa'))
wait_interval=wait_interval)
```

#### Compile the pipeline
Expand Down
Loading

0 comments on commit cccbb9b

Please sign in to comment.