[Doc] Change sample/component/sdk documentation to not use `use_gcp_s…

…ecret` (#2782) * Update use_gcp_secret documentation to point to authenticating pipelines to GCP doc * Update Local Development Quickstart.ipynb
kubeflow · Jan 9, 2020 · 589aaa9 · 589aaa9
1 parent 41a1da2
commit 589aaa9
Show file tree

Hide file tree

Showing 32 changed files with 117 additions and 264 deletions.
diff --git a/components/gcp/bigquery/query/README.md b/components/gcp/bigquery/query/README.md
@@ -52,11 +52,7 @@ output_gcs_path | The path to the Cloud Storage bucket containing the query outp
 To use the component, the following requirements must be met:
 
 *   The BigQuery API is enabled.
-*   The component is running under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow Pipeline cluster.  For example:
-
-    ```
-    bigquery_query_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))
-    ```
+*   The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.
 *   The Kubeflow user service account is a member of the `roles/bigquery.admin` role of the project.
 *   The Kubeflow user service account is a member of the `roles/storage.objectCreator `role of the Cloud Storage output bucket.
 
@@ -125,7 +121,6 @@ OUTPUT_PATH = '{}/bigquery/query/questions.csv'.format(GCS_WORKING_DIR)
 
 ```python
 import kfp.dsl as dsl
-import kfp.gcp as gcp
 import json
 @dsl.pipeline(
     name='Bigquery query pipeline',
@@ -147,7 +142,7 @@ def pipeline(
         table_id=table_id, 
         output_gcs_path=output_gcs_path, 
         dataset_location=dataset_location, 
-        job_config=job_config).apply(gcp.use_gcp_secret('user-gcp-sa'))
+        job_config=job_config)
 ```
 
 #### Compile the pipeline

diff --git a/components/gcp/bigquery/query/sample.ipynb b/components/gcp/bigquery/query/sample.ipynb
@@ -57,11 +57,7 @@
     "To use the component, the following requirements must be met:\n",
     "\n",
     "*   The BigQuery API is enabled.\n",
-    "*   The component is running under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow Pipeline cluster.  For example:\n",
-    "\n",
-    "    ```\n",
-    "    bigquery_query_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))\n",
-    "    ```\n",
+    "*   The component can authenticate to use GCP APIs. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.\n",
     "*   The Kubeflow user service account is a member of the `roles/bigquery.admin` role of the project.\n",
     "*   The Kubeflow user service account is a member of the `roles/storage.objectCreator `role of the Cloud Storage output bucket.\n",
     "\n",
@@ -179,7 +175,6 @@
    "outputs": [],
    "source": [
     "import kfp.dsl as dsl\n",
-    "import kfp.gcp as gcp\n",
     "import json\n",
     "@dsl.pipeline(\n",
     "    name='Bigquery query pipeline',\n",
@@ -201,7 +196,7 @@
     "        table_id=table_id, \n",
     "        output_gcs_path=output_gcs_path, \n",
     "        dataset_location=dataset_location, \n",
-    "        job_config=job_config).apply(gcp.use_gcp_secret('user-gcp-sa'))"
+    "        job_config=job_config)"
    ]
   },
   {
@@ -301,4 +296,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 2
-}
+}
diff --git a/components/gcp/dataflow/launch_python/README.md b/components/gcp/dataflow/launch_python/README.md
@@ -63,14 +63,11 @@ job_id | The ID of the Cloud Dataflow job that is created.
 ## Cautions & requirements
 To use the components, the following requirements must be met:
 - Cloud Dataflow API is enabled.
-- The component is running under a secret Kubeflow user service account in a Kubeflow Pipelines cluster.  For example:
-    ```
-    component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))
-    ```
-The Kubeflow user service account is a member of:
-- `roles/dataflow.developer` role of the project.
-- `roles/storage.objectViewer` role of the Cloud Storage Objects `python_file_path` and `requirements_file_path`.
-- `roles/storage.objectCreator` role of the Cloud Storage Object `staging_dir`. 
+- The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.
+- The Kubeflow user service account is a member of:
+    - `roles/dataflow.developer` role of the project.
+    - `roles/storage.objectViewer` role of the Cloud Storage Objects `python_file_path` and `requirements_file_path`.
+    - `roles/storage.objectCreator` role of the Cloud Storage Object `staging_dir`. 
 
 ## Detailed description
 The component does several things during the execution:
@@ -221,7 +218,6 @@ OUTPUT_FILE = '{}/wc/wordcount.out'.format(GCS_STAGING_DIR)
 
 ```python
 import kfp.dsl as dsl
-import kfp.gcp as gcp
 import json
 @dsl.pipeline(
     name='Dataflow launch python pipeline',
@@ -243,7 +239,7 @@ def pipeline(
         staging_dir = staging_dir, 
         requirements_file_path = requirements_file_path, 
         args = args,
-        wait_interval = wait_interval).apply(gcp.use_gcp_secret('user-gcp-sa'))
+        wait_interval = wait_interval)
 ```
 
 #### Compile the pipeline

diff --git a/components/gcp/dataflow/launch_python/sample.ipynb b/components/gcp/dataflow/launch_python/sample.ipynb
@@ -47,14 +47,11 @@
     "## Cautions & requirements\n",
     "To use the components, the following requirements must be met:\n",
     "- Cloud Dataflow API is enabled.\n",
-    "- The component is running under a secret Kubeflow user service account in a Kubeflow Pipeline cluster.  For example:\n",
-    "```\n",
-    "component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))\n",
-    "```\n",
-    "The Kubeflow user service account is a member of:\n",
-    "- `roles/dataflow.developer` role of the project.\n",
-    "- `roles/storage.objectViewer` role of the Cloud Storage Objects `python_file_path` and `requirements_file_path`.\n",
-    "- `roles/storage.objectCreator` role of the Cloud Storage Object `staging_dir`. \n",
+    "- The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.\n",
+    "- The Kubeflow user service account is a member of:\n",
+    "    - `roles/dataflow.developer` role of the project.\n",
+    "    - `roles/storage.objectViewer` role of the Cloud Storage Objects `python_file_path` and `requirements_file_path`.\n",
+    "    - `roles/storage.objectCreator` role of the Cloud Storage Object `staging_dir`. \n",
     "\n",
     "## Detailed description\n",
     "The component does several things during the execution:\n",
@@ -295,7 +292,6 @@
    "outputs": [],
    "source": [
     "import kfp.dsl as dsl\n",
-    "import kfp.gcp as gcp\n",
     "import json\n",
     "@dsl.pipeline(\n",
     "    name='Dataflow launch python pipeline',\n",
@@ -317,7 +313,7 @@
     "        staging_dir = staging_dir, \n",
     "        requirements_file_path = requirements_file_path, \n",
     "        args = args,\n",
-    "        wait_interval = wait_interval).apply(gcp.use_gcp_secret('user-gcp-sa'))"
+    "        wait_interval = wait_interval)"
    ]
   },
   {
@@ -417,4 +413,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 2
-}
+}
diff --git a/components/gcp/dataflow/launch_template/README.md b/components/gcp/dataflow/launch_template/README.md
@@ -37,11 +37,8 @@ job_id | The id of the Cloud Dataflow job that is created.
 
 To use the component, the following requirements must be met:
 - Cloud Dataflow API is enabled.
-- The component is running under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow Pipeline cluster.  For example:
-   ```
-   component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))
-   ```
-* The Kubeflow user service account is a member of:
+- The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.
+- The Kubeflow user service account is a member of:
     - `roles/dataflow.developer` role of the project.
     - `roles/storage.objectViewer` role of the Cloud Storage Object `gcs_path.`
     - `roles/storage.objectCreator` role of the Cloud Storage Object `staging_dir.` 
@@ -102,7 +99,6 @@ OUTPUT_PATH = '{}/out/wc'.format(GCS_WORKING_DIR)
 
 ```python
 import kfp.dsl as dsl
-import kfp.gcp as gcp
 import json
 @dsl.pipeline(
     name='Dataflow launch template pipeline',
@@ -128,7 +124,7 @@ def pipeline(
         location = location, 
         validate_only = validate_only,
         staging_dir = staging_dir,
-        wait_interval = wait_interval).apply(gcp.use_gcp_secret('user-gcp-sa'))
+        wait_interval = wait_interval))
 ```
 
 #### Compile the pipeline

diff --git a/components/gcp/dataflow/launch_template/sample.ipynb b/components/gcp/dataflow/launch_template/sample.ipynb
@@ -42,11 +42,8 @@
     "\n",
     "To use the component, the following requirements must be met:\n",
     "- Cloud Dataflow API is enabled.\n",
-    "- The component is running under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow Pipeline cluster.  For example:\n",
-    "   ```\n",
-    "   component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))\n",
-    "   ```\n",
-    "* The Kubeflow user service account is a member of:\n",
+    "- The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.\n",
+    "- The Kubeflow user service account is a member of:\n",
     "    - `roles/dataflow.developer` role of the project.\n",
     "    - `roles/storage.objectViewer` role of the Cloud Storage Object `gcs_path.`\n",
     "    - `roles/storage.objectCreator` role of the Cloud Storage Object `staging_dir.` \n",
@@ -155,7 +152,6 @@
    "outputs": [],
    "source": [
     "import kfp.dsl as dsl\n",
-    "import kfp.gcp as gcp\n",
     "import json\n",
     "@dsl.pipeline(\n",
     "    name='Dataflow launch template pipeline',\n",
@@ -181,7 +177,7 @@
     "        location = location, \n",
     "        validate_only = validate_only,\n",
     "        staging_dir = staging_dir,\n",
-    "        wait_interval = wait_interval).apply(gcp.use_gcp_secret('user-gcp-sa'))"
+    "        wait_interval = wait_interval)"
    ]
   },
   {
@@ -282,4 +278,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 2
-}
+}
diff --git a/components/gcp/dataproc/create_cluster/README.md b/components/gcp/dataproc/create_cluster/README.md
@@ -62,11 +62,7 @@ Note: You can recycle the cluster by using the [Dataproc delete cluster componen
 
 To use the component, you  must:
 *   Set up the GCP project by following these [steps](https://cloud.google.com/dataproc/docs/guides/setup-project).
-*   Run the component under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow cluster. For example:
-
-    ```
-    component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))
-    ```
+*   The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.
 *   Grant the following types of access to the Kubeflow user service account:
     *   Read access to the Cloud Storage buckets which contain the initialization action files.
     *   The role, `roles/dataproc.editor`, on the project.
@@ -114,7 +110,6 @@ EXPERIMENT_NAME = 'Dataproc - Create Cluster'
 
 ```python
 import kfp.dsl as dsl
-import kfp.gcp as gcp
 import json
 @dsl.pipeline(
     name='Dataproc create cluster pipeline',
@@ -140,7 +135,7 @@ def dataproc_create_cluster_pipeline(
         config_bucket=config_bucket, 
         image_version=image_version, 
         cluster=cluster, 
-        wait_interval=wait_interval).apply(gcp.use_gcp_secret('user-gcp-sa'))
+        wait_interval=wait_interval)
 ```
 
 #### Compile the pipeline

diff --git a/components/gcp/dataproc/create_cluster/sample.ipynb b/components/gcp/dataproc/create_cluster/sample.ipynb
@@ -46,11 +46,7 @@
     "\n",
     "To use the component, you  must:\n",
     "*   Set up the GCP project by following these [steps](https://cloud.google.com/dataproc/docs/guides/setup-project).\n",
-    "*   Run the component under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow cluster. For example:\n",
-    "\n",
-    "    ```\n",
-    "    component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))\n",
-    "    ```\n",
+    "*   The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.\n",
     "*   Grant the following types of access to the Kubeflow user service account:\n",
     "    *   Read access to the Cloud Storage buckets which contains initialization action files.\n",
     "    *   The role, `roles/dataproc.editor` on the project.\n",
@@ -137,7 +133,6 @@
    "outputs": [],
    "source": [
     "import kfp.dsl as dsl\n",
-    "import kfp.gcp as gcp\n",
     "import json\n",
     "@dsl.pipeline(\n",
     "    name='Dataproc create cluster pipeline',\n",
@@ -163,7 +158,7 @@
     "        config_bucket=config_bucket, \n",
     "        image_version=image_version, \n",
     "        cluster=cluster, \n",
-    "        wait_interval=wait_interval).apply(gcp.use_gcp_secret('user-gcp-sa'))"
+    "        wait_interval=wait_interval)"
    ]
   },
   {
@@ -248,4 +243,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 2
-}
+}
diff --git a/components/gcp/dataproc/delete_cluster/README.md b/components/gcp/dataproc/delete_cluster/README.md
@@ -43,11 +43,7 @@ ML workflow:
 ## Cautions & requirements
 To use the component, you must:
 *   Set up a GCP project by following this [guide](https://cloud.google.com/dataproc/docs/guides/setup-project).
-*   Run the component under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow cluster. For example:
-
-    ```
-    component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))
-    ```
+*   The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.
 *   Grant the Kubeflow user service account the role, `roles/dataproc.editor`, on the project.
 
 ## Detailed description
@@ -98,7 +94,6 @@ EXPERIMENT_NAME = 'Dataproc - Delete Cluster'
 
 ```python
 import kfp.dsl as dsl
-import kfp.gcp as gcp
 import json
 @dsl.pipeline(
     name='Dataproc delete cluster pipeline',
@@ -112,7 +107,7 @@ def dataproc_delete_cluster_pipeline(
     dataproc_delete_cluster_op(
         project_id=project_id, 
         region=region, 
-        name=name).apply(gcp.use_gcp_secret('user-gcp-sa'))
+        name=name)
 ```
 
 #### Compile the pipeline

diff --git a/components/gcp/dataproc/delete_cluster/sample.ipynb b/components/gcp/dataproc/delete_cluster/sample.ipynb
@@ -33,11 +33,7 @@
     "## Cautions & requirements\n",
     "To use the component, you must:\n",
     "*   Set up a GCP project by following this [guide](https://cloud.google.com/dataproc/docs/guides/setup-project).\n",
-    "*   Run the component under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow cluster. For example:\n",
-    "\n",
-    "    ```\n",
-    "    component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))\n",
-    "    ```\n",
+    "*   The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.\n",
     "*   Grant the Kubeflow user service account the role `roles/dataproc.editor` on the project.\n",
     "\n",
     "## Detailed description\n",
@@ -125,7 +121,6 @@
    "outputs": [],
    "source": [
     "import kfp.dsl as dsl\n",
-    "import kfp.gcp as gcp\n",
     "import json\n",
     "@dsl.pipeline(\n",
     "    name='Dataproc delete cluster pipeline',\n",
@@ -139,7 +134,7 @@
     "    dataproc_delete_cluster_op(\n",
     "        project_id=project_id, \n",
     "        region=region, \n",
-    "        name=name).apply(gcp.use_gcp_secret('user-gcp-sa'))"
+    "        name=name)"
    ]
   },
   {

diff --git a/components/gcp/dataproc/submit_hadoop_job/README.md b/components/gcp/dataproc/submit_hadoop_job/README.md
@@ -60,11 +60,7 @@ job_id | The ID of the created job. | String
 To use the component, you must:
 *   Set up a GCP project by following this [guide](https://cloud.google.com/dataproc/docs/guides/setup-project).
 *   [Create a new cluster](https://cloud.google.com/dataproc/docs/guides/create-cluster).
-*   Run the component under a secret [Kubeflow user service account](https://www.kubeflow.org/docs/started/getting-started-gke/#gcp-service-accounts) in a Kubeflow cluster. For example:
-
-    ```python
-    component_op(...).apply(gcp.use_gcp_secret('user-gcp-sa'))
-    ```
+*   The component can authenticate to GCP. Refer to [Authenticating Pipelines to GCP](https://www.kubeflow.org/docs/gke/authentication-pipelines/) for details.
 *   Grant the Kubeflow user service account the role, `roles/dataproc.editor`, on the project.
 
 ## Detailed description
@@ -135,7 +131,6 @@ Caution: This will remove all blob files under `OUTPUT_GCS_PATH`.
 
 ```python
 import kfp.dsl as dsl
-import kfp.gcp as gcp
 import json
 @dsl.pipeline(
     name='Dataproc submit Hadoop job pipeline',
@@ -164,7 +159,7 @@ def dataproc_submit_hadoop_job_pipeline(
         args=args, 
         hadoop_job=hadoop_job, 
         job=job, 
-        wait_interval=wait_interval).apply(gcp.use_gcp_secret('user-gcp-sa'))
+        wait_interval=wait_interval)
 ```
 
 #### Compile the pipeline