Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Update permission requirement in README #2422

Merged
merged 10 commits into from
Oct 29, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions manifests/gcp_marketplace/guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ gcloud projects add-iam-policy-binding $PROJECT_ID \
--member=serviceAccount:$SA_NAME@$PROJECT_ID.iam.gserviceaccount.com \
--role=roles/storage.admin

# Note that you can not bind multiple roles in one line.
numerology marked this conversation as resolved.
Show resolved Hide resolved
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member=serviceAccount:$SA_NAME@$PROJECT_ID.iam.gserviceaccount.com \
--role=roles/ml.admin
Expand Down Expand Up @@ -98,3 +99,25 @@ as `Service Account User`. The Google Service Account is [Compute Engine default
- Please also add your account as `Project Viewer` via [IAM](https://console.cloud.google.com/iam-admin/iam).

For simplicity but not good for security, adding as `Project Editor` also can work.

### Pipeline steps got insufficient permission
If you see an error message stating that the pipeline got insufficient
permissions, for example:

```
Error executing an HTTP request: HTTP response code 403 with body '{
"error": {
"errors": [
{
"domain": "global",
"reason": "insufficientPermissions",
"message": "Insufficient Permission"
}
],
"code": 403,
"message": "Insufficient Permission"
}
}
```
please make sure following the procedure in [credential setup](#gcp-service-account-credentials). IAM configuration and/or
API enabling might take up to 5 mins to propagate.
13 changes: 8 additions & 5 deletions samples/contrib/parameterized_tfx_oss/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Parameterized TFX pipeline sample
# Overview

[Tensorflow Extended (TFX)](https://github.com/tensorflow/tfx) is a Google-production-scale machine
learning platform based on TensorFlow. It provides a configuration framework to express ML pipelines
Expand All @@ -9,15 +9,19 @@ This sample demonstrates how to author a ML pipeline in TFX and run it on a KFP
Please refer to inline comments for the purpose of each step.

In order to successfully compile this sample, you'll need to have a TFX installation at HEAD.
First, you can clone their repo and run `python setup.py install` from `tfx/tfx`.
First, you can clone their repo and run `python setup.py install` from `tfx/`.
The image used in the pipeline is specified as `tfx_image` in the
`KubeflowDagRunnerConfig`. Currently we're using our own patched version of TFX image containing visualization support.
List of officially released nightly build image available can be found [here](https://hub.docker.com/r/tensorflow/tfx/tags)).

After that, running
`python3 chicago_taxi_pipeline_simple.py` compiles the TFX pipeline into KFP pipeline package.
This pipeline requires google storage permission to run.

numerology marked this conversation as resolved.
Show resolved Hide resolved
# Permission

This pipeline requires Google Cloud Storage permission to run.
If KFP was deployed through K8S marketplace, please follow instructions in [the guideline](https://github.com/kubeflow/pipelines/blob/master/manifests/gcp_marketplace/guide.md#gcp-service-account-credentials)
numerology marked this conversation as resolved.
Show resolved Hide resolved
to make sure the service account has `storage.admin` role.

## Caveats

Expand All @@ -28,8 +32,7 @@ objects `dsl.PipelineParam` and appending them to the `KubeflowDagRunner._params
KubeflowDagRunner can correctly identify those pipeline parameters and interpret them as Argo
placeholder correctly when compilation. However, this parameterization approach is a hack and
we do not have plan for long-term support. Instead we're working with TFX team to support
pipeline parameterization using their [RuntimeParameter](https://github.com/tensorflow/tfx/blob/46bb4f975c36ea1defde4b3c33553e088b3dc5b8/tfx/orchestration/data_types.py#L108).

pipeline parameterization using their [RuntimeParameter](https://github.com/tensorflow/tfx/blob/46bb4f975c36ea1defde4b3c33553e088b3dc5b8/tfx/orchestration/data_types.py#L108).
### Known issues
* This approach only works for string-typed quantities. For example, you cannot parameterize
`num_steps` of `Trainer` in this way.
Expand Down
7 changes: 5 additions & 2 deletions samples/core/xgboost_training_cm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,11 @@ or not.

## Requirements

Preprocessing uses Google Cloud DataProc. Therefore, you must enable the [DataProc API](https://cloud.google.com/endpoints/docs/openapi/enable-api) for the given GCP project.

Preprocessing uses Google Cloud DataProc. Therefore, you must enable the
[Cloud Dataproc API](https://pantheon.corp.google.com/apis/library/dataproc.googleapis.com?q=dataproc) for the given GCP project. This is the
general [guideline](https://cloud.google.com/endpoints/docs/openapi/enable-api) for enabling GCP APIs.
If KFP was deployed through K8S marketplace, please follow instructions in [the guideline](https://github.com/kubeflow/pipelines/blob/master/manifests/gcp_marketplace/guide.md#gcp-service-account-credentials)
numerology marked this conversation as resolved.
Show resolved Hide resolved
to make sure the service account used has the role `storage.admin` and `dataproc.admin`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably also add the API enabled requirement, such as "enable the DataProc API for the given GCP project"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have a direct link to enabling that API.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a link to the API page.
@Ark-kun are you saying there is a direct link that can enable the API within one-click? If so that'll be better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you try running a Dataproc or Dataflow component, that link is in the error log.

Here is the Dataflow link: https://console.cloud.google.com/apis/api/dataflow.googleapis.com/overview?project=1234567


## Compile

Expand Down