Skip to content

Commit

Permalink
chore: move samples from python-docs-sample (#66)
Browse files Browse the repository at this point in the history
* Add XMPP Sample

* Add Dataproc Sample

* Add more region tags

* Minor dataproc fixes

* Fix Dataproc e2e for Python 3

* Update reqs

* updating requirements [(#358)](#358)

Change-Id: I6177a17fad021e26ed76679d9db34848c17b62a8

* Update Reqs

* Wrong arg description

* Auto-update dependencies. [(#456)](#456)

* Auto-update dependencies. [(#459)](#459)

* Fix import order lint errors

Change-Id: Ieaf7237fc6f925daec46a07d2e81a452b841198a

* bump

Change-Id: I02e7767d13ba267ee9fc72c5b68a57013bb8b8d3

* Auto-update dependencies. [(#486)](#486)

* Auto-update dependencies. [(#540)](#540)

* Auto-update dependencies. [(#542)](#542)

* Move to google-cloud [(#544)](#544)

* Auto-update dependencies. [(#584)](#584)

* Auto-update dependencies. [(#629)](#629)

* Update samples to support latest Google Cloud Python [(#656)](#656)

* Update README.md [(#691)](#691)

* Auto-update dependencies. [(#715)](#715)

* Auto-update dependencies. [(#735)](#735)

* Auto-update dependencies.
* Fix language OCR sample
* Remove unused import

* Auto-update dependencies. [(#790)](#790)

* Remove usage of GoogleCredentials [(#810)](#810)

* Fix a typo [(#813)](#813)

* Remove cloud config fixture [(#887)](#887)

* Remove cloud config fixture

* Fix client secrets

* Fix bigtable instance

* Fix reference to our testing tools

* Auto-update dependencies. [(#914)](#914)

* Auto-update dependencies.

* xfail the error reporting test

* Fix lint

* Auto-update dependencies. [(#922)](#922)

* Auto-update dependencies.

* Fix pubsub iam samples

* Auto-update dependencies. [(#1005)](#1005)

* Auto-update dependencies.

* Fix bigtable lint

* Fix IOT iam interaction

* Auto-update dependencies. [(#1011)](#1011)

* Properly forwarding the "region" parameter provided as an input argument. [(#1029)](#1029)

* Auto-update dependencies. [(#1055)](#1055)

* Auto-update dependencies.

* Explicitly use latest bigtable client

Change-Id: Id71e9e768f020730e4ca9514a0d7ebaa794e7d9e

* Revert language update for now

Change-Id: I8867f154e9a5aae00d0047c9caf880e5e8f50c53

* Remove pdb. smh

Change-Id: I5ff905fadc026eebbcd45512d4e76e003e3b2b43

* Fix region handling and allow to use an existing cluster. [(#1053)](#1053)

* Auto-update dependencies. [(#1094)](#1094)

* Auto-update dependencies.

* Relax assertions in the ocr_nl sample

Change-Id: I6d37e5846a8d6dd52429cb30d501f448c52cbba1

* Drop unused logging apiary samples

Change-Id: I545718283773cb729a5e0def8a76ebfa40829d51

* Auto-update dependencies. [(#1133)](#1133)

* Auto-update dependencies.

* Fix missing http library

Change-Id: I99faa600f2f3f1f50f57694fc9835d7f35bda250

* Auto-update dependencies. [(#1186)](#1186)

* Auto-update dependencies. [(#1199)](#1199)

* Auto-update dependencies.

* Fix iot lint

Change-Id: I6289e093bdb35e38f9e9bfc3fbc3df3660f9a67e

* Fixed Failed Kokoro Test (Dataproc) [(#1203)](#1203)

* Fixed Failed Kokoro Test (Dataproc)

* Fixed Lint Error

* Update dataproc_e2e_test.py

* Update dataproc_e2e_test.py

* Fixing More Lint Errors

* Fixed b/65407087

* Revert "Merge branch 'master' of https://github.com/michaelawyu/python-docs-samples"

This reverts commit 1614c7d, reversing
changes made to cd1dbfd.

* Revert "Fixed b/65407087"

This reverts commit cd1dbfd.

* Fixed Lint Error

* Fixed Lint Error

* Auto-update dependencies. [(#1208)](#1208)

* Dataproc GCS sample plus doc touchups [(#1151)](#1151)

* Auto-update dependencies. [(#1217)](#1217)

* Auto-update dependencies. [(#1239)](#1239)

* Added "Open in Cloud Shell" buttons to README files [(#1254)](#1254)

* Auto-update dependencies. [(#1282)](#1282)

* Auto-update dependencies.

* Fix storage acl sample

Change-Id: I413bea899fdde4c4859e4070a9da25845b81f7cf

* Auto-update dependencies. [(#1309)](#1309)

* Auto-update dependencies. [(#1320)](#1320)

* Auto-update dependencies. [(#1355)](#1355)

* Auto-update dependencies. [(#1359)](#1359)

* Auto-update dependencies.

* update Dataproc region tags to standard format [(#1826)](#1826)

* Update submit_job_to_cluster.py [(#1708)](#1708)

switch region to new 'global' region and remove unnecessary function.

* Auto-update dependencies. [(#1846)](#1846)

ACK, merging.

* Need separate install for google-cloud-storage [(#1863)](#1863)

* Revert "Update dataproc/submit_job_to_cluster.py" [(#1864)](#1864)

* Revert "Remove test configs for non-testing directories [(#1855)](#1855)"

This reverts commit 73a7332.

* Revert "Auto-update dependencies. [(#1846)](#1846)"

This reverts commit 3adc94f4d0c14453153968c3851fae100e2c5e44.

* Revert "Tweak slack sample [(#1847)](#1847)"

This reverts commit a48c010.

* Revert "Non-client library example of constructing a Signed URL [(#1837)](#1837)"

This reverts commit fc3284d.

* Revert "GCF samples: handle {empty JSON, GET} requests + remove commas [(#1832)](#1832)"

This reverts commit 6928491.

* Revert "Correct the maintenance event types [(#1830)](#1830)"

This reverts commit c22840f.

* Revert "Fix GCF region tags [(#1827)](#1827)"

This reverts commit 0fbfef2.

* Revert "Updated to Flask 1.0 [(#1819)](#1819)"

This reverts commit d52ccf9.

* Revert "Fix deprecation warning [(#1801)](#1801)"

This reverts commit 981737e.

* Revert "Update submit_job_to_cluster.py [(#1708)](#1708)"

This reverts commit df1f2b22547b7ca86bbdb791ad930003a815a677.

* Create python-api-walkthrough.md [(#1966)](#1966)

* Create python-api-walkthrough.md

This Google Cloud Shell walkthrough is linked to Cloud Dataproc documentation to be published at: https://cloud.google.com/dataproc/docs/tutorials/python-library-example

* Update python-api-walkthrough.md

* Update list_clusters.py [(#1887)](#1887)

* Auto-update dependencies. [(#1980)](#1980)

* Auto-update dependencies.

* Update requirements.txt

* Update requirements.txt

* Update Dataproc samples. [(#2158)](#2158)

* Update requirements.txt

* Update python-api-walkthrough.md

* Update submit_job_to_cluster.py

* Update list_clusters.py

* Update python-api-walkthrough.md [(#2172)](#2172)

* Adds updates including compute [(#2436)](#2436)

* Adds updates including compute

* Python 2 compat pytest

* Fixing weird \r\n issue from GH merge

* Put asset tests back in

* Re-add pod operator test

* Hack parameter for k8s pod operator

* feat: adding samples for dataproc - create cluster [(#2536)](#2536)

* adding sample for cluster create

* small fix

* Add create cluster samples

* Fixed copyright, added 'dataproc' to region tag and changed imports from 'dataproc' to 'dataproc_v1'

* Fix copyright in create_cluster.py

* Auto-update dependencies. [(#2005)](#2005)

* Auto-update dependencies.

* Revert update of appengine/flexible/datastore.

* revert update of appengine/flexible/scipy

* revert update of bigquery/bqml

* revert update of bigquery/cloud-client

* revert update of bigquery/datalab-migration

* revert update of bigtable/quickstart

* revert update of compute/api

* revert update of container_registry/container_analysis

* revert update of dataflow/run_template

* revert update of datastore/cloud-ndb

* revert update of dialogflow/cloud-client

* revert update of dlp

* revert update of functions/imagemagick

* revert update of functions/ocr/app

* revert update of healthcare/api-client/fhir

* revert update of iam/api-client

* revert update of iot/api-client/gcs_file_to_device

* revert update of iot/api-client/mqtt_example

* revert update of language/automl

* revert update of run/image-processing

* revert update of vision/automl

* revert update testing/requirements.txt

* revert update of vision/cloud-client/detect

* revert update of vision/cloud-client/product_search

* revert update of jobs/v2/api_client

* revert update of jobs/v3/api_client

* revert update of opencensus

* revert update of translate/cloud-client

* revert update to speech/cloud-client

Co-authored-by: Kurtis Van Gent <31518063+kurtisvg@users.noreply.github.com>
Co-authored-by: Doug Mahugh <dmahugh@gmail.com>

* feat: dataproc quickstart sample added and create_cluster updated [(#2629)](#2629)

* Adding quickstart sample

* Added new quickstart sample and updated create_cluster sample

* Fix to create_cluster.py

* deleted dataproc quickstart files not under dataproc/quickstart/

* Added quickstart test

* Linting and formatting fixes

* Revert "Linting and formatting fixes"

This reverts commit c5afcbc.

* Added bucket cleanup to quickstart test

* Changes to samples and tests

* Linting fixes

* Removed todos in favor of clearer docstring

* Fixed lint error

Co-authored-by: Leah E. Cole <6719667+leahecole@users.noreply.github.com>

* Update Python Cloud Shell walkthrough script [(#2733)](#2733)

Cloud Shell walkthrough scripts no longer support enabling APIs. APIs must be enabled by linking to the console.
Updated product name: "Cloud Dataproc" -> "Dataproc".

* fix: added cli functionality to dataproc quickstart example [(#2734)](#2734)

* Added CLI functionality to quickstart

* Fixed Dataproc quickstart test to properly clean up GCS bucket [(#3001)](#3001)

* splitting up #2651 part 1/3 - dataproc + endpoints [(#3025)](#3025)

* splitting up #2651

* fix typos

* chore(deps): update dependency google-auth to v1.11.2 [(#2724)](#2724)

Co-authored-by: Leah E. Cole <6719667+leahecole@users.noreply.github.com>

* chore(deps): update dependency google-cloud-storage to v1.26.0 [(#3046)](#3046)

* chore(deps): update dependency google-cloud-storage to v1.26.0

* chore(deps): specify dependencies by python version

* chore: up other deps to try to remove errors

Co-authored-by: Leah E. Cole <6719667+leahecole@users.noreply.github.com>
Co-authored-by: Leah Cole <coleleah@google.com>

* chore(deps): update dependency google-cloud-dataproc to v0.7.0 [(#3083)](#3083)

* feat: added dataproc workflows samples [(#3056)](#3056)

* Added workflows sample

* chore(deps): update dependency grpcio to v1.27.2 [(#3173)](#3173)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [grpcio](https://grpc.io) | minor | `==1.25.0` -> `==1.27.2` |
| [grpcio](https://grpc.io) | minor | `==1.23.0` -> `==1.27.2` |
| [grpcio](https://grpc.io) | minor | `==1.26.0` -> `==1.27.2` |
| [grpcio](https://grpc.io) | patch | `==1.27.1` -> `==1.27.2` |

---

### Renovate configuration

:date: **Schedule**: At any time (no schedule defined).

:vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

:recycle: **Rebasing**: Never, or you tick the rebase/retry checkbox.

:no_bell: **Ignore**: Close this PR and you won't be reminded about these updates again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#GoogleCloudPlatform/python-docs-samples).

* Simplify noxfile setup. [(#2806)](#2806)

* chore(deps): update dependency requests to v2.23.0

* Simplify noxfile and add version control.

* Configure appengine/standard to only test Python 2.7.

* Update Kokokro configs to match noxfile.

* Add requirements-test to each folder.

* Remove Py2 versions from everything execept appengine/standard.

* Remove conftest.py.

* Remove appengine/standard/conftest.py

* Remove 'no-sucess-flaky-report' from pytest.ini.

* Add GAE SDK back to appengine/standard tests.

* Fix typo.

* Roll pytest to python 2 version.

* Add a bunch of testing requirements.

* Remove typo.

* Add appengine lib directory back in.

* Add some additional requirements.

* Fix issue with flake8 args.

* Even more requirements.

* Readd appengine conftest.py.

* Add a few more requirements.

* Even more Appengine requirements.

* Add webtest for appengine/standard/mailgun.

* Add some additional requirements.

* Add workaround for issue with mailjet-rest.

* Add responses for appengine/standard/mailjet.

Co-authored-by: Renovate Bot <bot@renovateapp.com>

* fix: add mains to samples [(#3284)](#3284)

Added mains to two samples: create_cluster and instantiate_inline_workflow_templates.

Fixed their associated tests to accommodate this.

Removed subprocess from quickstart/quickstart_test.py to fix [2873](#2873)

fixes #2873

* Update dependency grpcio to v1.28.1 [(#3276)](#3276)

Co-authored-by: Leah E. Cole <6719667+leahecole@users.noreply.github.com>

* Update dependency google-auth to v1.14.0 [(#3148)](#3148)

Co-authored-by: Leah E. Cole <6719667+leahecole@users.noreply.github.com>

* chore(deps): update dependency google-auth to v1.14.1 [(#3464)](#3464)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [google-auth](https://togithub.com/googleapis/google-auth-library-python) | patch | `==1.14.0` -> `==1.14.1` |
| [google-auth](https://togithub.com/googleapis/google-auth-library-python) | minor | `==1.11.2` -> `==1.14.1` |

---

### Release Notes

<details>
<summary>googleapis/google-auth-library-python</summary>

### [`v1.14.1`](https://togithub.com/googleapis/google-auth-library-python/blob/master/CHANGELOG.md#&#8203;1141-httpswwwgithubcomgoogleapisgoogle-auth-library-pythoncomparev1140v1141-2020-04-21)

[Compare Source](https://togithub.com/googleapis/google-auth-library-python/compare/v1.14.0...v1.14.1)

</details>

---

### Renovate configuration

:date: **Schedule**: At any time (no schedule defined).

:vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

:recycle: **Rebasing**: Never, or you tick the rebase/retry checkbox.

:no_bell: **Ignore**: Close this PR and you won't be reminded about these updates again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#GoogleCloudPlatform/python-docs-samples).

* chore(deps): update dependency google-cloud-storage to v1.28.0 [(#3260)](#3260)

Co-authored-by: Takashi Matsuo <tmatsuo@google.com>

* chore(deps): update dependency google-auth to v1.14.2 [(#3724)](#3724)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [google-auth](https://togithub.com/googleapis/google-auth-library-python) | patch | `==1.14.1` -> `==1.14.2` |

---

### Release Notes

<details>
<summary>googleapis/google-auth-library-python</summary>

### [`v1.14.2`](https://togithub.com/googleapis/google-auth-library-python/blob/master/CHANGELOG.md#&#8203;1142-httpswwwgithubcomgoogleapisgoogle-auth-library-pythoncomparev1141v1142-2020-05-07)

[Compare Source](https://togithub.com/googleapis/google-auth-library-python/compare/v1.14.1...v1.14.2)

</details>

---

### Renovate configuration

:date: **Schedule**: At any time (no schedule defined).

:vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

:recycle: **Rebasing**: Never, or you tick the rebase/retry checkbox.

:no_bell: **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#GoogleCloudPlatform/python-docs-samples).

* chore: some lint fixes [(#3743)](#3743)

* chore(deps): update dependency google-auth to v1.14.3 [(#3728)](#3728)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [google-auth](https://togithub.com/googleapis/google-auth-library-python) | patch | `==1.14.2` -> `==1.14.3` |

---

### Release Notes

<details>
<summary>googleapis/google-auth-library-python</summary>

### [`v1.14.3`](https://togithub.com/googleapis/google-auth-library-python/blob/master/CHANGELOG.md#&#8203;1143-httpswwwgithubcomgoogleapisgoogle-auth-library-pythoncomparev1142v1143-2020-05-11)

[Compare Source](https://togithub.com/googleapis/google-auth-library-python/compare/v1.14.2...v1.14.3)

</details>

---

### Renovate configuration

:date: **Schedule**: At any time (no schedule defined).

:vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

:recycle: **Rebasing**: Never, or you tick the rebase/retry checkbox.

:no_bell: **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [x] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#GoogleCloudPlatform/python-docs-samples).

* chore(deps): update dependency grpcio to v1.29.0 [(#3786)](#3786)

* chore(deps): update dependency google-cloud-storage to v1.28.1 [(#3785)](#3785)

* chore(deps): update dependency google-cloud-storage to v1.28.1

* [asset] testing: use uuid instead of time

Co-authored-by: Takashi Matsuo <tmatsuo@google.com>

* update google-auth to 1.15.0 part 3 [(#3816)](#3816)

* Update dependency google-cloud-dataproc to v0.8.0 [(#3837)](#3837)

* chore(deps): update dependency google-auth to v1.16.0 [(#3903)](#3903)

* update google-auth part 3 [(#3963)](#3963)

* chore(deps): update dependency google-cloud-dataproc to v0.8.1 [(#4015)](#4015)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [google-cloud-dataproc](https://togithub.com/googleapis/python-dataproc) | patch | `==0.8.0` -> `==0.8.1` |

---

### Release Notes

<details>
<summary>googleapis/python-dataproc</summary>

### [`v0.8.1`](https://togithub.com/googleapis/python-dataproc/blob/master/CHANGELOG.md#&#8203;081-httpswwwgithubcomgoogleapispython-dataproccomparev080v081-2020-06-05)

[Compare Source](https://togithub.com/googleapis/python-dataproc/compare/v0.8.0...v0.8.1)

</details>

---

### Renovate configuration

:date: **Schedule**: At any time (no schedule defined).

:vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

:recycle: **Rebasing**: Never, or you tick the rebase/retry checkbox.

:no_bell: **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#GoogleCloudPlatform/python-docs-samples).

* Replace GCLOUD_PROJECT with GOOGLE_CLOUD_PROJECT. [(#4022)](#4022)

* Update dependency google-auth to v1.17.0 [(#4058)](#4058)

* chore(deps): update dependency google-auth to v1.17.1 [(#4073)](#4073)

* Update dependency google-auth to v1.17.2 [(#4083)](#4083)

* Update dependency google-auth to v1.18.0 [(#4125)](#4125)

* Update dependency google-cloud-dataproc to v1 [(#4109)](#4109)

Co-authored-by: Takashi Matsuo <tmatsuo@google.com>

* chore(deps): update dependency google-cloud-storage to v1.29.0 [(#4040)](#4040)

* chore(deps): update dependency grpcio to v1.30.0 [(#4143)](#4143)

Co-authored-by: Takashi Matsuo <tmatsuo@google.com>

* Update dependency google-auth-httplib2 to v0.0.4 [(#4255)](#4255)

Co-authored-by: Takashi Matsuo <tmatsuo@google.com>

* chore(deps): update dependency pytest to v5.4.3 [(#4279)](#4279)

* chore(deps): update dependency pytest to v5.4.3

* specify pytest for python 2 in appengine

Co-authored-by: Leah Cole <coleleah@google.com>

* chore(deps): update dependency google-auth to v1.19.0 [(#4293)](#4293)

* chore(deps): update dependency google-cloud-dataproc to v1.0.1 [(#4309)](#4309)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [google-cloud-dataproc](https://togithub.com/googleapis/python-dataproc) | patch | `==1.0.0` -> `==1.0.1` |

---

### Release Notes

<details>
<summary>googleapis/python-dataproc</summary>

### [`v1.0.1`](https://togithub.com/googleapis/python-dataproc/blob/master/CHANGELOG.md#&#8203;101-httpswwwgithubcomgoogleapispython-dataproccomparev100v101-2020-07-16)

[Compare Source](https://togithub.com/googleapis/python-dataproc/compare/v1.0.0...v1.0.1)

</details>

---

### Renovate configuration

:date: **Schedule**: At any time (no schedule defined).

:vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

:recycle: **Rebasing**: Never, or you tick the rebase/retry checkbox.

:no_bell: **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#GoogleCloudPlatform/python-docs-samples).

* chore(deps): update dependency google-auth to v1.19.1 [(#4304)](#4304)

* chore(deps): update dependency google-auth to v1.19.2 [(#4321)](#4321)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [google-auth](https://togithub.com/googleapis/google-auth-library-python) | patch | `==1.19.1` -> `==1.19.2` |

---

### Release Notes

<details>
<summary>googleapis/google-auth-library-python</summary>

### [`v1.19.2`](https://togithub.com/googleapis/google-auth-library-python/blob/master/CHANGELOG.md#&#8203;1192-httpswwwgithubcomgoogleapisgoogle-auth-library-pythoncomparev1191v1192-2020-07-17)

[Compare Source](https://togithub.com/googleapis/google-auth-library-python/compare/v1.19.1...v1.19.2)

</details>

---

### Renovate configuration

:date: **Schedule**: At any time (no schedule defined).

:vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

:recycle: **Rebasing**: Never, or you tick the rebase/retry checkbox.

:no_bell: **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#GoogleCloudPlatform/python-docs-samples).

* Update dependency google-auth to v1.20.0 [(#4387)](#4387)

* Update dependency pytest to v6 [(#4390)](#4390)

* Update dependency grpcio to v1.31.0 [(#4438)](#4438)

* chore(deps): update dependency google-auth to v1.20.1 [(#4452)](#4452)

* chore: update templates

Co-authored-by: Bill Prin <waprin@google.com>
Co-authored-by: Bill Prin <waprin@gmail.com>
Co-authored-by: Jon Wayne Parrott <jonwayne@google.com>
Co-authored-by: Eran Kampf <eran@ekampf.com>
Co-authored-by: DPE bot <dpebot@google.com>
Co-authored-by: aman-ebay <amancuso@google.com>
Co-authored-by: Martial Hue <martial.hue@gmail.com>
Co-authored-by: Gioia Ballin <gioia.ballin@gmail.com>
Co-authored-by: michaelawyu <chenyumic@google.com>
Co-authored-by: michaelawyu <michael.a.w.yu@hotmail.com>
Co-authored-by: Alix Hamilton <ajhamilton@google.com>
Co-authored-by: James Winegar <jameswinegar@users.noreply.github.com>
Co-authored-by: Charles Engelke <github@engelke.com>
Co-authored-by: Gus Class <gguuss@gmail.com>
Co-authored-by: Brad Miro <bmiro@google.com>
Co-authored-by: Kurtis Van Gent <31518063+kurtisvg@users.noreply.github.com>
Co-authored-by: Doug Mahugh <dmahugh@gmail.com>
Co-authored-by: Leah E. Cole <6719667+leahecole@users.noreply.github.com>
Co-authored-by: WhiteSource Renovate <bot@renovateapp.com>
Co-authored-by: Leah Cole <coleleah@google.com>
Co-authored-by: Takashi Matsuo <tmatsuo@google.com>
  • Loading branch information
22 people authored Aug 8, 2020
0 parents commit 5ab0fe2
Show file tree
Hide file tree
Showing 17 changed files with 1,627 additions and 0 deletions.
84 changes: 84 additions & 0 deletions dataproc/snippets/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# Cloud Dataproc API Examples

[![Open in Cloud Shell][shell_img]][shell_link]

[shell_img]: http://gstatic.com/cloudssh/images/open-btn.png
[shell_link]: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/GoogleCloudPlatform/python-docs-samples&page=editor&open_in_editor=dataproc/README.md

Sample command-line programs for interacting with the Cloud Dataproc API.

See [the tutorial on the using the Dataproc API with the Python client
library](https://cloud.google.com/dataproc/docs/tutorials/python-library-example)
for information on a walkthrough you can run to try out the Cloud Dataproc API sample code.

Note that while this sample demonstrates interacting with Dataproc via the API, the functionality demonstrated here could also be accomplished using the Cloud Console or the gcloud CLI.

`list_clusters.py` is a simple command-line program to demonstrate connecting to the Cloud Dataproc API and listing the clusters in a region.

`submit_job_to_cluster.py` demonstrates how to create a cluster, submit the
`pyspark_sort.py` job, download the output from Google Cloud Storage, and output the result.

`single_job_workflow.py` uses the Cloud Dataproc InstantiateInlineWorkflowTemplate API to create an ephemeral cluster, run a job, then delete the cluster with one API request.

`pyspark_sort.py_gcs` is the same as `pyspark_sort.py` but demonstrates
reading from a GCS bucket.

## Prerequisites to run locally:

* [pip](https://pypi.python.org/pypi/pip)

Go to the [Google Cloud Console](https://console.cloud.google.com).

Under API Manager, search for the Google Cloud Dataproc API and enable it.

## Set Up Your Local Dev Environment

To install, run the following commands. If you want to use [virtualenv](https://virtualenv.readthedocs.org/en/latest/)
(recommended), run the commands within a virtualenv.

* pip install -r requirements.txt

## Authentication

Please see the [Google cloud authentication guide](https://cloud.google.com/docs/authentication/).
The recommended approach to running these samples is a Service Account with a JSON key.

## Environment Variables

Set the following environment variables:

GOOGLE_CLOUD_PROJECT=your-project-id
REGION=us-central1 # or your region
CLUSTER_NAME=waprin-spark7
ZONE=us-central1-b

## Running the samples

To run list_clusters.py:

python list_clusters.py $GOOGLE_CLOUD_PROJECT --region=$REGION

`submit_job_to_cluster.py` can create the Dataproc cluster or use an existing cluster. To create a cluster before running the code, you can use the [Cloud Console](console.cloud.google.com) or run:

gcloud dataproc clusters create your-cluster-name

To run submit_job_to_cluster.py, first create a GCS bucket (used by Cloud Dataproc to stage files) from the Cloud Console or with gsutil:

gsutil mb gs://<your-staging-bucket-name>

Next, set the following environment variables:

BUCKET=your-staging-bucket
CLUSTER=your-cluster-name

Then, if you want to use an existing cluster, run:

python submit_job_to_cluster.py --project_id=$GOOGLE_CLOUD_PROJECT --zone=us-central1-b --cluster_name=$CLUSTER --gcs_bucket=$BUCKET

Alternatively, to create a new cluster, which will be deleted at the end of the job, run:

python submit_job_to_cluster.py --project_id=$GOOGLE_CLOUD_PROJECT --zone=us-central1-b --cluster_name=$CLUSTER --gcs_bucket=$BUCKET --create_new_cluster

The script will setup a cluster, upload the PySpark file, submit the job, print the result, then, if it created the cluster, delete the cluster.

Optionally, you can add the `--pyspark_file` argument to change from the default `pyspark_sort.py` included in this script to a new script.
77 changes: 77 additions & 0 deletions dataproc/snippets/create_cluster.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
#!/usr/bin/env python

# Copyright 2019 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# This sample walks a user through creating a Cloud Dataproc cluster using
# the Python client library.
#
# This script can be run on its own:
# python create_cluster.py ${PROJECT_ID} ${REGION} ${CLUSTER_NAME}


import sys

# [START dataproc_create_cluster]
from google.cloud import dataproc_v1 as dataproc


def create_cluster(project_id, region, cluster_name):
"""This sample walks a user through creating a Cloud Dataproc cluster
using the Python client library.
Args:
project_id (string): Project to use for creating resources.
region (string): Region where the resources should live.
cluster_name (string): Name to use for creating a cluster.
"""

# Create a client with the endpoint set to the desired cluster region.
cluster_client = dataproc.ClusterControllerClient(client_options={
'api_endpoint': f'{region}-dataproc.googleapis.com:443',
})

# Create the cluster config.
cluster = {
'project_id': project_id,
'cluster_name': cluster_name,
'config': {
'master_config': {
'num_instances': 1,
'machine_type_uri': 'n1-standard-1'
},
'worker_config': {
'num_instances': 2,
'machine_type_uri': 'n1-standard-1'
}
}
}

# Create the cluster.
operation = cluster_client.create_cluster(project_id, region, cluster)
result = operation.result()

# Output a success message.
print(f'Cluster created successfully: {result.cluster_name}')
# [END dataproc_create_cluster]


if __name__ == "__main__":
if len(sys.argv) < 4:
sys.exit('python create_cluster.py project_id region cluster_name')

project_id = sys.argv[1]
region = sys.argv[2]
cluster_name = sys.argv[3]
create_cluster(project_id, region, cluster_name)
47 changes: 47 additions & 0 deletions dataproc/snippets/create_cluster_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Copyright 2019 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import os
import uuid

from google.cloud import dataproc_v1 as dataproc
import pytest

import create_cluster


PROJECT_ID = os.environ['GOOGLE_CLOUD_PROJECT']
REGION = 'us-central1'
CLUSTER_NAME = 'py-cc-test-{}'.format(str(uuid.uuid4()))


@pytest.fixture(autouse=True)
def teardown():
yield

cluster_client = dataproc.ClusterControllerClient(client_options={
'api_endpoint': f'{REGION}-dataproc.googleapis.com:443'
})
# Client library function
operation = cluster_client.delete_cluster(PROJECT_ID, REGION, CLUSTER_NAME)
# Wait for cluster to delete
operation.result()


def test_cluster_create(capsys):
# Wrapper function for client library function
create_cluster.create_cluster(PROJECT_ID, REGION, CLUSTER_NAME)

out, _ = capsys.readouterr()
assert CLUSTER_NAME in out
32 changes: 32 additions & 0 deletions dataproc/snippets/dataproc_e2e_donttest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

""" Integration tests for Dataproc samples.
Creates a Dataproc cluster, uploads a pyspark file to Google Cloud Storage,
submits a job to Dataproc that runs the pyspark file, then downloads
the output logs from Cloud Storage and verifies the expected output."""

import os

import submit_job_to_cluster

PROJECT = os.environ['GOOGLE_CLOUD_PROJECT']
BUCKET = os.environ['CLOUD_STORAGE_BUCKET']
CLUSTER_NAME = 'testcluster3'
ZONE = 'us-central1-b'


def test_e2e():
output = submit_job_to_cluster.main(
PROJECT, ZONE, CLUSTER_NAME, BUCKET)
assert b"['Hello,', 'dog', 'elephant', 'panther', 'world!']" in output
107 changes: 107 additions & 0 deletions dataproc/snippets/instantiate_inline_workflow_template.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# This sample walks a user through instantiating an inline
# workflow for Cloud Dataproc using the Python client library.
#
# This script can be run on its own:
# python instantiate_inline_workflow_template.py ${PROJECT_ID} ${REGION}


import sys

# [START dataproc_instantiate_inline_workflow_template]
from google.cloud import dataproc_v1 as dataproc


def instantiate_inline_workflow_template(project_id, region):
"""This sample walks a user through submitting a workflow
for a Cloud Dataproc using the Python client library.
Args:
project_id (string): Project to use for running the workflow.
region (string): Region where the workflow resources should live.
"""

# Create a client with the endpoint set to the desired region.
workflow_template_client = dataproc.WorkflowTemplateServiceClient(
client_options={
'api_endpoint': f'{region}-dataproc.googleapis.com:443'
}
)

parent = workflow_template_client.region_path(project_id, region)

template = {
'jobs': [
{
'hadoop_job': {
'main_jar_file_uri': 'file:///usr/lib/hadoop-mapreduce/'
'hadoop-mapreduce-examples.jar',
'args': [
'teragen',
'1000',
'hdfs:///gen/'
]
},
'step_id': 'teragen'
},
{
'hadoop_job': {
'main_jar_file_uri': 'file:///usr/lib/hadoop-mapreduce/'
'hadoop-mapreduce-examples.jar',
'args': [
'terasort',
'hdfs:///gen/',
'hdfs:///sort/'
]
},
'step_id': 'terasort',
'prerequisite_step_ids': [
'teragen'
]
}],
'placement': {
'managed_cluster': {
'cluster_name': 'my-managed-cluster',
'config': {
'gce_cluster_config': {
# Leave 'zone_uri' empty for 'Auto Zone Placement'
# 'zone_uri': ''
'zone_uri': 'us-central1-a'
}
}
}
}
}

# Submit the request to instantiate the workflow from an inline template.
operation = workflow_template_client.instantiate_inline_workflow_template(
parent, template
)
operation.result()

# Output a success message.
print('Workflow ran successfully.')
# [END dataproc_instantiate_inline_workflow_template]


if __name__ == "__main__":
if len(sys.argv) < 3:
sys.exit('python instantiate_inline_workflow_template.py '
+ 'project_id region')

project_id = sys.argv[1]
region = sys.argv[2]
instantiate_inline_workflow_template(project_id, region)
31 changes: 31 additions & 0 deletions dataproc/snippets/instantiate_inline_workflow_template_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import os

import instantiate_inline_workflow_template


PROJECT_ID = os.environ['GOOGLE_CLOUD_PROJECT']
REGION = 'us-central1'


def test_workflows(capsys):
# Wrapper function for client library function
instantiate_inline_workflow_template.instantiate_inline_workflow_template(
PROJECT_ID, REGION
)

out, _ = capsys.readouterr()
assert "successfully" in out
Loading

0 comments on commit 5ab0fe2

Please sign in to comment.