BigQuery: Moves BigQuery tutorial for Dataproc to python-docs-samples #1494

alixhami · 2018-05-18T02:00:20Z

Adds BigQuery portion of Dataproc tutorial to this repo so it can be regularly tested (previously hard coded in the docs).

tswast · 2018-05-18T16:25:34Z

bigquery/cloud-client/natality_tutorial.py

+
+
+def run_natality_tutorial():
+    # [START bigquery_query_natality_tutorial]


Question: do we want to own this as a BigQuery sample or should it go under Dataproc?

I'm leaning towards BigQuery, too, since there doesn't appear to be anything Dataproc-specific here.

I put it as BigQuery and didn't include Dataproc in the name because there's nothing Dataproc-related in this sample (except the docstring, which we could move into the docs instead of the sample) and we could potentially leverage this sample for other purposes.

tswast · 2018-05-18T16:27:31Z

bigquery/cloud-client/natality_tutorial.py

+
+    In the code below, the following actions are taken:
+    * A new dataset is created "natality_regression."
+    * A new table "regression_input" is created to hold the inputs for our


We don't have to actually create the table, right? The query job should be able to do that for us, so long as the configuration.query.createDisposition BigQuery job property is correct.

tswast · 2018-05-18T16:28:37Z

bigquery/cloud-client/natality_tutorial.py

+
+    # In the new BigQuery dataset, create a new table.
+    table_ref = dataset.table('regression_input')
+    # The table needs a schema before it can be created and accept data.


I actually think the query job will run just fine without a schema, since BigQuery knows all they correct types when it runs the query.

tswast · 2018-05-18T16:29:24Z

bigquery/cloud-client/natality_tutorial.py

+        SELECT weight_pounds, mother_age, father_age, gestation_weeks,
+        weight_gain_pounds, apgar_5min
+        FROM `bigquery-public-data.samples.natality`
+        WHERE weight_pounds is not null


Let's use consistent casing for SQL keywords AND IS NOT NULL.

tswast · 2018-05-18T19:31:50Z

bigquery/cloud-client/natality_tutorial.py

+    job_config.destination = table_ref
+
+    # BigQuery can auto-detect the schema based on the source table.
+    job_config.autodetect = True


QueryJobConfig doesn't have an autodetect parameter. This line is not needed.

Sorry if my previous comment was confusing. I mean that when BigQuery runs a query, it knows the types of the various columns in the query results based on the query text, so no explicit schema is needed.

Oops removed the nonsense attribute

googleapis#1494)](GoogleCloudPlatform/python-docs-samples#1494)

#1494)](GoogleCloudPlatform/python-docs-samples#1494)

Moves BigQuery tutorial for Dataproc to python-docs-samples

fcde4fd

alixhami added the bigquery label May 18, 2018

alixhami requested a review from tswast May 18, 2018 02:00

googlebot added the cla: yes This human has signed the Contributor License Agreement. label May 18, 2018

tswast reviewed May 18, 2018

View reviewed changes

alixhami added 2 commits May 18, 2018 12:27

refactors per comments

91670c7

removes incorrect portion of docstring

dd9fc47

tswast reviewed May 18, 2018

View reviewed changes

removes nonsense attribute

ae7c047

tswast approved these changes May 18, 2018

View reviewed changes

alixhami merged commit c6b3914 into GoogleCloudPlatform:master May 18, 2018

alixhami deleted the bq-natality-tutorial branch May 18, 2018 19:50

plamut pushed a commit to plamut/python-bigquery that referenced this pull request Jun 25, 2020

BigQuery: Moves BigQuery tutorial for Dataproc to python-docs-samples [(

02b0a53

googleapis#1494)](GoogleCloudPlatform/python-docs-samples#1494)

plamut pushed a commit to plamut/python-bigquery that referenced this pull request Jul 22, 2020

BigQuery: Moves BigQuery tutorial for Dataproc to python-docs-samples [(

f50b0d2

googleapis#1494)](GoogleCloudPlatform/python-docs-samples#1494)

shollyman pushed a commit to googleapis/python-bigquery that referenced this pull request Jul 22, 2020

BigQuery: Moves BigQuery tutorial for Dataproc to python-docs-samples [(

0288ebb

#1494)](GoogleCloudPlatform/python-docs-samples#1494)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BigQuery: Moves BigQuery tutorial for Dataproc to python-docs-samples #1494

BigQuery: Moves BigQuery tutorial for Dataproc to python-docs-samples #1494

Uh oh!

alixhami commented May 18, 2018

Uh oh!

tswast May 18, 2018

Uh oh!

alixhami May 18, 2018

Uh oh!

tswast May 18, 2018

Uh oh!

tswast May 18, 2018

Uh oh!

tswast May 18, 2018

Uh oh!

tswast May 18, 2018

Uh oh!

alixhami May 18, 2018

Uh oh!

Uh oh!



		def run_natality_tutorial():
		# [START bigquery_query_natality_tutorial]

BigQuery: Moves BigQuery tutorial for Dataproc to python-docs-samples #1494

BigQuery: Moves BigQuery tutorial for Dataproc to python-docs-samples #1494

Uh oh!

Conversation

alixhami commented May 18, 2018

Uh oh!

tswast May 18, 2018

Choose a reason for hiding this comment

Uh oh!

alixhami May 18, 2018

Choose a reason for hiding this comment

Uh oh!

tswast May 18, 2018

Choose a reason for hiding this comment

Uh oh!

tswast May 18, 2018

Choose a reason for hiding this comment

Uh oh!

tswast May 18, 2018

Choose a reason for hiding this comment

Uh oh!

tswast May 18, 2018

Choose a reason for hiding this comment

Uh oh!

alixhami May 18, 2018

Choose a reason for hiding this comment

Uh oh!

Uh oh!