Skip to content

Commit c2c2958

Browse files
authored
feat: allow to set clustering and time partitioning options at table creation (#928)
* refactor: standardize bigquery options handling to manage more options * feat: handle table partitioning, table clustering and more table options (expiration_timestamp, expiration_timestamp, require_partition_filter, default_rounding_mode) via create_table dialect options * fix: having clustering fields and partitioning exposed has table indexes leads to bad autogenerated version file def upgrade() -> None: # ### commands auto generated by Alembic - please adjust! ### op.drop_index('clustering', table_name='dataset.some_table') op.drop_index('partition', table_name='dataset.some_table') # ### end Alembic commands ### def downgrade() -> None: # ### commands auto generated by Alembic - please adjust! ### op.create_index('partition', 'dataset.some_table', ['createdAt'], unique=False) op.create_index('clustering', 'dataset.some_table', ['id', 'createdAt'], unique=False) # ### end Alembic commands ### * docs: update README to describe how to create clustered and partitioned table as well as other newly supported table options * test: adjust system tests since indexes are no longer populated from table partitions and clustering info * test: alembic now supports creating partitioned tables * test: run integration tests with all the new create_table options * chore: rename variables to represent what it is a bit more clearly * fix: assertions should no be used to validate user inputs * refactor: extract process_option_value() from post_create_table() for improved readability * docs: add docstring to post_create_table() and _process_option_value() * test: increase code coverage by testing error cases * refactor: better represent the distinction between the option value data type check and the transformation in SQL literal * test: adding test cases for _validate_option_value_type() and _process_option_value() * chore: coding style * chore: reformat files with black * test: typo in tests * feat: change the option name for partitioning to leverage the TimePartitioning interface of the Python Client for Google BigQuery * fix: TimePartitioning.field is optional * chore: coding style * test: fix system test with table option bigquery_require_partition_filter * feat: add support for experimental range_partitioning option * test: fix system test with new bigquery_time_partitioning table option * docs: update README with time_partitioning and range_partitioning * test: relevant comments in unit tests * test: cover all error cases * chore: no magic numbers * chore: consistency in docstrings * chore: no magic number * chore: better error types * chore: fix W605 invalid escape sequence
1 parent ac74a34 commit c2c2958

File tree

7 files changed

+799
-67
lines changed

7 files changed

+799
-67
lines changed

README.rst

Lines changed: 52 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -292,14 +292,65 @@ To add metadata to a table:
292292

293293
.. code-block:: python
294294
295-
table = Table('mytable', ..., bigquery_description='my table description', bigquery_friendly_name='my table friendly name')
295+
table = Table('mytable', ...,
296+
bigquery_description='my table description',
297+
bigquery_friendly_name='my table friendly name',
298+
bigquery_default_rounding_mode="ROUND_HALF_EVEN",
299+
bigquery_expiration_timestamp=datetime.datetime.fromisoformat("2038-01-01T00:00:00+00:00"),
300+
)
296301
297302
To add metadata to a column:
298303

299304
.. code-block:: python
300305
301306
Column('mycolumn', doc='my column description')
302307
308+
To create a clustered table:
309+
310+
.. code-block:: python
311+
312+
table = Table('mytable', ..., bigquery_clustering_fields=["a", "b", "c"])
313+
314+
To create a time-unit column-partitioned table:
315+
316+
.. code-block:: python
317+
318+
from google.cloud import bigquery
319+
320+
table = Table('mytable', ...,
321+
bigquery_time_partitioning=bigquery.TimePartitioning(
322+
field="mytimestamp",
323+
type_="MONTH",
324+
expiration_ms=1000 * 60 * 60 * 24 * 30 * 6, # 6 months
325+
),
326+
bigquery_require_partition_filter=True,
327+
)
328+
329+
To create an ingestion-time partitioned table:
330+
331+
.. code-block:: python
332+
333+
from google.cloud import bigquery
334+
335+
table = Table('mytable', ...,
336+
bigquery_time_partitioning=bigquery.TimePartitioning(),
337+
bigquery_require_partition_filter=True,
338+
)
339+
340+
To create an integer-range partitioned table
341+
342+
.. code-block:: python
343+
344+
from google.cloud import bigquery
345+
346+
table = Table('mytable', ...,
347+
bigquery_range_partitioning=bigquery.RangePartitioning(
348+
field="zipcode",
349+
range_=bigquery.PartitionRange(start=0, end=100000, interval=10),
350+
),
351+
bigquery_require_partition_filter=True,
352+
)
353+
303354
304355
Threading and Multiprocessing
305356
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

0 commit comments

Comments
 (0)