Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add table expiry #230

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Conversation

thomas-vl
Copy link
Contributor

@thomas-vl thomas-vl commented Aug 21, 2023

Description & motivation

Add hours_to_expiration as an option to external tables in bigquery
resolves: #225

version: 2

sources:
  - name: test
    loader: gcloud storage
    dataset: "{{ target.dataset }}"
    tables:
      - name: test
        description: Sheet
        external:
          hours_to_expiration: 24
          location: 'https://docs.google.com/spreadsheets/d/id'
          options:
            format: google_sheets
            skip_leading_rows: 1
            sheet_range: 'sheetrange'
        columns:
          - name: column_name
            data_type: STRING
            description: "Column Name"

Checklist

  • I have verified that these changes work locally
  • I have updated the README.md (if applicable)
  • I have added an integration test for my fix/feature (if applicable)

Copy link
Collaborator

@dataders dataders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hours_to_expiration should be nested within options, just as the now merged max_staleness does. accordingly this new clause should be adjusted to happen within the existing options loop, as described below.

if not a CETAS options clause, to what does dbt's external.options pertain?

@@ -4,11 +4,11 @@
{%- set external = source_node.external -%}
{%- set partitions = external.partitions -%}
{%- set options = external.options -%}

{%- set hours_to_expiration = external.get('hours_to_expiration') -%}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe this is all that is needed

Suggested change
{%- set hours_to_expiration = external.get('hours_to_expiration') -%}
{%- set hours_to_expiration = external.hours_to_expiration -%}

@@ -45,5 +45,8 @@
{%- endif -%}
{%- endfor -%}
{%- endif -%}
{%- if hours_to_expiration -%}
, expiration_timestamp = TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL {{hours_to_expiration}} hour)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we integrate this into the already existing enumeration of options.items() (L39-L47)? perhaps something like the below?

this solution isn't very clean either though, imho. I thought to use the fancy jinja loop.first to exclude the leading comma in the first example, but expiration_timestamp will never be first in the list of options as uris are a required option. 😵 . perhaps we could refactor so that the currently unused external field is where uris are given??

            {%- if options is mapping -%}
            {%- for key, value in options.items() if key != 'uris' %}
                {%- if value is string -%}
                    , {{key}} = '{{value}}'
                {%- else -%}
                    {%- if key == "hours_to_expiration" -%}
                        , expiration_timestamp = TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL {{hours_to_expiration}} hour)
                    {%- else -%}
                        , {{key}} = {{value}}
                {%- endif -%}
            {%- endfor -%}
            {%- endif -%}

@dataders
Copy link
Collaborator

dataders commented Apr 5, 2024

any progress on this @thomas-vl?

Copy link
Collaborator

@dataders dataders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomas-vl have you had a chance to consider my feedback from the previous review?

Copy link

This PR has been marked as Stale because it has been open with no activity as of late. If you would like the PR to remain open, please comment on the PR or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DBT 'hours_to_expiration' on table [BigQuery]
2 participants