-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Closed
Description
Is this your first time submitting a feature request?
- I have read the expectations for open source contributors
- I have searched the existing issues, and I could not find an existing issue for this feature
- I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion
Describe the feature
There are many use cases where it would be helpful to know the event time field of a given model:
- filtering your project to only run on the last X days of data (sampling)
- incremental extensions (lookback window, generate the "where" clause for you)
- intelligently set your clustering key
- etc.
This is similar to a handful of our current configs:
We believe this is distinct from your partition key, because it's likely you want to filter on something different (event_time) than what you’re partitioning on (processing_time) - event time vs processing time.
We should add a new config to allow folks to specify the event time field for a given model.
Acceptance Criteria
- Create new config
event_time
that accepts SQL (name of a column, or SQL liketo_date(my_time_col)
; similar toloaded_at_field
)
models:
- name: my_model
config:
event_time: my_time_column
columns:
- name: my_time_column
data_type: timestamp
...
- config can be set in config block or in
dbt_project.yml
or in schema yml - config can be set for models and sources
YML design options
Option 1: field name field
models:
- name: my_model
config:
event_time_field: my_time_field
columns:
- name: my_time_field
data_type: timestamp
...
granularity: day
Pros | Cons |
---|---|
can set in dbt_project.yml | duplicate column name in YML |
reuse same data_type config block | |
could use any sql, like loaded_at_field |
Option 2: dictionary config
models:
- name: my_model
config:
event_time:
field_name: my_time_field
granularity: day
...
columns:
- name: my_time_field
data_type: timestamp
Open considerations
- What should the config be called?
event_time
,event_time_field
, something else? - Should we also support this configuration on other resource types (sources, snapshots, etc.)?
Additional requirement
Add some document about how config on node and config in node.config works(code level implementation) while working on the ticket.
related issue #7157
https://docs.getdbt.com/reference/configs-and-properties
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request