Skip to content

Full refresh model kind creating multiple batches #3729

Open
@MikeWallis42

Description

I'm still trying to narrow down the exact conditions but the thing that triggered this behaviour was setting the default start in config.yml to a date far in the past.

model_defaults:
  dialect: trino
  start: 2020-03-09

After doing this we noticed the plan mentioned 4 batches which then failed our duplicate primary key audit as there's no temporal macros used in the full models, so it inserted the same date 4 times.

I initially thought it could be because of the variable BATCH_SIZE = 10000 as our models are set to interval_unit 'hour' but that only seems to be used when creating source queries for dataframes.

At the moment we have a workaround by changing the start directly in full models to start '1 day ago'

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions