-
Couldn't load subscription status.
- Fork 355
Description
Feature description
Description
As a DLT technical user on multiple extract duplicate warnings would like to see what is the resource the thresholds are breached.
Examples
{"written_at":"2024-01-15T14:18:11.004Z","written_ts":12323743747832468,"component_name":"salesforce","process":12345,"taskName":null,"msg":"Large number of records (201) sharing the same value of cursor field '<yourcursorfield>'. This can happen if the cursor field has a low resolution (e.g., only stores dates without times), causing many records to share the same cursor value. Consider using a cursor column with higher resolution to reduce the deduplication state size.","type":"log","logger":"dlt","thread":"MainThread","level":"WARNING","module":"__init__","line_no":600,"version":{"dlt_version":"1.15.1","pipeline_name":"<pipeline_name>"}}
Are you a dlt user?
Yes, I'm already a dlt user.
Use case
On Salesforce for example it is hard to tell which is the resource deduped/with source data issues
Proposed solution
Add to logger.warning _check_duplicate_cursor_threshold the resource name in question.
Related issues
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status