Skip to content

Conversation

@halconel
Copy link
Contributor

@halconel halconel commented Feb 6, 2026

Summary

  • Add max_records_per_run parameter to limit records processed per run
  • Add diagnostic logging when offset is not calculated (empty output, NULL max_update_ts)
  • Add diagnostic logging when offset is not updated in run_full

Motivation

After PR #372 merge, analysis showed that 34% of transformations (73/215) have stuck offsets despite successful processing. This PR adds:

  1. Protection against processing millions of records in single run
  2. Diagnostic logging to identify WHY offset updates fail

Changes

  • Added max_records_per_run parameter to BaseBatchTransformStep.__init__
  • Applied SQL LIMIT in get_full_process_ids when max_records_per_run is set
  • Added logging in store_batch_result when offset not calculated
  • Added logging in run_full when changes.offsets is empty
  • Updated BatchTransform and DatatableBatchTransform dataclasses

@halconel halconel merged commit 7882c9d into epoch8:feat/offsets Feb 6, 2026
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant