Skip to content

Conversation

@fordN
Copy link
Contributor

@fordN fordN commented Feb 3, 2026

This PR adds support for prometheus metrics and creates a few initial metrics.

Summary

  • Add AmpMetrics singleton with Prometheus metrics for loaders
  • Track records processed, latency, errors, batch sizes, connections
  • Handle metric re-registration in test environments
  • Add metrics documentation
  • Adds prometheus-client dependency

Depends on PR #32 (test generalization)
Resolves #7

fordN added 15 commits January 8, 2026 18:12
Foundational work to enable all loader tests to inherit common test
patterns
Adds generalized streaming test infrastructure and migrates Redis and
Snowflake loader tests to use the shared base classes.
Migrates the final three loader test suites to use the shared base test
infrastructure
Iceberg loader was using an outdated reorg deletion method that
used a _meta_block_ranges column instead of using the modern
state_store + _amp_batch_id approach.

Changes:
1. _handle_reorg now uses state_store.invalidate_from_block() to get
   affected batch IDs, matching PostgreSQL/Snowflake/DeltaLake approach

2. _perform_reorg_deletion now filters rows by _amp_batch_id instead of
   trying to parse non-existent _meta_block_ranges JSON column

3. Efficient filtering using set membership checks on batch IDs
Changed test_append_mode to append data with different IDs (6-10) instead
of reusing the same IDs (1-5) to avoid duplicate key conflicts in key-value
stores like LMDB and Redis.
- Redis stream storage: Corrected test to use f'{table_name}:stream' key
  format to match how Redis loader stores stream data
- LMDB overwrite mode: Fixed _clear_data() to properly delete named
  databases when overwriting data
- LMDB streaming: Added tx_hash column to test data for compatibility
  with key pattern requirements
- Base streaming tests: Updated column references from transaction_hash
  to tx_hash for consistency across all loaders
**Iceberg Loader:**
- Added snapshot_id to metadata for test compatibility
- Modified base loader to pass table_name in kwargs to metadata methods
- Skipped partition_spec test (requires PartitionSpec object implementation)

**PostgreSQL Loader:**
- Fixed _clear_table() to check table existence before TRUNCATE
- Prevents "relation does not exist" errors in overwrite mode

**DeltaLake Loader:**
- Added partition_by property for convenient access
- Added delta_version and files_added metadata aliases
- Fixed test fixture to use unique table paths per test
- Prevents data accumulation across tests

**Test Infrastructure:**
- Updated delta_basic_config fixture to generate unique paths per test
- Prevents cross-test contamination in DeltaLake tests
Key fixes:
- Changed loader.conn to loader.connection (Snowflake uses different attribute name)
- Set supports_overwrite = False (Snowflake doesn't support OVERWRITE mode)
- Set requires_existing_table = False (Snowflake auto-creates tables)
- Added cleanup_tables fixture for Snowflake-specific test cleanup
@fordN fordN self-assigned this Feb 3, 2026
@fordN fordN changed the base branch from main to ford/generalize-loader-tests February 3, 2026 01:26
@fordN fordN force-pushed the ford/metrics-instrumentation branch from 60b9016 to 67c07a0 Compare February 3, 2026 01:31
@fordN fordN added the metrics label Feb 3, 2026
@edgeandnode edgeandnode deleted a comment from github-actions bot Feb 3, 2026
- use new dependency `prometheus-client`
-
@fordN fordN force-pushed the ford/metrics-instrumentation branch from 67c07a0 to 5ee72d0 Compare February 3, 2026 01:59
@fordN fordN marked this pull request as draft February 3, 2026 04:09
Base automatically changed from ford/generalize-loader-tests to main February 9, 2026 22:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant