-
Notifications
You must be signed in to change notification settings - Fork 1
feat: add observoor CPU utilization model #218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add external model for observoor.cpu_utilization and transformation model fct_node_cpu_utilization that enriches with node_class for EIP-7870 reference node filtering. Includes migration 070 and auto-generated proto/Go bindings. Also fixes proto comments for int_engine_new_payload_fastest_execution_by_node_class that referenced the wrong table name.
Fix fct_node_cpu_utilization dependency format from
"observoor.cpu_utilization" to "{{external}}.observoor_cpu_utilization"
to match the expected dependency pattern. Update FROM clause to use
the correct dep helper syntax.
Add pectra and fusaka transformation tests with assertions covering
data integrity, CPU percentage bounds, and node_class enrichment logic.
The CI test runner needs this table schema to exist so it can clone it and load parquet test data for fct_node_cpu_utilization.
The test runner's CloneExternalDatabase always looked in the `default` database for external table schemas. External models with a `database` field in their frontmatter (e.g., observoor_cpu_utilization → observoor.cpu_utilization) need to be cloned from the correct source database. Changes: - Add Database field to Frontmatter, SourceDB/SourceTable to ModelMetadata - Add ExternalTableRef type to carry cross-database source info through the pipeline - Update CloneExternalDatabase to accept ExternalTableRef list with per-table source DB - Update cloneTableWithUniqueReplicaPath and modifyCreateTableForClone to handle table rename when source table name differs from model name - Remove migration 071 (incorrectly placed external table on CBT cluster)
The frontmatter `table` field for cross-database external models contains the source table name (e.g., "cpu_utilization"), not the model identifier (e.g., "observoor_cpu_utilization"). The test runner was caching the model under the wrong key, causing lookups to fail and falling back to default database when cloning tables.
The global word-boundary regex was causing double-prefixing of table names in cross-database clones (observoor_observoor_cpu_utilization_local). Replace with targeted string replacements that only modify the Distributed engine's local table reference.
Two fixes for fct_node_cpu_utilization producing 0 rows in CI:
1. Fix transformation dep key: CBT resolves external dependency entries
using eConfig.Table (from frontmatter), not the model name. For
cross-database models where table != model name, the dep key must
match the frontmatter table name ("cpu_utilization"), not the model
name ("observoor_cpu_utilization").
2. Fix parquet data loading: Cross-database external models have their
bounds scan and dependency helpers resolve to the source database
(e.g., observoor.cpu_utilization). The test runner must load parquet
data into the source database, not the per-test ext_XXX database,
so the CBT engine finds data during bounds scanning.
…l models
The {{external}} placeholder substitutes the default external database,
which doesn't match cross-database models that register under their own
database (e.g., observoor.cpu_utilization). Use the literal database.table
format so CBT's DAG lookup resolves correctly.
Also adds resolveExternalDependency to the test runner so "observoor.cpu_utilization"
maps back to the canonical model name "observoor_cpu_utilization".
Better aligns with naming conventions since the data is per-process (keyed by pid + client_type) within each node.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
observoor_cpu_utilizationto ingest eBPF CPU utilization data from the observoor databasefct_node_cpu_utilizationwithnode_classenrichment for EIP-7870 filtering_local+ distributed patternDepends on observoor data being available on the xatu ClickHouse cluster. Frontend counterpart: ethpandaops/lab#417