-
Notifications
You must be signed in to change notification settings - Fork 123
feat(l1): rpc instrumentation and panels #5311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| "disableTextWrap": false, | ||
| "editorMode": "code", | ||
| "expr": "(\n sum by (function_name) (\n increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", function_name!~\"Block execution|Execute Block|Trie update\"}[$__range])\n )\n)\nor\nlabel_replace(\n sum( increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", function_name=\"Block execution\"}[$__range]) )\n - sum( increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", function_name=~\"Storage read|Account read|Account code read|Block hash read\"}[$__range]) ),\n \"function_name\",\"Block Execution (w/o reads)\",\"__name__\",\".*\"\n)\nor\nlabel_replace(\n sum( increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", function_name=\"Execute Block\"}[$__range]) )\n - sum( increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", function_name=\"Block execution\"}[$__range]) ),\n \"function_name\",\"Merkleization Diff\",\"__name__\",\".*\"\n)", | ||
| "expr": "(\n sum by (function_name) (\n increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", namespace!~\"rpc|engine\", function_name!~\"Block execution|Execute Block|Trie update\"}[$__range])\n )\n)\nor\nlabel_replace(\n sum( increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", namespace!~\"rpc|engine\", function_name=\"Block execution\"}[$__range]) )\n - sum( increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", namespace!~\"rpc|engine\", function_name=~\"Storage read|Account read|Account code read|Block hash read\"}[$__range]) ),\n \"function_name\",\"Block Execution (w/o reads)\",\"__name__\",\".*\"\n)\nor\nlabel_replace(\n sum( increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", namespace!~\"rpc|engine\", function_name=\"Execute Block\"}[$__range]) )\n - sum( increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", namespace!~\"rpc|engine\", function_name=\"Block execution\"}[$__range]) ),\n \"function_name\",\"Merkleization Diff\",\"__name__\",\".*\"\n)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of matching with the newly added block_execution namespace we are matching against the 2 additional ones (RPC and engine) this is to support the current running nodes when this changes are deployed to our centralized grafana instance. It avoids breaking the current Execution Breakdown for nodes that doesn'ta have this change and here is an issue to made this change in the following days: #5312
| record_async_duration(namespace, req.method.as_str(), async move { | ||
| request.handle(context).await | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to have this behind the metrics feature flag maybe? I'll take a look at how the rest of the instrumentation was working with the previous breakdown panels and in any case create an issue to follow-up on this.
Lines of code reportTotal lines added: Detailed view |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive RPC and Engine API instrumentation to improve observability and performance monitoring. The changes introduce a namespace-based profiling system that categorizes metrics by their source (block_execution, rpc, engine), along with new Grafana dashboard panels to visualize these metrics.
Key Changes
- Introduced
record_async_durationhelper function for timing async operations without requiring#[instrument]attributes - Added namespace field to all instrumentation spans for better metric categorization
- Created new RPC and Engine API dashboard panels showing request rates, latency, and method breakdowns
Reviewed Changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/blockchain/metrics/profiling.rs | Added namespace support via NamespaceVisitor, implemented record_async_duration for async timing, and updated histogram to include namespace label |
| crates/networking/rpc/rpc.rs | Integrated record_async_duration in RPC handler to instrument all RPC/Engine calls with appropriate namespace |
| crates/networking/rpc/Cargo.toml | Added ethrex-metrics dependency for profiling functionality |
| crates/blockchain/blockchain.rs | Added namespace="block_execution" to Execute Block, Trie update, and Block DB update spans |
| crates/blockchain/vm.rs | Added namespace="block_execution" to database read operations (Account, Storage, Block hash, Account code) |
| crates/vm/backends/mod.rs | Added namespace="block_execution" to Block execution span |
| Cargo.toml | Added ethrex-metrics to workspace dependencies |
| Cargo.lock | Updated dependency graph for ethrex-rpc with ethrex-metrics |
| metrics/provisioning/grafana/dashboards/common_dashboards/ethrex_l1_perf.json | Added RPC and Engine API panels with request rates, latency charts, and method breakdowns; updated existing queries to filter by namespace |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
metrics/provisioning/grafana/dashboards/common_dashboards/ethrex_l1_perf.json
Outdated
Show resolved
Hide resolved
metrics/provisioning/grafana/dashboards/common_dashboards/ethrex_l1_perf.json
Outdated
Show resolved
Hide resolved
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
MegaRedHand
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
**Motivation** Add rpc and engine instrumentation and panels to our dashboards <!-- Why does this pull request exist? What are its goals? --> **Description** This PR adds a couple of changes to profiling.rs and our dashboard: - Add a simple way to time async funcrions without relying on `#instrument` (which would be more complex for every rpc) - Add namespace as a field to the instrumentation spans - Set a namespace for the old ones (`block_execution`) - Make the breakdown panels work with the new changes - Add new RPC and Engine panels <!-- A clear and concise general description of the changes this PR introduces --> <img width="2551" height="780" alt="image" src="https://github.com/user-attachments/assets/3b9feead-fc21-4115-b91b-118157477f18" /> <!-- Link to issues: Resolves lambdaclass#111, Resolves lambdaclass#222 --> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
**Motivation** Document the new execution breakdown and RPC panels in the dashboard as well as other metric related docs **Description** This PR documents both #5220 and #5311 additions, so we might want to merge #5311 first. It updates the gap analysis, the actual dashboard doc and some metrics related docs. It also sets prometheus scrape to 5s as already done today on our centralized grafana instance. Quick [link](https://github.com/lambdaclass/ethrex/blob/update-dashboard-documentation/docs/developers/l1/dashboards.md) to check the updated dashboard doc out with the viewer --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Motivation
Add rpc and engine instrumentation and panels to our dashboards
Description
This PR adds a couple of changes to profiling.rs and our dashboard:
#instrument(which would be more complex for every rpc)block_execution)