Skip to content

Conversation

@rodrigo-o
Copy link
Collaborator

@rodrigo-o rodrigo-o commented Nov 12, 2025

Motivation

Add rpc and engine instrumentation and panels to our dashboards

Description

This PR adds a couple of changes to profiling.rs and our dashboard:

  • Add a simple way to time async funcrions without relying on #instrument (which would be more complex for every rpc)
  • Add namespace as a field to the instrumentation spans
  • Set a namespace for the old ones (block_execution)
  • Make the breakdown panels work with the new changes
  • Add new RPC and Engine panels
image

"disableTextWrap": false,
"editorMode": "code",
"expr": "(\n sum by (function_name) (\n increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", function_name!~\"Block execution|Execute Block|Trie update\"}[$__range])\n )\n)\nor\nlabel_replace(\n sum( increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", function_name=\"Block execution\"}[$__range]) )\n - sum( increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", function_name=~\"Storage read|Account read|Account code read|Block hash read\"}[$__range]) ),\n \"function_name\",\"Block Execution (w/o reads)\",\"__name__\",\".*\"\n)\nor\nlabel_replace(\n sum( increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", function_name=\"Execute Block\"}[$__range]) )\n - sum( increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", function_name=\"Block execution\"}[$__range]) ),\n \"function_name\",\"Merkleization Diff\",\"__name__\",\".*\"\n)",
"expr": "(\n sum by (function_name) (\n increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", namespace!~\"rpc|engine\", function_name!~\"Block execution|Execute Block|Trie update\"}[$__range])\n )\n)\nor\nlabel_replace(\n sum( increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", namespace!~\"rpc|engine\", function_name=\"Block execution\"}[$__range]) )\n - sum( increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", namespace!~\"rpc|engine\", function_name=~\"Storage read|Account read|Account code read|Block hash read\"}[$__range]) ),\n \"function_name\",\"Block Execution (w/o reads)\",\"__name__\",\".*\"\n)\nor\nlabel_replace(\n sum( increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", namespace!~\"rpc|engine\", function_name=\"Execute Block\"}[$__range]) )\n - sum( increase(function_duration_seconds_sum{job=\"$job\", instance=~\"$instance(:\\\\d+)?$\", namespace!~\"rpc|engine\", function_name=\"Block execution\"}[$__range]) ),\n \"function_name\",\"Merkleization Diff\",\"__name__\",\".*\"\n)",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of matching with the newly added block_execution namespace we are matching against the 2 additional ones (RPC and engine) this is to support the current running nodes when this changes are deployed to our centralized grafana instance. It avoids breaking the current Execution Breakdown for nodes that doesn'ta have this change and here is an issue to made this change in the following days: #5312

Comment on lines +200 to +202
record_async_duration(namespace, req.method.as_str(), async move {
request.handle(context).await
})
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to have this behind the metrics feature flag maybe? I'll take a look at how the rest of the instrumentation was working with the previous breakdown panels and in any case create an issue to follow-up on this.

@github-actions
Copy link

github-actions bot commented Nov 13, 2025

Lines of code report

Total lines added: 103
Total lines removed: 0
Total lines changed: 103

Detailed view
+-----------------------------------------------+-------+------+
| File                                          | Lines | Diff |
+-----------------------------------------------+-------+------+
| ethrex/crates/blockchain/blockchain.rs        | 1313  | +15  |
+-----------------------------------------------+-------+------+
| ethrex/crates/blockchain/metrics/profiling.rs | 110   | +55  |
+-----------------------------------------------+-------+------+
| ethrex/crates/blockchain/vm.rs                | 124   | +20  |
+-----------------------------------------------+-------+------+
| ethrex/crates/networking/rpc/rpc.rs           | 843   | +8   |
+-----------------------------------------------+-------+------+
| ethrex/crates/vm/backends/mod.rs              | 171   | +5   |
+-----------------------------------------------+-------+------+

@github-actions github-actions bot added the L1 Ethereum client label Nov 13, 2025
@rodrigo-o rodrigo-o marked this pull request as ready for review November 13, 2025 13:41
@rodrigo-o rodrigo-o requested a review from a team as a code owner November 13, 2025 13:41
Copilot AI review requested due to automatic review settings November 13, 2025 13:41
@ethrex-project-sync ethrex-project-sync bot moved this to In Review in ethrex_l1 Nov 13, 2025
Copilot finished reviewing on behalf of rodrigo-o November 13, 2025 13:43
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive RPC and Engine API instrumentation to improve observability and performance monitoring. The changes introduce a namespace-based profiling system that categorizes metrics by their source (block_execution, rpc, engine), along with new Grafana dashboard panels to visualize these metrics.

Key Changes

  • Introduced record_async_duration helper function for timing async operations without requiring #[instrument] attributes
  • Added namespace field to all instrumentation spans for better metric categorization
  • Created new RPC and Engine API dashboard panels showing request rates, latency, and method breakdowns

Reviewed Changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
crates/blockchain/metrics/profiling.rs Added namespace support via NamespaceVisitor, implemented record_async_duration for async timing, and updated histogram to include namespace label
crates/networking/rpc/rpc.rs Integrated record_async_duration in RPC handler to instrument all RPC/Engine calls with appropriate namespace
crates/networking/rpc/Cargo.toml Added ethrex-metrics dependency for profiling functionality
crates/blockchain/blockchain.rs Added namespace="block_execution" to Execute Block, Trie update, and Block DB update spans
crates/blockchain/vm.rs Added namespace="block_execution" to database read operations (Account, Storage, Block hash, Account code)
crates/vm/backends/mod.rs Added namespace="block_execution" to Block execution span
Cargo.toml Added ethrex-metrics to workspace dependencies
Cargo.lock Updated dependency graph for ethrex-rpc with ethrex-metrics
metrics/provisioning/grafana/dashboards/common_dashboards/ethrex_l1_perf.json Added RPC and Engine API panels with request rates, latency charts, and method breakdowns; updated existing queries to filter by namespace

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rodrigo-o and others added 2 commits November 13, 2025 14:54
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Collaborator

@MegaRedHand MegaRedHand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jrchatruc jrchatruc added this pull request to the merge queue Nov 14, 2025
Merged via the queue into main with commit 9fc41e7 Nov 14, 2025
49 of 55 checks passed
@jrchatruc jrchatruc deleted the rpc-instrumentation-and-panels branch November 14, 2025 20:48
@github-project-automation github-project-automation bot moved this from In Review to Done in ethrex_l1 Nov 14, 2025
lakshya-sky pushed a commit to lakshya-sky/ethrex that referenced this pull request Nov 17, 2025
**Motivation**

Add rpc and engine instrumentation and panels to our dashboards

<!-- Why does this pull request exist? What are its goals? -->

**Description**

This PR adds a couple of changes to profiling.rs and our dashboard:

- Add a simple way to time async funcrions without relying on
`#instrument` (which would be more complex for every rpc)
- Add namespace as a field to the instrumentation spans
- Set a namespace for the old ones (`block_execution`)
- Make the breakdown panels work with the new changes
- Add new RPC and Engine panels

<!-- A clear and concise general description of the changes this PR
introduces -->

<img width="2551" height="780" alt="image"
src="https://github.com/user-attachments/assets/3b9feead-fc21-4115-b91b-118157477f18"
/>

<!-- Link to issues: Resolves lambdaclass#111, Resolves lambdaclass#222 -->

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
github-merge-queue bot pushed a commit that referenced this pull request Nov 17, 2025
**Motivation**

Document the new execution breakdown and RPC panels in the dashboard as
well as other metric related docs

**Description**

This PR documents both #5220 and #5311 additions, so we might want to
merge #5311 first.

It updates the gap analysis, the actual dashboard doc and some metrics
related docs. It also sets prometheus scrape to 5s as already done today
on our centralized grafana instance.

Quick
[link](https://github.com/lambdaclass/ethrex/blob/update-dashboard-documentation/docs/developers/l1/dashboards.md)
to check the updated dashboard doc out with the viewer

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

L1 Ethereum client

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants