Description
This is a follow-up to the work for #2408 and PR #2409.
Suggestions
We discussed yesterday with operators and here are some takeaways, also based on my observations so far:
-
Even after a channel is cleared,
oldest_*
metrics can remain to the same value (i.e., not reset to0
).- I'm not sure why that would be the case, because in v1-rc.0 we do reset the field to
0
- https://github.com/informalsystems/ibc-rs/blob/b80bceab591a291203af8c41732b016cf612a296/telemetry/src/state.rs#L468-L470
- I'm not sure why that would be the case, because in v1-rc.0 we do reset the field to
-
I think we should clarify what the
oldest_timestamp
is, it seems this field is a local timestamp to the Hermes process, not an on-chain packet timestamp (when the packet was created), which is not clear from the telemetry help message, specifically:# HELP oldest_timestamp The timestamp of the oldest sequence number in seconds
# TYPE oldest_timestamp gauge
oldest_timestamp{chain="ibc-0",channel="channel-3",counterparty="ibc-1",port="transfer"} 0 -
Let's rename
oldest_*
metrics tobacklog_*
and additionally:- make it clear these metrics are per-channel
- add a
backlog_size
metric to capture the number of pending packets in a channel.
Activity