Skip to content

Commit

Permalink
[Sui] add a process label to uptime metric (MystenLabs#14951)
Browse files Browse the repository at this point in the history
## Description 

This helps to differentiate between validators and fullnodes. When
validators have stuck consensus, it is harder to differentiate via other
metrics.

## Test Plan 

CI. Private testnet.

---
If your changes are not user-facing and not a breaking change, you can
skip the following section. Otherwise, please indicate what changed, and
then add to the Release Notes section as highlighted during the release
process.

### Type of Change (Check all that apply)

- [ ] protocol change
- [ ] user-visible impact
- [ ] breaking change for a client SDKs
- [ ] breaking change for FNs (FN binary must upgrade)
- [ ] breaking change for validators or node operators (must upgrade
binaries)
- [ ] breaking change for on-chain data layout
- [ ] necessitate either a data wipe or data migration

### Release notes
  • Loading branch information
mwtian authored Nov 21, 2023
1 parent 75c7d3f commit 01a8af7
Show file tree
Hide file tree
Showing 3 changed files with 18 additions and 5 deletions.
10 changes: 7 additions & 3 deletions crates/mysten-metrics/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -319,13 +319,17 @@ impl RegistryService {
}

/// Create a metric that measures the uptime from when this metric was constructed.
/// The metric is labeled with the provided 'version' label (this should generally be of the
/// format: 'semver-gitrevision') and the provided 'chain_identifier' label.
/// The metric is labeled with:
/// - 'process': the process type, differentiating between validator and fullnode
/// - 'version': binary version, generally be of the format: 'semver-gitrevision'
/// - 'chain_identifier': the identifier of the network which this process is part of
pub fn uptime_metric(
process: &str,
version: &'static str,
chain_identifier: &str,
) -> Box<dyn prometheus::core::Collector> {
let opts = prometheus::opts!("uptime", "uptime of the node service in seconds")
.variable_label("process")
.variable_label("version")
.variable_label("chain_identifier");

Expand All @@ -335,7 +339,7 @@ pub fn uptime_metric(
opts,
prometheus_closure_metric::ValueType::Counter,
uptime,
&[version, chain_identifier],
&[process, version, chain_identifier],
)
.unwrap();

Expand Down
7 changes: 6 additions & 1 deletion crates/sui-node/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -123,12 +123,17 @@ fn main() {
let node = node_once_cell_clone.get().await;
let chain_identifier = match node.state().get_chain_identifier() {
Some(chain_identifier) => chain_identifier.to_string(),
None => "Unknown".to_string(),
None => "unknown".to_string(),
};

info!("Sui chain identifier: {chain_identifier}");
prometheus_registry
.register(mysten_metrics::uptime_metric(
if is_validator {
"validator"
} else {
"fullnode"
},
VERSION,
chain_identifier.as_str(),
))
Expand Down
6 changes: 5 additions & 1 deletion crates/sui-proxy/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,11 @@ async fn main() -> Result<()> {
let registry_service = metrics::start_prometheus_server(metrics_listener);
let prometheus_registry = registry_service.default_registry();
prometheus_registry
.register(mysten_metrics::uptime_metric(VERSION, "sui-proxy"))
.register(mysten_metrics::uptime_metric(
"sui-proxy",
VERSION,
"unavailable",
))
.unwrap();
let app = app(
Labels {
Expand Down

0 comments on commit 01a8af7

Please sign in to comment.