Skip to content

Improve HotStuff Metrics: Add Internal Metrics Abstraction, Runtime Sampling, and Bandwidth Metric #281

@meling

Description

@meling

Summary

Our current metrics flow is built around protobuf messages collected at worker nodes (clients and replicas) and sent to the experiment controller. This works well, but we should follow the style of the Go runtime metrics. We should also collect selected Go runtime metrics at each worker, and add support for bandwidth measurements between replicas.

Motivation

  • We want a consistent internal representation for all metrics—HotStuff-specific and Go runtime metrics—before they are encoded into protobuf messages.
  • Sampling runtime metrics such as GC cycles, heap usage, and goroutine counts provides valuable context when analyzing performance anomalies.

Proposed Changes

1. Add an internal metrics abstraction

Follow Go's runtime/metrics convention in how we record metrics before encoding with protobuf.

  • Metric{Name, Unit, Kind, Value}
  • path:unit

2. Implement a metrics sampler on each worker

Add a sampler component that:

  • collects HotStuff-specific metrics (commit latency, view changes, QC sizes, etc.)
  • reads selected Go runtime metrics via runtime/metrics.Read
  • converts everything into the internal metrics abstraction
  • forwards the aggregated metrics through the existing protobuf pipeline

3. Add a bandwidth metric between replicas

Extend the metrics system to measure and report bandwidth usage between replicas. Options include:

  • counting bytes sent/received per peer over each interval
  • integrating with network transport wrappers for transparent measurement
  • decision: conduct measurements using an external tool like iperf, netperf, wireshark, or via hotstuff protocol messaging.
  • emitting metrics such as:
    • /hotstuff/net/bytes_sent:bytes
    • /hotstuff/net/bytes_received:bytes
    • /hotstuff/net/bandwidth_mbps:float

This should be included in the metrics sampler and exported with the other metrics.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions