Skip to content

Add Canonical Representation for Precompile Configs #268

Open
@aaronbuchwald

Description

@aaronbuchwald

Goal

The goal of this ticket is to provide a way for nodes in a subnet to let each other know when there is a pending network upgrade that they might not know about.

Currently, we apply network upgrades for a Subnet by applying a JSON file specifying the upgrade bytes to AvalancheGo, so that it can pass the upgrade bytes in via vm.Initialize(...): handled in Subnet-EVM here: https://github.com/ava-labs/subnet-evm/blob/master/plugin/evm/vm.go#L387.

AvalancheGo reads in the chain configs here: https://github.com/ava-labs/avalanchego/blob/master/config/config.go#L1059 and supplies the upgrade bytes to the VM in order to tell it what network upgrades should be activated.

Applying network upgrades on a distributed network has been a pain point for subnet operators who have run sometimes run into issues where they missed applying the upgrade to one or more nodes.

The goal of this ticket is to provide a way for the nodes of a Subnet to communicate their current upgrade bytes and improve observability that they may be about to fall behind a network upgrade or have already done so.

Implementation Idea

Create a canonical representation for precompile configs and the entire upgrade config, so that we can gossip a hash of the canonical upgrade config.

We can distribute this hash when nodes connect to each other as a VM-level handshake. The VM-level handshake could look like the following:

  1. Send a hash of the upgrade config
  2. if it matches my own, send my own hash back to confirm and do nothing further
  3. If it does not match my own, request the full config, so that I can attempt to parse it and see what the difference in the upgrade is

Once we have the VM level handshake in place to determine whether we're in sync with the rest of the network and if not, what the difference is, the next question is what metrics/logs should we attempt to surface to the node operator.

I'd lean towards:

  1. Adding a log similar to the AvalancheGo message that tells the node operator peers have a higher version: https://github.com/ava-labs/avalanchego/blob/v1.10.3/network/peer/peer.go#L880
  2. Add a metric that says what portion of stake (maybe also number of peers?) on the network is in agreement about the network upgrade bytes

Open Questions

  1. Should this be included in the health check? Most likely no because we don't want other nodes upgrading to cause our node to fail the health check
  2. How should node operators monitor this? Put in a warning level monitor to let node operators know when a pending upgrade is going to happen
  3. Can we do better than this and parse network upgrades that we don't know what they are yet? Can we include information such as exactly when the network upgrade is going to happen that we do not have activated ourselves or perhaps add this to the health check only in the case that a sufficient percentage of the network has performed the upgrade and it has already activated?

Metadata

Metadata

Assignees

Type

No type

Projects

Status

In Progress 🏗️

Relationships

None yet

Development

No branches or pull requests

Issue actions