Skip to content

[Epic]: Rolling Upgrades #18523

Open
Open
@tac0turtle

Description

Summary

The Cosmos SDK upgrade process has been that validators either need to use cosmovisor, be present at the time of the upgrade or have a third party tool in order to start the new binary while shutting down the old one. Much of the ecosystem has become accustom to this method, but it has caused a growth in maintenance from application developers.

Secondly, if you would like to sync from genesis then it is a mess to get all the right versions lined up in cosmovisor, even then its its unclear if the binaries will work as intended, barring there was no emergency binary issued by the team. This leads to many people not being able to sync from genesis on newer chains.

Note: If the node operators have archives nodes, then it is not possible to allow querying of old versions through the running binary. Secondary or third binaries need to be provided in order to query the old state.

For the reasons listed above and those not listed, we would like to explore rolling upgrades.

A rolling upgrade is when node operators can upgrade binaries ahead of time allowing the chain to upgrade on its own without intervention by the developers or node operators. This will simplify the operation of a node, allow node operators to sync from genesis and will allow historical versions to be run without needing to operate many different binaries.

Goals

The goals of this work are:

  • Minimal Downtime: Networks should be upgradable with minimal or no downtime sustained to its users
  • Backwards Compatibility: Allow node operators the option to query historical state and process historical transactions. If holding onto older versions of the app proves to have a performance overhead we should allow operators to only compile the latest version of the app.

Problem Definition

Upgrades are cumbersome for node operators, from being awake at all hours of the day for an upgrade to making sure you upgrade at the correct time. Application developers have a larger burden to maintain historical binaries and hope that the block protocol will not change from version to version.

Work Breakdown

As we have adopted protobuf in the Cosmos SDK there are some gotchas with how this can be done.

We should work on a few demos in different directions for how to achieve many different app versions. This will help influence the final design.

This is meant as a tracking issue and will be updated once we are ready to begin this work.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    • Status

      ☃️ Icebox

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions