Skip to content

Commit

Permalink
fix: Dumping upgrade info interrupted by cosmovisor (#9384) (#9608)
Browse files Browse the repository at this point in the history
Solution:
- dumping upgrade info before emit `UPGRADED NEEDED` log which will
  cause cosmovisor to kill chain process

<!-- < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < ☺
v                               ✰  Thanks for creating a PR! ✰
v    Before smashing the submit button please review the checkboxes.
v    If a checkbox is n/a - please still include it but + a little note why
☺ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >  -->

## Description

The problematic procedure:

1. chain process output UPGRADE NEEDED log
2. cosmovisor see the message and kill the chain binary
3. chain process dump upgrade info and panic itself

the step 2 and 3 runs concurrently, so the dumping process can be interrupted by cosmovisor's terminate signal. The proposed solution is to dump upgrade info before emitting the log.
there are two problematic situation:
1. the upgrade info file is created, but content is not written or flushed before killed, when the chain process restart, it'll panic because of json parsing error.
2. the upgrade info file is not created at all, when the chain process restart, the [store upgrades](https://github.com/crypto-org-chain/chain-main/blob/master/app/app.go#L436) are not activated, will cause app hash mismatch error later on.

---

Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.

- [x] Targeted PR against correct branch (see [CONTRIBUTING.md](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting))
- [ ] Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
- [x] Code follows the [module structure standards](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules/structure.md).
- [ ] Wrote unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing)
- [ ] Updated relevant documentation (`docs/`) or specification (`x/<module>/spec/`)
- [ ] Added relevant `godoc` [comments](https://blog.golang.org/godoc-documenting-go-code).
- [ ] Added a relevant changelog entry to the `Unreleased` section in `CHANGELOG.md`
- [x] Re-reviewed `Files changed` in the Github PR explorer
- [x] Review `Codecov Report` in the comment section below once CI passes

(cherry picked from commit 0540ed2)

Co-authored-by: yihuang <huang@crypto.com>
  • Loading branch information
mergify[bot] and yihuang authored Jun 29, 2021
1 parent 919ae01 commit 7b48375
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions x/upgrade/abci.go
Original file line number Diff line number Diff line change
Expand Up @@ -56,17 +56,17 @@ func BeginBlocker(k keeper.Keeper, ctx sdk.Context, _ abci.RequestBeginBlock) {
}

if !k.HasHandler(plan.Name) {
upgradeMsg := BuildUpgradeNeededMsg(plan)
// We don't have an upgrade handler for this upgrade name, meaning this software is out of date so shutdown
ctx.Logger().Error(upgradeMsg)

// Write the upgrade info to disk. The UpgradeStoreLoader uses this info to perform or skip
// store migrations.
err := k.DumpUpgradeInfoToDisk(ctx.BlockHeight(), plan.Name)
if err != nil {
panic(fmt.Errorf("unable to write upgrade info to filesystem: %s", err.Error()))
}

upgradeMsg := BuildUpgradeNeededMsg(plan)
// We don't have an upgrade handler for this upgrade name, meaning this software is out of date so shutdown
ctx.Logger().Error(upgradeMsg)

panic(upgradeMsg)
}
// We have an upgrade handler for this upgrade name, so apply the upgrade
Expand Down

0 comments on commit 7b48375

Please sign in to comment.