Race condition between cosmovisor and upgrade handler #8964
Description
Summary of Bug
Under certain conditions the cosmosvisor process can terminate the blockchain executable before an upgrade plan file is flushed to disk. This action prevents a successful upgrade due to a missing upgrade info file.
Version
Tested against 0.42.2
Issue is present in latest/master
Steps to Reproduce
Create a local instance of the network running with cosmovisor. Use a local binary for upgrade such that there is no latency from downloading a binary. The cosmosvisor will terminiate the process as soon as the log message is received but before the upgrade info file can be persisted to disk.
In the following code the upgrade required message is written to the log on line 45 while the upgrade file is dumped on 49.
Lines 45 to 52 in a78f777
Meanwhile in the cosmovisor process the monitor will execute an unclean process termination to force an immediate exist when the message appears in the logs
cosmos-sdk/cosmovisor/process.go
Lines 118 to 124 in a78f777
Remediation
Move the log message on abci.goL45
after the k.DumpUpgradeInfoToDisk
on line 49.
For Admin Use
- Not duplicate issue
- Appropriate labels applied
- Appropriate contributors tagged
- Contributor assigned/self-assigned
Activity