Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: [Interchain Security] validatorUnbondingCanComplete must take into account (re)bonded validators #12796

Merged
merged 7 commits into from
Aug 5, 2022

Conversation

mpoke
Copy link

@mpoke mpoke commented Aug 2, 2022

Closes cosmos/interchain-security#232

The UnbondingOnHold flag of a validator is replaced with a counter (i.e., UnbondingOnHoldRefCount) that is incremented when PutUnbondingOnHold is called and decremented when UnbondingCanComplete is called.

This fix enables the following (see cosmos/interchain-security#232):

  • val is removed from the validator set (bonded -> unbonding) and the unbonding operation with id x is put on hold;
    -val is added back to the validator set (unbonding -> bonded);
  • val is removed again from the validator set (bonded -> unbonding) and the unbonding operation with id y is put on hold;
  • UnbondingCanComplete(x) is called
    • previously, this would have set val. UnbondingOnHold to false and the unbonding of val could have completed;
    • now, this decreases val.UnbondingOnHoldRefCount to 1 and thus, the unbonding of val cannot complete;
  • UnbondingCanComplete(y) is called
    • previously, this could have triggered a panic when calling UnbondingToUnbonded(val) if val is already unbonded (see here);
    • now, this decreases val.UnbondingOnHoldRefCount to 0 and thus, the unbonding of val can complete.

For consistency, the same change (UnbondingOnHold is replaced by UnbondingOnHoldRefCount) is done for both undelegation entries and redelegation entries.

Closes cosmos/interchain-security#233, i.e., putValidatorOnHold only increments UnbondingOnHoldRefCount and validatorUnbondingCanComplete only decrements UnbondingOnHoldRefCount. As a result, the order of staking.EndBlock and UnbondingCanComplete is irrelevant. If staking.EndBlock is called first, the validator unbonding will be completed in the next block. To enable this, only matured and not on hold validators are removed from the unbonding validator queue (instead of all the validators with the same key, i.e., unbonding time).

Also closes cosmos/interchain-security#111 as the val.IsMature is no longer needed.

@mpoke mpoke requested review from danwt and jtremback August 2, 2022 08:18
@mpoke mpoke changed the title validatorUnbondingCanComplete must take into account (re)bonded validators fix: validatorUnbondingCanComplete must take into account (re)bonded validators Aug 2, 2022
@mpoke mpoke changed the title fix: validatorUnbondingCanComplete must take into account (re)bonded validators fix: [Interchain Security] validatorUnbondingCanComplete must take into account (re)bonded validators Aug 2, 2022
@@ -786,7 +786,7 @@ func (k Keeper) CompleteUnbonding(ctx sdk.Context, delAddr sdk.AccAddress, valAd
// loop through all the entries and try to complete unbonding mature entries
for i := 0; i < len(ubd.Entries); i++ {
entry := ubd.Entries[i]
if entry.IsMature(ctxTime) && !entry.UnbondingOnHold {
if entry.IsMature(ctxTime) && !entry.OnHold() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nitpick- why not call it UnbondingOnHold()?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea was to keep it consistent with IsMature (it is not UnbondingIsMature). The Unbonding Delegation Entry is matured or on hold. But I don't have a strong position about it. @jtremback Let me know if I should change to UnbondingOnHold().

if !val.IsMature(ctx.BlockTime(), ctx.BlockHeight()) {
val.UnbondingOnHold = false
k.SetValidator(ctx, val)
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this get removed? Doesn't it remove functionality from this function? Where does

		// If unbonding is mature complete it
		val = k.UnbondingToUnbonded(ctx, val)
		if val.GetDelegatorShares().IsZero() {
			k.RemoveValidator(ctx, val.GetOperator())
		}

		k.DeleteUnbondingIndex(ctx, id)

happen now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see it is also happening in UnbondAllMatureValidators, but I'm not sure that's enough, because there it is only triggered when unbonding on the provider completes, while it is trigger here when unbonding on the consumer completes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the logic might now be broken if unbonding completes on the provider chain first, but I think the unit tests test for that so I'm confused.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, looks like they don't. I don't remember why not.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtremback UnbondAllMatureValidators is called in every staking.EndBlock(). As long as a mature validator val (that is on hold due to Interchain Security, i.e., val.UnbondingOnHoldRefCount > 0) is not removed from the unbonding validator queue, then once UnbondingCanComplete is called enough times to decrement val.UnbondingOnHoldRefCount to 0, the next call to staking.EndBlock() will move val to unbonded.

Copy link
Author

@mpoke mpoke Aug 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice that for UBDEs and REDEs, we cannot remove the code from unbondingDelegationEntryCanComplete() and redelegationEntryCanComplete(), respectively. This is because in staking.EndBlock(), mature unbonding delegations and redelegations are removed from their respective queues, see DequeueAllMatureUBDQueue and DequeueAllMatureRedelegationQueue here.

@@ -434,15 +432,15 @@ func (k Keeper) UnbondAllMatureValidators(ctx sdk.Context) {
if !val.IsUnbonding() {
panic("unexpected validator in unbonding queue; status was not unbonding")
}
if !val.UnbondingOnHold {
if val.UnbondingOnHoldRefCount == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't .OnHold() doing this elsewhere?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OnHold() is defined for unbonding delegation entries and redelegation entries. It is useful there since we use it in multiple places in the code, similarly to the IsMature(). For validators, there is no IsMature(). Also, this is the only place we check if val.UnbondingOnHoldRefCount == 0.

Copy link
Contributor

@jtremback jtremback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it is probably good, but it seems like it might break the case where a validator unbonding completes first on the provider and then on the consumer. Let's go over it together.

@jtremback jtremback self-requested a review August 2, 2022 21:39
Copy link
Contributor

@jtremback jtremback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... I see you thought about this a lot here cosmos/interchain-security#111 (comment), and modified unit tests that deal with this scenario. If you're sure that it's possible for a validator to complete unbonding in the scenario where unbonding tries to complete first on the provider, and then the consumer, go ahead and merge it.

require.NoError(t, err)

// Try again to unbond validators
app.StakingKeeper.UnbondAllMatureValidators(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What triggers this to be called IRL?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next block. At any given time, app.StakingKeeper.UnbondAllMatureValidators will be eventually called again as long as the chain is producing new blocks (i.e., liveness is not violated).

@mpoke
Copy link
Author

mpoke commented Aug 3, 2022

Looks like it is probably good, but it seems like it might break the case where a validator unbonding completes first on the provider and then on the consumer. Let's go over it together.

see #12796 (comment)

If you're sure that it's possible for a validator to complete unbonding in the scenario where unbonding tries to complete first on the provider, and then the consumer, go ahead and merge it.

I would feel more confident once @danwt is also reviewing it. :)

@mpoke
Copy link
Author

mpoke commented Aug 3, 2022

@jtremback db20718 makes sure that UnbondingOnHoldRefCount will never be <0. In other words, the number of calls to UnbondingCanComplete(id) shouldn't exceed the calls to PutUnbondingOnHold(id). I think it's an important check to avoid future misuse of these two functions. What do you think?

Copy link
Contributor

@danwt danwt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a short look at this and think it works. Also seems to pass diff test. I think refCnt is a bit engineered since for validators when the unbond->rebond->unbond happens you no longer care whatsoever about 1st unbond since the channel is ordered and thus by the time 2nd unbond is received the 1st is already received.

Also don't see need for changing redel/undels, especially since they are already unique and refCnt in {0,1}.

Despite the above, I do believe perfect is the enemy of good enough and this seems good enough so merge 👍

I'd like more time to take a look but going on vacation and understand sdk changes are blocking.

Comment on lines 256 to 263
if ubd.Entries[i].UnbondingOnHoldRefCount <= 0 {
return true,
sdkerrors.Wrapf(
types.ErrUnbondingOnHoldRefCountNegative,
"undelegation unbondingId(%d), expecting UnbondingOnHoldRefCount > 0, got %T",
id, ubd.Entries[i].UnbondingOnHoldRefCount,
)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this really a panic situation?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to return an error that will be handled at an upper layer, e.g., in the provider CCV module where we actually panic (see https://github.com/cosmos/interchain-security/blob/3c647c2db5a733737b27ccf9da02bb4e5db18114/x/ccv/provider/keeper/relay.go#L82).

x/staking/types/delegation.go Outdated Show resolved Hide resolved
@@ -50,4 +50,5 @@ var (
ErrNoHistoricalInfo = sdkerrors.Register(ModuleName, 38, "no historical info found")
ErrEmptyValidatorPubKey = sdkerrors.Register(ModuleName, 39, "empty validator public key")
ErrUnbondingNotFound = sdkerrors.Register(ModuleName, 40, "unbonding operation not found")
ErrUnbondingOnHoldRefCountNegative = sdkerrors.Register(ModuleName, 41, "cannot un-hold unbonding operation that is not on hold")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this is a panic situation, not an error, right? Errors are for recoverable, possible things but this is something that would imply a bug in our code and isn't recoverable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mpoke
Copy link
Author

mpoke commented Aug 4, 2022

@danwt

I think refCnt is a bit engineered since for validators when the unbond->rebond->unbond happens you no longer care whatsoever about 1st unbond since the channel is ordered and thus by the time 2nd unbond is received the 1st is already received.
Also don't see need for changing redel/undels, especially since they are already unique and refCnt in {0,1}.

This also enables multiple external modules to use these two functions, i.e., PutUnbondingOnHold and UnbondingCanComplete. Plus it fixes cosmos/interchain-security#233

@mpoke mpoke merged commit bbb8d86 into interchain-security-rebase Aug 5, 2022
@mpoke mpoke deleted the marius/rebonded-validator branch August 5, 2022 07:51
jtremback added a commit that referenced this pull request Nov 1, 2022
* feat: store ABCI validator updates in transient store

* fix test build

* change transient key name

* add UnbondingDelegationEntryCreated hook

* add id to UnbondingDelegationEntry

* changes to add simple version of staking ccv hooks

* ubde id to string

* rough draft of more efficient technique

* change hook api and do some clean ups

* use ByEntry index and keep stopped entries in Entries array

* correct error convention

* add comment

* some cleanups

* comment cleanup

* finish hooking up hooks

* get the tests to pass

* proof of concept with embedded mock hooks

* first unit test of CCV hooks

* fix forgotten pointer bug

* move hook mocks into own file

* added test for completing stopped unbonding early

* added staking hooks template

* correct file and module names

* clean up and fix import error

* move staking hooks template to types

* fix hooks after merge

* fix silly proto bug

* feat: Add AfterValidatorDowntime hook in slashing module (#10938)

Create a slashing hook that allows external modules to register an execution when a validator has downtime.

Implementation details:

* Call hook in HandleValidatorSignature (x/slashing/keeper/infractions.go) which updates validator SignInfo data

* Defer hook execution in case it also wants to update validator SignInfo data

* Add methods to update SignInfo to slashing keeper interface(/x/slashing/types/expected_keepers.go)

* update: Remove slashing module hooks (#11425)

* update: Remove slashing module hooks

Hooks are not required anymore to implement the slashing for downtime in CCV. The logic is now using the staking keeper interface definition from the slashing module.

The SDK changes are the following:
- /x/slashing/keeper/infractions.go - remove hook calls and don't update validators missed blocks when jailed

- /x/slashing/types/expected_keepers.go - remove `AfterValidatorDowntime` hook interface and add `IsJailed()` method to staking interface definition

- /x/staking/keeper/validator.go - implement `IsJailed()` method

* fix last details

* Finish staking hooks merge (#11677)

* allow stopping and completing of redelegations

* refactor to remove BeforeUnbondingDelegationEntryComplete hook and notes for validator unbonding

* WIP rough draft of validator unbonding hooks

* add many of marius's suggested changes

* More review changes

* unbonding delegation tests pass

* WIP adding redelegation tests

* WIP redelegation tests work

* unbondingDelegation and redelegation tests pass and cleanup

* WIP validator unbonding tests almost pass

* tests for all new functionality pass

* fix index deleting logic

* clean up TODOs

* fix small logic bug

* fix slashing tests

* Rename statements containing 'UnbondingOp' to 'Unbond' in code, docs and proto files

Co-authored-by: Simon <simon.ntz@gmail.com>

* feat: enable double-signing evidence in Interchain-Security (#11921)

* Add a `InfractionType` enum to Slash function arguments

* Remove pubkey condition in HandleEquivocation

* Update docs/core/proto-docs.md

Co-authored-by: billy rennekamp <billy.rennekamp@gmail.com>

* Update proto/cosmos/staking/v1beta1/staking.proto

Co-authored-by: billy rennekamp <billy.rennekamp@gmail.com>

* add a possible solution to the evidence module issue

Co-authored-by: billy rennekamp <billy.rennekamp@gmail.com>

* chore: remove direct reliance on staking from slashing (backport #12177) (#12181)

* fix: make ModuleAccountInvariants pass for IS SDK changes (#12554)

* fix bug breaking ModuleAccountInvariants

* set UnbondingOnHold to false explicitly

* Fixes staking hooks safety issues (#12578)

Co-authored-by: Daniel T <30197399+danwt@users.noreply.github.com>

* Revert "fix: make ModuleAccountInvariants pass for IS SDK changes (#12554)" (#12782)

This reverts commit 67c8163.

* fix: make ModuleAccountInvariants pass for IS SDK changes (#12783)

* fix bug breaking ModuleAccountInvariants

* set UnbondingOnHold to false explicitly

* fix: [Interchain Security] `validatorUnbondingCanComplete` must take into account (re)bonded validators (#12796)

* replace val.UnbondingOnHold w/ UnbondingOnHoldRefCount

* add UnbondingOnHoldRefCount for undel and red (for consistency)

* improve comments

* improve TestValidatorUnbondingOnHold test

* ret error if UnbondingOnHoldRefCount is negative

* adding extra validator unbonding test

* change OnHold() def

* fix: [Interchain Security] Fix leak in unbonding hooks (#12849)

* remove leak for UBDEs and REDEs

* remove leak for val unbondings

* docs: [Interchain Security] update spec (#12848)

* updating staking spec

* clarify code

* fix typo

* store ValidatorUpdates in normal store (#12845)

* Update x/slashing/keeper/signing_info.go

Co-authored-by: Simon Noetzlin <simon.ntz@gmail.com>

* Update x/staking/keeper/val_state_change.go

* Update x/staking/keeper/val_state_change.go

* Update x/slashing/keeper/infractions.go

Co-authored-by: Simon Noetzlin <simon.ntz@gmail.com>

* Update x/staking/keeper/val_state_change.go

* Update x/staking/keeper/val_state_change.go

* fix compile errors

* remove stakingtypes.TStoreKey

* fix: decrease minimums for genesis parameters (#13106)

* Update genesis.go

* Update genesis.go

Co-authored-by: Federico Kunze <federico.kunze94@gmail.com>
Co-authored-by: Aditya Sripal <adityasripal@gmail.com>
Co-authored-by: Simon Noetzlin <simon.ntz@gmail.com>
Co-authored-by: billy rennekamp <billy.rennekamp@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Marius Poke <marius.poke@posteo.de>
Co-authored-by: Daniel T <30197399+danwt@users.noreply.github.com>
Co-authored-by: Shawn Marshall-Spitzbart <44221603+smarshall-spitzbart@users.noreply.github.com>
sainoe pushed a commit that referenced this pull request Nov 2, 2022
…into account (re)bonded validators (#12796)

* replace val.UnbondingOnHold w/ UnbondingOnHoldRefCount

* add UnbondingOnHoldRefCount for undel and red (for consistency)

* improve comments

* improve TestValidatorUnbondingOnHold test

* ret error if UnbondingOnHoldRefCount is negative

* adding extra validator unbonding test

* change OnHold() def
sainoe pushed a commit that referenced this pull request Nov 3, 2022
…into account (re)bonded validators (#12796)

* replace val.UnbondingOnHold w/ UnbondingOnHoldRefCount

* add UnbondingOnHoldRefCount for undel and red (for consistency)

* improve comments

* improve TestValidatorUnbondingOnHold test

* ret error if UnbondingOnHoldRefCount is negative

* adding extra validator unbonding test

* change OnHold() def
sainoe pushed a commit that referenced this pull request Nov 16, 2022
…into account (re)bonded validators (#12796)

* replace val.UnbondingOnHold w/ UnbondingOnHoldRefCount

* add UnbondingOnHoldRefCount for undel and red (for consistency)

* improve comments

* improve TestValidatorUnbondingOnHold test

* ret error if UnbondingOnHoldRefCount is negative

* adding extra validator unbonding test

* change OnHold() def
sainoe pushed a commit that referenced this pull request Jan 26, 2023
…into account (re)bonded validators (#12796)

* replace val.UnbondingOnHold w/ UnbondingOnHoldRefCount

* add UnbondingOnHoldRefCount for undel and red (for consistency)

* improve comments

* improve TestValidatorUnbondingOnHold test

* ret error if UnbondingOnHoldRefCount is negative

* adding extra validator unbonding test

* change OnHold() def
sainoe pushed a commit that referenced this pull request Feb 9, 2023
…into account (re)bonded validators (#12796)

* replace val.UnbondingOnHold w/ UnbondingOnHoldRefCount

* add UnbondingOnHoldRefCount for undel and red (for consistency)

* improve comments

* improve TestValidatorUnbondingOnHold test

* ret error if UnbondingOnHoldRefCount is negative

* adding extra validator unbonding test

* change OnHold() def
@faddat faddat mentioned this pull request Mar 23, 2023
19 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants