Multiset semantics for Coins and DecCoins

## Summary

Cleans up potentially dangerous inconsistencies in `Coins` and `DecCoins` by establishing a multiset model for semantics and a normalization policy.

## Problem Definition

`Coins` (and `DecCoins`) are sometimes vague and surprising in their semantics.

- `IsEqual()` returns false when comparing `Coins` of different lengths, but panics when it encounters `Coins` of the same length but having a difference in denoms. This appears to be intentional, as there are unit tests for the behavior. There doesn't seem to be a sensible use case for these semantics.

- The call `a.IsAnyGT(b)` sounds like it ought to be equivalent to `!b.IsAllLTE(a)`, but it's something quite different, and deliberately so. For instance, the doc comment says `{2A, 3B}.IsAnyGT{5C} = false`, despite the fact that the 2 of denom `A` is greater than the zero amount of `A` in `{5C}`.  Similarly for `IsAnyGTE()`.

- There is inconsistency in canonicalization of `Coins` values.

    - Negative quantities are forbidden by `Validate()` but returned by `SafeSub()`. `Add()` promises not to return any negative quantities but has no checks to enforce this.

    - Some methods explicitly handle non-valid values, others panic or return wrong results. For the empty `Coins` value, `NewCoins()` uses the empty slice, but `Add()` and `Sub()` use `nil`. Most uses are happy with either, but some tests require one specifically.

- There's much hand-wringing about the existence of zero-quantity `Coin` values but the non-existence of zero quanties in the representation of `Coins` values. There's question of the validity of an empty `Coins` value. See #7046 and its references.

This all seems to stem from confusion of the implementation of `Coins` with the semantics it provides, the lack of clear definition of those semantics, and the absence of a policy of normalization.

## Proposal

### Multiset model

Each `Coins` value should be abstractly thought of as a collection of coins of all possible denominations.  This makes `Coins` a [multiset](https://en.wikipedia.org/wiki/Multiset), which suggests some operations and their proper behavior.

A multiset has an obvious partial order based on containment: `A <= B` if and only if for all denoms `D`, `A.AmountOf(D) <= B.AmountOf(D)`.  Note that this is a partial order, so it's possible for neither `A <= B` nor `B <= A`.

The partial order gives rise to well-known mathematical structure.  See #10995 which adds `Min()` and `Max()` operations, which are indispensible in various vesting account calculations.

If the set of denoms was stable, the SDK could have defined `Coins` as a fixed-length vector of `Int`, with each denom getting a well-known offset. However, since we expect a large and dynamic set of denoms, with most `Coins` values having only a few of these nonzero, then the current "sparse" representation makes more sense.  But the semantics should be the same for both.

This interpretation settles the hand-wringing over the possibility of zero-quantity denoms in `Coin` but not in `Coins`. It confuses the sparse _implementation_ of `Coins` with the data it provides: you see the zero if you ask with `AmountOf()`.

### Normalization policy

The Coins representation as an array of `(denom, amount)` pairs makes sense, and allows direct representation in proto-derived structs. But it also allows representation of some values that do not match up with the abstract model of `Coins`:

- invalid denoms
- duplicate denoms
- unsorted entries
- zero-quantity entries
- negative-quantity entries

When the representation is created with the `NewCoins()` constructor, all of these are checked for, but protos generate the representation directly, as do some tests and an unknown amount of user code.

Nothing can be done about invalid denoms or negative quantities.  `Sort()` will handle order, and a trip through `NewCoins()` will handle both order and zero quantities. Nothing fixes duplicates at the moment, but we could potentially consolidate them by adding the quantities together. It's not clear whether reforming such content would fix bugs or merely obscure them.

Methods vary in their support for invalid representations.  `IsZero()` seems to exist (vs `Empty()`) only for zero quantities.  `IsEqual()` sorts its inputs just to be sure, while `AmountOf()` silently relies on its inputs being ordered. Application code can't tell what to expect without reading each implementation.  This is a mess.

The "highbrow" approach to regularizing this would be to create a `ValidCoins` type with all of the operations, leaving `Coins` as the type of the proto field, having only a method `Coins.Validate() (ValidCoins, err)` to canonicalize the coins.

A more middle-brow approach would be to simply let most methods assume valid input, but require all to produce valid output.  Application logic should explcitly normalize values coming from "outside", e.g. transactions, but can consider the store to only hold valid values.

### DecCoin

`DecCoins` is mostly a clone of `Coins` with `Dec` instead of `Int` for quantities, but similarly requires positive quantities for valid values. Thanks to Golang's impoverished type system, we need to duplicate the implementation.

All of the above concerns and suggestions seem to apply.

### Specific Changes

Let me sum this all up in the following specific proposals.

1. Fix `IsEqual()` to not panic. Though technically an incompatible change, it's hard to imagine anyone relying on the current behavior.

2. Rewrite `IsAnyGT{E}()` in terms of `!IsAllLT{E}()` and change to multiset semantics. Again, an incompatible change, but I think it's likely that the usages actually expect the multiset behavior.

3. Deprecate or remove `DenomsSubsetOf()` since the very question doesn't make sense in multiset semantics.

4. `SafeSub` should return empty coins if `hasNeg` is true, to prevent negative-quantity `Coins` from leaking out. A quick review of the code suggests that the returned coins value is never used if the `hasNeg` field is set. (Except for one case in a test, can probably be rewritten.)

5. All methods should handle `nil` and empty slice the same way.  E.g. always test with `len(coins) == 0` instead of `coins == nil`.  Code (including tests) should not try to distinguish between the two.

6. Nevertheless, pick `nil` as the canonical representation of the empty `Coins` value since it's the "zero value" dictated by the Go language.

7. Application code and tests should not create coins as arrays but should instead use NewCoins.

8. Expect validation / canoncicalization at the boundary (via `NewCoins(dirtyCoinArray...)` or an equivalent alias) and don't do checks or conversions in other methods. Possibly have a transitional period where all methods check for invalid input and panic.

9. Equivalent changes in `DecCoins`.

____

#### For Admin Use

- [ ] Not duplicate issue
- [ ] Appropriate labels applied
- [ ] Appropriate contributors tagged
- [ ] Contributor assigned/self-assigned

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiset semantics for Coins and DecCoins #11223

Summary

Problem Definition

Proposal

Multiset model

Normalization policy

DecCoin

Specific Changes

For Admin Use

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development