Skip to content

Multiset semantics for Coins and DecCoins #11223

Open
@JimLarson

Description

Summary

Cleans up potentially dangerous inconsistencies in Coins and DecCoins by establishing a multiset model for semantics and a normalization policy.

Problem Definition

Coins (and DecCoins) are sometimes vague and surprising in their semantics.

  • IsEqual() returns false when comparing Coins of different lengths, but panics when it encounters Coins of the same length but having a difference in denoms. This appears to be intentional, as there are unit tests for the behavior. There doesn't seem to be a sensible use case for these semantics.

  • The call a.IsAnyGT(b) sounds like it ought to be equivalent to !b.IsAllLTE(a), but it's something quite different, and deliberately so. For instance, the doc comment says {2A, 3B}.IsAnyGT{5C} = false, despite the fact that the 2 of denom A is greater than the zero amount of A in {5C}. Similarly for IsAnyGTE().

  • There is inconsistency in canonicalization of Coins values.

    • Negative quantities are forbidden by Validate() but returned by SafeSub(). Add() promises not to return any negative quantities but has no checks to enforce this.

    • Some methods explicitly handle non-valid values, others panic or return wrong results. For the empty Coins value, NewCoins() uses the empty slice, but Add() and Sub() use nil. Most uses are happy with either, but some tests require one specifically.

  • There's much hand-wringing about the existence of zero-quantity Coin values but the non-existence of zero quanties in the representation of Coins values. There's question of the validity of an empty Coins value. See Write ADR for Coins #7046 and its references.

This all seems to stem from confusion of the implementation of Coins with the semantics it provides, the lack of clear definition of those semantics, and the absence of a policy of normalization.

Proposal

Multiset model

Each Coins value should be abstractly thought of as a collection of coins of all possible denominations. This makes Coins a multiset, which suggests some operations and their proper behavior.

A multiset has an obvious partial order based on containment: A <= B if and only if for all denoms D, A.AmountOf(D) <= B.AmountOf(D). Note that this is a partial order, so it's possible for neither A <= B nor B <= A.

The partial order gives rise to well-known mathematical structure. See #10995 which adds Min() and Max() operations, which are indispensible in various vesting account calculations.

If the set of denoms was stable, the SDK could have defined Coins as a fixed-length vector of Int, with each denom getting a well-known offset. However, since we expect a large and dynamic set of denoms, with most Coins values having only a few of these nonzero, then the current "sparse" representation makes more sense. But the semantics should be the same for both.

This interpretation settles the hand-wringing over the possibility of zero-quantity denoms in Coin but not in Coins. It confuses the sparse implementation of Coins with the data it provides: you see the zero if you ask with AmountOf().

Normalization policy

The Coins representation as an array of (denom, amount) pairs makes sense, and allows direct representation in proto-derived structs. But it also allows representation of some values that do not match up with the abstract model of Coins:

  • invalid denoms
  • duplicate denoms
  • unsorted entries
  • zero-quantity entries
  • negative-quantity entries

When the representation is created with the NewCoins() constructor, all of these are checked for, but protos generate the representation directly, as do some tests and an unknown amount of user code.

Nothing can be done about invalid denoms or negative quantities. Sort() will handle order, and a trip through NewCoins() will handle both order and zero quantities. Nothing fixes duplicates at the moment, but we could potentially consolidate them by adding the quantities together. It's not clear whether reforming such content would fix bugs or merely obscure them.

Methods vary in their support for invalid representations. IsZero() seems to exist (vs Empty()) only for zero quantities. IsEqual() sorts its inputs just to be sure, while AmountOf() silently relies on its inputs being ordered. Application code can't tell what to expect without reading each implementation. This is a mess.

The "highbrow" approach to regularizing this would be to create a ValidCoins type with all of the operations, leaving Coins as the type of the proto field, having only a method Coins.Validate() (ValidCoins, err) to canonicalize the coins.

A more middle-brow approach would be to simply let most methods assume valid input, but require all to produce valid output. Application logic should explcitly normalize values coming from "outside", e.g. transactions, but can consider the store to only hold valid values.

DecCoin

DecCoins is mostly a clone of Coins with Dec instead of Int for quantities, but similarly requires positive quantities for valid values. Thanks to Golang's impoverished type system, we need to duplicate the implementation.

All of the above concerns and suggestions seem to apply.

Specific Changes

Let me sum this all up in the following specific proposals.

  1. Fix IsEqual() to not panic. Though technically an incompatible change, it's hard to imagine anyone relying on the current behavior.

  2. Rewrite IsAnyGT{E}() in terms of !IsAllLT{E}() and change to multiset semantics. Again, an incompatible change, but I think it's likely that the usages actually expect the multiset behavior.

  3. Deprecate or remove DenomsSubsetOf() since the very question doesn't make sense in multiset semantics.

  4. SafeSub should return empty coins if hasNeg is true, to prevent negative-quantity Coins from leaking out. A quick review of the code suggests that the returned coins value is never used if the hasNeg field is set. (Except for one case in a test, can probably be rewritten.)

  5. All methods should handle nil and empty slice the same way. E.g. always test with len(coins) == 0 instead of coins == nil. Code (including tests) should not try to distinguish between the two.

  6. Nevertheless, pick nil as the canonical representation of the empty Coins value since it's the "zero value" dictated by the Go language.

  7. Application code and tests should not create coins as arrays but should instead use NewCoins.

  8. Expect validation / canoncicalization at the boundary (via NewCoins(dirtyCoinArray...) or an equivalent alias) and don't do checks or conversions in other methods. Possibly have a transitional period where all methods check for invalid input and panic.

  9. Equivalent changes in DecCoins.


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions