Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abstract Map #497

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from
Draft

Abstract Map #497

wants to merge 7 commits into from

Conversation

kentosugama
Copy link
Contributor

@kentosugama kentosugama commented Jan 6, 2023

Create implementation agnostic Map, List, and Set classes. This lets the user not have to think about the implementation of the data structure, and we can pick the best “general case” implementation for each abstract data dat structure.

Currently, if you want to map from Text to Nat what you usually do is:

import TrieMap "mo:base/TrieMap";
import Text "mo:base/Text";

let map = TrieMap.TrieMap<Text, Nat>(Text.equal, Text.hash);

The first attempt is to wrap TrieMap in an abstract class:

import { Map } "mo:base/Map";
import Text "mo:base/Text";

let map = Map<Text, Nat>(Text.equal, Text.hash);

This is a clear improvement and should be done.

What about the equal and hash arguments?

For primitive types, if you have the constructor take in a module that contains these functions, the code becomes very clean because the base library already has these modules defined.

import { Map } "mo:base/Map";
import Text "mo:base/Text";

let map1 = Map<Text, Nat>(Text);

What about mapping from a custom type C to some other type?

module Mod {
  public class C() {
    public let x = 4
  };
  public func compare(c1 : C, c2 : C) : Order.Order {
    Nat.compare(c1.x, c2.x)
  };
  public func hash(c : C) : Hash.Hash {
    Hash.hash(c.x)
  }
};

// -------- Most likely in a different file -----------

let map2 = Map<Mod.C, Nat>(Mod);

This initialization of map2 is again really nice to read, but the user has to wrap the compare and hash functions in a module. Maybe this is okay, because those functions have to be defined somewhere, so it's just a matter of wrapping their definitions in a module. But it does add a layer of friction (albeit only to users advanced enough to be mapping their own custom types).

Alternatively, you can create the module at intialization (e.g. if compare and hash are defined in separate places). But then you get an ugly initialization.

class D() {
  public let x = 4
};

// -------- In different locations -----------
func compare2(d1 : D, d2 : D) : Order.Order {
  Nat.compare(d1.x, d2.x)
};

// -------- In different locations -----------
func hash2(d : D) : Hash.Hash {
  Hash.hash(d.x)
};

// -------- In different locations -----------
let map3 = Map<D, Nat>(
  module {
    public let compare : (D, D) -> Order.Order = compare2;
    public let hash : D -> Hash.Hash = hash2
  }
);

So, for custom types, map2 is much cleaner but it necessitates users predefining the module (which they probably would do wherever they define the class C).

If modules could be considered subtypes of records, then even map3 could become clean to use.

let map4 = Map<C, Nat>({ compare; hash })

But wrapping up, I think users can get away with the map1 and map2 situations most of the time. map3 is not ideal at all, and if that is actually the common case then a different approach should be taken, or the compiler should be modified to allow for map4 to be used in place of map3.

@chenyan-dfinity
Copy link
Contributor

For map2, it's likely that a module contains multiple type definitions, we may need to remap the hash/compare function names in this case.

For map1, it's nice that subtyping works in our favor, but I personally prefer Map<Text, Nat>(Text.compare, Text.hash) or Map<Text, Nat>({ compare = Text.compare; hash = Text.hash }), as it makes explicit that this data structure relies on compare and hash.

Another concern with class is that it cannot be a stable variable. We at least need share and unshare function in the class to facilitate upgrade. Or some compiler changes to allow the data part of class to be stable.

@kentosugama
Copy link
Contributor Author

Added share and unshare

@kentosugama
Copy link
Contributor Author

If the user cares about implementation, I think it is best if they choose a specific implementation (such as TrieMap or HashMap) instead of this wrapper.

I think the main point of this class is to hide underlying implementation details from the user, who might have the desire of "I just want to map things, I don't really care how". In this case, I really like that the hash, and equal functions are hidden. It reduces the chance of choice paralysis for new users of the language who have other things to worry about.

This also let's us swap out map implementations from under the hood as the best "general case map implementation" changes as the IC changes (e.g. Hash based maps vs trees).

@kentosugama
Copy link
Contributor Author

kentosugama commented Jan 6, 2023

Ahh didn't think of modules with multiple types... That's a good point.
I think for situations with multiple types and multiple hash / compare functions, we're back in the map3 (hopefully map4) situation.

@chenyan-dfinity
Copy link
Contributor

We can get map4 by setting the class to take record arguments instead of module. This way, we are only losing map1. I think map1 is hiding too much information that it can easily lead to foot guns. For example, I probably never want to use Float.equal or Float.compare instead of a custom function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants