Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record representation is unstable #7031

Open
cometkim opened this issue Sep 10, 2024 · 9 comments
Open

Record representation is unstable #7031

cometkim opened this issue Sep 10, 2024 · 9 comments

Comments

@cometkim
Copy link
Member

cometkim commented Sep 10, 2024

Since we added support for optional fields on records, a crucial assumption was broken: up until then, records can be expected to have a fixed layout, as its definition.

So records with the same values for the same keys always had the same hash.

However, by introducing optional fields, we can easily create values ​​that pass the equality check but have different hashes.

type t = {
  a: string,
  b?: string,
  c: string,
}

let v1 = {
  a: "a",
  c: "c",
}

let v2 = {
  ...v1,
  b: "b",
}

let v3 = {
  a: "a",
  b: "b",
  c: "c"
}

assert (Hashtbl.hash(v2) == Hashtbl.hash(v3))

https://rescript-lang.org/try?version=v11.1.4&code=C4TwDgpgBMULxQN4CgpQIYC4oGdgCcBLAOwHMAaVKAIwH5s8izK0BjBgki5AX2WQA2EWADcAjPCRUsUAETpZLKOzmtFvfkNEAmSSjQA6I+KXVss6ur6DhUEQGY908wtPnLSlbLWVr6HDgQ+MAAFAAS-gAWwNQCBpFRIeIAlPAIETjRsfGJDsnJyEA

v2 and v3 have the same values for the same keys but in different order.

  • Obj.entries(v2) => [['a', 'a'], ['c', 'c'], ['b', 'b']]
  • Obj.entries(v3) => [['a', 'a'], ['b', 'b'], ['c', 'c']]

This literally means that users can't use a record as a key in a hashtable or other data structure. In other words, it's not a record.

@cometkim
Copy link
Member Author

cometkim commented Sep 10, 2024

One possible fix is to add entry sorting at runtime, which will add serious performance degradation wherever optional fields are used (e.g. JSXv4)

Another way is to initialize optional fields to None, but this will again change the JS semantic.

@zth
Copy link
Collaborator

zth commented Sep 11, 2024

Coercion for records also break this assumption (more fields can be on the object than what's represented by the record).

Honestly I think these are just quirks we'll have to live with in some form if we want optional record fields (which we do) and coercion (which we also do).

Semi-unrelated, but one thing I've thought about before is whether one could have an annotation for a record definition type that disallows optional fields and/or coercion. Or one annotation for each thing. That way you could set up a "protected" record that you know will have the exact runtime shape you expect (unless it's from an external, but 🤷 ).

@shulhi
Copy link
Member

shulhi commented Sep 11, 2024

Coercion for records also break this assumption (more fields can be on the object than what's represented by the record).

Honestly I think these are just quirks we'll have to live with in some form if we want optional record fields (which we do) and coercion (which we also do).

I think we should mention this caveat in the docs for optional fields. I rarely use record as the key (or a complex type), but who knows.

@cristianoc
Copy link
Collaborator

Actually that was never true because of FFI.

@cometkim
Copy link
Member Author

Coercion for records also break this assumption (more fields can be on the object than what's represented by the record).

IMO it should have runtime effects, not just be a simple type assertion.

type a = {
  foo: string,
  bar: string,
}

type b = {
  bar: string,
}

a :> b

Output should be { bar: a.bar } rather than a. It also makes more sense to extend it to custom encoding/decoding later.

I know it has a tradeoff, but we need to care a little more for integrity, even if it needs runtime.

@cometkim
Copy link
Member Author

Actually that was never true because of FFI.

At least when using FFI explicitly, users can be quite careful about it. But in pure ReScript-produced code, this behavior is somewhat surprising to me.

@cometkim
Copy link
Member Author

Honestly I think these are just quirks we'll have to live with in some form if we want optional record fields (which we do) and coercion (which we also do).

If so, we should deprecate supporting comparison operations (==, >, <, >=, <=) on records. It's a general property when dealing with immutable data, but if we can't fully support it, it's just an unreliable footgun.

@zth
Copy link
Collaborator

zth commented Sep 12, 2024

If so, we should deprecate supporting comparison operations (==, >, <, >=, <=) on records. It's a general property when dealing with immutable data, but if we can't fully support it, it's just an unreliable footgun.

This is definitely a good idea. I wonder however if anyone is using them. That type of comparison could be quite powerful (== for records anyway) but I wonder whether it actually works as expected today?

@cometkim
Copy link
Member Author

This is definitely a good idea. I wonder however if anyone is using them. That type of comparison could be quite powerful (== for records anyway) but I wonder whether it actually works as expected today?

OCaml std users may be using it, but there's no guarantee that it will work without bugs.

We're dropping OCaml std, and Belt is a little better because it makes users explicitly build their own Hashable and Comparable modules.

But still, using Hashtbl.hash or comparison operators as its implementation is a common usage pattern that as far as I know.

However, the advantage of module functions is that it makes it easy to replace the implementation. Users can get both correctness and performance by replacing in a custom implementation.

Then we could extend language support something like @deriving to support deriving hash and comparison, like Rust's derive macro, or Java's Lombok does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants