Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for standard library types #120

Open
27 of 33 tasks
ibokuri opened this issue Jul 23, 2023 · 6 comments
Open
27 of 33 tasks

Add support for standard library types #120

ibokuri opened this issue Jul 23, 2023 · 6 comments
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@ibokuri
Copy link
Contributor

ibokuri commented Jul 23, 2023

General

If you'd like to see a certain std type gain support in Getty, please leave a comment and I'll add it to the hit list.

Also, feel free to work on any of the types listed below. If you have any questions, you can ask them on our Discord or in this issue.

The Hit List

  • ArrayHashMap (includes Auto, String)
  • ArrayHashMapUnmanaged (includes Auto, String)
  • ArrayListAligned (includes ArrayList)
  • ArrayListAlignedUnmanaged (includes ArrayListUnmanaged)
  • IntegerBitSet (includes half of StaticBitSet)
  • ArrayBitSet (includes half of StaticBitSet)
  • DynamicBitSetUnmanaged
  • DynamicBitSet
  • BoundedArray
  • BufMap
  • BufSet
  • ComptimeStringMap
  • BoundedEnumMultiset (includes EnumMultiset)
  • IndexedArray (includes EnumArray)
  • IndexedSet
  • IndexedMap
  • LinearFifo
  • HashMap (includes Auto, String)
  • HashMapUnmanaged (includes Auto, String)
  • SinglyLinkedList
  • TailQueue
  • MultiArrayList
  • net.Address
    • Serialization support is done.
    • Deserialization support broke due to some memcpy issue in std net.zig:
      $ zig build test
      run deserialization test: error: thread 1007393 panic: @memcpy arguments alias
      /Users/jason/.asdf/installs/zig/master/lib/std/net.zig:553:70: 0x102138407 in resolve (deserialization test)
                  @memcpy(result.sa.addr[16 - index ..][0..index], ip_slice[0..index]);
                                                                           ^
      /Users/jason/.asdf/installs/zig/master/lib/std/net.zig:85:54: 0x102138dc7 in resolveIp6 (deserialization test)
              return Address{ .in6 = try Ip6Address.resolve(buf, port) };
                                                           ^
      /Users/jason/.asdf/installs/zig/master/lib/std/net.zig:58:23: 0x10213900f in resolveIp (deserialization test)
              if (resolveIp6(name, port)) |ip6| return ip6 else |err| switch (err) {
                            ^
      /Users/jason/Projects/Personal/getty/src/de/blocks/net_address.zig:63:50: 0x10213a413 in test.deserialize - std.net.Address (deserialization test)
                      .want = std.net.Address.resolveIp(ipv6, 0) catch return error.UnexpectedTestError,
                                                       ^
      /Users/jason/.asdf/installs/zig/master/lib/test_runner.zig:99:29: 0x1020cb44b in mainServer (deserialization test)
                      test_fn.func() catch |err| switch (err) {
                                  ^
      /Users/jason/.asdf/installs/zig/master/lib/test_runner.zig:33:26: 0x1020c4d57 in main (deserialization test)
              return mainServer() catch @panic("internal test runner failure");
                               ^
      /Users/jason/.asdf/installs/zig/master/lib/std/start.zig:598:22: 0x1020c48f7 in main (deserialization test)
                  root.main();
  • net.Ip4Address (?, might be covered by net.Address)
  • net.Ip6Address (?, might be covered by net.Address)
  • net.AddressList
  • PackedIntArrayEndian (includes PackedIntArray)
  • PackedIntSliceEndian (includes PackedIntSlice)
  • PriorityDequeue
  • PriorityQueue
  • SegmentedList
  • SemanticVersion
  • Uri
@ibokuri ibokuri added accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. labels Jul 23, 2023
@ibokuri ibokuri added this to the 0.5.0 milestone Jul 24, 2023
@polykernel
Copy link
Contributor

How should a data structure with multiple possible ways of serialization be serialized? The motivating example is PriorityQueue, one possible serialization is taking elements in the order returned by popping the queue while another is to iterate over the queue with an iterator. Furthermore, for data structures such as EnumMultiset, there is not one obvious serialized form (i.e. a multiset can be serialized as a map or a list).

@ibokuri
Copy link
Contributor Author

ibokuri commented Jul 30, 2023

I usually try to follow these rules:

  • Do what most people would expect, or whatever would serve as a reasonable default.
  • The value being serialized shouldn't be modified.

So for PriorityQueue, I'd prefer iterating over it instead of popping values off to avoid modifying the queue.

As for EnumMultiset, serializing it as a sequence seems appropriate to me. I usually think of sets as sequences and the doc comment for EnumMultiset states that it's backed by a dense array.

@ibokuri
Copy link
Contributor Author

ibokuri commented Jul 30, 2023

@polykernel, after thinking for a bit, I feel like serializing EnumMultisets as maps (something like {"enum_foo": 1}, where 1 is the number of enum_foos in the set) makes more sense. Logically, serializing them as sequences seems nice but practically speaking that pretty much always just results in a ton of unnecessary tokens and parsing time for everybody.

Thoughts on representing EnumMultisets as maps instead?

@polykernel
Copy link
Contributor

polykernel commented Jul 31, 2023

I think it is sensible to represent EnumMultisets as maps by default given multisets are usually represented as maps in practice, but it might be useful in some cases to serialize them as sequences. Perhaps, there could a block specific attribute to control the serialized format but I am not sure if having block specific attributes are desired or scalable.

@ibokuri
Copy link
Contributor Author

ibokuri commented Jul 31, 2023

Ahh okay, I haven't worked with multisets often so I wasn't aware that they're usually maps. I'll note that down in the original post.

@polykernel
Copy link
Contributor

polykernel commented Aug 1, 2023

I think it is sensible to represent EnumMultisets as maps by default given multisets are usually represented as maps in practice

@ibokuri Sorry, I worded this terribly. By represented as maps, I actually mean implemented as/similarly to maps rather than represented as maps in serialized form. On second thought, I realized I overgeneralized the statement, I know in C++ (at least in libstdc++ and libc++), multiset is implemented like map except the value being stored is the same as the key, but I am definitely not qualified to assess what is usual implementation strategy of multiset is in general.

After some more pondering, I came up with a list comparing the advantages/disadvantages for both seq and map serialization, please let met know if there are points I missed.

# Seq
+ Preserves semantics: a multiset is semantically a type of unordered collection
- Redundant processing: multiplicity information is lost in the process of serialization
  which requires unnecessary processing by the receiving end to recover
- Succinctness: the size of the encoding is proportional to the number of values in the multiset

# Map
+ Succinctness: the size of the encoding is proportional to the number of unique values in the multiset
+ Readability: a key-value mapping is more readable than a sequence with unspecified ordering
- Breaks semantics: a multiset is not semantically equivalent to a map, but rather an unordered
  collection with additional information

Base on the comparison, it seems serializing to maps is the better option. Furthermore, it may be worthwhile to support deserializing from maps as well. I will take a shot at implementing this when I have some time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

2 participants