Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Tuple Types for Go #64457

Closed
griesemer opened this issue Nov 30, 2023 · 23 comments
Closed

proposal: Tuple Types for Go #64457

griesemer opened this issue Nov 30, 2023 · 23 comments
Labels
dotdotdot ... LanguageChange Suggested changes to the Go language Proposal v2 An incompatible library change
Milestone

Comments

@griesemer
Copy link
Contributor

griesemer commented Nov 30, 2023

NOTE: This is a write-up of an internally worked out idea for tuple types, inspired by some of the tuple proposals listed below. The primary reason for this write-up is simply to document this work for future reference.

Importantly, at this point we do not actually propose that we pursue any of these ideas further because there don't seem to be convincingly strong use cases (where a struct wouldn't be a better choice) that justify the extra language complexity.

Tuple Types for Go

Various proposals suggest the introduction of a tuple type or related mechanism to Go:

This is an alternative proposal, most closely related to #32941, but with more details worked out.
It uses the same lightweight and intuitive syntax suggested by some of these earlier proposals.
The syntax directly matches the commonly used mathematical notation for tuples: (a, b), (a, b, c), etc.
The proposed changes are backward compatible with existing code, with a tiny exception: array literals that use the [...]E array type notation where the closing bracket ] is not on the same line as the ... will fail to parse. Such code is extremely unlikely to occur in practice.

Key features:

  • A tuple type is a new type (not a struct).

  • An implicitly typed new tuple value is created by writing down the tuple elements enclosed in parentheses,
    like (a, b, c) for the 3-tuple with elements a, b, and c.

  • A tuple type literal is written like a tuple value, but with types rather than values as elements.

  • A tuple value is unpacked into a multi-value (like the result of a multi-valued function) with the ... operator:
    (a, b, c)... unpacks the tuple (a, b, c) into the values a, b, c.
    (This avoids the need for an unpack builtin and reads pretty nicely.)

  • A tuple element may be accessed (but not set) with a selector indicating the element field number.
    For instance, t.0 and t.1 access the elements 0 and 1 of tuple t.
    (This may be a feature that we may want to leave out, at least in the beginning.)

  • A tuple may be converted into a struct and vice versa if they have the same sequence of element and field types, respectively.

That is the essence of the entire proposal.

Tuple types

A tuple is a finite sequence of values, called the tuple elements, each of which has its own type.
An n-tuple is a tuple with n elements, and n is the length of the tuple.
If n is zero, the tuple is the empty tuple.
A tuple type defines the set of all tuples of the same length and with the same sequence of element types.

TupleType = "(" [ Type { "," Type } [ "," ] ] ")" .

To distinguish a single-element tuple type from a parenthesized (scalar) type, the element type must be followed by a trailing comma.
Note: We may want to disallow single-element tuple types (see the discussion section).

The result parameter list of a multi-valued function is not a tuple type, rather the existing syntax requires that such multi-valued results are enclosed in parentheses.

To avoid ambiguities (and for backward compatibility), a function returning a single tuple must enclose a tuple type literal (but not a named tuple type) in an extra pair of parentheses (see the discussion section).

()                      // empty tuple type
(int,)                  // single-element tuple type, requires trailing ","
(int, string)           // pair
(int, int, error)       // 3-tuple type
(int, int, error,)      // trailing comma is permitted
func f() ((int, int))   // f returns a single (int, int) tuple result

Implicitly typed tuple expressions

A tuple expression creates a value of an n-tuple type from n individual values.
For a given tuple expression with n elements (x0, x1, …, xn-1), the corresponding tuple type is the n-tuple type (t0, t1, …, tn-1) where the types ti correspond to types of the corresponding tuple elements xi.
If a tuple element is an untyped constant, the corresponding tuple element type is the default type of that constant.

TupleExpr = "(" [ Expr { "," Expr } [ "," ] ] ")" .

To distinguish a single-element tuple expression from a parenthesized (scalar) value, the tuple element must be followed by a trailing comma.
Note: We may want to disallow single-element tuple expressions (see the discussion section).

()                    // empty tuple value
(1,)                  // single-element type value, requires trailing ","
(1, "foo")            // (int, string) tuple
(1, "foo", false)     // (int, string, bool) tuple
(1, "foo", false,)    // trailing comma is permitted
return (1, 2)         // function returning a single tuple value

Typed tuple expressions

If a tuple type is provided explicitly, the usual composite literal syntax may be used to create tuple values.
Tuple elements may be omitted from right to left; omitted elements assume their respective zero value for the given tuple type.

type T (uint, string)
T{}                          // zero value of type T
T{1}                         // (uint(1), "") of type T
T{1.0, "foo"}                // (uint(1), "foo") of type T
(float32, bool){2, false}    // (float32(2), false) of type (float32, bool)

Unpacking tuples

A n-tuple value unpacks into a sequence of n values (a multi-value) with the ... operator.
The sequence of the result types is the sequence of the tuple element types.

The result values may be assigned to variables or passed as arguments to functions following the same rules that apply to functions returning multiple values.

UnpackExpr = PrimaryExpr "..." .

An unpacked tuple may not appear as part of a list of other values.

()...                           // not permitted in statement context

x, s := (1, "foo")...           // x = 1, s = "foo"

t := (s, 1.2)                   // t = ("foo", 1.2)
y, _ := t...                    // y = "foo"

func f(string, float64)
f(t...)                         // f("foo", 1.2)

To make unpack operations work syntactically (without the need for an explicit semicolon), if the ... token appears at the end of a line, a semicolon is automatically inserted into the token stream immediately after the ... token. This will break array literals using the [...]E{...} notation if the closing bracket ] appears on a different line than the ..., a situation that is extremely unlikely to occur in actual Go code.

It may be possible to avoid the unpacking operator ... entirely by adjusting assignment rules such that "the right thing" happens based on the number of variables on the LHS of an assignment and whether the RHS is a tuple or not. For instance

a, b, c = t

could be permitted if t is a 3-tuple (to be clear, we are not suggesting this in this proposal - it's just a possibilty one might want to explore more). Using an explicit unpacking operator like ... seems like a good idea for readability, especially when passing an n-tuple to a function requiring n arguments. It also avoids an ambiguity with comma-ok expressions such as t, ok := <-c where it would be unclear what should happen if c's element type is a tuple type.

Accessing tuple elements

A tuple element may be accessed (read) directly with an element selector expression indicating the tuple element number, which must be in the range from zero to the tuple length minus one. Unless it is zero, the element number must start with a non-zero decimal digit.

ElementSelectorExpr = PrimaryExpr "." elementNumber .
elementNumber       = decimal_digit { decimal_digit } .

The element number is either 0 or must start with a non-zero digit.
An element selector expression denotes a value, not a variable that can be set.
It is not an addressable operand.

a, b, c, d := t...
x := t.0                        // x == a
z := t.2                        // z == c
t.1 = 3.14                      // invalid: cannot set a tuple element
p := &t.3                       // invalid: tuple elements are not addressable

Note: This is perhaps the most controversial part of this proposal.
It may be better to leave this feature out, to discourage uses of tuples where a proper struct would be the better choice.
See also the section Tuples considered harmful? below.

Tuple conversions

A tuple value t can be explicitly converted to a struct type S (or a struct value s to a tuple type T) if the tuple length matches the number of struct fields and each tuple element type in the sequence of tuple elements is identical to the corresponding struct field type in the sequence of struct fields.

type S struct {
	x int
	_ float64
	s string
	*embedded
}

t := (1, 2.0, "foo", (*embedded)(nil))
s := S(t)

type T (int, float64, string, *embedded)
t = T(s)

Tuple identity

Two tuple types are identical if they have the same length and if corresponding tuple element types are identical.

Tuple comparisons

Tuple types are comparable if all their element types are comparable.
Two tuple values are equal if their corresponding element values are equal.
The elements are compared in source order, and comparison stops as soon as two element values differ (or all elements have been compared).

Zero values for tuples

The zero value for a tuple type is a tuple where each element has the zero value for the element type.

Discussion

Empty tuples

The primary use of empty tuples is situations where we currently use empty dummy structs.
For instance, one might define a set type using a map and tuple type like so:

type IntSet map[int]()

or send a signal without payload on a channel

ch <- ()

Single-element tuples

Single-element tuples are problematic as they overlap syntactically and semantically with scalar values.
The proposed syntax proposes using a trailing comma to distinguish between the parenthesized value (x) and the single-element tuple (x,).
A function signature declaring a result type of the form (T,) returns a scalar value of type T rather than a value of tuple type (T,) because a single function result parameter may be enclosed in parentheses and be followed by a trailing comma.

It is not obvious that single-element tuple types are worth the (small) extra complexity. Nor is it clear that they would be particularly useful. There are several approaches one could take:

  1. Do nothing
    Single-element tuples are permitted as proposed for consistency.

  2. Disallow single-element tuples.
    A single-element tuple type (T,) or tuple value (x,) is rejected.
    This can be achieved in three ways: either by treating (T,) or (x,) like a parenthesized type or scalar value (and thus newly permitting a trailing comma in parenthesized expressions); by rejecting a tuple type or value if n == 1 (assuming a trailing comma is present); or by changing the syntax such that trailing commas are not permitted.

  3. Treat a single-element tuple as a parenthesized scalar - they are one and the same.
    (T,) and (x) mean the same as (T) and (x), and a parenthesized scalar may be used like a single-element tuple.
    Unpacking a parenthesized scalar is a no-op: (x)... = (x) = x.
    Converting a parenthesized scalar to a matching single-field struct is permitted: S((x)) = S{x}.
    With this approach, and independently from it, there is still a choice between permitting or disallowing trailing commas syntactically.

Suggested initial approach

Disallowing single-element tuples by changing the syntax to

TupleType = "(" [ Type { "," Type } ] ")" .
TupleExpr = "(" [ Expr { "," Expr } ] ")" .

(which makes it impossible to distinguish between a tuple and a scalar) would permit switching to one of the other approaches in a backward-compatible way in the future. This might be a good initial approach to take.

A word on function signatures

If Go were designed with tuple types in mind from the start, functions would not need the notion of a multi-value return.
Instead, they might simply return a tuple value if needed.
The ... unpack operator would be used to unpack a function tuple result into a multi-value.
In such a world one would not need the extra parentheses around the literal result tuple type.

func f() (int, string)    // hypothetical: f returns an (int, string) tuple
a, b := f()...            // unpack result for multi-value assignment

However, in such a world it would also be impossible to name individual function result parameters (assuming that there's no explicit way to name tuple fields except for access).

To disambiguate between a function returning multiple results and a function returning a single tuple, either a named tuple type or extra parentheses must be used:

func f() T                  // g returns a tuple of type T
func g() ((int, string))    // f returns an (int, string) tuple

Tuples considered harmful?

Various sources report that using tuples indiscriminately harms code readability and makes code evolution
difficult (thanks to @bcmills for this list):

  • Google's Guava libraries in Java
  • Chromium project in C++
  • Tuples are evil (Scala)
  • std::pair considered harmful (C++)

A primary case against tuples is their use when a struct type would be a better choice, because of the ability to give names to struct fields, because fields can be added in the future by simply changing the struct type, and because some tuple libraries permit individual tuple field (read and write) access.

In contrast, tuple types as presented here can be given names like any other types, so extra tuple elements can be added by changing the tuple definition in one place (actual tuple expressions will still need to be updated, though, and that may be desirable).

Individual tuple elements can be accessed but not set or addressed.

Finally, tuples represent immutable values: once created they can only be taken apart, not modified.

Implementation

To study issues with this proposal we have built a basic prototype implementation that can parse, print, and type-check most tuple operations as proposed. The implementation is surprisingly straightforward.

No particular difficulties are anticipated for the compiler's backend as a tuple value can be treated essentially like a struct value.

The API of the reflect package and various libraries (go/ast, go/types) would need to be adjusted to accommodate for the new type.

An alternative (and much simpler implementation) might treat tuples as syntactic sugar for special tuple types (see #63221). Such an approach would simplify the design and implementation significantly because all the usual rules for structs would apply, and all but parsing code would be shared. A tempting choice is to map an n-tuple to a struct with n fields of blank (_) name. That falls apart because blank fields are ignored for struct comparison. But there are other alternatives.

@gopherbot gopherbot added this to the Proposal milestone Nov 30, 2023
@griesemer griesemer changed the title proposal: tuple types proposal: Tuple Types for Go Nov 30, 2023
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/546076 mentions this issue: cmd/compile/internal/syntax: implement tuple parsing

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/546077 mentions this issue: cmd/compile/internal/syntax: implement tuple printing

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/546078 mentions this issue: cmd/compile/internal/types2: implement tuple type-checking

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/546079 mentions this issue: cmd/compile/internal/syntax, types2: implement ... unpack operator

@afocus
Copy link

afocus commented Nov 30, 2023

If single-element tuples are prohibited, the expander '...' can be omitted.

If the left side is a single value then it returns the tuple itself
If it is multi-valued, the tuple is considered to have been expanded.

This is also very compatible with functions with multiple return values.

@jimmyfrasche
Copy link
Member

In the Tuple Conversions section, the value 2 is assigned to the field _ float64. Does that get ignored?

If you have, say, a func f() (int, error) how do you convert the result of f() into a (int, error) tuple? (without having to write N generic functions for each N-length tuple you need to handle)

Can you use keyed literals like (int, string){1: "?"} == (0, "?")

How would these tuples be represented in go/ast?

You can send a tuple value in an interface to a dependency written for an earlier version of Go unaware of the new reflect.Kind. Will any special steps need to be taken to rollout to help with situations like that?

@felix021
Copy link

For a function returning a 2-tuple:

func foo() ((int, int))

Is it both legal to get the result as:

t := f() // get a tuple

a, b := f() // unpack the tuple

Might be a bit confusing when writing code.

Or, it might be better to introduce a new syntax for unpacking (at least in this case):

(a, b) := f() // unpack the tuple

@atdiar
Copy link

atdiar commented Nov 30, 2023

@felix021 from what I understand you need the ... operator to unpack.

Proposal lgtm overall.
I'd have some questions regarding type parameters and variadics but that's the direction anyway and that can be addressed later.

@dfinkel
Copy link
Contributor

dfinkel commented Nov 30, 2023

It's worth noting that the inability to assign to tuple fields nor take their addresses creates a back-door syntax for marking a set of values as immutable at runtime.

I'd rather have a real readonly attribute, but without an equivalent of C/C++'s const keyword, I can see some interfaces using tuples instead of structs in some places to get immutability. (which would be a net reduction in readability)

@DeedleFake
Copy link

DeedleFake commented Nov 30, 2023

Avoiding ... as proposed would make it impossible to unpack a tuple directly out of chan read or a map access, unless it was assumed to always be an unpack which I would very much not like. For example, what happens if I do

c := make(chan (int, string), 1)
c <- (3, "Example")
v, ok := <-c // Is this a read and unpack or just a regular two-value read?

Also, for a bit of prior art, Python handles single-value tuples in the way proposed, requiring a comma to indicate them, as does Rust. Just thought that that should be mentioned somewhere.

My personal opinion on tuples is that while I do very much want them to be added to Go, I think that there's little point in adding them at this point unless they're either some kind of syntax sugar for struct types or a new type of struct. I quite like #63221, for example, but I could also see them being added as something like var v struct(int, string) and v = struct(3, "Example").

@earthboundkid
Copy link
Contributor

I imagine this would be legal?

type MyTuple (int, string)

func (m MyTuple) DoSomething() {}

If tuples aren't addressable, I guess func (m *MyTuple) DoSomething() would be illegal?

@Merovius
Copy link
Contributor

A word on function signatures […]

I understand why this is being proposed that way. But the single biggest reason I want tuples is to be able to write generic functions that operate on function types with an arbitrary number of arguments/returns. In particular, this means we can't use tuples to circumvent the Seq vs. Seq2 issue of #61897 - you can't stuff strconv.Atoi into a generic Map, for example, as func(string) (int, error) has the wrong number of returns. So you need to explicitly wrap it into a func(string) ((int, error)), writing

for t := xiter.Map(flag.Args(), func(s string) ((int, error)) { return strconv.Atoi(s) }) {
    i, err := t...
    // …
}

TBQH I don't particularly see the point of tuples, at that point. They are immutable, sure, but why are immutable product types important, but immutable pointers or slices not?

I think the only benefit that's left is that they are syntactically lighter than structs. Which, fair enough.

Arguably, my generics-concern is better addressed by some form of variadic type parameters anyways, as you probably want a way to manipulate the wrapped function arguments anyways (e.g. you might want a func[In...,Out... any] Must(f func(In) (Out, error) func(In) Out - strawman syntax, because I can't actually imagine how that would look). But that doesn't make tuples any more useful.

So… my immediate reaction to this proposal is a resounding shrug, I think.

@jimmyfrasche
Copy link
Member

@dfinkel while the tuple value itself is immutable but you could still have:

var s *T = newT()
t := (s, 0)
s.Mutate()

@carlmjohnson my understanding is that the values in the tuple aren't addressable so you can't do &t.0 but I imagine the tuple itself follows the ordinary rules

@Merovius I don't think tuples could address that because you'd still need to specify the tuple itself, specifically its types and therefore their number.

@griesemer
Copy link
Contributor Author

Thanks for all the feedback!

Just to be clear, as I said in the opening section, at this point we do not actually propose that we pursue any of these ideas further. This was simply meant to document a concrete experiment that could (if one wanted) be made into something real.

I was primarily interested (at first) to find out if there were any syntactic problems with using a nice tuple notation that doesn't require special markers (there doesn't seem to be any problems); and secondly, I wanted to explore some of the semantic aspects.

@griesemer
Copy link
Contributor Author

Some people took the time to think this through and brought up good points, so even though we don't plan to do anything further here, I think they deserve answers:

@jimmyfrasche (comment) The 2 value that gets assigned to the _ float64 field should be treated the same way as it would if a struct literal was assigned instead of a tuple.

Converting the result of a multi-valued function into a tuple would require explicit assignment to variables and then the creation of a tuple. A tuple type is a real type, different from a multi-valued result.

Also, one cannot use keyed literals. The point here was to restrict tuples to areas where a struct may be overkill. In most cases, a struct is the better solution. I think it's probably a mistake to create "data structures" where tuples play a more important role besides just basic data values, exactly for the reasons pointed out by some of the critics of tuple types. In other words, I see a tuple as a kind of "compound basic type", nothing more.

To see how they are implemented in the go/ast, look at CL 546076.

Sending a tuple value via an interface to a piece of code unaware of tuples would have to be handled somehow. I haven't thought about it.

@felix021 (comment) A tuple must be explicitly unpacked per this proposal. I mentioned at the end that one might be able to make it all work without explicit unpacking, but as you say, it could be confusing, which is why I think an explicit unpack operation is better.

@dfinkel (comment) That point is that tuples are immutable - like other basic types.

@DeedleFake (comment) Unpacking must be done explicitly. So v, ok := <-c is the usual comma-ok operation, and v may be a tuple value. You could do <-v... in which case the unpack happens immediately, but you don't get the ok (and you may get a panic). Having an automatic unpack wouldn't work in this case because of the ambiguity, so this is actually a good example justifying the need for an explicit unpack operation.

Also, thanks for the information about the trailing comma use in Python and Rust. I didn't know about that detail, but it's also a pretty obvious approach (we do this in Go to address an ambiguity with type parameters and array declarations).

@carlmjohnson (comment) A tuple type is a real type and so one could define methods on it. Tuples are addressable but tuple elements are not. Similar to how a float64 value is addressable, but it's exponent field is not.

@Merovius (comment) I think using tuples in any more complex way leads to all the problems outlined by the critics of tuples. I really see them just as super-lightweight structs, to be used exactly (and only) when a struct seems overkill. I see them as a way to make slightly more complex basic types. For everything else, we should be using what we already have. But of course that's just my opinion.

@jimmyfrasche
Copy link
Member

Converting the result of a multi-valued function into a tuple would require explicit assignment to variables and then the creation of a tuple. A tuple type is a real type, different from a multi-valued result.

One of the use cases for tuples is to send the result of a function down a channel. For example a chan (T, error) is quite ergonomic to receive from but without a way to lift a multi-valued result into a tuple sending would require two lines:

v, err := f()
ch <- (v, err)

You could do

func T2[X0, X1 any](x0 X0, x1 X1) ((X0, X1)) {
  return (x0, x1)
}

and then write

ch <- T2(f())

but then you need a T3 and maybe a T4 etc. and you're in the same position as you were pre-tuples where you're having to write multiple versions of the same thing each for a different index. So it seems like you'd still need the pack builtin from #63221 but only for this one case.

@atdiar
Copy link

atdiar commented Nov 30, 2023

@Merovius that's a bit early (to say the least) to imagine how that could work but I think that one would want In and Out to be types.
In that case, tuples would be amenable to this.
It's probably not perfect but I don't see how these variadic type parameters could work otherwise.

@ianlancetaylor ianlancetaylor added LanguageChange Suggested changes to the Go language v2 An incompatible library change labels Dec 2, 2023
@urandom
Copy link

urandom commented Dec 2, 2023

One thing I'm missing is the ability to be able to convert between a function's return values and a tuple, for the purpose of easily passing the function's values through a channel.

That's probably the place I'd use such a type myself

@jimmyfrasche

This comment was marked as outdated.

@DeedleFake
Copy link

@jimmyfrasche

That's unrelated. @urandom is asking for the ability to do something more like

func Example() (int, string) { ... }

c := make(chan (int, string))
c <- Example()

@jimmyfrasche
Copy link
Member

@DeedleFake I was replying to another comment that has since been deleted

The specific case @urandom's asking about is part of the more general "convert multiple returns into a tuple".

Since t... is being used to unpack ...f() would be the natural choice, making your example c <- ...Example(), though that doesn't look great to me

@Merovius
Copy link
Contributor

Merovius commented Dec 6, 2023

c <= (Example(),) might be a syntactical option.

@griesemer
Copy link
Contributor Author

griesemer commented Dec 8, 2023

I am going to retract this proposal in favor of #33080 and its rephrasing #64613.
I think those proposals are more light-weight, get us close to what tuples would provide without the need to introduce a new type, and the ideas there have received a lot of positive feedback (see emoji voting in #33080).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dotdotdot ... LanguageChange Suggested changes to the Go language Proposal v2 An incompatible library change
Projects
None yet
Development

No branches or pull requests