Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

maps: add iterator-related functions #61900

Closed
rsc opened this issue Aug 9, 2023 · 27 comments
Closed

maps: add iterator-related functions #61900

rsc opened this issue Aug 9, 2023 · 27 comments

Comments

@rsc
Copy link
Contributor

rsc commented Aug 9, 2023

We propose to add the following functions to package maps, to provide good support for code using iterators.

This is one of a collection of proposals updating the standard library for the new 'range over function' feature (#61405). It would only be accepted if that proposal is accepted. See #61897 for a list of related proposals.

All serves as a “source” for iterators.

// All returns an iterator over key-value pairs from m.
func All[Map ~map[K]V, K comparable, V any](m Map) iter.Seq2[K, V] {
	return func(yield func(K, V) bool) bool {
		for k, v := range m {
			if !yield(k, v) {
				return false
			}
		}
		return true
	}
}

Keys and Values are like All: not terribly useful by themselves but useful as inputs to other iteration adapters.
In particular, we expect that x := slices.Sorted(maps.Keys(m)) will be a common pattern.

// Keys returns an iterator over keys in m.
func Keys[Map ~map[K]V, K comparable, V any](m Map) iter.Seq[K] {
	return func(yield func(K) bool) bool {
		for k := range m {
			if !yield(k) {
				return false
			}
		}
		return true
	}
}
// Values returns an iterator over values in m.
func Values[Map ~map[K]V, K comparable, V any](m Map) iter.Seq[V] {
	return func(yield func(V) bool) bool {
		for _, v := range m {
			if !yield(v) {
				return false
			}
		}
		return true
	}
}

Insert and Collect serve as “sinks” for iterators.

// Insert adds the key-value pairs from seq to m.
func Insert[Map ~map[K]V, K comparable, V any](m M, seq iter.Seq2[K, V]) {
	for k, v := range seq {
		m[k] = v
	}
	return m
}
// Collect collects key-value pairs from seq into a new map
// and returns it.
func Collect[K comparable, V any](seq iter.Seq2[K, V]) map[K]V {
	m := make(map[K]V)
	Insert(m, seq)
	return m
}
@rsc
Copy link
Contributor Author

rsc commented Aug 9, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc
Copy link
Contributor Author

rsc commented Aug 30, 2023

Finishing this proposal discussion is blocked on #61405.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/558736 mentions this issue: maps: add All, Keys, Values, Insert, Collect

@earthboundkid
Copy link
Contributor

It is unfortunate that slices.Sorted(maps.Keys(m)) won't get a length hint from m and may end up doing unnecessary allocations and copies. What if there were also maps.SortedKeys(m) that did the right thing?

@Merovius
Copy link
Contributor

@earthboundkid ISTM if that is a concern, it should maybe rather be slices.SortedLength[E any](int, iter.Seq[E]) (modulo name), as any use of "make a sorted slice from an iterator" would suffer the same issue.

I get, FWIW, that "get the sorted keys from a map" is a fairly common use case, to try and get deterministic order from the specifically non-deterministic maps. But it still seems a more general problem.

@gophun
Copy link

gophun commented Jan 30, 2024

maps.Keys could return a sequence implementation with a length hint and slices.Sorted could type assert for that, similar to what some io.Reader functions do.

@gophun
Copy link

gophun commented Jan 30, 2024

Nevermind, iter.Seq is a concrete type, not an interface.

@rsc
Copy link
Contributor Author

rsc commented Jan 31, 2024

It's true that slices.Sorted(maps.Keys(m)) will not pre-size the slice, but that's not necessarily a strike against the iterator forms. We could have a separate discussion about maps.KeysSorted and maps.KeysSortedFunc as optimizations, if that became a concern.

@rsc
Copy link
Contributor Author

rsc commented Feb 2, 2024

Have all remaining concerns about this proposal been addressed?

The full godoc is:

func All[Map ~map[K]V, K comparable, V any](m Map) iter.Seq2[K, V]
    All returns an iterator over key-value pairs from m.

func Collect[K comparable, V any](seq iter.Seq2[K, V]) map[K]V
    Collect collects key-value pairs from seq into a new map and returns it.

func Insert[Map ~map[K]V, K comparable, V any](m Map, seq iter.Seq2[K, V])
    Insert adds the key-value pairs from seq to m.

func Keys[Map ~map[K]V, K comparable, V any](m Map) iter.Seq[K]
    Keys returns an iterator over keys in m.

func Values[Map ~map[K]V, K comparable, V any](m Map) iter.Seq[V]
    Values returns an iterator over values in m.

@earthboundkid
Copy link
Contributor

SortedKeys can wait and be a separate proposal.

@zephyrtronium
Copy link
Contributor

It's true that slices.Sorted(maps.Keys(m)) will not pre-size the slice, but that's not necessarily a strike against the iterator forms.

More wordy, but slices.Sorted(slices.Append(make([]T, 0, len(m)), maps.Keys(m))) does preallocate the slice, as long as T can be named. And an extra generic helper func PreallocKeys[Map ~map[K]V, K comparable, V any](m Map) []K { return make([]K, 0, len(m)) } addresses the situations where it can't.

@rsc
Copy link
Contributor Author

rsc commented Feb 8, 2024

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

The full godoc is:

func All[Map ~map[K]V, K comparable, V any](m Map) iter.Seq2[K, V]
    All returns an iterator over key-value pairs from m.

func Collect[K comparable, V any](seq iter.Seq2[K, V]) map[K]V
    Collect collects key-value pairs from seq into a new map and returns it.

func Insert[Map ~map[K]V, K comparable, V any](m Map, seq iter.Seq2[K, V])
    Insert adds the key-value pairs from seq to m.

func Keys[Map ~map[K]V, K comparable, V any](m Map) iter.Seq[K]
    Keys returns an iterator over keys in m.

func Values[Map ~map[K]V, K comparable, V any](m Map) iter.Seq[V]
    Values returns an iterator over values in m.

@magical
Copy link
Contributor

magical commented Feb 9, 2024

func Keys[Map ~map[K]V, K comparable, V any](m Map) iter.Seq[K]
Keys returns an iterator over keys in m.

func Values[Map ~map[K]V, K comparable, V any](m Map) iter.Seq[V]
Values returns an iterator over values in m.

What, if anything, is guaranteed about the iteration order of Keys and Values (EDIT: and All)? I believe this should be specified in the docs.

Does these two for loops yield the same key order? I would assume not, based on analogy to range-over-map.

for k := range maps.Keys(m) {
    println(k)
}
for k := range maps.Keys(m) {
    println(k)
}

What about this?

seq := maps.Keys(m)

for k := range seq {
    println(k)
}
for k := range seq {
    println(k)
}

I can see why it might make sense from an implementation standpoint for this to also be randomized. However it also seems weird for a value of type Seq, which nominally represents a sequence, to in fact yield multiple different sequences.

@jimmyfrasche
Copy link
Member

Insert should clearly specify whether it overwrites existing keys

@fzipp
Copy link
Contributor

fzipp commented Feb 9, 2024

@earthboundkid An iter.Seq is not an iterator and does not exhaust.

@earthboundkid
Copy link
Contributor

Yes, I deleted my comment but you beat me to it. 😄

@rsc
Copy link
Contributor Author

rsc commented Feb 14, 2024

Happy to update docs to say that iteration order is undefined (different each time) and Insert does overwrite keys.

@rsc
Copy link
Contributor Author

rsc commented Feb 14, 2024

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

The full godoc is:

func All[Map ~map[K]V, K comparable, V any](m Map) iter.Seq2[K, V]
    All returns an iterator over key-value pairs from m.

func Collect[K comparable, V any](seq iter.Seq2[K, V]) map[K]V
    Collect collects key-value pairs from seq into a new map and returns it.

func Insert[Map ~map[K]V, K comparable, V any](m Map, seq iter.Seq2[K, V])
    Insert adds the key-value pairs from seq to m.

func Keys[Map ~map[K]V, K comparable, V any](m Map) iter.Seq[K]
    Keys returns an iterator over keys in m.

func Values[Map ~map[K]V, K comparable, V any](m Map) iter.Seq[V]
    Values returns an iterator over values in m.

@rsc rsc changed the title proposal: maps: add iterator-related functions maps: add iterator-related functions Feb 14, 2024
@rsc rsc modified the milestones: Proposal, Backlog Feb 14, 2024
@magical
Copy link
Contributor

magical commented Feb 15, 2024

@rsc I don't see any doc changes

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/586716 mentions this issue: maps: add All, Keys, Values, Insert, Collect

@andig
Copy link
Contributor

andig commented Jun 30, 2024

ISTM if that is a concern, it should maybe rather be slices.SortedLength[E any](int, iter.Seq[E]) (modulo name), as any use of "make a sorted slice from an iterator" would suffer the same issue.

@Merovius there are other cases when you need to collect the iterator, like when that can only be safely done under a lock and you wouldn't want to hold onto the lock while rangeing over it. That sounds more like a general pre-allocating keys/values to slice function is desirable than a specific slices.SortedLength (refs #61626 (comment)).

@earthboundkid
Copy link
Contributor

I added an issue for slices.CollectN #68261.

@jub0bs
Copy link

jub0bs commented Sep 3, 2024

I'm coming to this quite late, but I have a question: Why do functions All, Insert, Key, and Values introduce a Map ~map[K]V type parameter? Is it purely to make their signatures more readable?

AFAIU, all those functions could have been written as follows without losing any expressive power:

func All[K comparable, V any](m map[K]V) iter.Seq2[K, V]
func Insert[K comparable, V any](m map[K]V, seq iter.Seq2[K, V])
func Keys[K comparable, V any](m map[K]V) iter.Seq[K]
func Keys[K comparable, V any](m map[K]V) iter.Seq[V]

@DeedleFake
Copy link

@jub0bs

Because custom map types, i.e. type Example map[string]string, could require manual conversion otherwise.

@jub0bs
Copy link

jub0bs commented Sep 3, 2024

@DeedleFake No; as long as the argument is of some type whose underlying type is some map, no explicit conversion is required from the caller.

package main

import (
	"iter"
)

type Example map[string]string

func main() {
	m := Example{"key": "value"}
	_ = Keys(m) // no compilation error
}

func Keys[K comparable, V any](m map[K]V) iter.Seq[K] {
	return func(yield func(K) bool) {
		for k := range m {
			if !yield(k) {
				return
			}
		}
	}
}

(playground)

@Merovius
Copy link
Contributor

Merovius commented Sep 3, 2024

@jub0bs I think the answer is "because of consistency". It does make a difference for functions like maps.Equal, because a usage like slices.EqualFunc([]T{}, maps.Equal) would not work if T is a defined map type, otherwise. That's still a bit exotic, but it made a bit more sense when we talked about slices.Equal, for example, as defined slice types are a bit more common. At that point, we pretty much decided to go all the way and put the extra type-parameters into all the functions.

Of course now people are justifiably confused about when to add them and do it inappropriately, because they cargo-cult from the slices and maps package. I've seen the question a couple of times lately and I plan to write a relatively comprehensive post about it soon.

In general, if it is foreseeable that a function might be used with a higher-order function, it makes sense to add the extra type-parameter. And to be fair, I can imagine that functions like All and Keys have usages like that, if it becomes sufficiently common to write functional pipelines over iterators. I can imagine having a higher-order iterator transformer needing something like a func(T) iter.Seq[T].

@ianlancetaylor
Copy link
Contributor

@jub0bs For consistency with the other functions in the maps package, so that an explicit instantiation always starts with the map type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

Successfully merging a pull request may close this issue.