-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Description
After reading several recent questions about divrem and sincos (not in Base) broadcasting on arrays, I think the current situation could be improved. To summarize, the issue arises when trying to compute sin.(A), cos.(A) (where A is a vector) in an efficient way. (minmax, extrema, divrem, fldmod, fldmod1, reim, are similar)
The approach with tmp = sincos.(A); first.(tmp), last.(tmp) is unsatisfactory because it requires an intermediate allocation.
The approach with sin.(A), cos.(A) is unsatisfactory because there may be efficiency gains from computing sin and cos at once. I suspect some of the desire for an unzip function is caused by this problem, which has no convenient solution at the moment, aside from writing out the loop. Thinking about this problem, I have a few proposals.
Proposal A
It may make sense to allow
(sin, cos).(A)or
broadcast((sin, cos), A)which returns sin.(A), cos.(A) but perhaps computed more efficiently. (In other words, an "unzipped broadcast".)
However this raises the question about whether sincos(x) and divrem(x, y) themselves might better be called (sin, cos)(x)...
Proposal B
If this syntax is too revolutionary, an alternative is to offer unzip (#13942), along with a way to make the inner broadcast lazy. That is, perhaps we could have
unzip(sincos@.(A))
which expands to
unzip(Broadcast(sincos, A))
where Broadcast is a minimal iterable object which calls broadcast upon collect, but unzip on it can be specialized to avoid the intermediate allocation.
Here I'm proposing @. as an idiom for a lazy broadcast; this has a reasonably nice parallel to @[...] for lazy indexing (i.e. viewing). This approach has the advantage that lazy broadcasts have themselves been a commonly requested feature.
(Now, unzip might itself be reasonably expected to be lazy, so we might wish to require collect(unzip(sincos@.(A))).)
Proposal C
Ideally sin.(A), cos.(A) would be as efficient as computing sincos.(A). It may be possible to intercept this pattern in lowering, and lower the construction of tuples that are broadcast over the same symbol arguments to Base._tuple_broadcast((sin, cos), A) which by default falls back to broadcast(sin, A), broadcast(cos, A), but allows specialization. I don't particularly like this kind of solution, because it does break referential transparency.