Skip to content

Proposal: add a built-in reduce function on the type level #12512

Closed

Description

This is my first proposal here, so feedback welcome.

TL;DR I noticed arrays in the standard lib are assumed homogeneous e.g. Array<T>, I'd like to see better support for heterogeneous arrays (i.e. tuples), and in looking for solutions there concluded being able to use reduce in the type language might help a lot to facilitate hard typings, that is, cut down on walls of generics to get them into one line.
To prevent confusion, I started looking for a solution to typing reduce-like functions, but ended up finding a way using a reduce-like function (in the type language) so as to solve something more general.

Array.prototype.reduce definitions currently naively assume the initial value, output, and all intermediate values share one and the same type, all following the 'homogeneous array of unspecified length' Array<T> presumption:

reduce(callbackfn: (previousValue: T, currentValue: T, currentIndex: number, array: T[]) => T, initialValue?: T): T;
reduce<U>(callbackfn: (previousValue: U, currentValue: T, currentIndex: number, array: T[]) => U, initialValue: U): U;

This assumption may hold in quite some cases, though not in all. The reality for those cases may be harder to catch in the current typing system, as each successive result is really based on what came before it. I believe to properly capture this the typing system would need to be able to iterate just as the original function would (though dealing with types rather than values).

Code illustrating the issue:

const pathFn = <T, K extends string|number>(obj: T, path: K[]) => path.reduce(<U>(o: U, k: K extends keyof U) => o[k], obj);
// ^ mirrors Ramda's `R.path`, or Lodash's `_.get` with array

interface Cat {
  age: number;
  children: Cat[];
}
let bigglesworth: Cat = { age: 5, children: [ { age: 1 } ] }
let pets = { bigglesworth };
let firstChildAge = pathFn(pets, ['bigglesworth', 'children', 0, 'age']); // 1

Expected behavior:

Expected inferred type: number

Actual behavior:

Reduce concludes object in, object out.

Relevance:

Isn't this some weird edge case?

I hear you asking.

Well, out of the functions contained in Ramda, I'm under the impression the following functions utilize this reduce type of logic, and whose typings I think would benefit from solving this problem:

  • reduce, reduceBy, scan
  • path, pathEq, pathOr, pathSatisfies, assocPath, dissocPath, lensPath **
  • invertObj **, invert
  • mergeAll, mergeWithKey, mergeWith (though not merge)
  • pickBy (infer which to pick, types if heterogeneous)
  • fromPairs, toPairs, toPairsIn **
  • where, whereEq (relevant if calculating the result on the type-level, which requires all values at compile-time, for where functions returning actual boolean values e.g. type guards, for whereEq a value equality check)
  • flatten *
  • indexBy
  • sequence, transduce, traverse
  • applySpec
  • filter (for objects this allows typing properly rather than as a Partial given known boolean outcomes, as is possible using type guards)
  • xprod *
  • zipObj *, **
  • reverse *, **
  • pipe, pipeP, pipeK **
  • compose, composeP, composeK **
  • curry? **

*: added value presuming heterogeneous content (tuple input).
**: currently the case for pipe/compose/curry, and in retrospect for the others marked this way (including ironically the path I'd taken for my example), these can be worked around using lists covering the different possible sizes (squared in case of xprod), which works, though it makes definition files harder to read/write/maintain (esp. when all functions are curried as well). This would for those cases turn this proposal into no more than syntactic sugar to turn all those predictable variants into one line.

I ignored silly ones like range / times that'd technically get more type info (-> array size known) but wouldn't plausibly benefit from it.

I might be off by a bit, but this would seem to be in the order of 37 functions (/234), including ones I've yet to use, but some others I'd deem indispensable.

Summarizing:

Now, I'd wondered if we needed some generic construct for iteration on the type level, but that might raise a whole slew of new questions (what would this look like syntax-wise? where else might it be used? wouldn't the type language become a copy of the actual language?). That's not the way I'd like to go here.

Information we have:

  1. The type language supports function application.
  2. The type language has certain values built-in, e.g. the standard JavaScript types.
  3. In this case, all the above-mentioned functions could get typed better because they specifically (directly or by dependency) follow this reduce pattern (the non-trivial version, i.e. cases where the current typing system fails).

Proposal:

I believe functional programming in TypeScript could be significantly improved by exposing a reduce function on the type level.
Note:

  • This would be a function rather than method like Array.prototype.reduce, I suppose ideally with the array as the third function argument, e.g. function reduce(callback /* pars: (accumulator, currentValue, currentIndex) */, initialValue, values /* Array, don't even bother running if info like length is unknown */).
  • Unlike JS functions like Array.prototype.reduce, this would instead deal with types rather than values.
  • I'd be cool with any recognizable name/capitalization, from reduce (good/bad because it seems like a regular function) to _RDC (looks more like a type-level name, unlikely to clash, still kinda recognizable, but ugly).

Examples:

Using _RDC as a name for example purposes, we can retry the example from above:

reduce<T, U, V>(callbackfn: T, initialValue?: U, values: V): _RDC(T, U, V);
const typedPathFn = <T, K extends string|number>(obj: T, path: K[]) => reduce(<U>(o: U, k: K extends keyof U) => o[k], obj, path);
let typedFirstChildAge = typedPathFn(pets, ['bigglesworth', 'children', 0, 'age']);
// number \o/

But we can do more than that.
The bigger picture here would be to also fix all of the Array methods from the standard library, so let's start looking at a simple example from there, reversing a tuple.

A function version might look like this:

reverse(tpl: []): [];
reverse<A>(tpl: [A]): [A];
reverse<A,B>(tpl: [A,B]): [B,A];
reverse<A,B,C>(tpl: [A,B,C]): [C,B,A];
...

Can we reduce that?

reverse<Ts>(tpl: [...Ts]): _RDC((acc, v) => [v, ...acc], [], Ts);

Let's look at a definition of compose, here with 1 param on the outer function:

compose<V0, T1>(fn0: (x0: V0) => T1): (x0: V0) => T1;
compose<V0, T1, T2>(fn1: (x: T1) => T2, fn0: (x0: V0) => T1): (x0: V0) => T2;
compose<V0, T1, T2, T3>(fn2: (x: T2) => T3, fn1: (x: T1) => T2, fn0: (x: V0) => T1): (x: V0) => T3;
...

Can we generalize this? I think so.

compose<V0, T0, V, T>(...fns: ...(x: V) => T, f0: (x0: V0) => T0): (x: V0) => ..._RDC((acc, f) => [f(...acc)], [V0], reverse([...fns, f0]));

We mentioned these all use 1 input param. What does varying that part look like?

compose<V0, T1>(fn0: (x0: V0) => T1): (x0: V0) => T1;
compose<V0, V1, T1>(fn0: (x0: V0, x1: V1) => T1): (x0: V0, x1: V1) => T1;
compose<V0, V1, V2, T1>(fn0: (x0: V0, x1: V1, x2: V2) => T1): (x0: V0, x1: V1, x2: V2) => T1;

Can we incorporate that?

compose<Vs, T0, V, T>(...fns: ...(x: V) => T, f0: (...xs: ...Vs) => T0): (...xs: Vs) => ..._RDC((acc, f) => [f(...acc)], [...Vs], reverse([...fns, f0]));

The opposite (R.pipe) is easier, no flipping:

pipe<Vs, T0, V, T>(f0: (...xs: ...Vs) => T0, ...fns: ...(x: V) => T): (...xs: Vs) => ..._RDC((acc, f) => [f(...acc)], [...Vs], [f0, ...fns]);

How about curry, as given by sandersn in his Variadic Kinds proposal? In there, he gaven an example for the case of dividing the parameters over 2 function applications: curry<...T,...U,V>(f: (...args:[...T,...U]) => V, ...ts:...T): (...us: ...U) => V. Let's try to make that more general:

curry<V,...A>(f: (...args:[...A]) => V, ...as:...A): V
curry<V,...A,...B>(f: (...args:[...A,...B]) => V, ...as:...A): (...bs: ...B) => V
curry<V,...A,...B,...C>(f: (...args:[...A,...B,...C]) => V, ...as:...A): (...bs: ...B) => (...cs: ...C) => V
curry<V,...A,...B,...C,...D>(f: (...args:[...A,...B,...C,...D]) => V, ...as:...A): (...bs: ...B) => (...cs: ...C) => (...ds: ...D) => V
...

Let's try to reduce.

curry<V,...T>(f: (...args:[...A, ...T]) => V, ...as?:...A): _RDC((acc, par) => (par) => acc, V, reverse(T))

This appears to be covering the case where arguments not passed immediately would need to be passed in one by one (so using a full wall as above is still more useful), but it's a start.

To revisit those reduce definitions from the standard library, these could then be fixed up with a list of generic tuple definition for various lengths:

interface Tuple2<T,U> extends [T,U] {
  reduce<X,Y>(callbackfn: X, initialValue?: Y): _RDC(X, Y, this /* is this right? */);
}

Note that these would also need to capture the case of homogeneous arrays, since info on array size (and separate keys) is relevant. Note that the maximum tuple length may this way become a bottleneck for the definition of the method version.

So, would you have these definitions generate the original definition walls?

I think if from TS -> .d.ts we'd generate the 'resulting' possible types (i.e. the list options [A], [A,B], [A,B,C], ...), we'd have to choose a balance between (1) limiting to a fixed number vs. (2) keeping our files clean. This sounds like a lose-lose trade-off, so I'd either generate options in-memory or have TS calculate them as needed.

Language Feature Checklist:

SYNTAX:
Since from a definition writer's perspective this feels intuitively similar as using 'regular' functions (function application: F(T)) on the type level, I'd be inclined to have this use the same syntax.
Since this is on the type level and using an existing syntax, I expect no clashes with ES.

SEMANTICS:

  1. What is an error under the proposed feature? Show many examples of both errors and non-errors
  • I believe this should function largely the same as applying any function on the type level, although under the hood it would be different, in the sense that TS would need to trigger this 'special' function rather than doing the type-level function application directly. So if one iteration of applying the callback function were to result in a TS error, this should throw to bubble up rather than continue to iterate.
const prop = <U,T>(o: U, k: T extends keyof U) => o[k];
_RDC(prop, pets, ['bigglesworth', 'children', 0, 'age']) // ok, number
let runtime_path: string[];
_RDC(prop, pets, runtime_path) // not good, any
let runtime_pets: { [k: string]: Cat };
_RDC(prop, runtime_pets, ['bigglesworth', 'children']) // Cat[]? <-- this should probably work? assumes existance of Cat 'bigglesworth'. Otherwise make it `Cat[] | null` / `undefined`?
_RDC(prop, runtime_pets, ['bigglesworth', 'children', 0, 'age']) // number? <- also a 'should probably work I guess?', on top of the above does also assume existance of child 0.
// not shown: bad callback functions (parameter 1). This would just be any type-level function, and be applied as such in the reduce function with the accumulator and next value, so wherever that fails, you have a bad... something.
  1. How does the feature impact subtype, supertype, identity, and assignability relationships?
  • I don't foresee special interaction.
  1. How does the feature interact with generics?
  • I don't foresee special interaction. That said, arrays typed such that info on their distinct values/types has gone lost would mean no iteration could be made (-> any).

EMIT:

  1. What are the effects of this feature on JavaScript emit? Be specific; show examples
  • N/A
  1. Does this emit correctly in the presence of variables of type ‘any’? Features cannot rely on runtime type information
  • It'd need to return any as well in this case -- all required info must be inferable for this to add value.
  1. What are the impacts to declaration file (.d.ts) emit?
  • N/A
  1. Does this feature play well with external modules?
  • N/A

COMPATIBILITY:
No breaking changes.

OTHER:

  1. Can the feature be implemented without negatively affecting compiler performance?
  • looping through would cost more compute than ignoring these cases. But then the relevant type info would go lost, and presumably most of the usage cases would be somewhat limited in scope -- when accurate info on the array (e.g. length) is unknown, iterating can be skipped as there would be no point (not enough info to find out anything more specific than using the existing methods relying on Array<T>).
  1. What impact does it have on tooling scenarios, such as member completion and signature help in editors?
  • positive -- inference should greatly improve for code making use of this.

P.S.: I'm also hoping for improvement for map. I think these two form the basis for a lot of the basic FP operations, and with these out of the way FP in TS should become a lot more pleasant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Needs InvestigationThis issue needs a team member to investigate its status.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions