Skip to content

Composite operators / subgraphs #907

@fdwr

Description

@fdwr

There are hundreds of potential operators found across ML libraries, and implementing them all in a web standard is not feasible. So instead, we should at least support a mature enough core operator set that would enable composability of larger aggregate operators. Ningxin presented the idea at TPAC 2024, where you can define a composite operator (such as multihead attention, which is not in WebNN) and then execute it. Then if the backend has a compatible implementation of the subgraph as a builtin operator, passing that higher level of expression through the user agent to the backend can be simpler/more efficient than needing to recognize the patterns nested throughout the graph, modify the graph, and refuse them.

Below is a possible brief example of the idea (which may turn out quite differently after we think more about it). e.g.:

// Assume tanh was not already a built-in WebNN operator:
// tanh(x) = (exp(2 * x) - 1) / (exp(2 * x) + 1)
function buildTanh(builder, inputDesc)
{
    let tanh = builder.div(
        builder.sub(
            builder.exp(builder.mul(builder.constant(inputDesc.dataType, 2), builder.input("input", inputDesc))),
            builder.constant(inputDesc.dataType, 1)
        ),
        builder.add(
            builder.exp(builder.mul(builder.constant(inputDesc.dataType, 2), builder.input("input", inputDesc))),
            builder.constant(inputDesc.dataType, 1)
        )
    );
    return graphBuilder.buildSubgraph(tanh, {"input"}, {"output"});
}

...

let tanh = buildTanh(graphBuilder, inputDesc);
let tanhResult = graphBuilder.subgraph(tanh, {"input": input});
let mulResult = graphBuilder.mul(tanhResult.output, ...);

Benefits

  • Enables the API to support new operators earlier (for niche operators that might never be part of WebNN or are not in the official API yet), falling back to the decomposition when absent.
  • Boosts performance when the backend has a mapping for it.
  • Supports large operators like "attention" and "mixture of experts" without permanently complicating the API with heavyweight but potentially non-durable operators (as we've seen with large operators like LSTM and GRU that rise and fall in popularity).
  • The pattern matching only has to be done once (at subgraph creation), not multiple times across potentially thousands of nodes.

Considerations:

  • How do we propagate data types? It would be ideal to reuse the tanh subgraph above with either float16 or float32 inputs, without needing to create multiple graphs for each data type. Currently the input definition must be fully qualified, but it would be useful for graphBuilder.input("input") to remain unresolved at subgraph creation time, then resolved at subgraph usage time (meaning shape and type propagation is delayed until knowable). This impacts constant() too which currently expects a concrete MLOperandDataType, rather than lazily accepting the type of another MLOperand or having a way to cast to the target type of another tensor (like ONNX CastLike).
  • How do we propagate input shapes? We should be able to reuse a subgraph with multiple input shapes, but currently inputs require a concrete tensor description with known shapes, meaning these subgraphs would only work with input tensors of exactly the same shape. Would this interact with Support flexible input sizes #883? This affects constants too - would we want a constant-of-shape overload?
  • Should we pass a name string along to buildSubgraph to aid the backend in verifying custom operator compatibility, rather than pure pattern matching? If so, what about backends that use different names for operators? Should it be a list of possible strings then (or is pure pattern matching better)?

(note this is different from #6, as this is more about subgraph composition from existing primitive operators rather than say interop with custom WebGPU shaders)

Additionally:

  • should attributes be parameterizable?
  • support optional inputs?
  • should subgraphs include (optionally) names (or a list of names)?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions