Skip to content

Vectorize optimized_portable_ops versions of portable ops? #9241

Open
@swolchok

Description

@swolchok

🚀 The feature, motivation and pitch

Similarly to #8932, we should be able to conditionally compile portable ops to do some vectorization. I imagine that this would look like either passing a second lambda to our util functions, or perhaps passing template lambdas that we then could use for both some scalar T and also Vectorized<T>. The second option would require us to get an std-workalike interface to Vectorized operations so that things like exp would work seemlessly, which probably would have a similar solution to pytorch/pytorch#144495 .

RFC

As a concrete example, op_add currently calls a util workhorse function with a lambda:

    utils::apply_bitensor_elementwise_fn<CTYPE_COMPUTE, op_name>(
        [val_alpha](const CTYPE_COMPUTE val_a, const CTYPE_COMPUTE val_b) {
          return val_a + val_alpha * val_b;
        },

We could imagine instead making the call look like this, with a template lambda, so that we could seamlessly use the lambda with Vectorized:

    utils::apply_bitensor_elementwise_fn<CTYPE_COMPUTE, op_name>(
        [val_alpha](const auto val_a, const auto val_b) {
          return val_a + val_alpha * val_b;
        },

A second, harder example is op_exp:

Tensor& exp_out(KernelRuntimeContext& ctx, const Tensor& in, Tensor& out) {
  return internal::unary_ufunc_realhbbf16_to_floathbf16(std::exp, ctx, in, out);
}

I think ideally we would find a solution to the above-mentioned PyTorch issue and then write this as

Tensor& exp_out(KernelRuntimeContext& ctx, const Tensor& in, Tensor& out) {
  return internal::unary_ufunc_realhbbf16_to_floathbf16_v2([](auto x) { return c10::math::exp(x); }, ctx, in, out);
}

using a template lambda that could be instantiated with either a scalar or Vectorized, as outlined above.

cc @larryliu0820 @manuelcandales

Metadata

Metadata

Assignees

Labels

actionableItems in the backlog waiting for an appropriate impl/fixmodule: kernelsIssues related to kernel libraries and utilities, and code under kernels/

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions