✨[Feature] Reducing Overhead with C++ Torchbind operation getting called up to Python

**Is your feature request related to a problem? Please describe.**


We are seeing that Torchbind operators from the C++ runtime getting called into Python in order to dispatch.

**Describe the solution you'd like**


We want to run in C++ without going back to python. 

Potential solutions would be registering as a CUDA op or can we reexport so that we dont need to be lifted into python and we run more like what happens in AOTInductor or we can switch to an executorch style integration rather than torchbind 

**Describe alternatives you've considered**


**Additional context**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

✨[Feature] Reducing Overhead with C++ Torchbind operation getting called up to Python #3942

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

✨[Feature] Reducing Overhead with C++ Torchbind operation getting called up to Python #3942

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions